Description
Describe the problem
Hi, CockroachDB developers. In CockroachDB's docs, COALESCE(...)
's description is: First non-NULL expression with Short Circuit. As described in the documentation, CockroachDB short-circuits the COALESCE(...)
function. While the parameters in the COALESCE(...)
function are subqueries (e.g., SELECT COALESCE((SELECT COUNT(c1) FROM t1), (SELECT COUNT(c2) FROM t2)) AS result;
), CockroachDB will fail to short-circuit.
To Reproduce
Please reproduce the above issue as follow. CockroachDB will wait for all subqueries to complete before performing COALESCE(...)
. A better strategy is to short-circuit when the first non-empty result is obtained.
CREATE TABLE t1(c1 INT8);
CREATE TABLE t2(c2 INT8);
INSERT INTO t1 VALUES (1);
INSERT INTO t2 SELECT i FROM generate_series(1, 10000000)
-- positive case
SELECT COALESCE(1,(SELECT COUNT(c2) FROM t2)) AS result;
result
----------
1
(1 row)
Time: 1ms total (execution 1ms / network 0ms)
SELECT COUNT(c1) FROM t1;
count
---------
1
(1 row)
Time: 2ms total (execution 2ms / network 0ms)
-- negetive case
SELECT COALESCE((SELECT COUNT(c1) FROM t1),(SELECT COUNT(c2) FROM t2)) AS result;
result
----------
1
(1 row)
Time: 3.850s total (execution 3.849s / network 0.001s)
explain SELECT COALESCE((SELECT COUNT(c1) FROM t1),(SELECT COUNT(c2) FROM t2)) AS result;
info
------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• root
│
├── • values
│ size: 1 column, 1 row
│
├── • subquery
│ │ id: @S1
│ │ original sql: (SELECT count(c1) FROM t1)
│ │ exec mode: one row
│ │
│ └── • group (scalar)
│ │ estimated row count: 1
│ │
│ └── • scan
│ estimated row count: 1 (100% of the table; stats collected 1 day ago)
│ table: t1@t1_pkey
│ spans: FULL SCAN
│
└── • subquery
│ id: @S2
│ original sql: (SELECT count(c2) FROM t2)
│ exec mode: one row
│
└── • group (scalar)
│ estimated row count: 1
│
└── • scan
estimated row count: 10,000,000 (100% of the table; stats collected 1 day ago)
table: t2@t2_pkey
spans: FULL SCAN
(33 rows)
Time: 7ms total (execution 6ms / network 1ms)
When executing the same case in MariaDB/MySQL, they only consume little time. I'm sorry to use this comparison to illustrate this report. In my opinion, all of MariaDB, MySQL, and CockroachDB are the great DBMSs. Especially among distributed DBMS, CockroachDB is almost the best one. So what I want to say is that we may find bugs in comparison (i.e., differential test).
MariaDB [test]> SELECT COALESCE((SELECT COUNT(c1) FROM t1),(SELECT COUNT(c2) FROM t2)) AS result;
+--------+
| result |
+--------+
| 1 |
+--------+
1 row in set (0.003 sec)
mysql> SELECT COALESCE((SELECT COUNT(c1) FROM t1),(SELECT COUNT(c2) FROM t2)) AS result;
+--------+
| result |
+--------+
| 1 |
+--------+
1 row in set (0.015 sec)
Expected behavior
It consumes little time as MariaDB and MySQL done.
SELECT COALESCE((SELECT COUNT(c1) FROM t1),(SELECT COUNT(c2) FROM t2)) AS result;
result
----------
1
(1 row)
Time: 0.0xx s total (execution 0.0xx s / network 0.001s)
Environment:
Server version: CockroachDB CCL v24.3.13 (x86_64-pc-linux-gnu, built 2025/05/13 17:14:36, go1.22.8X:nocoverageredesign) (same version as client)
Client app: cockroach sql
Jira issue: CRDB-51572
Metadata
Metadata
Assignees
Labels
Type
Projects
Status