8000 Add new recursive semantics (`USING KEY`) by cryoEncryp · Pull Request #12430 · duckdb/duckdb · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add new recursive semantics (USING KEY) #12430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
Feb 20, 2025

Conversation

cryoEncryp
Copy link
Contributor
@cryoEncryp cryoEncryp commented Jun 7, 2024

This PR adds the optional modifier USING KEY to recursive CTEs.
The semantics of a recursive CTE only differ slightly from the regular
recursive CTEs. Yet, the performance impact can be substantial. Below,
we motivate the USING KEY modifier using two example queries that
focus on the efficient evaluation of common algorithms over graphs. We
have, however, found the modifier to be beneficial for a wide range of
recursive (or iterative) computations over relational data. For the
interested, you can find an in-depth discussion of the rationale, the
semantics, and applications of USING KEY in the following paper:

"A Fix for the Fixation on Fixpoints"
Denis Hirn, Torsten Grust
Proceedings of the 13th Conference on Innovative Data Systems Research
(CIDR 2023), Amsterdam, The Netherlands, January 2023.
PDF available at https://www.cidrdb.org/cidr2023/papers/p14-hirn.pdf

(Note: in that paper we have used the syntax WITH ITERATIVE ... KEY.
In this PR, we use WITH RECURSIVE ... USING KEY instead.)

In a nutshell, USING KEY alters the behavior of a regular recursive CTE as follows:

  • In each iteration, a regular CTE appends results rows to the union
    table
    which, ultimately defines the overall result of the CTE.

  • In contrast, a CTE with USING KEY has the ability to update rows that
    have been placed in the union table in an earlier iteration: if the
    current iteration produces a row with key k, it replaces a row with the
    same key k in the union table. If no such row exists in the union table
    yet, the new row is appended to the union table as usual.

  • This allows a CTE to exercise fine-grained control over the union table
    contents. Avoiding the append-only behavior can lead to significantly
    smaller union table sizes. This helps query runtime, memory consumption,
    and makes it feasible to access the union table *while the iteration is
    still ongoing (this is impossible for regular recursive CTEs): in a
    CTE WITH RECURSIVE T(...) USING KEY ..., table T denotes the
    rows added by the last iteration (as is usual for recursive CTEs), while
    table recurring_T denotes the union table built so far. References
    to recurring_T allow for the elegant and idiomatic translation of rather
    complex algorithms into readable SQL code.

Here is an example of a CTE with USING KEY with a key column (a) and a payload column (b). In the first iteration we have two different keys, 1 and 2. These two rows will generate two new rows, (1, 3) and (2, 4). In the next iteration we produce a new key, 3, which will generate a new row. We also generate the row (2,3), where 2 is a key that already exists from the previous iteration. This will overwrite the old payload 4 with the new payload 3.

WITH RECURSIVE tbl(a,b) USING KEY (a) AS (
    SELECT a, b
    FROM (VALUES (1, 3), (2, 4)) t(a, b)
	UNION
    SELECT a+1, b
    FROM tbl
    WHERE a < 3
)
SELECT *
FROM tbl

The result of the CTE with USING KEY would be as follows:

a b
1 3
2 3
3 3

Normally, a CTE adds all rows from each iteration to the union table, like the table below.

a b
1 3
2 4
2 3
3 4
3 3

Already in a small example we can see that the resulting table for USING KEY grows very slowly in comparison to the normal CTE.

Perfomance

The given example is simple, but not really useful for comparing the computation time of the USING KEY CTE and the normal recursive CTE. Therefore, we decided to formulate two algorithms for graphs that exploit the full potential of the new semantic, starting with the Connected Components (CC) and then the Distance Vector Routing (DVR), and measured the time between the USING KEY variant and a handcrafted version written with the standard WITH RECURSIVE. The queries used the LDBC SNB Person_knows_Person and Person tables as benchmarks.

CC

The connected components algorithm computes, for each node in a graph, the node with the smallest id to which it is connected. With a scaling factor of 0.003, the WITH RECURSIVE takes 78.394s on my machine.

WITH RECURSIVE cc(node, front, comp) AS (
	SELECT n.id, n.id AS front, n.id AS comp
	FROM Person AS n
		UNION
	SELECT u.node, v.id AS front, v.id AS comp
	FROM cc AS u, knows AS e, Person AS v
	WHERE (e.person1id, e.person2id) = (u.front, v.id) 
		OR (e.person2id, e.person1id) = (u.front, v.id))
SELECT c.node, MIN(c.comp) AS comp
FROM cc AS c
GROUP BY c.node
ORDER BY c.node;

With the newly implemented semantics, the query below takes only 9.5 seconds.

WITH RECURSIVE cc(id, comp) USING KEY (id) AS (
	SELECT n.id, n.id AS comp
	FROM Person AS n
		UNION
	(SELECT DISTINCT ON (u.id) u.id, v.comp
		FROM recurring_cc AS u, cc AS v, knows AS e
		WHERE (e.person1id, e.person2id) = (u.id, v.id) 
			OR (e.person2id, e.person1id) = (u.id, v.id)
			AND v.comp < u.comp
		ORDER BY u.id asc, v.comp asc))
TABLE cc ORDER BY id;

DVR

The second algorithm measured is DVR. It is used to compute the shortest path between all nodes in a network. Again, we used LDBC SNB with a scaling factor of 0.003. The query using only WITH RECURSIVE takes 14 seconds.

WITH RECURSIVE paths(startNode, currNode, via, path, toReach, endReached, found) AS (
    SELECT -- define the path as the first edge of the traversal
         n1.id AS startNode, n1.id AS currNode, NULL :: HUGEINT AS via, 0 AS path, n2.id AS toReach, FALSE  AS endReached, FALSE
      FROM Person AS n1, Person AS n2
      WHERE n2.id <> n1.id
    UNION ALL
    SELECT -- concatenate new edge to the path
         paths.startNode AS startNode,
         person2id AS currNode,
         COALESCE(via, person2id) AS via, path+1,
         toReach,
         bool_or(person2id = toReach)
             OVER (PARTITION BY toReach, startNode ROWS BETWEEN UNBOUNDED PRECEDING
                            AND UNBOUNDED FOLLOWING) AS endReached,
         (person2id = toReach)
      FROM paths
      JOIN knows ON paths.currNode = person1id
      -- WHERE NOT array_contains(paths.path, person2id)
       AND NOT paths.endReached
 )
 SELECT startNode, toReach, via, path
 FROM paths WHERE found ORDER BY path, startNode, toReach;

With the newly implemented semantics, the query below takes only 0.7 seconds.

WITH RECURSIVE
    dvr(here, there, via, cost) USING KEY (here, there) AS (
      -- Start with DVR tables for all nodes, a table is identified by the here column.
      SELECT n.person1id, n.person2id, n.person2id :: string, 1:: double precision
      FROM   knows AS n
        UNION
      (SELECT n.person1id, dvr.there, dvr.here, (1 + dvr.cost) :: double precision AS cost
      FROM dvr -- Working table - Changes shared by neighbors.
      JOIN knows AS n -- Edges between the routes - needed to know how much weight the edge has.
          ON  n.person2id = dvr.here  -- SEND the update only to those tables that have an edge to sender.
          AND n.person1id != dvr.there -- Does not need an entry for itself.
      LEFT JOIN recurring_dvr AS rec -- Recurring table - The current routing tables.
          ON rec.here = n.person1id AND rec.there = dvr.there -- Update only routing table entries where the edge from the receiver starts.
      WHERE (1 + dvr.cost) :: double precision < COALESCE(rec.cost, 'Infinity' :: double precision) --  The updated cost + cost of edge must be less than the current entry in the routing table.
      ORDER BY 1 + dvr.cost ASC)
    )
  SELECT *
  FROM dvr ORDER BY cost, here, there ASC;

@cryoEncryp cryoEncryp force-pushed the rec-cte-key-variant branch from 57404ca to 88c6564 Compare June 10, 2024 07:30
@duckdb-draftbot duckdb-draftbot marked this pull request as draft June 12, 2024 11:33
@cryoEncryp cryoEncryp marked this pull request as ready for review June 12, 2024 12:08
Copy link
Collaborator
@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks great! Nice results. Some comments below:

explicit RecursiveKeyCTEState(ClientContext &context, const PhysicalRecursiveKeyCTE &op)
: intermediate_table(context, op.GetTypes()), new_groups(STANDARD_VECTOR_SIZE) {

vector<BoundAggregateExpression *> payload_aggregates_ptr;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: vector<reference<BoundAggregateExpression>>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be a good idea, but AggregateHashTable needs a vector of type vector<BoundAggregateExpression *>. Therefore I've used the raw C-pointers as well. I could adapt the AggregateHashTable to use a vector<reference<...>> but not sure if that was your intention?


// btodo: Maybe there is another way, we handle unique pointers and therefore can only copy the keys.
for (auto &key : info.key_targets) {
result.key_targets.emplace_back(key->Copy());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move here instead of copying? (std::move)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As my todo comment above this line also mentions, I am not happy with this solution and hope to find a better way. But during the preparation of the statement, we copy the cte_map and when we move the keys, we get an error for dereferencing a NULL pointer.

void QueryNode::CopyProperties(QueryNode &other) const { ...
	...
	for (auto &kv : cte_map.map) { ...
    ...
		for (auto &key : kv.second->key_targets) { ...
			kv_info->key_targets.push_back(key->Copy());
		}
    ....




query II
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps include a larger benchmark comparing the standard recursive CTEs vs the new ones as part of the benchmark runner (see e.g. benchmark/micro/cast/cast_date_string.benchmark)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will write more test, when the syntax is determined. 👍

// and we clear the intermediate table
gstate.intermediate_table.Reset();
// now we need to re-execute all of the pipelines that depend on the recursion
ExecuteRecursivePipelines(context);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely unrelated to this PR - but given you mentioned recursive CTE performance - I've profiled recursive CTEs before and in many cases (almost) all of the time went into the breaking down/setting up of pipeline states. If you're interested in optimizing recursive CTEs, reusing these states across iterations would likely lead to significant performance improvements as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that sounds interesting. I will look into it after this Pull-Request.

@Mytherin Mytherin changed the base branch from feature to main June 21, 2024 12:37
@szarnyasg szarnyasg added the Needs Documentation Use for issues or PRs that require changes in the documentation label Jun 28, 2024
@cryoEncryp cryoEncryp requested a review from Mytherin July 1, 2024 09:27
@duckdb-draftbot duckdb-draftbot marked this pull request as draft August 2, 2024 10:08
@cryoEncryp cryoEncryp marked this pull request as ready for review August 2, 2024 10:10
@duckdb-draftbot duckdb-draftbot marked this pull request as draft December 16, 2024 14:44
@cryoEncryp cryoEncryp marked this pull request as ready for review December 17, 2024 07:00
@cryoEncryp
Copy link
Contributor Author

The PR should be fine, the only check that seems to fail is an https error from vcpkg. I did write something to the nit you mentioned, maybe that still needs to be changed.

@duckdb-draftbot duckdb-draftbot marked this pull request as draft December 18, 2024 09:12
@cryoEncryp cryoEncryp marked this pull request as ready for review December 18, 2024 09:24
@duckdb-draftbot duckdb-draftbot marked this pull request as draft January 29, 2025 13:23
@cryoEncryp cryoEncryp marked this pull request as ready for review January 29, 2025 13:23
Copy link
Collaborator
@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! Did another pass and I think this is good to go now - could you just solve the merge conflicts?

@duckdb-draftbot duckdb-draftbot marked this pull request as draft February 20, 2025 09:03
@cryoEncryp cryoEncryp marked this pull request as ready for review February 20, 2025 09:03
@Mytherin Mytherin merged commit befbc73 into duckdb:main Feb 20, 2025
49 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

@cmettler
Copy link

Exciting update! I'm curious to know if this enhancement also enables the use of aggregations within recursive CTEs similar to what's described in the Materialize documentation here: https://materialize.com/docs/sql/select/recursive-ctes/.

@szarnyasg
Copy link
Collaborator

image

Antonov548 added a commit to Antonov548/duckdb-r that referenced this pull request Mar 4, 2025
krlmlr pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 5, 2025
Mytherin added a commit that referenced this pull request May 1, 2025
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 17, 2025
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changes Requested feature Needs Documentation Use for issues or PRs that require changes in the documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0