-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Performance Regression in LEFT JOIN with Timestamp Calculations (v1.1.1 → v1.2.0) #16552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
2 tasks done
Labels
Comments
Thanks, I could reproduce this:
|
krlmlr
added a commit
to duckdb/duckdb-r
that referenced
this issue
May 18, 2025
Fix duckdb/duckdb#16552: adjust join condition sequence (duckdb/duckdb#16943)
krlmlr
added a commit
to duckdb/duckdb-r
that referenced
this issue
May 18, 2025
Fix duckdb/duckdb#16552: adjust join condition sequence (duckdb/duckdb#16943)
krlmlr
added a commit
to duckdb/duckdb-r
that referenced
this issue
May 19, 2025
Fix duckdb/duckdb#16552: adjust join condition sequence (duckdb/duckdb#16943)
krlmlr
added a commit
to duckdb/duckdb-r
that referenced
this issue
May 19, 2025
Fix duckdb/duckdb#16552: adjust join condition sequence (duckdb/duckdb#16943)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What happens?
A severe performance regression (22x slower) occurs in DuckDB 1.2.0 compared to 1.1.1 when executing a LEFT JOIN that includes timestamp difference calculations. The issue appears to be related to the order in which join conditions are evaluated, with 1.2.0 evaluating expensive timestamp calculations first rather than simpler equality conditions.
To Reproduce
Minimal Reproducible Example
Output in DuckDB 1.1.1
Output in DuckDB 1.2.0
Root Cause
The join condition evaluation order changed between versions:
This change appears to bypass a crucial optimization where simpler conditions can filter out non-matching rows before performing expensive calculations.
Performance Impact
OS:
Linux
DuckDB Version:
1.1.1 and 1.2.0
DuckDB Client:
Python
Hardware:
same hardware used for both tests
Full Name:
Pavel Khokhlov
Affiliation:
personal
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
The text was updated successfully, but these errors were encountered: