8000 Add parallel memset when building hash join table by hehezhou · Pull Request #16172 · duckdb/duckdb · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add parallel memset when building hash join table #16172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 13, 2025

Conversation

hehezhou
Copy link

I found that when the number of threads and tables are both large, single-threaded memset (fill_n) becomes a bottleneck for hash joins. Therefore, I added parallel memset functionality. The following are the test results using 16 threads on TPCH with a scale factor of 30, the machine having 2x Intel(R) Xeon(R) Platinum 8474C @ 2.10GHz and 512GB of memory.

q01.benchmark
Old timing: 0.307804
New timing: 0.305122

q02.benchmark
Old timing: 0.09011
New timing: 0.081623

q03.benchmark
Old timing: 0.219148
New timing: 0.185937

q04.benchmark
Old timing: 0.197395
New timing: 0.172733

q05.benchmark
Old timing: 0.200423
New timing: 0.177111

q06.benchmark
Old timing: 0.056182
New timing: 0.055329

q07.benchmark
Old timing: 0.220938
New timing: 0.185942

q08.benchmark
Old timing: 0.238767
New timing: 0.221206

q09.benchmark
Old timing: 0.81091
New timing: 0.74094

q10.benchmark
Old timing: 0.384993
New timing: 0.358605

q11.benchmark
Old timing: 0.043594
New timing: 0.047106

q12.benchmark
Old timing: 0.149087
New timing: 0.152796

q13.benchmark
Old timing: 0.607926
New timing: 0.588897

q14.benchmark
Old timing: 0.140369
New timing: 0.110691

q15.benchmark
Old timing: 0.100208
New timing: 0.09759

q16.benchmark
Old timing: 0.121195
New timing: 0.11974

q17.benchmark
Old timing: 0.332341
New timing: 0.326049

q18.benchmark
Old timing: 0.666402
New timing: 0.641981

q19.benchmark
Old timing: 0.248784
New timing: 0.236629

q20.benchmark
Old timing: 0.174007
New timing: 0.175727

q21.benchmark
Old timing: 0.700216
New timing: 0.629451

q22.benchmark
Old timing: 0.141347
New timing: 0.128283


Old timing geometric mean: 0.2120037943770715
New timing geometric mean: 0.19861073916154134, roughly 6% faster

@hehezhou
Copy link
Author

Btw I think I found a logical bug when deciding if use parallel hash join table finalize, and I fix it in the second commit.

Copy link
Contributor
@lnkuiper lnkuiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This is a great idea. I saw this showing up in profiling a few times already, and this should really help cases when there are very large join ha 8000 sh tables :)

Maybe we can distribute the workload of the parallel memset slightly better, so I have one minor comment:

@duckdb-draftbot duckdb-draftbot marked this pull request as draft February 11, 2025 18:04
@hehezhou hehezhou marked this pull request as ready for review February 11, 2025 18:06
@duckdb-draftbot duckdb-draftbot marked this pull request as draft February 11, 2025 18:10
@hehezhou hehezhou marked this pull request as ready for review February 11, 2025 18:11
@hehezhou hehezhou requested a review from lnkuiper February 12, 2025 03:59
Copy link
Contributor
@lnkuiper lnkuiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! This is good to go.

@Mytherin Mytherin merged commit 8234e71 into duckdb:main Feb 13, 2025
49 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

@hehezhou hehezhou deleted the parallel-memset branch February 13, 2025 13:38
Antonov548 added a commit to Antonov548/duckdb-r that referenced this pull request Feb 27, 2025
Add parallel memset when building hash join table (duckdb/duckdb#16172)
krlmlr pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 5, 2025
Add parallel memset when building hash join table (duckdb/duckdb#16172)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
Add parallel memset when building hash join table (duckdb/duckdb#16172)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
Add parallel memset when building hash join table (duckdb/duckdb#16172)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 17, 2025
Add parallel memset when building hash join table (duckdb/duckdb#16172)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
Add parallel memset when building hash join table (duckdb/duckdb#16172)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0