-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Add parallel memset when building hash join table #16172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Btw I think I found a logical bug when deciding if use parallel hash join table finalize, and I fix it in the second commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! This is a great idea. I saw this showing up in profiling a few times already, and this should really help cases when there are very large join ha 8000 sh tables :)
Maybe we can distribute the workload of the parallel memset slightly better, so I have one minor comment:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! This is good to go.
Thanks! |
Add parallel memset when building hash join table (duckdb/duckdb#16172)
Add parallel memset when building hash join table (duckdb/duckdb#16172)
Add parallel memset when building hash join table (duckdb/duckdb#16172)
Add parallel memset when building hash join table (duckdb/duckdb#16172)
Add parallel memset when building hash join table (duckdb/duckdb#16172)
Add parallel memset when building hash join table (duckdb/duckdb#16172)
I found that when the number of threads and tables are both large, single-threaded memset (fill_n) becomes a bottleneck for hash joins. Therefore, I added parallel memset functionality. The following are the test results using 16 threads on TPCH with a scale factor of 30, the machine having 2x Intel(R) Xeon(R) Platinum 8474C @ 2.10GHz and 512GB of memory.