8000 vine: possible hash table ineficiencies · Issue #4115 · cooperative-computing-lab/cctools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

vine: possible hash table ineficiencies #4115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JinZhou5042 opened this issue Apr 15, 2025 · 27 comments
Open

vine: possible hash table ineficiencies #4115

JinZhou5042 opened this issue Apr 15, 2025 · 27 comments

Comments

@JinZhou5042
Copy link
Member
JinZhou5042 commented Apr 15, 2025

I observed a significant slowdown toward the end of a run with 880K tasks in total. With approximately 90% of tasks completed, 1000 available cores, fewer than 100 tasks running concurrently, and thousands of tasks remaining in the ready queue, the system became sluggish.

The progress bar advanced very slowly, only one or two tasks completed every 10 seconds, compared to the initial stage where hundreds of tasks completed every second.

To investigate, I used perf to monitor which function the manager spent most of its time on. The commands were:

perf record -g -F 99 -p [PID] -- sleep 10
perf report

And here is the output:

Image

It looks that the hash table iterations became extremely expensive when the table had served for a large number of elements.

My hypothesis was that the table capacity gets expanded every time the threshold is triggered. Elements come and go over time, but the table is never shrunk. As a result, after serving for numerous elements, the number of valid elements in the hash table is small, but the table itself becomes very large due to previous expansions, resulting in the final iterations very time-consuming.

I implemented a shrinking strategy to clean up redundant capacity when the number of elements is low. Combined with some other strategies, I saw an improved performance, but not very sure whether this had a positive impact.

I need to write a separate program to test whether the hash table becomes sluggish under large-scale scenarios.

@dthain
Copy link
Member
dthain commented Apr 15, 2025

+1 for writing a benchmark that evaluates the performance of the hash table independently of TaskVine. Perhaps @Ian2327 would like to take a crack at that?

@Ian2327
Copy link
Contributor
Ian2327 commented Apr 15, 2025

Sure! I'll take that challenge. Should I write the test in C or some other language?

@dthain
Copy link
Member
dthain commented Apr 15, 2025

I think C is a good language in which to do the test. Basically, we need a benchmark the measures the time to iterate the hash table for 100K, 200K ... 900K items. Then start removing them and measure the iteration time. Does it go back down?

@Ian2327
Copy link
Contributor
Ian2327 commented Apr 15, 2025

I should be testing the hash_table.c file, correct? Essentially, using that file and its methods to create a hash table with the items and then testing the time it takes to iterate.

@dthain
Copy link
Member
dthain commented Apr 16, 2025

That's correct.

@Ian2327
Copy link
Contributor
Ian2327 commented Apr 16, 2025

These are the results I got:

INSERTION PHASE:
Items: 0, Iteration time: 0.002086 sec
Items: 100000, Iteration time: 0.005003 sec
Items: 200000, Iteration time: 0.008241 sec
Items: 300000, Iteration time: 0.010579 sec
Items: 400000, Iteration time: 0.015659 sec
Items: 500000, Iteration time: 0.024131 sec
Items: 600000, Iteration time: 0.031690 sec
Items: 700000, Iteration time: 0.020698 sec
Items: 800000, Iteration time: 0.028815 sec
Items: 900000, Iteration time: 0.040763 sec

REMOVAL PHASE:
Items: 800000, Iteration time: 0.031551 sec
Items: 700000, Iteration time: 0.021490 sec
Items: 600000, Iteration time: 0.020170 sec
Items: 500000, Iteration time: 0.023556 sec
Items: 400000, Iteration time: 0.025876 sec
Items: 300000, Iteration time: 0.025126 sec
Items: 200000, Iteration time: 0.024666 sec
Items: 100000, Iteration time: 0.020412 sec
Items: 0, Iteration time: 0.011425 sec

It doesn't seem like the time after removing the elements is returning to the original range before adding the elements.

@Ian2327
Copy link
Contributor
Ian2327 commented Apr 22, 2025

After reading through the hash_table.c code, I don't seem to see a function that halves the number of buckets in the hash table, which counters the function that doubles the number of buckets. If this is implemented would it help with this issue of iterating over empty buckets?

@btovar
Copy link
Member
btovar commented Apr 22, 2025

Correct, we only implemented the doubling as at the time we didn't see a reason to reduce.
Simply halving every time we need may not be efficient, and some hysteresis may be needed.

Also, I'm not convinced that this is really an issue, or at least causing the observed slowdown. The reason is that the manager still spends some time in link_poll, which means that it is working under capacity, so this particular bottleneck is most likely coming from somewhere else.

@dthain
Copy link
Member
dthain commented Apr 22, 2025

While the total impact on taskvine is unclear, I think this is a clear deficiency in the data structure, and it will be a good exercise for @Ian2327 to fix it. Ian, please go ahead and work up a PR, and then let's discuss. Yes, you will have to implement the halving function, and @btovar is correct that some hysteresis will be needed.

@btovar
Copy link
Member
btovar commented Apr 22, 2025

Do we know which table is causing this? It may be that the tables are not really empty, but taskvine needs to keep tabs of all that info (e.g., files that have not been undeclared.), or there is a memleak where we are not deleting things from the tables when we should. Which type of workflow is this? Are there dependencies on the tasks? We have run workflows bigger than that before, and the hash tables were not such a bottleneck. I'm a little hesitant to add more dynamic complexity to the code without being sure that this is really the bottleneck we think it is.

@JinZhou5042
Copy link
Member Author
JinZhou5042 commented Apr 22, 2025

@btovar I am 60% sure it was the q->workers_with_complete_tasks, the workflow is DV5 with 250K tasks, and also the recent ineficiencies happen usually because of large file replicas and unrare worker evictions. Before that I didn't see such many failures as well...

@btovar
Copy link
Member
btovar commented Apr 22, 2025

But at most you have O(1000) workers joined, right? That should be ok for these tables?
If it is q->workers_with_complete_tasks, I think we may be blaming the code that we run more often by desing in the wait loop, rather than a real bottleneck.

@JinZhou5042
Copy link
Member Author

Yes, I agree with this, it could be that the manager got stuck with something else and gave this code a high priority to run.

@btovar
Copy link
Member
btovar commented Apr 22, 2025

In this case, I think we could change q->workers_with_complete_tasks to a list that we pop as we process. It was a table for wq historical reasons. I'll give that a try.

@btovar
Copy link
Member
btovar commented Apr 22, 2025

In fact q->workers_with_complete_tasks may be the cause of some of the segfaults you see. A segfault you reported was internal to malloc, which usually means that the memory was corrupted somewhere else. For q->workers_with_complete_tasks, we are deleting a key while iterating. I have been thinking of a way to avoid this in general, even throwing an fatal(..bug..) when we try to do it.

@JinZhou5042
Copy link
Member Author

But it calls hash_table_firstkey(q->workers_with_complete_tasks) right away after calling hash_table_remove(q->workers_with_complete_tasks, w->hashkey);, shouldn't this be a 100% safe way when deleting elements while iterating?

@btovar
Copy link
Member
btovar commented Apr 22, 2025

Yes, you are correct, good catch.
I think that's the slowdown you see, though. Instead of a linear iteration, we have a quadratic, as every time we reset to the start. This is made worse by the hash table being empty, but it is more about how we are using it than the table's fault. I'll have a pr in a moment...

@dthain
Copy link
Member
dthain commented Apr 23, 2025

@Ian2327 we will continue to argue about what contributes to TaskVine performance. :)

In the meantime, I would like you to write and test a modification to the hash table that shrinks the table by 1/2 whenever the load falls below 1/8. Then let's see what that performance looks like.

@JinZhou5042
Copy link
Member Author

@Ian2327 For your reference, I have a simple shrinking strategy implemented in this branch, see hash_table_halve_buckets if you find it helpful. And see hash_table_test.c for validating its functionality.

@Ian2327
Copy link
Contributor
Ian2327 commented Apr 23, 2025

@Ian2327 we will continue to argue about what contributes to TaskVine performance. :)

In the meantime, I would like you to write and test a modification to the hash table that shrinks the table by 1/2 whenever the load falls below 1/8. Then let's see what that performance looks like.

This is what I got after implementing the halving buckets:

INSERTION PHASE:
Items: 0, Iteration time: 0.002001 sec
Items: 100000, Iteration time: 0.004617 sec
Items: 200000, Iteration time: 0.007906 sec
Items: 300000, Iteration time: 0.009590 sec
Items: 400000, Iteration time: 0.014733 sec
Items: 500000, Iteration time: 0.021485 sec
Items: 600000, Iteration time: 0.030338 sec
Items: 700000, Iteration time: 0.019490 sec
Items: 800000, Iteration time: 0.028434 sec
Items: 900000, Iteration time: 0.041271 sec

REMOVAL PHASE:
Items: 800000, Iteration time: 0.031171 sec
Items: 700000, Iteration time: 0.020409 sec
Items: 600000, Iteration time: 0.018597 sec
Items: 500000, Iteration time: 0.022298 sec
Items: 400000, Iteration time: 0.027600 sec
Items: 300000, Iteration time: 0.028195 sec
Items: 200000, Iteration time: 0.025043 sec
Items: 100000, Iteration time: 0.014216 sec
Items: 0, Iteration time: 0.004760 sec

This was the run before the implementation:

INSERTION PHASE:
Items: 0, Iteration time: 0.002004 sec
Items: 100000, Iteration time: 0.004662 sec
Items: 200000, Iteration time: 0.008301 sec
Items: 300000, Iteration time: 0.010155 sec
Items: 400000, Iteration time: 0.014866 sec
Items: 500000, Iteration time: 0.021408 sec
Items: 600000, Iteration time: 0.030600 sec
Items: 700000, Iteration time: 0.020435 sec
Items: 800000, Iteration time: 0.027954 sec
Items: 900000, Iteration time: 0.037677 sec

REMOVAL PHASE:
Items: 800000, Iteration time: 0.028905 sec
Items: 700000, Iteration time: 0.020481 sec
Items: 600000, Iteration time: 0.018959 sec
Items: 500000, Iteration time: 0.023356 sec
Items: 400000, Iteration time: 0.024840 sec
Items: 300000, Iteration time: 0.024889 sec
Items: 200000, Iteration time: 0.023128 sec
Items: 100000, Iteration time: 0.017503 sec
Items: 0, Iteration time: 0.011163 sec

@dthain
Copy link
Member
dthain commented Apr 24, 2025

Hmm, maybe increase the number of items by a factor of 100 to get more consistent runtimes?

@Ian2327
Copy link
Contributor
Ian2327 commented Apr 24, 2025

This is what I got:
(With halving):

INSERTION PHASE:
Num buckets:        127|Items: 0, Iteration time: 0.516522 sec
Num buckets:   16646144|Items: 10000000, Iteration time: 1.052164 sec
Num buckets:   33292288|Items: 20000000, Iteration time: 1.216375 sec
Num buckets:   66584576|Items: 30000000, Iteration time: 2.180180 sec
Num buckets:   66584576|Items: 40000000, Iteration time: 5.628643 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 3.911151 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 4.566757 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 4.585195 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 5.511820 sec
Num buckets:  133169152|Items: 90000000, Iteration time: 5.066165 sec

REMOVAL PHASE:
Num buckets:  266338304|Items: 80000000, Iteration time: 3.811336 sec
Num buckets:  266338304|Items: 70000000, Iteration time: 6.845817 sec
Num buckets:  266338304|Items: 60000000, Iteration time: 6.101407 sec
Num buckets:  266338304|Items: 50000000, Iteration time: 6.085849 sec
Num buckets:  266338304|Items: 40000000, Iteration time: 6.202596 sec
Num buckets:  266338304|Items: 30000000, Iteration time: 5.739093 sec
Num buckets:  266338304|Items: 20000000, Iteration time: 2.852143 sec
Num buckets:  133169152|Items: 10000000, Iteration time: 2.325682 sec
Num buckets:  133169152|Items: 0, Iteration time: 1.113665 sec

(without halving):

INSERTION PHASE:
Num buckets:        127|Items: 0, Iteration time: 0.549217 sec
Num buckets:   16646144|Items: 10000000, Iteration time: 1.060335 sec
Num buckets:   33292288|Items: 20000000, Iteration time: 1.250571 sec
Num buckets:   66584576|Items: 30000000, Iteration time: 2.256575 sec
Num buckets:   66584576|Items: 40000000, Iteration time: 1.206507 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 2.549578 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 3.624075 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 4.397566 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 5.447921 sec
Num buckets:  133169152|Items: 90000000, Iteration time: 2.402528 sec

REMOVAL PHASE:
Num buckets:  266338304|Items: 80000000, Iteration time: 3.548334 sec
Num buckets:  266338304|Items: 70000000, Iteration time: 3.106250 sec
Num buckets:  266338304|Items: 60000000, Iteration time: 3.589783 sec
Num buckets:  266338304|Items: 50000000, Iteration time: 4.141373 sec
Num buckets:  266338304|Items: 40000000, Iteration time: 4.325799 sec
Num buckets:  266338304|Items: 30000000, Iteration time: 4.007034 sec
Num buckets:  266338304|Items: 20000000, Iteration time: 3.507949 sec
Num buckets:  266338304|Items: 10000000, Iteration time: 2.708850 sec
Num buckets:  266338304|Items: 0, Iteration time: 1.707641 sec

@JinZhou5042
Copy link
Member Author

I guess you won't see a significant performance difference since you're only testing insertion and removal, both of which are O(1) operations, whether halving occurs or not doesn't affect the performance noticeably.

And the halving threshold doesn't seem quite right, in the removal phase, when you have 10000000 items, the number of buckets drops to 133169152, but the load is 0.075. When all items are removed, the bucket count is not deducted.

@Ian2327
Copy link
Contributor
Ian2327 commented Apr 24, 2025

There was a slight issue with the ordering of when I was printing things.
(With halving):

INSERTION PHASE:
Num buckets:        127|Items: 0, Iteration time: 0.000001 sec
Num buckets:   1664
8000
6144|Items: 10000000, Iteration time: 0.512785 sec
Num buckets:   33292288|Items: 20000000, Iteration time: 1.025471 sec
Num buckets:   66584576|Items: 30000000, Iteration time: 1.139453 sec
Num buckets:   66584576|Items: 40000000, Iteration time: 2.144665 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 1.098983 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 2.293799 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 3.314591 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 4.251669 sec
Num buckets:  133169152|Items: 90000000, Iteration time: 5.167905 sec

REMOVAL PHASE:
Num buckets:  133169152|Items: 90000000, Iteration time: 5.037876 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 4.324703 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 3.332771 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 2.293010 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 1.099892 sec
Num buckets:  133169152|Items: 40000000, Iteration time: 1.391182 sec
Num buckets:  133169152|Items: 30000000, Iteration time: 1.814559 sec
Num buckets:  133169152|Items: 20000000, Iteration time: 1.841839 sec
Num buckets:   66584576|Items: 10000000, Iteration time: 1.075490 sec
Num buckets:        127|Items: 0, Iteration time: 0.000000 sec

(Without halving):

INSERTION PHASE:
Num buckets:        127|Items: 0, Iteration time: 0.000001 sec
Num buckets:   16646144|Items: 10000000, Iteration time: 0.546998 sec
Num buckets:   33292288|Items: 20000000, Iteration time: 1.033397 sec
Num buckets:   66584576|Items: 30000000, Iteration time: 1.172669 sec
Num buckets:   66584576|Items: 40000000, Iteration time: 2.247420 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 1.185176 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 2.517646 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 11.026331 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 5.188741 sec
Num buckets:  133169152|Items: 90000000, Iteration time: 5.514329 sec

REMOVAL PHASE:
Num buckets:  133169152|Items: 90000000, Iteration time: 5.483403 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 4.516701 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 3.745324 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 2.586994 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 1.202904 sec
Num buckets:  133169152|Items: 40000000, Iteration time: 1.542493 sec
Num buckets:  133169152|Items: 30000000, Iteration time: 2.083685 sec
Num buckets:  133169152|Items: 20000000, Iteration time: 1.993055 sec
Num buckets:  133169152|Items: 10000000, Iteration time: 1.338776 sec
Num buckets:  133169152|Items: 0, Iteration time: 0.269361 sec

@Ian2327
Copy link
Contributor
Ian2327 commented Apr 25, 2025

I get something like this if I end up calling the hash_table_halve_buckets() from my test script before each iteration rather than calling the halving from the remove function:

INSERTION PHASE:
Num buckets:        127|Items: 0, Iteration time: 0.000000 sec
Num buckets:   16646144|Items: 10000000, Iteration time: 0.270686 sec
Num buckets:   33292288|Items: 20000000, Iteration time: 0.548938 sec
Num buckets:   66584576|Items: 30000000, Iteration time: 0.748364 sec
Num buckets:   66584576|Items: 40000000, Iteration time: 1.113280 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 0.864008 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 1.524196 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 1.904774 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 2.271431 sec
Num buckets:  133169152|Items: 90000000, Iteration time: 2.593775 sec

REMOVAL PHASE:
Num buckets:  133169152|Items: 90000000, Iteration time: 2.590781 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 2.276454 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 1.906344 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 1.517520 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 0.863120 sec
Num buckets:  133169152|Items: 40000000, Iteration time: 0.805128 sec
Num buckets:  133169152|Items: 30000000, Iteration time: 0.741293 sec
Num buckets:  133169152|Items: 20000000, Iteration time: 0.657030 sec
Num buckets:   66584576|Items: 10000000, Iteration time: 0.463132 sec
Num buckets:   33292288|Items: 0, Iteration time: 0.038679 sec

The num buckets stops at 33292288 since it only halves once at most before every iteration.

@Ian2327
Copy link
Contributor
Ian2327 commented Apr 29, 2025

If I incorporate the halving into the HASH_TABLE_ITERATE function itself rather than my test script, then the times look like this since my test script measure how long it takes to complete the HASH_TABLE_ITERATE function of which now rehashing to the decreased table is a part of which explains the 19 sec time when reducing from 133169152 -> 66584576 and from 66584576 -> 33292288.

INSERTION PHASE:
Num buckets:        127|Items: 0, Iteration time: 0.000000 sec
Num buckets:   16646144|Items: 10000000, Iteration time: 0.268954 sec
Num buckets:   33292288|Items: 20000000, Iteration time: 0.546520 sec
Num buckets:   66584576|Items: 30000000, Iteration time: 0.745096 sec
Num buckets:   66584576|Items: 40000000, Iteration time: 1.106021 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 0.928846 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 1.540116 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 1.912967 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 2.277464 sec
Num buckets:  133169152|Items: 90000000, Iteration time: 2.586978 sec

REMOVAL PHASE:
Num buckets:  133169152|Items: 90000000, Iteration time: 2.585385 sec
Num buckets:  133169152|Items: 80000000, Iteration time: 2.260367 sec
Num buckets:  133169152|Items: 70000000, Iteration time: 1.900017 sec
Num buckets:  133169152|Items: 60000000, Iteration time: 1.504454 sec
Num buckets:  133169152|Items: 50000000, Iteration time: 0.866426 sec
Num buckets:  133169152|Items: 40000000, Iteration time: 0.804054 sec
Num buckets:  133169152|Items: 30000000, Iteration time: 0.739400 sec
Num buckets:  133169152|Items: 20000000, Iteration time: 0.659415 sec
Num buckets:   66584576|Items: 10000000, Iteration time: 19.493651 sec
Num buckets:   33292288|Items: 0, Iteration time: 3.898136 sec

@Ian2327
Copy link
Contributor
6D40 Ian2327 commented May 2, 2025

Result of gprof:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 41.32     28.86    28.86 1009876833     0.00     0.00  hash_table_nextkey
 27.61     48.14    19.28 100000000     0.00     0.00  hash_table_remove
 23.85     64.80    16.65 199876791     0.00     0.00  hash_table_insert
  3.67     67.36     2.57       20     0.13     0.27  hash_table_double_buckets
  1.30     68.27     0.91 299876791     0.00     0.00  jenkins_hash
  0.83     68.84     0.58        3     0.19     0.31  hash_table_halve_buckets
  0.77     69.39     0.54       20     0.03     1.31  measure_iteration_time
  0.24     69.56     0.17       40     0.00     0.00  hash_table_firstkey
  0.15     69.66     0.11 299876791     0.00     0.00  hash_string
  0.13     69.75     0.09                             main
  0.10     69.82     0.07        2     0.04     0.04  hash_table_firstkey_halving
  0.09     69.88     0.06        1     0.06     0.06  hash_table_clear
  0.07     69.93     0.05                             hash_table_nextkey_with_offset
  0.04     69.96     0.03                             frame_dummy
  0.03     69.98     0.02 190000000     0.00     0.00  generate_key
  0.01     69.99     0.01                             hash_table_fromkey
  0.00     69.99     0.00       23     0.00     0.00  hash_table_create
  0.00     69.99     0.00        1     0.00     0.06  hash_table_delete

with this output:

INSERTION PHASE:
Num buckets:        127|Items:          0, Iteration time: 0.000000 sec|Insertion time: 0.000000 sec
Num buckets:   16646144|Items:   10000000, Iteration time: 0.368499 sec|Insertion time: 5.948986 sec
Num buckets:   33292288|Items:   20000000, Iteration time: 0.898054 sec|Insertion time: 7.218499 sec
Num buckets:   66584576|Items:   30000000, Iteration time: 1.233462 sec|Insertion time: 12.129475 sec
Num buckets:   66584576|Items:   40000000, Iteration time: 1.908132 sec|Insertion time: 4.428443 sec
Num buckets:  133169152|Items:   50000000, Iteration time: 1.085552 sec|Insertion time: 21.561280 sec
Num buckets:  133169152|Items:   60000000, Iteration time: 2.544446 sec|Insertion time: 4.527519 sec
Num buckets:  133169152|Items:   70000000, Iteration time: 3.623162 sec|Insertion time: 4.696976 sec
Num buckets:  133169152|Items:   80000000, Iteration time: 4.406557 sec|Insertion time: 4.846902 sec
Num buckets:  133169152|Items:   90000000, Iteration time: 4.995502 sec|Insertion time: 5.084155 sec

REMOVAL PHASE:
Num buckets:  133169152|Items:   90000000, Iteration time: 4.986422 sec|Removal time: 4.436422 sec
Num buckets:  133169152|Items:   80000000, Iteration time: 4.488597 sec|Removal time: 4.217310 sec
Num buckets:  133169152|Items:   70000000, Iteration time: 3.516000 sec|Removal time: 4.345376 sec
Num buckets:  133169152|Items:   60000000, Iteration time: 2.160985 sec|Removal time: 3.387954 sec
Num buckets:  133169152|Items:   50000000, Iteration time: 1.006534 sec|Removal time: 3.639422 sec
Num buckets:  133169152|Items:   40000000, Iteration time: 0.895938 sec|Removal time: 3.531911 sec
Num buckets:  133169152|Items:   30000000, Iteration time: 0.790296 sec|Removal time: 3.100355 sec
Num buckets:  133169152|Items:   20000000, Iteration time: 0.695616 sec|Removal time: 3.077919 sec
Num buckets:   66584576|Items:   10000000, Iteration time: 22.981472 sec|Removal time: 4.346271 sec
Num buckets:   33292288|Items:          0, Iteration time: 4.797194 sec|Removal time: 1.543157 sec

Insertion Phase:

  • Time spikes in insertion time due to hash table bucket doubling occurring during insertion.

Removal Phase:

  • Time spikes in iteration time due to hash table bucket halving occurring during iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
0