-
Notifications
You must be signed in to change notification settings - Fork 37.4k
validation: improve performance of CheckBlockIndex #28339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validation: improve performance of CheckBlockIndex #28339
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. |
cc @martinus |
Hi, I wonder how you measured the runtime, is there a benchmark? Another optimization that might be worth it, it is to replace the |
So far only manually, by adding log statements to the beginning and end of CheckBlockIndex and running with
Thanks, I'll look into that! |
While you're at it, I noticed that the help text for |
Good point, I think that an interval argument similar to addrman or mempool consistency checks makes sense. Will try that in addition to the other points above (it might take a few weeks though, therefore put the PR into draft). |
3e0f001
to
87d6155
Compare
I added a commit with a benchmark, and another one that allow to specify the frequency for
I tried that in mzumsande@51aae54 (directly on master without the changes of this PR), however it doesn't seem to improve performance - I'll post benchmark results below. |
Benchmark results: master
PR
martinus' ideaUses
The benchmark currently uses a linear chain of 1100 headers. In general, the efficiency gain becomes larger with a longer chain, for example with just 100 headers the speedup is less. So there is a speedup of a facto >2.5 with the benchmark (that becomes larger on larger chains, see results for mainnet / testnet in OP). |
Concept ACK |
87d6155
to
efdc8e3
Compare
bool ChainstateManager::ShouldCheckBlockIndex() const | ||
{ | ||
if (!*Assert(m_options.check_block_index)) return false; | ||
if (GetRand(*m_options.check_block_index) >= 1) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be better to do this deterministically to avoid non-determinism in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have consistency among -checkaddrman
, -checkmempool
and -checkblockindex
. Since non-determinism would only occur for values other than 0 and 1, maybe it'd be best not to use other values than that in tests? Or change all three do be deterministic?
7015231
to
e880ee0
Compare
b51eb42 to e880ee0 Still haven't had the time to write more unit tests for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK e880ee0. Just rebase and suggested simplifications since the last review
@DrahtBot, I just reviewed it.. |
e880ee0
to
3c3895c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 3c3895c
Left another suggestion, but this looks good in its current form.
by not saving all indexes in a std::multimap, but only those that are not part of the best header chain. The indexes of the best header chain are stored in a vector, which, in the typical case of a mostly linear chain with a few forks, results in a much smaller multimap, and increases performance noticeably for long chains. This does not change the actual consistency checks that are being performed for each index, just the way the block index tree is stored and traversed. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
This makes it similar to -checkaddrman and -checkmempool, which also allow to run the check occasionally instead of always / never. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
3c3895c
to
5bc2077
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 5bc2077
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 5bc2077. Just added suggested assert and simplification since last review
ACK 5bc2077 |
Ported to the CMake-based build system in hebasto#290. |
CheckBlockIndex()
are consistency checks that are currently enabled by default on regtest.The function is rather slow, which is annoying if you
-checkblockindex
or don't know it existed.One reason why it's slow is that in order to be able to traverse the block tree depth-first from genesis, it inserts pointers to all block indices into a
std::multimap
- for which inserts and lookups become slow once there are hundred thousands of entries.However, typically the block index is mostly chain-like with just a few forks so a multimap isn't really needed for the most part. This PR suggests to store the block indices of the chain ending in the best header in a vector instead, and store only the rest of the indices in a multimap. This does not change the actual consistency checks that are being performed for each index, just the way the block index tree is stored and traversed.
This adds a bit of complication to make sure each block is visited (note that there are asserts that check it), making sure that the two containers are traversed correctly, but it speeds up the function considerably:
On master, a single invocation of
CheckBlockIndex
takes ~1.4s on mainnet for me (4.9s on testnet which has >2.4 million blocks).With this branch, the runtime goes down to ~0.27s (0.85s on testnet).This is a speedup by a factor ~5.