-
Notifications
You must be signed in to change notification settings - Fork 37.4k
index: initial sync speedup, parallelize process #26966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
index: initial sync speedup, parallelize process #26966
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/26966. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. LLM Linter (✨ experimental)Possible typos and grammar issues:
drahtbot_id_4_m |
deb2e6e
to
d8b0f5e
Compare
Cool, will take it for a spin... |
d8b0f5e
to
65dc850
Compare
Cool @Sjors, just pushed a small update. Found a little bug. Going to add an important note to the PR description (because otherwise testing results will vary a lot): As the access to the block data on disk is protected by |
65dc850
to
09cc56a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably much easier suggested than done, but did you attempt to implement parallelization in a more general way so that other indices could benefit from it as well? On first glance, txindex
, and the indices suggested in PRs (#24539, #26951) seem to be parallelizable as well (not coinstatsindex
though).
Yeah, that is part of the plan. I started with the block filter index because it requires an special treatment that My idea was to start reviewing this one, so the process gets as clean as possible, and then move forward with the generalization step. It's usually more natural to abstract processes when the specific cases are well-defined. |
09cc56a
to
43e6237
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Perhaps it could have separate parallelization and index logic. So it could be reused in other indexes.
b2b62f5
to
4e1fba1
Compare
Building the index on (AMD Ryzen 7950X, blocks stored on SSD):
I made sure to not load any wallets and disabled other indexes. I didn't test if the index was correct. I wonder if, for users without this index, it would be faster to generate the index, rescan the wallet and then delete it again. Combined with #26951 you would only have to generate filters up to the age of the wallet (IIUC, cc @pstratem). Note to self for future benchmarks: |
Great results @Sjors!. Could also give it a run rebased on top #27006.
The PR contain a test verifying it. Side note: |
1da78b2
to
f264ed5
Compare
81dcee9
to
349b099
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK 349b099 and I reviewed most of the code. Seems like a nice design and good approach. Plan to finish reviewing later.
src/index/base.cpp
Outdated
__func__, pindex->GetBlockHash().ToString()); | ||
return; | ||
} | ||
if (!ProcessBlock(pindex)) break; // error logged internally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: remove CBlockIndex access from node internals" (c4723fb)
It seems like this should be returning, not breaking. Would be good to change or clarify with a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: remove CBlockIndex access from node internals" (c4723fb)
It seems like this should be returning, not breaking. Would be good to change or clarify with a comment
yeah, good catch!.
I think at the end it doesn't matter much because ProcessBlock
aborts the node during a failure but still, this would have logged an extra line "index enabled at height " during shutdown.
src/index/base.cpp
Outdated
interfaces::BlockInfo block_info = kernel::MakeBlockInfo(pindex, block.get()); | ||
if (CustomAppend(block_info)) { | ||
|
||
if (ProcessBlock(pindex)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: remove CBlockIndex access from node internals" (c4723fb)
Coud add another // error logged internally
comment here since it now looks like this failure is being ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: remove CBlockIndex access from node internals" (c4723fb)
Coud add another
// error logged internally
comment here since it now looks like this failure is being ignored.
Done as suggested.
src/index/blockfilterindex.cpp
Outdated
} | ||
} | ||
|
||
const CBlockUndo& block_undo = block.height > 0 ? *Assert(block.undo_data) : CBlockUndo(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: remove CBlockIndex access from node internals" (c4723fb)
I'm not sure this is safe. It looks like if block height is 0 this is taking a reference to a temporary object that will go out scope.
In any case I think this could be simplified to const CBlockUndo& block_undo{*Assert(block.undo_data)}
because it looks like pointer will be set above even if height is 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: remove CBlockIndex access from node internals" (c4723fb)
I'm not sure this is safe. It looks like if block height is 0 this is taking a reference to a temporary object that will go out scope.
In any case I think this could be simplified to
const CBlockUndo& block_undo{*Assert(block.undo_data)}
because it looks like pointer will be set above even if height is 0.
yeah, great. Done as suggested.
src/util/threadpool.h
Outdated
Mutex cs_work_queue; | ||
std::queue<std::function<void()>> m_work_queue GUARDED_BY(cs_work_queue); | ||
std::condition_variable m_condition; | ||
CThreadInterrupt m_interrupt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "util: introduce general purpose thread pool" (a382501)
I don't think it makes sense for m_interrupt
to use the CThreadInterrupt
class because the ThreadPool class already has it's own mutex and condition variable and it would be wasteful to introduce more. I think could just replace CThreadInterrupt
with bool here, and replace m_interrupt();
with m_interrupt = true
and replace m_interrupt.reset()
with m_interrupt = false
while holding the mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "util: introduce general purpose thread pool" (a382501)
I don't think it makes sense for
m_interrupt
to use theCThreadInterrupt
class because the ThreadPool class already has it's own mutex and condition variable and it would be wasteful to introduce more. I think could just replaceCThreadInterrupt
with bool here, and replacem_interrupt();
withm_interrupt = true
and replacem_interrupt.reset()
withm_interrupt = false
while holding the mutex.
sure, done as suggested. Thanks!
src/sync.h
Outdated
@@ -299,6 +299,7 @@ inline MutexType* MaybeCheckNotHeld(MutexType* m) LOCKS_EXCLUDED(m) LOCK_RETURNE | |||
//! The above is detectable at compile-time with the -Wreturn-local-addr flag in | |||
//! gcc and the -Wreturn-stack-address flag in clang, both enabled by default. | |||
#define WITH_LOCK(cs, code) (MaybeCheckNotHeld(cs), [&]() -> decltype(auto) { LOCK(cs); code; }()) | |||
#define WITH_REVERSE_LOCK(cs, code) ([&]() -> decltype(auto) { REVERSE_LOCK(cs); code; }()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "util: introduce general purpose thread pool" (a382501)
This seems ok, but it's is only used one place where WITH_REVERSE_LOCK
is not much simpler than plain REVERSE_LOCK
. Could consider dropping it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "util: introduce general purpose thread pool" (a382501)
This seems ok, but it's is only used one place where
WITH_REVERSE_LOCK
is not much simpler than plainREVERSE_LOCK
. Could consider dropping it.
Sure. I think I did it this way to be very explicit about the code that will be executed without the lock, but yeah, we could achieve the same outcome with another set of brackets too.
src/util/threadpool.h
Outdated
class ThreadPool { | ||
|
||
private: | ||
Mutex cs_work_queue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "util: introduce general purpose thread pool" (a382501)
Not important, but would suggest simplifying naming and just calling these members:
Mutex m_mutex;
std::condition_variable m_cv;
The cs_
prefix is an older convention that comes from windows code, and there is as long as this class is going to have one mutex there isn't really a reason to use a more complicated name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "util: introduce general purpose thread pool" (a382501)
Not important, but would suggest simplifying naming and just calling these members:
Mutex m_mutex; std::condition_variable m_cv;The
cs_
prefix is an older convention that comes from windows code, and there is as long as this class is going to have one mutex there isn't really a reason to use a more complicated name.
k sure. Done as suggested.
src/init.cpp
Outdated
@@ -2104,7 +2108,20 @@ bool StartIndexBackgroundSync(NodeContext& node) | |||
} | |||
} | |||
|
|||
std::shared_ptr<ThreadPool> thread_pool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
This doesn't seem like a great use of shared_ptr because makes the shutdown sequence more complicated than it needs to be. I think it would be clearer if instead of taking std::shared_ptr<ThreadPool>
references they just used ThreadPool&
and a new std::unique_ptr<ThreadPool> m_index_threads
member was added to NodeContext
. This way the threads could be stopped with an explicit reset()
call instead of shutting down more unpredictably when the last index is destroyed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
This doesn't seem like a great use of shared_ptr because makes the shutdown sequence more complicated than it needs to be. I think it would be clearer if instead of taking
std::shared_ptr<ThreadPool>
references they just usedThreadPool&
and a newstd::unique_ptr<ThreadPool> m_index_threads
member was added toNodeContext
. This way the threads could be stopped with an explicitreset()
call instead of shutting down more unpredictably when the last index is destroyed.
Sounds good. Done as suggested. Thanks!
src/index/base.cpp
Outdated
const int max_blocks_to_sync = m_tasks_per_worker * m_thread_pool->WorkersCount() + m_tasks_per_worker; // extra 'm_tasks_per_worker' due the active-wait. | ||
const int tip_height = WITH_LOCK(cs_main, return m_chainstate->m_chain.Height()); | ||
const int remaining_blocks = tip_height - pindex_next->nHeight; | ||
work_chunk = remaining_blocks > max_blocks_to_sync ? m_tasks_per_worker : remaining_blocks / (m_thread_pool->WorkersCount() + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
I found this hard to follow:
const int max_blocks_to_sync = m_tasks_per_worker * m_thread_pool->WorkersCount() + m_tasks_per_worker; // extra 'm_tasks_per_worker' due the active-wait.
work_chunk = remaining_blocks > max_blocks_to_sync ? m_tasks_per_worker : remaining_blocks / (m_thread_pool->WorkersCount() + 1);
workers_count = m_thread_pool->WorkersCount();
would suggest dropping the max_blocks_to_sync
variable and simplifying to:
workers_count = m_thread_pool->WorkersCount();
work_chunk = std::min(m_tasks_per_worker, remaining_blocks / (workers_count + 1));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
I found this hard to follow:
const int max_blocks_to_sync = m_tasks_per_worker * m_thread_pool->WorkersCount() + m_tasks_per_worker; // extra 'm_tasks_per_worker' due the active-wait. work_chunk = remaining_blocks > max_blocks_to_sync ? m_tasks_per_worker : remaining_blocks / (m_thread_pool->WorkersCount() + 1); workers_count = m_thread_pool->WorkersCount();
would suggest dropping the
max_blocks_to_sync
variable and simplifying to:workers_count = m_thread_pool->WorkersCount(); work_chunk = std::min(m_tasks_per_worker, remaining_blocks / (workers_count + 1));
Good idea. Done as suggested.
src/index/base.h
Outdated
@@ -88,6 +91,7 @@ class BaseIndex : public CValidationInterface | |||
CThreadInterrupt m_interrupt; | |||
|
|||
std::shared_ptr<ThreadPool> m_thread_pool; | |||
uint16_t m_tasks_per_worker{INDEX_WORK_PER_CHUNK}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
IMO, this would be clearer if it were called blocks_per_worker or blocks_per_chunk instead of tasks_per_worker. Not knowing that a task is a block made the code harder to understand when initially reading it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
IMO, this would be clearer if it were called blocks_per_worker or blocks_per_chunk instead of tasks_per_worker. Not knowing that a task is a block made the code harder to understand when initially reading it.
sure. Done as suggested.
src/index/base.cpp
Outdated
std::vector<std::future<std::vector<std::any>>> futures; | ||
const CBlockIndex* it_start = pindex_next; | ||
|
||
if (parallel_sync_enabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
Would be really helpful to have a comment saying how this works at a high level. Would suggest something like "If parallel sync is enabled, use WorkersCount()+1
threads (including the current thread) to each process block ranges of up to m_tasks_per_worker
blocks. The blocks in each range are processed in sequence by calling the index's CustomProcessBlock
method which returns std::any
values that are collected into vectors. As the threads finish their work, the std::any
values are processed in order by calling the index's CustomPostProcessBlocks
method, and the process repeats until no blocks are remaining to be processed and post-processed."
I guess at a high level this seems reasonable, but perhaps too rigid. Like if there are 3 threads processing block ranges 1-10, 11-20, 21-30, and the first 2 threads finish while the third thread is slow. Why should the loop need to wait for the third thread before beginning to process blocks 31-50 and there are two idle threads doing nothing?
It seems like this idleness could be avoided by moving the CustomPostProcessBlocks
calls into the worker threads. So that whenever each worker thread finishes processing blocks, it then opportunistically calls CustomPostProcessBlocks
to post-process any blocks that are available (given the ordering constraint for post-processing). This way all the worker threads would continuously have work to do, and I suspect the resulting code might be simpler too since there would just be a single phase of work, not alternating Processing/PostProcessing phases.
349b099
to
55f23cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review, andrewtoth and ryanofsky!
Addressed most of the suggestions, but not all yet. Will finish the rest soon.
src/index/base.cpp
Outdated
__func__, pindex->GetBlockHash().ToString()); | ||
return; | ||
} | ||
if (!ProcessBlock(pindex)) break; // error logged internally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: remove CBlockIndex access from node internals" (c4723fb)
It seems like this should be returning, not breaking. Would be good to change or clarify with a comment
yeah, good catch!.
I think at the end it doesn't matter much because ProcessBlock
aborts the node during a failure but still, this would have logged an extra line "index enabled at height " during shutdown.
src/index/base.cpp
Outdated
interfaces::BlockInfo block_info = kernel::MakeBlockInfo(pindex, block.get()); | ||
if (CustomAppend(block_info)) { | ||
|
||
if (ProcessBlock(pindex)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: remove CBlockIndex access from node internals" (c4723fb)
Coud add another
// error logged internally
comment here since it now looks like this failure is being ignored.
Done as suggested.
src/index/blockfilterindex.cpp
Outdated
} | ||
} | ||
|
||
const CBlockUndo& block_undo = block.height > 0 ? *Assert(block.undo_data) : CBlockUndo(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: remove CBlockIndex access from node internals" (c4723fb)
I'm not sure this is safe. It looks like if block height is 0 this is taking a reference to a temporary object that will go out scope.
In any case I think this could be simplified to
const CBlockUndo& block_undo{*Assert(block.undo_data)}
because it looks like pointer will be set above even if height is 0.
yeah, great. Done as suggested.
src/util/threadpool.h
Outdated
Mutex cs_work_queue; | ||
std::queue<std::function<void()>> m_work_queue GUARDED_BY(cs_work_queue); | ||
std::condition_variable m_condition; | ||
CThreadInterrupt m_interrupt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "util: introduce general purpose thread pool" (a382501)
I don't think it makes sense for
m_interrupt
to use theCThreadInterrupt
class because the ThreadPool class already has it's own mutex and condition variable and it would be wasteful to introduce more. I think could just replaceCThreadInterrupt
with bool here, and replacem_interrupt();
withm_interrupt = true
and replacem_interrupt.reset()
withm_interrupt = false
while holding the mutex.
sure, done as suggested. Thanks!
src/sync.h
Outdated
@@ -299,6 +299,7 @@ inline MutexType* MaybeCheckNotHeld(MutexType* m) LOCKS_EXCLUDED(m) LOCK_RETURNE | |||
//! The above is detectable at compile-time with the -Wreturn-local-addr flag in | |||
//! gcc and the -Wreturn-stack-address flag in clang, both enabled by default. | |||
#define WITH_LOCK(cs, code) (MaybeCheckNotHeld(cs), [&]() -> decltype(auto) { LOCK(cs); code; }()) | |||
#define WITH_REVERSE_LOCK(cs, code) ([&]() -> decltype(auto) { REVERSE_LOCK(cs); code; }()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "util: introduce general purpose thread pool" (a382501)
This seems ok, but it's is only used one place where
WITH_REVERSE_LOCK
is not much simpler than plainREVERSE_LOCK
. Could consider dropping it.
Sure. I think I did it this way to be very explicit about the code that will be executed without the lock, but yeah, we could achieve the same outcome with another set of brackets too.
src/util/threadpool.h
Outdated
class ThreadPool { | ||
|
||
private: | ||
Mutex cs_work_queue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "util: introduce general purpose thread pool" (a382501)
Not important, but would suggest simplifying naming and just calling these members:
Mutex m_mutex; std::condition_variable m_cv;The
cs_
prefix is an older convention that comes from windows code, and there is as long as this class is going to have one mutex there isn't really a reason to use a more complicated name.
k sure. Done as suggested.
src/init.cpp
Outdated
@@ -2104,7 +2108,20 @@ bool StartIndexBackgroundSync(NodeContext& node) | |||
} | |||
} | |||
|
|||
std::shared_ptr<ThreadPool> thread_pool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
This doesn't seem like a great use of shared_ptr because makes the shutdown sequence more complicated than it needs to be. I think it would be clearer if instead of taking
std::shared_ptr<ThreadPool>
references they just usedThreadPool&
and a newstd::unique_ptr<ThreadPool> m_index_threads
member was added toNodeContext
. This way the threads could be stopped with an explicitreset()
call instead of shutting down more unpredictably when the last index is destroyed.
Sounds good. Done as suggested. Thanks!
src/index/base.cpp
Outdated
const int max_blocks_to_sync = m_tasks_per_worker * m_thread_pool->WorkersCount() + m_tasks_per_worker; // extra 'm_tasks_per_worker' due the active-wait. | ||
const int tip_height = WITH_LOCK(cs_main, return m_chainstate->m_chain.Height()); | ||
const int remaining_blocks = tip_height - pindex_next->nHeight; | ||
work_chunk = remaining_blocks > max_blocks_to_sync ? m_tasks_per_worker : remaining_blocks / (m_thread_pool->WorkersCount() + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
I found this hard to follow:
const int max_blocks_to_sync = m_tasks_per_worker * m_thread_pool->WorkersCount() + m_tasks_per_worker; // extra 'm_tasks_per_worker' due the active-wait. work_chunk = remaining_blocks > max_blocks_to_sync ? m_tasks_per_worker : remaining_blocks / (m_thread_pool->WorkersCount() + 1); workers_count = m_thread_pool->WorkersCount();
would suggest dropping the
max_blocks_to_sync
variable and simplifying to:workers_count = m_thread_pool->WorkersCount(); work_chunk = std::min(m_tasks_per_worker, remaining_blocks / (workers_count + 1));
Good idea. Done as suggested.
src/index/base.h
Outdated
@@ -88,6 +91,7 @@ class BaseIndex : public CValidationInterface | |||
CThreadInterrupt m_interrupt; | |||
|
|||
std::shared_ptr<ThreadPool> m_thread_pool; | |||
uint16_t m_tasks_per_worker{INDEX_WORK_PER_CHUNK}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "index: implement index parallel sync" (bc0e521)
IMO, this would be clearer if it were called blocks_per_worker or blocks_per_chunk instead of tasks_per_worker. Not knowing that a task is a block made the code harder to understand when initially reading it.
sure. Done as suggested.
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
1ccb9f6
to
f3f5e4c
Compare
029ba1a index: remove CBlockIndex access from CustomAppend() (furszy) 91b7ab6 refactor: index, simplify CopyHeightIndexToHashIndex to process single block (furszy) 6f1392c indexes, refactor: Remove remaining CBlockIndex* uses in index Rewind methods (Ryan Ofsky) 0a24870 indexes, refactor: Stop requiring CBlockIndex type to call IsBIP30Unspendable (Ryan Ofsky) 331a25c test: indexes, avoid creating threads when sync runs synchronously (furszy) Pull request description: Combining common refactors from #24230 and #26966, aiming to move both efforts forward while reducing their size and review burden. Broadly, #24230 focuses on enabling indexes to run in a separate process, and #26966 aims to parallelize the indexes initial synchronization process. A shared prerequisite for both is ensuring that only the base index class interacts with the node’s chain internals - child index classes should instead operate solely through chain events. This PR moves disk read lookups from child index classes to the base index class. It also includes a few documentation improvements and a test-only code cleanup. ACKs for top commit: maflcko: review ACK 029ba1a 👡 achow101: ACK 029ba1a TheCharlatan: Re-ACK 029ba1a davidgumberg: ACK 029ba1a mzumsande: Code Review ACK 029ba1a Tree-SHA512: f073af407fc86f228cb47a32c7bcf2241551cc89ff32059317eb81d5b86fd5fda35f228d2567e0aedbc9fd6826291f5fee05619db35ba44108421ae04d11e6fb
And add option to customize thread pool workers count
It also adds coverage for initial sync from a particular block. Mimicking a node restart.
f3f5e4c
to
53bc8b9
Compare
const auto& [filter, height] = std::any_cast<std::pair<BlockFilter, int>>(obj); | ||
const uint256& header = filter.ComputeHeader(m_last_header); | ||
if (!Write(filter, height, header)) { | ||
LogError("Error writings filters, shutting down block filters index\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writings -> writing [incorrect verb form in "Error writings filters"]
The current procedure for building the block filter index involves processing filters one at a time;
Reading blocks, undo data, and previous headers from disk sequentially.
This PR introduces a new mechanism to perform the work concurrently. Dividing the filters
generation workload among a pool of workers that can be configured by the user,
significantly increasing the speed of the index construction process.
The same concurrent processing model has been applied to the transactions index as well.
The newly introduced init flag
-indexworkers=<n>
enables the concurrent syncbehavior.
Where "n" is the number of worker threads that will be spawned at startup to create ranges
of block filters during the initial sync process. Destroying the workers pool once the
initial sync completes.
Note: by default, the parallelized sync process is not enabled.
Now the juicy part:
In my computer, with the node in debug mode and on IBD, with
-indexworkers=4
, theblock filter index generation took less than an hour. While, in master, the sync took more than 7 hours.
Important Note:
As the access to the block data on disk is protected by
cs_main
, this new feature runs substantiallyfaster when the node is not in IBD.