-
Notifications
You must be signed in to change notification settings - Fork 37.4k
net: introduce block tracker to retry to download blocks after failure #27837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
net: introduce block tracker to retry to download blocks after failure #27837
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Maybe mark as draft if CI is red and this is still based on something else? |
Oh, didn't know about the CI failure. Seems to be just an unused variable. But sure, draft until #27836 sounds good. |
1311875
to
aab7a2f
Compare
fe0f9be
to
d589be0
Compare
6629d1d test: improve robustness of connect_nodes() (furszy) Pull request description: Decoupled from #27837 because this can help other too, found it investigating a CI failure https://cirrus-ci.com/task/5805115213348864?logs=ci#L3200. The `connect_nodes` function in the test framework relies on a stable number of peer connections to verify that the new connection between the nodes is successfully established. This approach is fragile, as any of the peers involved in the process can drop, lose, or create a connection at any step, causing subsequent `wait_until` checks to stall indefinitely even when the peers in question were connected successfully. This commit improves the situation by using the nodes' subversion and the connection direction (inbound/outbound) to identify the exact peer connection and perform the checks exclusively on it. ACKs for top commit: stratospher: reACK 6629d1d. achow101: ACK 6629d1d maflcko: utACK 6629d1d AngusP: re-ACK 6629d1d Tree-SHA512: 5f345c0ce49ea81b643e97c5cffd133e182838752c27592fcdeac14ad10919fb4b7ff38e289e42a7c3c638a170bd0d0b7a9cd493898997a2082a7b7ceef4aeeb
d589be0
to
0545d24
Compare
🚧 At least one of the CI tasks failed. HintsMake sure to run all tests locally, according to the documentation. The failure may happen due to a number of reasons, for example:
Leave a comment here, if you need help tracking down a confusing failure. |
0545d24
to
8c4f665
Compare
8c4f665
to
c31233a
Compare
6629d1d test: improve robustness of connect_nodes() (furszy) Pull request description: Decoupled from bitcoin#27837 because this can help other too, found it investigating a CI failure https://cirrus-ci.com/task/5805115213348864?logs=ci#L3200. The `connect_nodes` function in the test framework relies on a stable number of peer connections to verify that the new connection between the nodes is successfully established. This approach is fragile, as any of the peers involved in the process can drop, lose, or create a connection at any step, causing subsequent `wait_until` checks to stall indefinitely even when the peers in question were connected successfully. This commit improves the situation by using the nodes' subversion and the connection direction (inbound/outbound) to identify the exact peer connection and perform the checks exclusively on it. ACKs for top commit: stratospher: reACK 6629d1d. achow101: ACK 6629d1d maflcko: utACK 6629d1d AngusP: re-ACK 6629d1d Tree-SHA512: 5f345c0ce49ea81b643e97c5cffd133e182838752c27592fcdeac14ad10919fb4b7ff38e289e42a7c3c638a170bd0d0b7a9cd493898997a2082a7b7ceef4aeeb
If the initial block fetching process fails, the p2p layer will be in charge of fetching the block from 'any' connected peer. Re-trying to download the block from different peers until it is received.
If no 'peer_id' is provided, 'getblockfrompeer' will just delegate the peer selection to the internal block downloading logic.
Allowing what we had before, a single block request with no automatic retry nor tracking mechanism.
c31233a
to
b18c72c
Compare
6629d1d test: improve robustness of connect_nodes() (furszy) Pull request description: Decoupled from bitcoin#27837 because this can help other too, found it investigating a CI failure https://cirrus-ci.com/task/5805115213348864?logs=ci#L3200. The `connect_nodes` function in the test framework relies on a stable number of peer connections to verify that the new connection between the nodes is successfully established. This approach is fragile, as any of the peers involved in the process can drop, lose, or create a connection at any step, causing subsequent `wait_until` checks to stall indefinitely even when the peers in question were connected successfully. This commit improves the situation by using the nodes' subversion and the connection direction (inbound/outbound) to identify the exact peer connection and perform the checks exclusively on it. ACKs for top commit: stratospher: reACK 6629d1d. achow101: ACK 6629d1d maflcko: utACK 6629d1d AngusP: re-ACK 6629d1d Tree-SHA512: 5f345c0ce49ea81b643e97c5cffd133e182838752c27592fcdeac14ad10919fb4b7ff38e289e42a7c3c638a170bd0d0b7a9cd493898997a2082a7b7ceef4aeeb
🐙 This pull request conflicts with the target branch and needs rebase. |
⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
Coming from #27652, part of #29183.
The general idea is to keep track of the user requested blocks so, in
case of a bad behaving peer or a network disconnection, they can be
fetched from another one automatically without any further user interaction.
This was requested by users because the
getblockfrompeer
RPC commandlacks the functionality to notify them about block request failures or peer
disconnections (which is expected due to the asynchronous nature of the block
requests).
Currently, this new functionality is limited to blocks requested by the
user via the 'getblockfrompeer' RPC command.
In the future, this class could expand its scope and be utilized in the
regular chain synchronization process. Or, even could be employed in
special procedures like a prune node rescan that uses BIP158 block filters,
or even into BIP157 itself.