-
Notifications
You must be signed in to change notification settings - Fork 37.4k
test: p2p: adhere to typical VERSION message protocol flow #29353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: p2p: adhere to typical VERSION message protocol flow #29353
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. |
95d0add
to
06deeb7
Compare
Squashed the last two commits, as the second to last commit failed without the last bugfix commit ("test: p2p: process post-v2-handshake data immediately"); this should hopefully make the "test each commit" CI target happy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, I think the test framework should replicate the bitcoind logic (that, as a fun fact, has been in place since the patch #101 by Hal Finney)
This deduplicates code for sending out the VERSION message (if available and not sent yet), currently used at three different places: 1) in the `connection_made` asyncio callback (for v1 connections that are not v2 reconnects) 2) at the end of `v2_handshake`, if the v2 handshake succeeded 3) in the `on_version` callback, if a reconnection with v1 happens
In the course of executing the asyncio data reception callback during a v2 handshake, it's possible that the receive buffer already contains data for after the handshake (usually a VERSION message for inbound connections). If we don't process that data immediately, we would do so after the next message is received, but with the adapted protocol flow introduced in the next commit, there is no next message, as the TestNode wouldn't continue until we send back our own version in `on_version`. Fix this by calling `self._on_data` immediately if there's data left in the receive buffer after a completed v2 handshake.
The test framework's p2p implementation currently sends out it's VERSION message immediately after an inbound connection (i.e. TestNode outbound connection) is made. This doesn't follow the usual protocol flow where the initiator sends a version first, and the responders processes that and only then responds with its own version message. Change that accordingly by only sending immediate VERSION message for outbound connections (or after v2 handshake for v2 connections, respectively), and sending out VERSION messages as response for incoming VERSION messages (i.e. in the function `on_version`) for inbound connections. Note that some of the overruled `on_version` methods in functional tests needed to be changed to send the version explicitly.
06deeb7
to
c340503
Compare
Brought back the commit "test: p2p: process post-v2-handshake data immediately" again and put it in before the one changing the version message flow. This leads to a cleaner history compared to the squashed commit (containing two long descriptions), while still passing the checks at each commit. Kudos to @stratospher for the idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK c340503
I ran the functional test suite both with and without --v2transport
on this branch and both runs succeeded.
utACK c340503 |
tested ACK c340503. very useful to have since we'd want real node behaviour! there is also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tACK c340503
As a side note, I found the newly added comments regarding inbounds and outbounds to be a bit confusing. This is due to them being written from the POV of the Python node instead of the TestNode's, which is the logic used by add_p2p_connection
and add_outbound_p2p_connection
@@ -17,7 +17,7 @@ def on_version(self, message): | |||
# message is received from the test framework. Don't send any responses | |||
# to the node's version message since the connection will already be | |||
# closed. | |||
pass | |||
self.send_version() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really just calling super().send_version
. We could get rid of the whole class given no other method is being overwritten (or leave the alias if we want it for readability, but get rid of the method)
@@ -43,7 +44,8 @@ def on_sendtxrcncl(self, message): | |||
|
|||
class P2PFeelerReceiver(SendTxrcnclReceiver): | |||
def on_version(self, message): | |||
pass # feeler connections can not send any message other than their own version | |||
# feeler connections can not send any message other than their own version | |||
self.send_version() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for p2p_add_connections::P2PFeelerReceiver
I agree that the inbound/outbound naming is currently not ideal w.r.t. mixed node perspectives and should be improved. Will look into a potential follow-up PR addressing this (#29353 (comment) and #29353 (comment) can also be tackled in that one then). Another idea I had recently was to improve the naming of the |
, bitcoin#29353, bitcoin#29452, bitcoin#29483, bitcoin#30545, bitcoin#31383 (BIP324 backports: part 5) c6f23a7 merge bitcoin#31383: Add missing node.setmocktime(self.mocktime) to p2p_ibd_stalling.py (Kittywhiskers Van Gogh) 9072a10 merge bitcoin#30545: fix intermittent failures in feature_proxy.py (Kittywhiskers Van Gogh) 7e2d435 merge bitcoin#29483: add --v1transport option, add --v2transport to a CI task (Kittywhiskers Van Gogh) 7e59a96 merge bitcoin#29452: document that BIP324 on by default (Kittywhiskers Van Gogh) 0f3b5e0 merge bitcoin#29353: adhere to typical VERSION message protocol flow (Kittywhiskers Van Gogh) dfdddfd rpc: enable `v2transport` for `masternode connect` when capable (Kittywhiskers Van Gogh) 3c16361 merge bitcoin#29356: make v2transport arg in addconnection mandatory and few cleanups (Kittywhiskers Van Gogh) c53cd93 merge bitcoin#29347: enable v2transport by default (Kittywhiskers Van Gogh) 7dcf561 merge bitcoin#27452: cover addrv2 anchors by adding TorV3 to CAddress in messages.py (Kittywhiskers Van Gogh) Pull request description: ## Additional Information * When backporting [bitcoin#27452](bitcoin#27452) in `feature_anchors.py`, `P2P_SERVICES` (`NODE_NETWORK | NODE_HEADERS_COMPRESSED`) has been replaced with `NODE_NETWORK` as the former evaluates to a value greater than `256` (specifically `2049`), which causes test failure. The replacement value is acceptable as `NODE_NETWORK` is the desired service flag expected by Dash Core ([source](https://github.com/dashpay/dash/blob/779e4295ad253bf909a7c45ec02743e580abc61b/src/protocol.cpp#L249-L254)). <details> <summary>Test failure:</summary> ``` dash@89afd55ae77e:/src/dash$ ./test/functional/feature_anchors.py 2024-12-14T12:31:22.244000Z TestFramework (INFO): PRNG seed is: 8365703189892653614 2024-12-14T12:31:22.244000Z TestFramework (INFO): Initializing test directory /tmp/dash_func_test_j0ya02yu 2024-12-14T12:31:22.776000Z TestFramework (INFO): When node starts, check if anchors.dat doesn't exist 2024-12-14T12:31:22.776000Z TestFramework (INFO): Add 2 block-relay-only connections to node 2024-12-14T12:31:24.781000Z TestFramework (INFO): Add 5 inbound connections to node 2024-12-14T12:31:29.843000Z TestFramework (INFO): Check node connections 2024-12-14T12:31:30.848000Z TestFramework (INFO): Check the addresses in anchors.dat 2024-12-14T12:31:30.848000Z TestFramework (INFO): Perturb anchors.dat to test it doesn't throw an error during initialization 2024-12-14T12:31:31.356000Z TestFramework (INFO): When node starts, check if anchors.dat doesn't exist anymore 2024-12-14T12:31:31.357000Z TestFramework (INFO): Ensure addrv2 support 2024-12-14T12:31:32.364000Z TestFramework (INFO): Add 256-bit-address block-relay-only connections to node 2024-12-14T12:31:33.368000Z TestFramework (INFO): Check for addrv2 addresses in anchors.dat 2024-12-14T12:31:33.369000Z TestFramework (ERROR): Unexpected exception caught during testing Traceback (most recent call last): File "/src/dash/test/functional/test_framework/test_framework.py", line 161, in main self.run_test() File "/src/dash/./test/functional/feature_anchors.py", line 135, in run_test new_data[services_index] = P2P_SERVICES ValueError: byte must be in range(0, 256) 2024-12-14T12:31:33.870000Z TestFramework (INFO): Stopping nodes 2024-12-14T12:31:33.871000Z TestFramework (WARNING): Not cleaning up dir /tmp/dash_func_test_j0ya02yu 2024-12-14T12:31:33.871000Z TestFramework (ERROR): Test failed. Test logging available at /tmp/dash_func_test_j0ya02yu/test_framework.log ``` </details> ## Breaking Changes None expected. ## Checklist: - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)** - [x] I have added or updated relevant unit/integration/functional/e2e tests - [x] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: UdjinM6: utACK c6f23a7 PastaPastaPasta: utACK c6f23a7; Tree-SHA512: ca134432d000d521827a20c75c03913421fe52a31fda1cbf632e0b207c31728406feb090469d592d8e2fd8d64298faa2752ff703de79f737a62c276c6a231097
This PR addresses a quirk in the test framework's p2p implementation regarding the version handshake protocol:
Currently, the VERSION message is sent immediately after an inbound connection (i.e. TestNode outbound connection) is made. This doesn't follow the usual protocol flow where the initiator sends a version first, the responder processes that and only then responds with its own version message. Change that accordingly by only sending immediate VERSION message for outbound connections (or after v2 handshake for v2 connections, respectively), and sending out VERSION message as response for incoming VERSION messages (i.e. in the function
on_version
) for inbound connections.I first stumbled upon this issue through reading comment https://mirror.b10c.me/bitcoin-bitcoin/24748/#discussion_r1465420112 (see last paragraph) and recently again in the course of working on a v2-followup for #29279, where this causes issues for TestNode outbound connections that disconnect before sending out their own version message.
Note that these changes lead to slightly more code in some functional tests that override the
on_version
method, as the version reply has to be sent explicitly now, but I think is less confusing and reflects better what is actually happening.