8000 fix(mirror): wait 15 seconds before sending transactions by marcelo-gonzalez · Pull Request #10990 · near/nearcore · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix(mirror): wait 15 seconds before sending transactions #10990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 10, 2024

Conversation

marcelo-gonzalez
Copy link
Contributor

The test pytest/tools/mirror/offline_test.py often fails with many fewer transactions observed than wanted, and the logs reveal that many transactions are invalid because the access keys used do not exist in the target chain. This happens because some early transactions that should have added those keys never make it on chain. These transactions are sent successfully from the perspective of the ClientActor, but the logs show that they're dropped by the peer manager:

DEBUG handle{handler="PeerManagerMessageRequest" actor="PeerManagerActor" msg_type="NetworkRequests"}: network: Failed sending message: peer not connected to=ed25519:Fz7d1xkkt3XsvTPiwk4JRhMuPru4Ss7cLS8fdhshDRj3 num_connected_peers=1 msg=Routed(RoutedMessageV2 { msg: RoutedMessage { ... body: tx GFW8HgTndXVKdcLHdsCXxURjHxDnnEqHadrbxvsLKVQb ...

So, the peer manager is dropping the transaction instead of routing it, and the test fails because many subsequent transactions depended on that one. A git bisect shows that this behavior starts after #9651. It seems that this failure to route messages happens for a bit longer after startup after that PR. The proper way to handle this might be to implement a mechanism whereby these messages won't just silently be dropped, and the ClientActor can receive a notification that it wasn't successful so that we can retry it later. But for now a workaround is to just wait a little bit before sending transactions. So we'll set a 15 second timer for the first batch of transactions, and then proceed normally with the others

the test pytest/tools/mirror/offline_test.py often fails with
many fewer transactions observed than wanted, and the logs
reveal that many transactions are invalid because the access keys
used do not exist in the target chain. This happens because some
early transactions that should have added those keys never make it
on chain. These transactions are sent successfully from the perspective
of the ClientActor, but the logs show that they're dropped by the peer manager:

```
DEBUG handle{handler="PeerManagerMessageRequest" actor="PeerManagerActor" msg_type="NetworkRequests"}: network: Failed sending message: peer not connected to=ed25519:Fz7d1xkkt3XsvTPiwk4JRhMuPru4Ss7cLS8fdhshDRj3 num_connected_peers=1 msg=Routed(RoutedMessageV2 { msg: RoutedMessage { ... body: tx GFW8HgTndXVKdcLHdsCXxURjHxDnnEqHadrbxvsLKVQb ...
```

So, the peer manager is dropping the transaction instead of routing it, and
the test fails because many subsequent transactions depended on that one. A git
bisect shows that this behavior starts after near#9651.
It seems that this failure to route messages happens for a bit longer after startup after
that PR.

The proper way to handle this might be to implement a mechanism whereby these
messages won't just silently be dropped, and the ClientActor can receive a notification
that it wasn't successful so that we can retry it later. But for now a workaround
is to just wait a little bit before sending transactions. So we'll set a 15 second
timer for the first batch of transactions, and then proceed normally with the others
@marcelo-gonzalez marcelo-gonzalez requested a review from a team as a code owner April 8, 2024 18:35
Copy link
codecov bot commented Apr 8, 2024

Codecov Report

Attention: Patch coverage is 0% with 26 lines in your changes are missing coverage. Please review.

Project coverage is 71.32%. Comparing base (ce92db4) to head (45cd21d).
Report is 1 commits behind head on master.

Files Patch % Lines
tools/mirror/src/chain_tracker.rs 0.00% 26 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10990      +/-   ##
==========================================
- Coverage   71.33%   71.32%   -0.01%     
==========================================
  Files         760      760              
  Lines      152284   152301      +17     
  Branches   152284   152301      +17     
==========================================
+ Hits       108625   108633       +8     
- Misses      39187    39198      +11     
+ Partials     4472     4470       -2     
Flag Coverage Δ
backward-compatibility 0.24% <0.00%> (-0.01%) ⬇️
db-migration 0.24% <0.00%> (-0.01%) ⬇️
genesis-check 1.43% <0.00%> (-0.01%) ⬇️
integration-tests 36.98% <0.00%> (+<0.01%) ⬆️
linux 69.80% <0.00%> (+<0.01%) ⬆️
linux-nightly 70.82% <0.00%> (-0.01%) ⬇️
macos 54.29% <0.00%> (-0.02%) ⬇️
pytests 1.66% <0.00%> (-0.01%) ⬇️
sanity-checks 1.45% <0.00%> (-0.01%) ⬇️
unittests 66.97% <0.00%> (+<0.01%) ⬆️
upgradability 0.29% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

8F85

@marcelo-gonzalez marcelo-gonzalez added this pull request to the merge queue Apr 10, 2024
Merged via the queue into near:master with commit 20658bb Apr 10, 2024
@marcelo-gonzalez marcelo-gonzalez deleted the mirror-delay branch April 10, 2024 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0