8000 Migrate remaining routing tests to network crate by saketh-are · Pull Request #8168 · near/nearcore · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Migrate remaining routing tests to network crate #8168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

saketh-are
Copy link
Collaborator

Follow up to #8073, #8109, and #8110 for the two remaining tests in integration-tests/src/tests/network/routing.rs.

),
)
}
_ => None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a loop here? Do you expect some connection to fail at first try?
Also you do not set config.allow_outbound, so no spontaneous connections should be established, right?
Why do you expect AlreadyConnected then?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I encountered some issues which I now understand to be timing-related. The old integration test handles them by waiting 50 ms after making each connection (link). As an example, if node 0 has 3 connections and does not finish dropping down to 2 connections before a 4th node attempts to connect, it will reject the new connection due to the configured max_num_peers = 3.

To resolve it I've removed the loop and instead inserted some pm0.wait_for_num_connected_peers(2).await; in the appropriate places.

However, there are still some non-deterministic occurrences of AlreadyConnected which I am trying to track down.

Copy link
Collaborator Author
@saketh-are saketh-are Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A system-seeded rng is used to pick the connections to drop (link). The way this test is written the connections from nodes 2,3,4 can by chance survive until they are re-attempted.

I propose rewriting the test a bit; we can check pm0's connected peers and always select a non-connected peer to attempt the next connection. It still achieves the main goal of testing that connections to archival nodes are retained preferentially. This test's purpose was not to check that peer manager rejects duplicate connections.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rewritten the test as proposed. I also changed the test to monitor pm0's event stream for closed connections with ClosingReason::PeerManager instead of using wait_for_num_connected_peers(2). It's more reliable since when a new connection is made we go from 2->3->2 connections and if we check for number of connections it's possible we see the first state when we really want to check the final state.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more tweak; depending on timing of monitor_peers_trigger, peer manager might drop an inbound connection "gracefully" even before the handshake is completed (link). However, the existing connect_to in peer_manager/testonly panics if it does not see a HandshakeCompleted event. I've switched this test to use a new send_outbound_connect so that the other peer managers simply initiate connections without waiting for HandshakeCompleted. We still check for a closed connection on pm0 side for ClosingReason::PeerManager which should maintain correctness of this test.

use near_network::time;

#[test]
fn account_propagation() -> anyhow::Result<()> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is redundant with the do_not_block_announce_account_broadcast test.

@saketh-are saketh-are marked this pull request as ready for review December 7, 2022 00:01
@saketh-are saketh-are requested a review from a team as a code owner December 7, 2022 00:01
@saketh-are saketh-are requested a review from mm-near December 7, 2022 00:01
@saketh-are saketh-are removed the request for review from mm-near December 7, 2022 00:02
@saketh-are saketh-are requested a review from pompon0 December 7, 2022 13:58
@near-bulldozer near-bulldozer bot merged commit a0def6d into near:master Dec 7, 2022
@pugachAG
Copy link
Contributor

@saketh-are it seems like the added test is flaky: zulip thread, could you please take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3644
3 participants
0