8000 fixed broadcasting AccountData after node restart. by pompon0 · Pull Request #8407 · near/nearcore · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fixed broadcasting AccountData after node restart. #8407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jan 25, 2023

Conversation

pompon0
Copy link
Contributor
@pompon0 pompon0 commented Jan 20, 2023

AccountData is broadcasted by validator node periodically (every 15min).
AccountData with greatest version field is considered the freshest.
We do not store the latest broadcasted version in storage, so each time node is restarted, is starts with version = 1, unless it learns the latest broadcasted version (from the previous execution) from the network.
It means that it may happen that the correct AccountData will be broadcasted only after 15min from restart.

This in turn means that the validator might be not reachable (messages cannot be routed to it) for 15min which is unacceptable. We fix that by making the validator node broadcast a new version of AccountData as soon as it learns about stale AccountData (but with higher version) from the network.

Alternative solutions considered are described in the accounts_data/mod.rs (see documentation of is_new() function).

@pompon0 pompon0 requested a review from a team as a code owner January 20, 2023 11:17
@pompon0 pompon0 requested a review from mm-near January 20, 2023 11:17
@pompon0 pompon0 requested a review from saketh-are January 20, 2023 13:06
if !self.is_n 10000 ew(&d) {
return None;
}
let d = match &self.local {
Some(local) if d.account_key == local.signer.public_key() => Arc::new(
VersionedAccountData {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we do it only if account data differs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could, but we don't care. If we encounter anything that we don't know that we have signed, we override it. It is a form of defense in depth - especially since the underlying proto may contain extra information that we cannot interpret (the received data might have been signed by a different neard version).

@pompon0 pompon0 requested a review from mm-near January 25, 2023 13:22
< 8000 input type="hidden" name="id" value="C_kwDOCQUkYtoAKGQ3OWM1MWU1YWE4ZjQ2NTlmMjRkNzVkN2ExOTBjOTBhMDA2ZjA2NDQ" autocomplete="off" data-targets="batch-deferred-content.inputs" />
@near-bulldozer near-bulldozer bot merged commit 88d6d5b into master Jan 25, 2023
@near-bulldozer near-bulldozer bot deleted the gprusak-override-data branch January 25, 2023 14:02
near-bulldozer bot pushed a commit that referenced this pull request Jan 25, 2023
This is a roll forward of #8182 ,
which was reverted by #8314 .

The issue that #8182 caused has been fixed in #8407 .
nayduck run: https://nayduck.near.org/#/run/2834
before the fix tests/sanity/switch_node_key.py was failing, now it succeeds.
No nayduck tests are newly failing with this PR, compared to master.
marcelo-gonzalez pushed a commit to marcelo-gonzalez/nearcore that referenced this pull request Jan 26, 2023
AccountData is broadcasted by validator node periodically (every 15min).
AccountData with greatest `version` field is considered the freshest.
We do not store the latest broadcasted version in storage, so each time node is restarted, is starts with version = 1, unless it learns the latest broadcasted version (from the previous execution) from the network.
It means that it may happen that the correct AccountData will be broadcasted only after 15min from restart.

This in turn means that the validator might be not reachable (messages cannot be routed to it) for 15min which is unacceptable. We fix that by making the validator node broadcast a new version of AccountData as soon as it learns about stale AccountData (but with higher version) from the network.

Alternative solutions considered are described in the accounts_data/mod.rs (see documentation of is_new() function).
marcelo-gonzalez pushed a commit to marcelo-gonzalez/nearcore that referenced this pull request Jan 26, 2023
This is a roll forward of near#8182 ,
which was reverted by near#8314 .

The issue that near#8182 caused has been fixed in near#8407 .
nayduck run: https://nayduck.near.org/#/run/2834
before the fix tests/sanity/switch_node_key.py was failing, now it succeeds.
No nayduck tests are newly failing with this PR,
8000
 compared to master.
marcelo-gonzalez pushed a commit to marcelo-gonzalez/nearcore that referenced this pull request Jan 26, 2023
AccountData is broadcasted by validator node periodically (every 15min).
AccountData with greatest `version` field is considered the freshest.
We do not store the latest broadcasted version in storage, so each time node is restarted, is starts with version = 1, unless it learns the latest broadcasted version (from the previous execution) from the network.
It means that it may happen that the correct AccountData will be broadcasted only after 15min from restart.

This in turn means that the validator might be not reachable (messages cannot be routed to it) for 15min which is unacceptable. We fix that by making the validator node broadcast a new version of AccountData as soon as it learns about stale AccountData (but with higher version) from the network.

Alternative solutions considered are described in the accounts_data/mod.rs (see documentation of is_new() function).
marcelo-gonzalez pushed a commit to marcelo-gonzalez/nearcore that referenced this pull request Jan 26, 2023
This is a roll forward of near#8182 ,
which was reverted by near#8314 .

The issue that near#8182 caused has been fixed in near#8407 .
nayduck run: https://nayduck.near.org/#/run/2834
before the fix tests/sanity/switch_node_key.py was failing, now it succeeds.
No nayduck tests are newly failing with this PR, compared to master.
ppca pushed a commit to ppca/nearcore that referenced this pull request Jan 30, 2023
AccountData is broadcasted by validator node periodically (every 15min).
AccountData with greatest `version` field is considered the freshest.
We do not store the latest broadcasted version in storage, so each time node is restarted, is starts with version = 1, unless it learns the latest broadcasted version (from the previous execution) from the network.
It means that it may happen that the correct AccountData will be broadcasted only after 15min from restart.

This in turn means that the validator might be not reachable (messages cannot be routed to it) for 15min which is unacceptable. We fix that by making the validator node broadcast a new version of AccountData as soon as it learns about stale AccountData (but with higher version) from the network.

Alternative solutions considered are described in the accounts_data/mod.rs (see documentation of is_new() function).
ppca pushed a commit to ppca/nearcore that referenced this pull request Jan 30, 2023
This is a roll forward of near#8182 ,
which was reverted by near#8314 .

The issue that near#8182 caused has been fixed in near#8407 .
nayduck run: https://nayduck.near.org/#/run/2834
before the fix tests/sanity/switch_node_key.py was failing, now it succeeds.
No nayduck tests are newly failing with this PR, compared to master.
@pompon0 pompon0 linked an issue Feb 11, 2023 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[network] Race condition when creating account data
2 participants
0