8000 node: call CreateSnapshot explicitly when node starts by yoomee1313 · Pull Request #1829 · klaytn/klaytn · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Aug 19, 2024. It is now read-only.

node: call CreateSnapshot explicitly when node starts #1829

Merged
merged 1 commit into from
Jun 6, 2023

Conversation

yoomee1313
Copy link
Contributor
@yoomee1313 yoomee1313 commented Apr 18, 2023

Proposed changes

  • gov.changeSet
    • gov.changeSet is one of the governance state cache: the vote data which is not stored in governance db yet.
    • It is a memory value. So, if the node shut down before it is flushed into gov db at governance block, it is deleted.
    • but it would restored well when the node restarts during the first snapshot call.
  • However, after fix debug.traceBlock not to remove gov.changeSet #1706 is applied, missing gov.changeSet problem appears
    • if the first snapshot is called by getValidator, it creates the snapshot caches without restoring gov.changeSet (because the snapshot called by getValidator is not able to write governance data)
    • the next snapshot call skip the restoration part always whether it is able to write governance data or not because there's already snapshot cache.
    • and it leads to a verification error due to the mismatch between gov.changeSet and blockHeader.governanceData.
  • So, the solution is calling snapshot explictly before sync starts to ensure the first snapshot call restores gov.changeSet always.
    • currentSet or idxCache are restored during intializeCache, but changeSet does not.

dev

  • vote at block 34 and restart immediately
  • at the next epoch start block (e.g. 60), it returned verification error
ERROR[05/05,09:01:42 Z] [30] Verification Error                        len(receivedChangeSet)=1 len(changeSet)=0
ERROR[05/05,09:01:42 Z] [5]
########## BAD BLOCK #########
...

PR

  • vote at block 34 and restart immediately
  • at the next epoch start block (e.g. 60), it handles governance data properly with out error.

Types of changes

Please put an x in the boxes related to your change.

  • Bugfix
  • New feature or enhancement
  • Others

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING GUIDELINES doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes ($ make test)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Related issues

  • Please leave the issue numbers or links related to this PR here.

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc...

@yoomee1313 yoomee1313 requested a review from JayChoi1736 May 16, 2023 06:12
@yoomee1313 yoomee1313 force-pushed the fix-missing-vote2 branch from 71080f4 to f3605ce Compare May 17, 2023 05:50
@yoomee1313 yoomee1313 changed the title consensus: revoke to call writable snapshot function (called by getValidators) node: call CreateSnapshot explicitly when node starts to restore votings May 17, 2023
@yoomee1313 yoomee1313 marked this pull request as ready for review May 17, 2023 06:23
@yoomee1313 yoomee1313 requested a review from aidan-kwon May 17, 2023 06:26
blukat29
blukat29 previously approved these changes May 18, 2023
Copy link
Member
@aidan-kwon aidan-kwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, but I cannot understand some points. Can you explain more?

if no such snapshot function is called until the next governance block is processed,

It seems that a node calls snapshot() on every block to verifyHeader.

this problem is caused by #1706

The PR disables "write" in some cases, but you are saying that there is an issue on "read" (restoration of vote data). So, I am confused.

@yoomee1313
Copy link
Contributor Author

It seems that a node calls snapshot() on every block to verifyHeader.

In a syncing situation, a snapshot is called, governance state data is writable, so governance state data is written(e.g. changeSet, currentSet). But the problem is that if the snapshot already exists, it will be hit and returned directly from the cache, and the code that writes the governance state data is not executed. (When I say that the snapshot already exists, I am assuming that the code has been run once when creating the snapshot.)
but the problem is, when the first snapshot call is non-writable at governance state data, the assumption the code has been run once is broken.

On second thought, I realized that �it is good to change the implementation, because the fundamental solution is to decouple the governance state data recovery (specifically changeSet) from the snapshot call.

  • implement changeSet recovery logic in initializeCache. (but if not possible, i will go back to original solution).

@yoomee1313 yoomee1313 marked this pull request as draft May 19, 2023 10:51
- latest synced blocks are the blocks from lastGovernanceStateBlock to currentBlock
- in-memory snapshots and related gov states are missed when node stops, so restore it when node starts
@yoomee1313 yoomee1313 changed the title node: call CreateSnapshot explicitly when node starts to restore votings node: call CreateSnapshot explicitly when node starts May 23, 2023
@yoomee1313 yoomee1313 marked this pull request as ready for review May 30, 2023 10:29
@yoomee1313
Copy link
Contributor Author

Dear reviewers, please review this PR one more time. I'll go as #1829 (comment).

@yoomee1313 yoomee1313 merged commit 89ed176 into klaytn:dev Jun 6, 2023
@yoomee1313 yoomee1313 deleted the fix-missing-vote2 branch June 6, 2023 01:30
@JayChoi1736 JayChoi1736 mentioned this pull request Jul 24, 2023
20 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0