8000 Fix Quorum by jiseongnoh · Pull Request #2026 · klaytn/klaytn · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Aug 19, 2024. It is now read-only.

Fix Quorum #2026

Merged
merged 10 commits into from
Nov 13, 2023
Merged

Fix Quorum #2026

merged 10 commits into from
Nov 13, 2023

Conversation

jiseongnoh
Copy link
Contributor

Proposed changes

This PR addresses a consensus instability issue in networks with 4x+1 validators. It updates the quorum calculation from the previous 2F+1 to math.Ceil(float64(2*valSet.Size())/3), aligning with QBFT requirements for enhanced network stability and Byzantine fault tolerance.

Closes #2023.

Types of changes

Please put an x in the boxes related to your change.

  • Bugfix
  • New feature or enhancement
  • Others

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING GUIDELINES doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes ($ make test)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Related issues

  • Please leave the issue numbers or links related to this PR here.

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc...

< 8000 a class="avatar avatar-user" style="width:20px;height:20px;" data-test-selector="commits-avatar-stack-avatar-link" data-hovercard-type="user" data-hovercard-url="/users/jiseongnoh/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="/jiseongnoh"> @jiseongnoh
@jiseongnoh jiseongnoh marked this pull request as draft November 8, 2023 09:10
@jiseongnoh jiseongnoh marked this pull request as ready for review November 9, 2023 03:25
@jiseongnoh jiseongnoh self-assigned this Nov 9, 2023
@@ -428,20 +428,18 @@ func PrepareCommittedSeal(hash common.Hash) []byte {
}

// Minimum required number of consensus messages to proceed
func requiredMessageCount(valSet istanbul.ValidatorSet) int {
func RequiredMessageCount(valSet istanbul.ValidatorSet) int {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this counting change safe without HF?
Assuming there are a total of 2F+1 nodes operational within the network, specifically denoting that 5 nodes are active among a total of 8 nodes.

In this scenario, where half of the nodes are responsible for calculating the necessary count of messages using the formula 2*valSet.F() + 1, thereby anticipating 6 messages, while the other half calculates it using int(math.Ceil(float64(2*size) / 3)), expecting 5 messages.

Up to this point, the chain has been able to progress continuously. However, there emerges an issue wherein half of the updated nodes may not facilitate the forwarding of the chain. Does this not pose a potential risk or detriment associated with this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update to the quorum calculation without a hard fork is designed to be low-risk.

  1. Cypress Network Stability: In the case of the Cypress mainnet, there are 31 committee nodes in operation, which aligns with the 3F+1 configuration. The quorum count will be consistent with both 2F+1 or Ceil(2N/3) formula. Therefore, the quorum calculation change will not pose a risk.
  2. ServiceChain Network Resilience: ServiceChains with different numbers of validators might potentially experience a problem under extreme conditions. However, the problem will be solved as long as most nodes update the binary. This is acceptable considering the complexities of a hard fork.

In short, this method involves a minor relaxation of the BFT principle, a trade-off between an uninterrupted network and a quick update process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also noteworthy that under normal operating conditions, the number of validators participating in consensus typically surpass the quorum required by both the 2F+1 and Ceil(2*N/3) formulas. This surplus of committed seals ensures that discrepancies in different quorum calculation will not compromise the consensus process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it thanks.

@jiseongnoh jiseongnoh merged commit 05974fc into klaytn:dev Nov 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Quorum Calculation Issue
5 participants
0