8000 p2p: investigate why re-dialing persistent peers consumes so many resources · Issue #3267 · cometbft/cometbft · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
p2p: investigate why re-dialing persistent peers consumes so many resources #3267
Open
@cason

Description

@cason

There are reports from node operators that while the p2p layer is attempting to reconnect to a persistent peer (typically because it is unavailable, offline, etc.) the overall performance of node degrades substantially. This is specially relevant in networks with short block times, when it is observed an increase in block times and proposers failing to get their blocks committed.

The method responsible for persistently attempt to dial a peer address is p2p.Switch.reconnectToPeer(*NetAddress). There is nothing really special on it in terms of resource consumption. The main calls are for dialing the peer address, which is the same p2p.Switch.DialPeerWithAddress(*NetAddress) used to dial any address, and sleeps.

The re-dialing is done using a standard (hard-code) procedure, summarized here. In summary, there are 20 attempts with linear intervals (5s plus a random jitter up to 3s), then the intervals are exponential, increasing powers of 3s, using the same jitter. At most 10 attempts are performed with exponential intervals, so at most 30 attempts are performed in total.

Turning the parameters used by this procedure configuration parameters has been proposed several times by block operators.

But this issue should focus, in my opinion, on understanding the source of the overhead that has been observed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingp2p

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0