8000 Make reactor.onReceive not block unmarshalling more packets from peer · Issue #3199 · cometbft/cometbft · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Make reactor.onReceive not block unmarshalling more packets from peer #3199
Open
@ValarDragon

Description

@ValarDragon

Feature Request

Summary

We want to be able to ingest packets from peers faster. There is some evidence (cc @evan-forbes) that we have blocking behavior on our ability to ingest data from peers. But as my explanation goes on, it should become clear based on intuition this is a problem.

Here is a profile from osmosis mainnet taken last night over one hour, from recvRoutine:
image

p2p.createMConnection.func1 is reactor.onReceive.

The flow we currently have is:

recvRoutine:
- Checks if flowrate is satisfied for max packet size (1024 bytes)
- Tries to read the next packet
- Protobuf Decodes packet
- Find corresponding packets channel
- Buffer the proto-decoded packet data
- If buffer corresponds to a full logical packet
  - Run reactor.OnReceive
- Go back to beginning

The issue is reactor.OnReceive blocks reading and proto-decoding more data for all channels to that peer. Some messages, e.g. IBC txs, take over 5ms. This means that if a peer gossips you an IBC tx, it will take at least 5ms before you even attempt to decode the subsequent packets they send you. (Leading to IBC enabled-DOS', amongst many others)

We see under low-load, CheckTx is already dominant item here.

Another problem is some consensus packets block on the cs mutex, which will be locked during block execution and vote processing.

Its clear we need to split these into different processes.

Proposal

If we need to guarantee in-order delivery across each channel, then I think the answers are:

  • Short term: Use a go channel within each channel, for buffering incoming packets and processing them in a different thread than the recv routine.
  • Long term, change the channel.onReceive API

If we don't need to guarantee in order delivery, and just achieve it in ~almost every case, then we can just call this function in a goroutine: https://github.com/cometbft/cometbft/blob/main/p2p/conn/connection.go#L671

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0