8000 Network metrics per client by iri031 · Pull Request #7445 · sigp/lighthouse · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Network metrics per client #7445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: unstable
Choose a base branch
from
Open

Conversation

iri031
Copy link
@iri031 iri031 commented May 12, 2025

Issue Addressed

Which issue # does this PR address?
#7389

Proposed Changes

Metrics for:

  • Blocks received and imported per client
  • Sync time per client
  • Bytes and messages received per client

@iri031 iri031 requested a review from jxs as a code owner May 12, 2025 20:18
@iri031 iri031 changed the title initial instrumentation Network metrics per client May 12, 2025
@@ -214,6 +214,19 @@ impl<T: BeaconChainTypes> NetworkBeaconProcessor<T> {
reprocess_tx: Option<mpsc::Sender<ReprocessQueueMessage>>,
seen_timestamp: Duration,
) {
let attestation_size = attestation.as_ssz_bytes().len() as u64;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calls force us to spend cycles re-serializing a lot of SSZ objects. To track this for free we should do the tracking in the lighthouse-network layer which I'm not sure has access to peer client information. If not possible to track this for free skip tracking bytes per client for now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to lighthouse-network


metrics::inc_counter_vec(
&metrics::MESSAGES_RECEIVED_PER_CLIENT,
&[&client]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should label this metric by [client, object_type] where object type could be

  • gossip_attestation
  • gossip_block
  • range_sync_block
  • backfill_sync_block

etc

@@ -241,6 +254,21 @@ impl<T: BeaconChainTypes> NetworkBeaconProcessor<T> {
packages: GossipAttestationBatch<T::EthSpec>,
reprocess_tx: Option<mpsc::Sender<ReprocessQueueMessage>>,
) {
8000 for package in &packages {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can move attestation metrics to process_gossip_attestation_result and only track valid objects

metrics::inc_counter_vec(
&metrics::MESSAGES_RECEIVED_PER_CLIENT,
&[&client]
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add a helper method

fn track_messages_received_per_client(&self, peer_id: PeerId, object_type: &'static str) {
    let client = self.network_globals.client(&peer_id).kind.to_string();
    metrics::inc_counter_vec(
            &metrics::MESSAGES_RECEIVED_PER_CLIENT,
            &[&client, object_type]
        );
}

to de-duplicate a lot of code

@@ -169,6 +169,11 @@ impl<E: EthSpec> PeerDB<E> {
}
}

/// Returns the sync start time of the peer if it exists
pub fn sync_start_time(&self, peer_id: &PeerId) -> Option<&Instant> {
self.peers.get(peer_id).and_then(|info| info.sync_start_time())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the meaning/purpose of sync start time, could you detail it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to calculate sync time per client. When a sync request is received I update this parameter to record the time when the sync started here and later when the state of the peer changes to is_synced here I calculate the time taken to include in the metric.

.peer_info(peer_id)
.map(|info| info.client().version.clone())
.unwrap_or_default()
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a single metric, not one per client type and one for client version. In mainnet is version field set for most clients? Could you test this and report back? If so, to what?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I test for the mainnet?

@dapplion
Copy link
Collaborator

FYI @iri031 there are still some unaddressed comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0