8000 Logging websocket connection issues · Issue #4896 · cometbft/cometbft · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Logging websocket connection issues #4896

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
shapeshed opened this issue Jan 30, 2025 · 2 comments
Open

Logging websocket connection issues #4896

shapeshed opened this issue Jan 30, 2025 · 2 comments
Labels
bug Something isn't working needs-triage This issue/PR has not yet been triaged by the team.

Comments

@shapeshed
Copy link
shapeshed commented Jan 30, 2025

Bug Report

Setup

CometBFT version (use cometbft version or git rev-parse --verify HEAD if installed from source):

v1.0.0

Have you tried the latest version: yes/no

Yes tried HEAD

ABCI app (name for built-in, URL for self-written if it's publicly available):

n/a

Environment:

  • OS (e.g. from /etc/os-release):
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://gitlab.archlinux.org/groups/archlinux/-/issues"
PRIVACY_POLICY_URL="https://terms.archlinux.org/docs/privacy-policy/"
LOGO=archlinux-logo
  • Install tools:

go mod

What happened?

When using rpchttp.New to connect to an rpc endpoint and websocket connection, the default settings passed to the underlying gorilla websocket connection make it hard for a client using the rpchttp.New method to understand if the websocket connection is healthy or not. This can lead to dead connections with no logging or information available to the application.

const (
defaultMaxReconnectAttempts = 25
defaultWriteWait = 0
defaultReadWait = 0
defaultPingPeriod = 0
)

These settings as per the comments lead to the websocket connection blocking forever and sending no pings

// Time allowed to write a message to the server. 0 means block until operation succeeds.
writeWait time.Duration
// Time allowed to read the next message from the server. 0 means block until operation succeeds.
readWait time.Duration
// Send pings to server with this period. Must be less than readWait. If 0, no pings will be sent.
pingPeriod time.Duration

Furthermore even if you pass a logger through to the client there is no easy way to understand if a connection is unhealthy.

This can lead to clients thinking a connection is healthy but in fact it is dead and events are missed.

What did you expect to happen?

I would expect a client using rpchttp.New to have some way of understand if a Websocket connection is healthy and being able to act on it. This could mean adjusting at least the ping interval, reconnecting and logging if it has passed. Ideally anyone using the rpchttp client would be able to override the defaults too. NewWS within ws_client.go hard codes the defaults. 🧇

How to reproduce it

This example is against HEAD which has slog as the logging library and slightly different interfaces for initialising a logger

package main

import (
	"context"
	"fmt"
	"os"

	"github.com/cometbft/cometbft/libs/log"
	rpchttp "github.com/cometbft/cometbft/rpc/client/http"
)

const (
	RPCEndpoint = "https://rpc.osmosis.zone:443"

	// Subscriber is an arbitrary string that can be used to manage a subscription
	Subscriber = "gobot"

	// Query is the query to subscribe to events matching the query
	Query = "token_swapped.module = 'gamm'"
)

func main() {
	// Create a new Tendermint RPC client
	c, err := rpchttp.New(RPCEndpoint)
	if err != nil {
		panic(err)
	}

	logger := log.NewJSONLogger(os.Stdout)

	c.SetLogger(logger)

	if err := c.Start(); err != nil {
		panic(err)
	}

	// Create a context for the subscription
	ctx := context.Background()

	// Subscribe to the WebSocket connection
	eventCh, err := c.Subscribe(ctx, Subscriber, Query)
	if err != nil {
		panic(err)
	}

	go func() {
		for {
			event := <-eventCh
			fmt.Printf("tokens swapped in: %+v\n", event.Events["token_swapped.tokens_in"])
		}
	}()

	select {}
}
  1. Start the example with go run main.go
  2. Observe that events are received and logged
  3. Kill your network connection somehow (e.g. turn off wifi or sudo iptables -A OUTPUT -p tcp --dport 443 -j DROP)
  4. Observe that no events are received
  5. Wait 30 seconds
  6. Observe that no logs or information is available to the application that the connection is unhealthy and it blocks forever
  7. Kill your network connection somehow (e.g. turn on wifi or sudo iptables -D OUTPUT -p tcp --dport 443 -j DROP)
  8. Observe that events are received again

There should be some way to understand that there was a read or write timeout or that a ping timed out so that a client using the library can make a decision on what to do.

@shapeshed shapeshed added bug Something isn't working needs-triage This issue/PR has not yet been triaged by the team. labels Jan 30, 2025
@mmsqe
Copy link
Contributor
mmsqe commented Feb 10, 2025

Seems you could try pass these 3 functions as the last options params of NewWS

func ReadWait(readWait time.Duration) func(*WSClient) {
return func(c *WSClient) {
c.readWait = readWait
}
}
// WriteWait sets the amount of time to wait before a websocket write times out.
// It should only be used in the constructor and is not Goroutine-safe.
func WriteWait(writeWait time.Duration) func(*WSClient) {
return func(c *WSClient) {
c.writeWait = writeWait
}
}
// PingPeriod sets the duration for sending websocket pings.
// It should only be used in the constructor - not Goroutine-safe.
func PingPeriod(pingPeriod time.Duration) func(*WSClient) {
return func(c *WSClient) {
c.pingPeriod = pingPeriod
}
}

@shapeshed
Copy link
Author

I think it is unlikely that there are incentives to provide an api to assess websocket connection stability within this project so unless there is appetite to address this for users of the default websocket client this issue can be closed.

@MMDQE thanks - yes that's definitely doable. my use case is making sure that the client can respond in the case of a faulty websocket connection. The cometbft client will log the error via the logger with this method but imho a better design would be to push onto a channel rather than log so that a client can make decisions about what to do in the case a websocket connection is faulty.

I made a start on a prototype here
https://github.com/shapeshed/tx_search_gorilla

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage This issue/PR has not yet been triaged by the team.
Projects
None yet
Development

No branches or pull requests

2 participants
0