Description
Bug Report
Setup
CometBFT version (use cometbft version
or git rev-parse --verify HEAD
if installed from source):
v1.0.0
Have you tried the latest version: yes/no
Yes tried HEAD
ABCI app (name for built-in, URL for self-written if it's publicly available):
n/a
Environment:
- OS (e.g. from /etc/os-release):
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://gitlab.archlinux.org/groups/archlinux/-/issues"
PRIVACY_POLICY_URL="https://terms.archlinux.org/docs/privacy-policy/"
LOGO=archlinux-logo
- Install tools:
go mod
What happened?
When using rpchttp.New
to connect to an rpc endpoint and websocket connection, the default settings passed to the underlying gorilla websocket connection make it hard for a client using the rpchttp.New
method to understand if the websocket connection is healthy or not. This can lead to dead connections with no logging or information available to the application.
cometbft/rpc/jsonrpc/client/ws_client.go
Lines 23 to 28 in ce344cc
These settings as per the comments lead to the websocket connection blocking forever and sending no pings
cometbft/rpc/jsonrpc/client/ws_client.go
Lines 71 to 78 in ce344cc
Furthermore even if you pass a logger through to the client there is no easy way to understand if a connection is unhealthy.
This can lead to clients thinking a connection is healthy but in fact it is dead and events are missed.
What did you expect to happen?
I would expect a client using rpchttp.New
to have some way of understand if a Websocket connection is healthy and being able to act on it. This could mean adjusting at least the ping interval, reconnecting and logging if it has passed. Ideally anyone using the rpchttp client would be able to override the defaults too. NewWS
within ws_client.go
hard codes the defaults. 🧇
How to reproduce it
This example is against HEAD which has slog as the logging library and slightly different interfaces for initialising a logger
package main
import (
"context"
"fmt"
"os"
"github.com/cometbft/cometbft/libs/log"
rpchttp "github.com/cometbft/cometbft/rpc/client/http"
)
const (
RPCEndpoint = "https://rpc.osmosis.zone:443"
// Subscriber is an arbitrary string that can be used to manage a subscription
Subscriber = "gobot"
// Query is the query to subscribe to events matching the query
Query = "token_swapped.module = 'gamm'"
)
func main() {
// Create a new Tendermint RPC client
c, err := rpchttp.New(RPCEndpoint)
if err != nil {
panic(err)
}
logger := log.NewJSONLogger(os.Stdout)
c.SetLogger(logger)
if err := c.Start(); err != nil {
panic(err)
}
// Create a context for the subscription
ctx := context.Background()
// Subscribe to the WebSocket connection
eventCh, err := c.Subscribe(ctx, Subscriber, Query)
if err != nil {
panic(err)
}
go func() {
for {
event := <-eventCh
fmt.Printf("tokens swapped in: %+v\n", event.Events["token_swapped.tokens_in"])
}
}()
select {}
}
- Start the example with
go run main.go
- Observe that events are received and logged
- Kill your network connection somehow (e.g. turn off wifi or sudo iptables -A OUTPUT -p tcp --dport 443 -j DROP)
- Observe that no events are received
- Wait 30 seconds
- Observe that no logs or information is available to the application that the connection is unhealthy and it blocks forever
- Kill your network connection somehow (e.g. turn on wifi or sudo iptables -D OUTPUT -p tcp --dport 443 -j DROP)
- Observe that events are received again
There should be some way to understand that there was a read or write timeout or that a ping timed out so that a client using the library can make a decision on what to do.