Closed
Description
Description
I'm polling a docker network endpoint /networks/my_network?verbose=true
continuously once every second, and after some time (can be many days) the docker daemon crashes with the following error in the logs:
fatal error: concurrent map read and map write
goroutine 29805570 [running]:
github.com/docker/docker/libnetwork/networkdb.(*NetworkDB).GetTableByNetwork(0xc0012987e0, {0x8c54cc, 0x12}, {0xc001380ce0, 0x19})
/go/src/github.com/docker/docker/libnetwork/networkdb/networkdb.go:426 +0x69
github.com/docker/docker/libnetwork.(*Network).Services(0xc0020a8e00)
/go/src/github.com/docker/docker/libnetwork/agent.go:497 +0x551
github.com/docker/docker/daemon.buildServiceAttachments(0x455620?)
/go/src/github.com/docker/docker/daemon/network.go:653 +0x3f
github.com/docker/docker/daemon.(*Daemon).GetNetworks(0x44f4a0?, {0xc0034afaa0?}, {0xaf?, 0xad?})
/go/src/github.com/docker/docker/daemon/network.go:595 +0x46d
github.com/docker/docker/api/server/router/network.(*networkRouter).getNetwork(0xc0018b20c0, {0x8c913a?, 0x13?}, {0xc8b2a0, 0xc000744d20}, 0xc00110e900, 0xc0034af740?)
/go/src/github.com/docker/docker/api/server/router/network/network_routes.go:121 +0x710
github.com/docker/docker/api/server/middleware.(*ExperimentalMiddleware).WrapHandler.ExperimentalMiddleware.WrapHandler.func1({0xc94520, 0xc0034af9b0}, {0xc8b2a0?, 0xc000744d20?}, 0x373de0?, 0xc000d98870?)
/go/src/github.com/docker/docker/api/server/middleware/experimental.go:26 +0xb4
github.com/docker/docker/api/server/middleware.(*VersionMiddleware).WrapHandler.VersionMiddleware.WrapHandler.func1({0xc94520, 0xc0034af8c0}, {0xc8b2a0, 0xc000744d20}, 0x40?, 0x40?)
/go/src/github.com/docker/docker/api/server/middleware/version.go:62 +0x2ae
github.com/docker/docker/pkg/authorization.(*Middleware).WrapHandler.func1({0xc94520, 0xc0034af8c0}, {0xc8b2a0?, 0xc000744d20?}, 0xc00110e900, 0x3b13460?)
/go/src/github.com/docker/docker/pkg/authorization/middleware.go:59 +0x683
github.com/docker/docker/api/server.(*Server).makeHTTPHandler.func1({0xc8b2a0, 0xc000744d20}, 0xc00110e700)
/go/src/github.com/docker/docker/api/server/server.go:55 +0x1c3
net/http.HandlerFunc.ServeHTTP(0xc94520?, {0xc8b2a0?, 0xc000744d20?}, 0xc61318?)
/usr/local/go/src/net/http/server.go:2136 +0x29
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*middleware).serveHTTP(0xc00218af20, {0xc831d0?, 0xc002c58fc0}, 0xc00110e500, {0xc6a080, 0xc001a2b470})
/go/src/github.com/docker/docker/vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/handler.go:217 +0x1202
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.NewMiddleware.func1.1({0xc831d0?, 0xc002c58fc0?}, 0xc5ae01?)
/go/src/github.com/docker/docker/vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/handler.go:81 +0x35
net/http.HandlerFunc.ServeHTTP(0xc94520?, {0xc831d0?, 0xc002c58fc0?}, 0xc5ae28?)
/usr/local/go/src/net/http/server.go:2136 +0x29
net/http.HandlerFunc.ServeHTTP(0xc00110e400?, {0xc831d0?, 0xc002c58fc0?}, 0x7fcd4c6c7df0?)
/usr/local/go/src/net/http/server.go:2136 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc0018903c0, {0xc831d0, 0xc002c58fc0}, 0xc00110e300)
/go/src/github.com/docker/docker/vendor/github.com/gorilla/mux/mux.go:212 +0x1c5
net/http.serverHandler.ServeHTTP({0xc0018bd6e0?}, {0xc831d0?, 0xc002c58fc0?}, 0x6?)
/usr/local/go/src/net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc001f94000, {0xc94520, 0xc0011e0e70})
/usr/local/go/src/net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 796
/usr/local/go/src/net/http/server.go:3086 +0x5cb
goroutine 1 [semacquire, 22798 minutes, locked to thread]:
sync.runtime_Semacquire(0xc000f08a80?)
/usr/local/go/src/runtime/sema.go:62 +0x25
sync.(*WaitGroup).Wait(0xc000d9c080?)
/usr/local/go/src/sync/waitgroup.go:116 +0x48
main.(*DaemonCli).start(0xc000d9c080, 0xc0006cbf00)
/go/src/github.com/docker/docker/cmd/dockerd/daemon.go:350 +0x1cf7
main.runDaemon(...)
/go/src/github.com/docker/docker/cmd/dockerd/docker_unix.go:13
main.newDaemonCommand.func1(0xc000d88100?, {0xc0001f00e0?, 0x7?, 0x88f9be?})
/go/src/github.com/docker/docker/cmd/dockerd/docker.go:37 +0x94
github.com/spf13/cobra.(*Command).execute(0xc000ad3800, {0xc000052100, 0xe, 0xe})
/go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:983 +0xabc
github.com/spf13/cobra.(*Command).ExecuteC(0xc000ad3800)
/go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
/go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:1039
main.main()
/go/src/github.com/docker/docker/cmd/dockerd/docker.go:106 +0x17b
...
<snipped, plenty more here, let me know if you're interested>
I'm also polling some other endpoints (/services
, /tasks
and /nodes
) every 10s, unsure if that's related.
Reproduce
I'm calling the endpoint directly from a script, but I guess docker network inspect --verbose x
could trigger it as well.
Expected behavior
not crash
docker version
Client:
Version: 25.0.5
API version: 1.44
Go version: go1.21.8
Git commit: 5dc9bcc
Built: Tue Mar 19 15:04:17 2024
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 25.0.5
API version: 1.44 (minimum version 1.24)
Go version: go1.21.8
Git commit: e63daec
Built: Tue Mar 19 15:05:39 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.7.13
GitCommit: 7c3aca7a610df76212171d200ca3811ff6096eb8
runc:
Version: 1.1.12
GitCommit: v1.1.12-0-g51d5e94
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client:
Version: 25.0.5
Context: default
Debug Mode: false
Server:
Containers: 33
Running: 9
Paused: 0
Stopped: 24
Images: 69
Server Version: 25.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: active
NodeID: wsfcvwmwg65zho2l63hon1wsb
Is Manager: false
Node Address: 10.0.1.33
Manager Addresses:
10.0.1.31:2377
Runtimes: runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: 7c3aca7a610df76212171d200ca3811ff6096eb8
runc version: v1.1.12-0-g51d5e94
init version: de40ad0
Security Options:
seccomp
Profile: builtin
Kernel Version: 6.6.33-0-lts
Operating System: Alpine Linux v3.19
OSType: linux
Architecture: x86_64
CPUs: 40
Total Memory: 31.28GiB
Name: x
ID: 84d016f3-5708-4b31-9437-3d147ee97829
Docker Root Dir: /var/lib/docker
Debug Mode: false
Username: x
Experimental: false
Insecure Registries:
10.0.1.31:5000
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
Additional Info
No response