-
Notifications
You must be signed in to change notification settings - Fork 18.8k
libn/networkdb: fix data race in GetTableByNetwork #49937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The function was accessing the index map without holding the mutex, so it would race any mutation to the database indexes. Fetch the reference to the tree's root while holding a read lock. Since the radix tree is immutable, taking a reference to the root is equivalent to starting a read-only database transaction, providing a consistent view of the data at a snapshot in time, even as the live state is mutated concurrently. Also optimize the WalkTable function by leveraging the immutability of the radix tree. Signed-off-by: Cory Snider <csnider@mirantis.com>
I saw the modprobe-not-found failures in one rootless run yesterday too ...
Not sure why they'd be flaky - modprobe not found seems like something that should happen always or never. So, it'll need looking at.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
params := strings.Split(string(path[1:]), "/") | ||
nid := params[1] | ||
key := params[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but would this be something we could replace for strings.Cut
, or do we expect more than one /
(and only need the first two components here)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indices start at 0. params
is a 3-tuple; we are extracting the second and third fields. I guess we could do something like the following if we really wanted to.
nid, key, _ = strings.Cut(string(path[len(tname)+2]), "/")
done := make(chan struct{}) | ||
defer close(done) | ||
for _, db := range dbs { | ||
go func(db *NetworkDB) { | ||
for { | ||
select { | ||
case <-done: | ||
return | ||
default: | ||
} | ||
_ = db.GetTableByNetwork("test_table", "network1") | ||
} | ||
}(db) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely non-blocking; guess we could make this use a context to make it slight more to the point
// Shake out any data races.
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
for _, db := range dbs {
go func(db *NetworkDB) {
for ctx.Err() == nil {
_ = db.GetTableByNetwork("test_table", "network1")
}
}(db)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. Also https://pkg.go.dev/testing#T.Context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! I was just looking at that one earlier Today, but we need to wait for go1.24 (which should be possible now that 1.24.3 is released). I was looking for it to be used in our integration tests, but we have a setupTest
that depends on the TestMain
to set up a common context, which doesn't allow creating one (testing.M
doesn't have the Context()
😂
nDB.RLock() | ||
root := nDB.indexes[byNetwork].Root() | ||
nDB.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting; so we have a read-lock here, but there's no lock in 2 places I found that mutate this?
moby/libnetwork/networkdb/networkdb.go
Lines 761 to 763 in d71afd7
func (nDB *NetworkDB) createOrUpdateEntry(nid, tname, key string, v *entry) (okTable bool, okNetwork bool) { | |
nDB.indexes[byTable], _, okTable = nDB.indexes[byTable].Insert([]byte(fmt.Sprintf("/%s/%s/%s", tname, nid, key)), v) | |
nDB.indexes[byNetwork], _, okNetwork = nDB.indexes[byNetwork].Insert([]byte(fmt.Sprintf("/%s/%s/%s", nid, tname, key)), v) |
moby/libnetwork/networkdb/networkdb.go
Lines 776 to 778 in d71afd7
func (nDB *NetworkDB) deleteEntry(nid, tname, key string) (okTable bool, okNetwork bool) { | |
nDB.indexes[byTable], _, okTable = nDB.indexes[byTable].Delete([]byte(fmt.Sprintf("/%s/%s/%s", tname, nid, key))) | |
nDB.indexes[byNetwork], _, okNetwork = nDB.indexes[byNetwork].Delete([]byte(fmt.Sprintf("/%s/%s/%s", nid, tname, key))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ignore me; from #44501
The major changes compared to the old radix are when modifying (insert or delete) a tree, and those are pretty self-contained: we replace the entire immutable tree under a lock.
Looks like caller is acquiring a lock indeed. although there's some odd bits because it's doing the swap inside of a nDB.indexes[byNetwork].Root().WalkPrefix([]byte("/"+nid),
🤔 .. I recall now that there was some weird stuff in that area that originally caused some panics the "mutable" radix, well 🤷
- What I did
- How I did it
The function was accessing the index map without holding the mutex, so it would race any mutation to the database indexes. Fetch the reference to the tree's root while holding a read lock. Since the radix tree is immutable, taking a reference to the root is equivalent to starting a read-only database transaction, providing a consistent view of the data at a snapshot in time, even as the live state is mutated concurrently.
Also optimize the WalkTable function by leveraging the immutability of the radix tree.
- How to verify it
go test -race
- Human readable description for the release notes
- A picture of a cute animal (not mandatory but encouraged)