-
Notifications
You must be signed in to change notification settings - Fork 880
networking: ensure the netns directory is mounted #3761
Conversation
I did some exploration, and sadly it's a bug in
However, there is a bug in step 2. The bind mount keeps any already-existing files in
Make sense? |
58a3c0e
to
bbb7fdc
Compare
I made the changes suggested to "mirror the logic in iproute" + your suggested fix. I improved the error message returned by Example errno EINVAL: |
bbb7fdc
to
e4dca02
Compare
Everything up to here looks fine to me. I've milestoned this for the next release, but I'd like to wait and see whether we get some feedback on the iproute2 patch. |
networking/podenv.go
Outdated
if err != nil { | ||
return fmt.Errorf("mount --rbind %s %s failed: %q", mountNetnsDirectory, mountNetnsDirectory, err) | ||
} | ||
return e.mountNetnsDirectory(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get rid of this recursive call here? It seems to be that if we reach this point we then just try one more mount(rshared)
or fail hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, in this case I have no reason to leave the mkdir -p
outside this method too.
networking/networking.go
Outdated
return nil, err | ||
} | ||
|
||
err = n.mountNetnsDirectory(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were not doing this before, and perhaps if it doesn't succeed we don't need to hard-fail. Should we perhaps just log an error if the mount fails here and keep going? @squeed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The container will start well but we cannot guarantee the lifecycle of its network if a third-party software like ip netns add
is triggered after its starts. So I'm definitely for a hard-fail.
e4dca02
to
a20209e
Compare
@lucab I removed the recursive call and moved the |
networking/podenv.go
Outdated
err = syscall.Mount("", mountNetnsDirectory, "none", syscall.MS_SHARED|syscall.MS_REC, "") | ||
if err != nil { | ||
// Fail unless we need to make the mount point | ||
if err.Error() != EINVAL { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simply use syscall.EINVAL
here and get rid of your custom constant.
networking/podenv.go
Outdated
if err != nil { | ||
// Fail unless we need to make the mount point | ||
if err.Error() != EINVAL { | ||
return fmt.Errorf("mount --bind %s failed: %q", mountNetnsDirectory, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message should start with mount --make-rshared ...
.
networking/podenv.go
Outdated
// Remount after the Upgrade | ||
err = syscall.Mount("", mountNetnsDirectory, "none", syscall.MS_SHARED|syscall.MS_REC, "") | ||
if err != nil { | ||
return fmt.Errorf("mount --bind %s failed: %q", mountNetnsDirectory, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message should start with mount --make-rshared ...
.
a20209e
to
1e49325
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Tested myself, works great! |
Context
Coming from Kubernetes issue 48427.
Initially I was thinking of doing it in the kubelet rkt code but as @squeed suggested, it's simpler and less hacky to do it in rkt codebase.
TL;DR:
This PR ensure the
/run/netns
directory is always mounted once as tmpfs.The
ip netns add <some-id>
command has this behavior by default. The kubelet calls this command for every newhostNetwork: false
pods.A mount of
/run/netns
over any existing netns files like/run/netns/cni-30b37e42-157e-9796-dd74-ba4a3689bbde
cause issues with them.Warning
Upgrading rkt with this fix needs a full drain of any running rkt-netns containers on the node.