8000 networking: ensure the netns directory is mounted by JulienBalestra · Pull Request #3761 · rkt/rkt · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Feb 24, 2020. It is now read-only.

networking: ensure the netns directory is mounted #3761

Merged
merged 1 commit into from
Aug 30, 2017

Conversation

JulienBalestra
Copy link
Contributor
@JulienBalestra JulienBalestra commented Aug 1, 2017

Context

Coming from Kubernetes issue 48427.
Initially I was thinking of doing it in the kubelet rkt code but as @squeed suggested, it's simpler and less hacky to do it in rkt codebase.

TL;DR:

This PR ensure the /run/netns directory is always mounted once as tmpfs.
The ip netns add <some-id> command has this behavior by default. The kubelet calls this command for every new hostNetwork: false pods.

A mount of /run/netns over any existing netns files like /run/netns/cni-30b37e42-157e-9796-dd74-ba4a3689bbde cause issues with them.

Warning

Upgrading rkt with this fix needs a full drain of any running rkt-netns containers on the node.

@squeed
Copy link
Contributor
squeed commented Aug 1, 2017

I did some exploration, and sadly it's a bug in iproute2, not rkt. I'll outline the bug, the fix, and how we might be able to work around it.

iproute2 doesn't actually do a mount -t tmpfs .... Instead, it wants to make /run/netns a mountpoint with shared propagation so that an instance of iproute2 inside a container can manage the host. This is a good idea. The basic sequence of actions it does are:

  1. mkdir -p /run/netns
  2. mount --bind /run/netns /run/netns
  3. mount --make-rshared /run/netns

However, there is a bug in step 2. The bind mount keeps any already-existing files in /run/netns, but doesn't actually copy any mounts in /run/netns. Since network namespaces are recorded as bind mounts, this copies them as files but doesn't preserve the actual ns. The fix is for iproute2 to do mount --rbind

iproute2 skips step 2 if /run/netns is already a mountpoint. So, rkt should mirror the logic in iproute, but with the bind mount an rbind.

Make sense?

@JulienBalestra JulienBalestra force-pushed the mount-netns branch 2 times, most recently from 58a3c0e to bbb7fdc Compare August 2, 2017 18:35
@JulienBalestra
Copy link
Contributor Author
JulienBalestra commented Aug 2, 2017

I made the changes suggested to "mirror the logic in iproute" + your suggested fix.

I improved the error message returned by syscall.Mount.
In case of failure, it's too short to be clear in the logs.

Example errno EINVAL:
stage1: networking: failed to setup network: invalid argument

@lucab
Copy link
Member
lucab commented Aug 3, 2017

Everything up to here looks fine to me. I've milestoned this for the next release, but I'd like to wait and see whether we get some feedback on the iproute2 patch.

if err != nil {
return fmt.Errorf("mount --rbind %s %s failed: %q", mountNetnsDirectory, mountNetnsDirectory, err)
}
return e.mountNetnsDirectory(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get rid of this recursive call here? It seems to be that if we reach this point we then just try one more mount(rshared) or fail hard.

Copy link
Contributor Author
@JulienBalestra JulienBalestra Aug 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, in this case I have no reason to leave the mkdir -p outside this method too.

return nil, err
}

err = n.mountNetnsDirectory(false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were not doing this before, and perhaps if it doesn't succeed we don't need to hard-fail. Should we perhaps just log an error if the mount fails here and keep going? @squeed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The container will start well but we cannot guarantee the lifecycle of its network if a third-party software like ip netns add is triggered after its starts. So I'm definitely for a hard-fail.

@JulienBalestra
Copy link
Contributor Author

@lucab I removed the recursive call and moved the mkdir in the method.
Do we have a feedback for the iproute patch ?
While I am here, do you know how long it could takes for that merged patch to be propagated into CoreOS container linux stable channel ?

err = syscall.Mount("", mountNetnsDirectory, "none", syscall.MS_SHARED|syscall.MS_REC, "")
if err != nil {
// Fail unless we need to make the mount point
if err.Error() != EINVAL {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simply use syscall.EINVAL here and get rid of your custom constant.

if err != nil {
// Fail unless we need to make the mount point
if err.Error() != EINVAL {
return fmt.Errorf("mount --bind %s failed: %q", mountNetnsDirectory, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should start with mount --make-rshared ....

// Remount after the Upgrade
err = syscall.Mount("", mountNetnsDirectory, "none", syscall.MS_SHARED|syscall.MS_REC, "")
if err != nil {
return fmt.Errorf("mount --bind %s failed: %q", mountNetnsDirectory, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should start with mount --make-rshared ....

Copy link
Member
@lucab lucab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@squeed
Copy link
Contributor
squeed commented Aug 30, 2017

Tested myself, works great!

@squeed squeed merged commit d3a23a2 into rkt:master Aug 30, 2017
@JulienBalestra JulienBalestra deleted the mount-netns branch November 9, 2018 14:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0