8000 containerd-shim residue · Issue #768 · containerd/containerd · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

containerd-shim residue #768

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
keloyang opened this issue Apr 25, 2017 · 14 comments
Closed

containerd-shim residue #768

keloyang opened this issue Apr 25, 2017 · 14 comments

Comments

@keloyang
Copy link
Contributor

Some anomalies (e.g. containerd is killed by docker daemon), can lead to docker-containerd-shim residue. #763 gives two examples.
but another example can't be resolved by the pr above, this is because of that root/id/init/pid (and others)has not been created.

diff --git a/runtime/container.go b/runtime/container.go
index 29fd66e..9e5bf27
--- a/runtime/container.go
+++ b/runtime/container.go
@@ -506,6 +506,7 @@ func (c *container) createCmd(ctx context.Context, pid string, cmd *exec.Cmd, p
                }
                return err
        }
+       os.Exit(-1)
        // We need the pid file to have been written to run
        defer func() {
                go func() {

with the case above, docker-container-shim will block at "os.OpenFile("exit", ...", shim process will not exit except you kill it manually.

How can we resolve this scenario ?

Could we have a shim file located at root/id, which write docker-containerd-shim process' pid info,then kill it when containerd restore ?

@hqhq hqhq added the v0.2.x label Apr 25, 2017
@mlaventure
Copy link
Contributor

The issue would be the same if containerd dies before it has the time to write the pid on disk.

I'm afraid there's not really a solution for this case. I don't want to assume that the running shim processes are ones that can be killed without issue by the new instance.

@keloyang
Copy link
Contributor Author
keloyang commented Apr 26, 2017

@mlaventure thanks for your patience.
I have a workaround like this, do you think it make sense ?

diff --git a/containerd-shim/main.go b/containerd-shim/main.go
index c921671..44ce0a0 100644
--- a/containerd-shim/main.go
+++ b/containerd-shim/main.go
@@ -8,6 +8,7 @@ import (
        "path/filepath"
        "runtime"
        "syscall"
+       "time"
 
        "github.com/docker/containerd/osutils"
        "github.com/docker/docker/pkg/term"
@@ -23,6 +24,25 @@ type controlMessage struct {
        Height int
 }
 
+func setupShimTimeout(log *os.File) chan error {
+       c := make(chan error, 1)
+       go func() {
+               select {
+               case <-c:
+                       return
+               case <-time.After(2 * time.Minute):
+                       err := fmt.Errorf("shim timeout 2m")
+                       writeMessage(log, "error", err)
+                       os.Exit(-1)
+               }
+       }()
+       return c
+}
+
+func finishShimTimeout(err chan error) {
+       err <- nil
+}
+
 // containerd-shim is a small shim that sits in front of a runtime implementation
 // that allows it to be repartented to init and handle reattach from the caller.
 //
@@ -59,6 +79,9 @@ func main() {
 }
 
 func start(log *os.File) error {
+       // add timeout to avoid block when shim open pipe
+       successShim := setupShimTimeout(log)
+
        // start handling signals as soon as possible so that things are properly reaped
        // or if runtime exits before we hit the handler
        signals := make(chan os.Signal, 2048)
@@ -121,6 +144,8 @@ func start(log *os.File) error {
                        msgC <- m
                }
        }()
+
+       finishShimTimeout(successShim)
        if runtime.GOOS == "solaris" {
                return nil
        }

@keloyang
Copy link
Contributor Author

The issue would be the same if containerd dies before it has the time to write the pid on disk.

@mlaventure Maybe can write /root/id/shim in docker-containerd-shim process ,not docker-containerd. WDYT ?

@keloyang
Copy link
Contributor Author

ping @mlaventure

@ericslandry
Copy link

What's the best way (command) to clean this up? One of these orphan shims has maxed out my /run inodes:

foo@bar:/run/docker/libcontainerd# df -i
Filesystem                 Inodes   IUsed    IFree IUse% Mounted on
udev                      1016936     429  1016507    1% /dev
tmpfs                     1021914 1021734      180  100% /run

@mlaventure
Copy link
Contributor

@keloyang I'm not sure what this workaround is meant for. It's not up to the shim to ensure the pipes are open on the other side but to the containerd client.

@mlaventure
Copy link
Contributor

@ericslandry what are the files in your /run exactly? The shim is not the one creating the pipes, that is done by docker.

@ericslandry
Copy link

@mlaventure I didn't mean to murky the waters. I know that the content of /run/docker/libcontainerd/4bae1... is irrelevant to this issue. It was just a different problem that led me here.

Unfortunately, 'docker system prune' doesn't clean the orphans in /run/docker/libcontainerd/ . I'm just looking for a clean way to remove these folders.

@mlaventure
Copy link
Contributor

@ericslandry no worries, I just can't direct you to a solution if I don't know what is left in your /run and what is the state of your system :)

But you should probably open an issue on moby 8000 /moby for this. Note: If you have a lot of running containers/execs the fact that those files are present is completely normal.

@ericslandry
Copy link

@mlaventure My /run/docker/libcontainerd folder contains: 19 hash folders, a 'containerd' folder and a few files (docker-containerd.pid, docker-containerd.sock, event.ts). One of these hash folders (4bae1e91be17a3780c78a480eea8041734a48d9f9a48047aeb3dfcf343588873) contains 583324 files. Most of them are 0-byte files named like 026bbc585ccea230747213f1e25d8b1e15d2c95e1fe2503ff7ddc28922c7570e-stderr .

I've ran 'docker system prune' and 'docker rmi $(docker images)'. I have no containers or images. What should I do to remove these hash folders (which I'm assuming are the orphanned shims)? So far, my undesirable options that i'm not even sure would work are:

  • rm -rf /run/docker/libcontainerd/4bae1e91be17a3780c78a480eea8041734a48d9f9a48047aeb3dfcf343588873
  • sudo service docker restart
  • reboot

But I'm looking for a magical docker command to help me

@mlaventure
Copy link
Contributor

@ericslandry what version of docker are you running? Older version had a bug where they forgot to delete the exec's pipes once those were dead. It shouldn't happen in the current version.

@ericslandry
Copy link

@mlaventure docker version:

Client:
 Version:      17.03.1-ce
 API version:  1.27
 Go version:   go1.7.5
 Git commit:   c6d412e
 Built:        Mon Mar 27 17:14:09 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.03.1-ce
 API version:  1.27 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   c6d412e
 Built:        Mon Mar 27 17:14:09 2017
 OS/Arch:      linux/amd64
 Experimental: false

@mlaventure
Copy link
Contributor

@ericslandry please open an issue on https://github.com/moby/moby with the information asked into the template and ping me on it.

If you have a way to reproduce it, it would be great too.

if you can provide the output of ls -l -1 in the directory with all the files, I can try to match them up with the daemon logs (hoping yours was in debug mode).

@keloyang
Copy link
Contributor Author
keloyang commented May 2, 2017

@mlaventure we can ignore the workaround.
It is really a problem that the abnormal shim don't exit, and we can write /root/id/shim(shim's pid) in docker-containerd-shim process ,not docker-containerd,then containerd restore, it can clean the shims.
WDYT ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
0