8000 feat: support buildx moby worker in docker 23.0.0 onward to accelerating building process by skipping docker load by Kaiyang-Chen · Pull Request #1472 · tensorchord/envd · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: support buildx moby worker in docker 23.0.0 onward to accelerating building process by skipping docker load #1472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 14, 2023

Conversation

Kaiyang-Chen
Copy link
Contributor

I am not so familiar with every details in envd, so this is just a initial commit. Let me know any part that i should pay attention to when adding a new builder type.

Signed-off-by: Kaiyang-Chen kaiyang-chen@sjtu.edu.cn

…lding process by skipping docker load

Signed-off-by: Kaiyang-Chen <kaiyang-chen@sjtu.edu.cn>
@Kaiyang-Chen
Copy link
Contributor Author

Currently, my implementation is that when initialize envd, it will check the docker version of the user, if it's 23.0.0 onward, the default builder will be moby-worker instead of docker-container, so bootstrap step is needed no more. If such mechanism are reasonable, maybe we should change the doc accordingly.

@VoVAllen
Copy link
Member
VoVAllen commented Feb 10, 2023

Please fix the lint problem and include the time comparison! Glad to see this feature!

Signed-off-by: Kaiyang-Chen <kaiyang-chen@sjtu.edu.cn>
Signed-off-by: Kaiyang Chen <kaiyang-chen@sjtu.edu.cn>
@Kaiyang-Chen
Copy link
Contributor Author

The saving time really depends on the size of the building product, but generally, it just don't need the original execution time for sending tarball / docker load procedure. Here is an running example for example/pytorch2/envd.build, although the network condition might not be consistent, the original sending tarball procedure which took 86.1s (10%+ or original build time) is reduced in the new version.

docker-container buildkit

[+] ⌚ parse build.envd and download/cache dependencies 1.0s ✅ (finished)
[+] build envd environment 838.6s (7/7) FINISHED
 => importing cache manifest from docker.io/tensorchord/python-cache:e  1.1s
 => docker-image://ghcr.io/pytorch/pytorch-nightly:latest             719.7s
 => => resolve ghcr.io/pytorch/pytorch-nightly:latest                   0.9s
 => => sha256:5bbe2d701c56ef360ead0469e970f88a9345f39416e2 105B / 105B  0.3s
 => => sha256:fe1ef7a4faad7e3c0836a52a2afdc5888718e2ac31aa 571B / 571B  0.3s
 => => sha256:ff1d93ac8cdf89847d65d32aa583b0bec769ee 40.94MB / 40.94MB  1.0s
 => => sha256:d9f43e545e80afcbee78faa59e3476ec4bbe68 5.51GB / 5.51GB  607.7s
 => => sha256:4f4d9f1bf26a3ba5f90da338b482f90ef84c45 40.58MB / 40.58MB  0.9s
 => => sha256:a055bf07b5b05332897ea9a464c5e76a507faf 26.71MB / 26.71MB  0.7s
 => => extracting sha256:a055bf07b5b05332897ea9a464c5e76a507fafe72fa21  0.9s
 => => extracting sha256:4f4d9f1bf26a3ba5f90da338b482f90ef84c45edb3e16  1.2s
 => => extracting sha256:d9f43e545e80afcbee78faa59e3476ec4bbe683c710  109.6s
 => => extracting sha256:ff1d93ac8cdf89847d65d32aa583b0bec769ee6e83a4a  1.4s
 => => extracting sha256:fe1ef7a4faad7e3c0836a52a2afdc5888718e2ac31aae  0.0s
 => => extracting sha256:5bbe2d701c56ef360ead0469e970f88a9345f39416e21  0.0s
 => [internal] settings pip cache mount permissions                     0.0s
 => pip install torch_tb_profiler --extra-index-url https://download.  22.5s
 => pip install tensorboard                                             1.4s
 => [internal] create dir for runtime.mount /home/envd/log              0.0s
 => exporting to oci image format                                      94.8s
 => => exporting layers                                                 8.7s
 => => exporting manifest sha256:43773e2aa9a20b403857a00aa7ae4af1dd8cf  0.0s
 => => exporting config sha256:67eef9cb01fa0283c36215b1bff8845a18258d0  0.0s
 => => sending tarball                                                 86.1s

moby worker build-in buildkit:

[+] ⌚ parse build.envd and download/cache dependencies 1.0s ✅ (finished)
[+] build envd environment 640.7s (7/7) FINISHED
 => importing cache manifest from docker.io/tensorchord/python-cache:envd-v0.3.1     0.5s
 => docker-image://ghcr.io/pytorch/pytorch-nightly:latest                          606.5s
 => => resolve ghcr.io/pytorch/pytorch-nightly:latest                                1.0s
 => => sha256:9d95ce4062b8247b2f32bc7acdf6a0fab807ce2609e24ccca0b83 1.58kB / 1.58kB  0.0s
 => => sha256:23fdd4c373c717d0ea0cb13126f235aff1a23e6e6d9d381e5ea92 4.88kB / 4.88kB  0.0s
 => => sha256:a055bf07b5b05332897ea9a464c5e76a507fafe72fa21370d3f 26.71MB / 26.71MB  0.4s
 => => sha256:4f4d9f1bf26a3ba5f90da338b482f90ef84c45edb3e1636d434 40.58MB / 40.58MB  0.6s
 => => sha256:d9f43e545e80afcbee78faa59e3476ec4bbe683c710894c4f6b 5.51GB / 5.51GB  516.0s
 => => extracting sha256:a055bf07b5b05332897ea9a464c5e76a507fafe72fa21370d3fccaf07d  0.6s
 => => sha256:ff1d93ac8cdf89847d65d32aa583b0bec769ee6e83a4addf09b 40.94MB / 40.94MB  1.1s
 => => sha256:fe1ef7a4faad7e3c0836a52a2afdc5888718e2ac31aae813c9b3a6836 571B / 571B  0.8s
 => => sha256:5bbe2d701c56ef360ead0469e970f88a9345f39416e2116ed7d5915a9 105B / 105B  1.1s
 => => extracting sha256:4f4d9f1bf26a3ba5f90da338b482f90ef84c45edb3e1636d434280a3f2  0.6s
 => => extracting sha256:d9f43e545e80afcbee78faa59e3476ec4bbe683c710894c4f6bb4a106  86.7s
 => => extracting sha256:ff1d93ac8cdf89847d65d32aa583b0bec769ee6e83a4addf09b6f8d1cc  1.1s
 => => extracting sha256:fe1ef7a4faad7e3c0836a52a2afdc5888718e2ac31aae813c9b3a6836d  0.0s
 => => extracting sha256:5bbe2d701c56ef360ead0469e970f88a9345f39416e2116ed7d5915a9e  0.0s
 => [internal] settings pip cache mount permissions                                  0.0s
 => pip install torch_tb_profiler --extra-index-url https://download.pytorch.org/w  24.4s
 => pip install tensorboard                                                          1.8s
 => [internal] create dir for runtime.mount /home/envd/log                           0.0s
 => exporting to image                                                               8.0s
 => => exporting layers                                                              8.0s
 => => writing image sha256:fd8fc636a50d4b03bd36ab9eea67acbfcb2496fa34ac03ba3b78c8b  0.0s
 => => naming to docker.io/library/pytorch2:dev                                      0.0s

Copy link
Member
@VoVAllen VoVAllen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also show me what moby worker's context looks like? And please also test to manually create moby context by envd context create.

Overall looks good to me! Thanks for your contribution!

@@ -138,3 +142,23 @@ func UserAgent() string {

return "envd/" + version
}

func GetDockerVersion() (int, error) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove blank line here. And I think you can put it within builder package or in utils folder, instead of a new package here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change it to pkg/driver/docker in order to avoid importing extra package, PTAL

Signed-off-by: Kaiyang Chen <kaiyang-chen@sjtu.edu.cn>
@Kaiyang-Chen
Copy link
Contributor Author
Kaiyang-Chen commented Feb 11, 2023

result of envd context ls after envd context create is shown below, and the builder works well when using a new moby-worker context
Screenshot 2023-02-11 at 18 43 42
When user want to use context of other builder, they can still switch back to that builder and then build. The following example is using docker-container builder on machine with docker version 23.0.1.
Screenshot 2023-02-11 at 17 28 46

@Kaiyang-Chen Kaiyang-Chen changed the title feat: support buildx moby worker in docker 23.0.0 to accelerating building process by skipping docker load feat: support buildx moby worker in docker 23.0.0 onward to accelerating building process by skipping docker load Feb 11, 2023
Copy link
Member
@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then will https://github.com/tensorchord/envd/pull/1472/files#diff-1199bd9f65f4618675f464758037bf051b6c122c6ec5ec5350edd6e8159c59e1L261 be executed when moby is used?

And, how do we decide when to use the moby worker?

@Kaiyang-Chen
Copy link
Contributor Author
Kaiyang-Chen commented Feb 13, 2023

@gaocegege

Then will https://github.com/tensorchord/envd/pull/1472/files#diff-1199bd9f65f4618675f464758037bf051b6c122c6ec5ec5350edd6e8159c59e1L261 be executed when moby is used?

Yes, but for moby worker, different solveOpt will be constructed.

And, how do we decide when to use the moby worker?

Currently, whenever the App.Before from urfave is executed, the builder for default context will be determined based on the current running docker version. So, when the users has docker version >= 23.0.0 before they install envd, once they install, the default builder will be moby worker. But they still can use any other worker when they create new context accordingly. If the user just upgrade their docker engine, then they will need to create a new context with builder type moby-worker in order to use it.

if err != nil {
return nil, errors.Wrap(err, "failed to create buildkit clientt")
}
c.Client = bkcli
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this one require bootstrap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope

@kemingy
Copy link
Member
kemingy commented Feb 13, 2023

And, how do we decide when to use the moby worker?

Currently, whenever the App.Before from urfave is executed, the builder for default context will be determined based on the current running docker version. So, when the users has docker version >= 23.0.0 before they install envd, once they install, the default builder will be moby worker. But they still can use any other worker when they create new context accordingly. If the user just upgrade their docker engine, then they will need to create a new context with builder type moby-worker in order to use it.

I'm not sure why we need the moby-worker type here. Since the users cannot have docker-worker and moby-worker at the same time. It's already determined by the current host docker engine version. envd should be able to detect that and use the right one.

@Kaiyang-Chen
Copy link
Contributor Author
Kaiyang-Chen commented Feb 13, 2023

I'm not sure why we need the moby-worker type here. Since the users cannot have docker-worker and moby-worker at the same time. It's already determined by the current host docker engine version. envd should be able to detect that and use the right one.

@kemingy So what you mean is like we need to determine the docker version every time before we build? And change current context accordingly? My understanding was user have the right to choose whatever builder they want, it just if the docker version is high enough, the default builder will be moby-worker and we enable user to create new context with moby-worker builder.

Actually it can have docker-worker and moby-worker at the same time in different context if the user want that.

@kemingy
Copy link
Member
kemingy commented Feb 13, 2023

I'm not sure why we need the moby-worker type here. Since the users cannot have docker-worker and moby-worker at the same time. It's already determined by the current host docker engine version. envd should be able to detect that and use the right one.

@kemingy So what you mean is like we need to determine the docker version every time before we build? And change current context accordingly? My understanding was user have the right to choose whatever builder they want, it just if the docker version is high enough, the default builder will be moby-worker and we enable user to create new context with moby-worker builder.

Actually it can have docker-worker and moby-worker at the same time in different context if the user want that.

I think the moby-worker can do everything we need as the docker-worker. Thus we can choose the better one for users. Unless there are some downsides of it (I'm not sure).

< 8000 /p>

@VoVAllen
Copy link
Member

@kemingy Then we need a new type of builder here, such as "auto builder". Otherwise the builder's address should be deterministic.

@kemingy
Copy link
Member
kemingy commented Feb 13, 2023

@kemingy Then we need a new type of builder here, such as "auto builder". Otherwise the builder's address should be deterministic.

I think it is deterministic. Users cannot have multiple docker engines on the host at the same time. BTW, docker is moby. Adding the moby type in the code makes it hard to maintain. Since it just means that "docker engine >= 23.0.0".

@VoVAllen
Copy link
Member
VoVAllen commented Feb 13, 2023

@kemingy But it's totally different between using envd_buildkit container (docker worker for docker<23.0.0) and using docker's builtin buildx (moby worker for docker >=23.0.0). Thus it's not easy to unify them into the same build type?

@VoVAllen VoVAllen merged commit 0929661 into tensorchord:main Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0