-
Notifications
You must be signed in to change notification settings - Fork 167
feat!: helm chart and base image refactor DO NOT MERGE #816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
BREAKING CHANGE: This was moved to legacy over a year ago removing to simplify chart Signed-off-by: Ryan Faircloth <ryan@dss-i.com>
BREAKING CHANGE: Moving to the root crds folder so that crds will not be removed on uninstall. The consequence of this change is CRDs are not upgraded however common CD tools such as argo and flux2 will upgrade the CRDs. It is common for k8s solutions using CRDs to require manual updates when using helm for deployment. Alternative would be to use manifests or olm in the future Signed-off-by: Ryan Faircloth <ryan@dss-i.com>
Loadbalancer can have a cost and is not required for this use case. fix is changing default to ClusterIP Signed-off-by: Ryan Faircloth <ryan@dss-i.com>
TopoLVM is critical to scheduling and node readiness set appropriate default values Signed-off-by: Ryan Faircloth <ryan@dss-i.com>
Signed-off-by: Ryan Faircloth <ryan@dss-i.com>
Alpine is a slightly smaller image this change swaps for a similar configuration and adds nvme-cli and lvm2 so user provided init routines have a likely chance of being able to use the same image to keep pull size down. Result is 284MB vs the current image size of 319MB. A side benefit when the current image is built it is more likely than not a different package will be installed because versions are not easily pined. In this PR I am not pinning versions as this adds a responsibility to the upstream team to update package level locked versions. Signed-off-by: Ryan Faircloth <ryan@dss-i.com>
This refactor handled embedding lvmd as a container rather than a process in the node csi container. This makes logging more friendly as the lvmd logs can be monitored without "reading between the lines" of the node pod. The socket is mounted using a emptydir with memory as the medium. Removed use of nodePID Signed-off-by: Ryan Faircloth <ryan@dss-i.com>
7eade4e
to
8a4ac8f
Compare
@ryanfaircloth |
|
||
log.FromContext(ctx).Info("invoking LVM command", "args", args) | ||
err := c.Run() | ||
|
||
log.FromContext(ctx).Info("result of LVM command", "stdout", stdout.Bytes(), "stderr", stderr.Bytes()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we introduce this at a lower log level? I do not think this is necessary for common debugging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current run command that is slighly less useful for troubleshooting is already at the info level I would say move both to debug but I don't see support for changing log level this runs once at startup so I don't see the harm with including it. It took me several hours to get to the issue of usePID not working on eks 1.28 because I was unaware nsenter was failing silently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall what is logged and when could use some improvement its not "bad" but its not very intentional, for example at the info level I would like to know what vgs were found on startup so in the list function we should have a dump of the VGs, as LVs actions we should see that at info level. this really should be debug but as a non go programmer I wasn't really wanting to do more than I could safely copy and paste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @jakobmoellerdev . Please separate it.
{{- end }} | ||
{{- with .Values.node.initContainers }} | ||
initContainers: {{ toYaml . | nindent 6 }} | ||
{{- end }} | ||
containers: | ||
- name: topolvm-node | ||
{{- if .Values.node.lvmdEmbedded }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lvmdEmbedded is fundamentally different from what you are trying to do here. Embedded means embedded in topolvm-node, not a separate binary. If you run lvmd separately from topolvm-node it is no longer embedded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I do agree I changed the implementation and re-used the same flag the admin experience should be the same though would you like a new word here or a more verbose description of the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just trying to add some context as the original implementor of the embedded functionality: IMO this is a fundamentally different deployment from using --embed-lvmd
in topolvm-node so we should find a different name for it.
From what I can see, this functionality should not be removed just because we change the way that we move from 2 DS to 1 DS without using this flag. When using the flag, we do not even need to run LVMD so there will always be 1 DS.
I have 2 follow-up questions from your work:
- How in the future would we allow users to run with
--embed-lvmd
on topolvm-node? IMO this is answered by running 1 DS with 2 containers in the default setting (lvmdEmbedded = false) or introducing a new flag that is mutually exclusive with lvmdEmbedded = true - What tangible benefit does it bring to have a single daemon set with 2 containers over 2 ds with 1 container. I currently do not see major gains, would be cool to have this in the PR explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I see we have multiple implementation choices right now LVMD can be systemd, in its own DS or as a child process of the node DS container. In this PR I removed "embeded" as it is currently defined because that didn't seem to solve a unique problem or offer any value on its own and to me made troubleshooting more complicated. having LVMD as its own daemon set also doesn't seem to be particularly useful as an option because we could have race conditions between the pods or node may not start for some reason such as lack of resources. So really I moved the LVMD "managed" into this pod and redefined embeded. I think adding node.lvmd.external would be more descriptive as this really just reduces the deployment choices to two. DS (via node pod) and external
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we have multiple implementation choices right now LVMD can be systemd, in its own DS or as a child process of the node DS container.
It is not a child process if you run it within topolvm-node. It is one process and we do not open up a gRPC server in this case.
In this PR I removed "embeded" as it is currently defined because that didn't seem to solve a unique problem or offer any value on its own and to me made troubleshooting more complicated.
I believe you are quite quick to come to this resolution. Please reference the dedicated design proposal and see if you still do not understand the use case. Just because this option of deployment is not applicable for you does not mean it is not useful for other users.
having LVMD as its own daemon set also doesn't seem to be particularly useful as an option because we could have race conditions between the pods or node may not start for some reason such as lack of resources.
This does not make sense because 1. race conditions between pods are not a thing (we are an eventually consistent system, and starting one before the other does not result in problems) and 2. If the node does not have enough resources for both lvmd and topolvm-node and you run with 1DS the daemonset will still not start and run successfully.
So really I moved the LVMD "managed" into this pod and redefined embeded. I think adding node.lvmd.external would be more descriptive as this really just reduces the deployment choices to two. DS (via node pod) and external
I disagree that you redefined embedded. You removed the option to configure it entirely. There is no way in your chart to enable this kind of configuration.
However I agree that having 2 deployment choices should be the goal.
My conclusion is different however: I think your way of deploying (1 DS with 2 containers ) should succeed the 2 DS approach and NOT embedded LVMD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re race conditions I have seen instances of the node pod crashing when lvmd is not ready, I have also seen a few instances where workload pods start before node/lvmd is ready
Thanks for the doc pointer, after reading I think my approach achieves the same goals in this design doc the goal of not needing a second DS is achieved by treating LVMd as a library (I thought it was process) with the consequence of having two very different interface methods. My approach does result in retention of an independent process and grpc but means the interface between node and lvmd is the same rather it is deployed in the pod or deployed as a systemd service. The goals is I understand them are the same. My approach has a slightly smaller code surface less to maintain and would make the embeded moot assuming I understood the goals correctly. While I am removing choices have I caused an unforeseen consequence by impacting a feature or function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the crashes resulting from startup: While it is true that topolvm-node will continue to restart if it does not establish a connection to LVMD, this will work just fine after a restart and is part of a normal regression loop in k8s. Your approach can fix this though, I agree.
To your comment: I believe your approach does not achieve 2 of the main goals on why this was introduced in the first place: 1. Reduce Memory Consumption because you still need a gRPC server and a socket and 2. Reduce Container Count - Even though you have 1 DS you still retain both containers, which results in more crun/kubelet overhead. This is especially important in edge scenarios where local provisioning needs low resource overhead which is one of the main use cases for our Users (the environment in which we bundle TopoLVM). Without these reasons there would not be a valid reason to even combine lvmd and topolvm-node.
That being said and assuming all of this is not valid in the end for the greater open source community, you would still have to remove all of the code that was introduced with embed-lvmd on top of your existing changes, otherwise it will end up dead code. Same goes for the test coding. All of this while not gaining features and even increasing resource consumption again.
I am honestly surprised how you think that your change is beneficial over an implementation approach that is already present, presents the same functionality in a less complex deployment model (because compared to your suggestion, you will have to setup a socket and maintain 2 containers with 2 log streams, while the existing implementation does not need a socket at all), has lower resource overhead and is already tested.
From what I can see, the only reason you really did this is because:
[embedded] made troubleshooting more complicated [during testing]
Which is an entirely separate problem and should be tackled within the given code imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the back and forth I see the issue and will revise the approach to use embedded rather than side car
FROM --platform=$TARGETPLATFORM topolvm as topolvm-with-sidecar | ||
FROM --platform=$TARGETPLATFORM alpine:3.19 as topolvm-with-sidecar | ||
|
||
RUN apk add --no-cache \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will add licensing issues to the released TopoLVM images as they now package various OS packages that need to be mentioned in the licensing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While that is possible I don't think its a new concern the only thing that is likely net new is lvm2 and nvmecli. To resolve this and future concerns I would suggest the addition of FOSSA (free service for oss apps) to produce a SBOM. You may have specific reasons here I don't understand happy to discuess and resolve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly this is a great point. @toshipp I suggest we open up an issue on LICENSE tracking for all of our open source dependencies in TopoLVM so that we remain compliant in the project and questions like these dont arise in PRs. I suggest we do this before we merge here.
We (speaking on behalf of Red Hat) are interested in remaining License Compliant for TopoLVM and we personally do not use the Dockerfile in the open source version because we rebuild with Red Hat Maintained versions of packages such as lvm2 (which is also maintained and owned by Red Hat and published as open source).
Until this is not cleared up however, I suggest not to move forward with this change to make sure no licensing aspects change while we move for more transparency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also see the way this is being built today the sidecars from the kube project are embeded into the container, I don't see a THIRDPARTY or other sbom so I would like to continue with the change as its more likely slightly better due to alpine being a smaller base image with fewer inclusions, or a net same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a different story if you use third party container images or if you build your own container image imo. But lets wait for other maintainers opinions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#818 for LICENSE discussions. Thanks for creating!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakobmoellerdev @ryanfaircloth I'll handle #818.
@@ -24,8 +24,8 @@ spec: | |||
{{- with .Values.node.podLabels }} | |||
{{- toYaml . | nindent 8 }} | |||
{{- end }} | |||
{{- if and .Values.node.metrics.enabled .Values.node.metrics.annotations }} | |||
annotations: {{ toYaml .Values.node.metrics.annotations | nindent 8 }} | |||
{{- if and .Values.node.csi.metrics.enabled .Values.node.csi.metrics.annotations }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
breaking change with previous chart versions, needs a major version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree added ! and Breaking Change to the commits to ensure this was communicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats nice! But maybe we should think about wether we need to move this CSI section at all? What benefit does it bring or is it necessary? If not, maybe we can avoid the breaking change alltogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might have been over agressive but these options don't apply to the pod only the csi container so I was attempting to not introduce ambiguity with the changes to move lvmd into the pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah understood, makes sense to me :) . Then lets wait for other maintainers to give feedback on if this will be accepted.
{{- if .Values.node.lvmdEmbedded }} | ||
- name: lvmd-socket-dir | ||
emptyDir: | ||
medium: Memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice change, but are you sure this is the only thing requiring hostPID? Just because it starts doesnt mean all block devices are recognized properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did test in my environment which at the moment is EKS with k8s 1.28
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #816 (comment)
I incorrectly remove the topolvm target I missed that image is published but not used in the helm chart I had no reason to remove it other than simplification so I am putting it back
The current process was building the upstream side-cars with no specific options or modifications. All required arch versions are provided upstream this change simply uses as provided a future change could adopt the upstream containers
@ryanfaircloth |
I leave my comments for each commit. 7b33e32 6cdae05 5c9d859 f62a1ac 3cce93d |
Thanks I will keep this open for easy access to the feedback and close it off when I've proposed the smaller PRs for proper discussion. |
Individual commits for the specific changes at a high level this pr removes the need for hostPID (container is still privileged)
I really don't expect to see this PR merged but the feedback is useful and I will pull the parts out that make sense and submit as bit size PRs.