-
Notifications
You must be signed in to change notification settings - Fork 7.4k
fix: adding imagePullSecret to update job #39032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3f34827
to
6017113
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for looking into this issue and creating both the PR and the list of TODOs.
With the imagePullSecret fixed, I'd say once we make the name of the workdir volume a bit more specific to avoid clashes, I consider this PR ready to be merged.
Optional: Copying over init containers could be part of this PR if we think this is a low-hanging fruit, or a follow-up. AFAIK the init containers run sequentially, and we could add ours at the end which should be fine.
There was a new comment in the parent issue about the imagePullPolicy, not sure if this is really not picked up and where this would be missing.
I don't really understand how a start-dev would end up in here. If it is via a podTemplate
, I'd say it is not supported. If there is no simple way to handle it beyond also replacing start-dev
or adding a profile, I'd say failing and then recreate is fine. I would assume that it is failing today already, as the init container wouldn't create the JSON, and then the second one wouldn't be able to read it.
builder.withInitContainers(); | ||
builder.withContainers(); | ||
|
||
// TODO: should the scheduling fields be tweaked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say it would be usually safe to keep them as is. Some people would want to run Keycloak only on specific nodes, and could for example also ensure that it runs only on ARM architecture nodes.
As an alternative, we might eventually add some separate scheduling to the upgrade part of the CR. Until then, I think we're good to keep it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until then, I think we're good to keep it as is.
I'm fine with that, just wanted to highlight a behavioral difference between what the logic was doing and will now do. Let's just expand the comment to explain why keeping the scheduling is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, please expand the comment.
// mix in the existing state | ||
var currentSecrets = builder.buildImagePullSecrets(); | ||
current.getSpec().getTemplate().getSpec().getImagePullSecrets().stream().filter(s -> !currentSecrets.contains(s)).forEach(builder::addToImagePullSecrets); | ||
// TODO: if the name is the same, but the volume has changed the behavior will be inconsistent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say if we would make the volume name a bit more specific we should be good for now.
See above for a suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This concern is about a difference between the desired and existing state. Lets say in the existing state you have a volume name config pointed at configmap X, and in the desired state you a volume also named config pointed at configmap Y - we are not accounting for that possibility.
This could happen with more than just via the unsupported podTemplate, such as via the cache config configmap.
If we want to account for this we'll need to do a name remapping as part of the merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I now see your point. I'd say this is in the unsupported space, and within the margin of error when we accept it as-is now.
|
||
@ApplicationScoped | ||
public class KeycloakUpdateJobDependentResource extends CRUDKubernetesDependentResource<Job, Keycloak> { | ||
|
||
// shared volume configuration | ||
private static final String WORK_DIR_VOLUME_NAME = "workdir"; | ||
private static final String WORK_DIR_VOLUME_NAME = "workdir"; // TODO: could conflict with an existing volume name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to make it unique. Alternatively we could make it non-static and add a random suffix.
private static final String WORK_DIR_VOLUME_NAME = "workdir"; // TODO: could conflict with an existing volume name | |
private static final String WORK_DIR_VOLUME_NAME = "update-job-temporary-workdir"; |
downstream.accept(arg); | ||
}).toList(); | ||
// TODO: reuse ConfigArgConfigSource parsing | ||
// also start-dev is problematic - at the very least --profile=dev should be added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how a start-dev
would end up here ... again via a podTemplate
? I'd say this is not supported when using the Operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the podTemplate is the only way to use start-dev.
If we are ok with letting the job fail to start, then I can just minimize this comment and move on - this in general is a misconfiguration because you'd have to go to great lengths to cluster in dev mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with letting it fail to start.
// remove things we don't want | ||
|
||
// TODO: this may actually not be correct - via the unsupported PodTemplate someone could make configuration dependent upon an init container | ||
builder.withInitContainers(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying over init containers might be a good thing. But as PodTemplate is not supported, might do a best effort here.
Reading up on init containers, it seems that they run sequentially. So we could then add ours at the end.
I don't believe the comment is correct. imagePullPolicy is inherited at the container level. I believe they are just seeing the same issue with imagePullSecrets.
Yes, hopefully we can just let it fail rather than adding more logic to gracefully handle an unsupported situation. |
Thank you for the replies, I added some comments above. If we want to be on the safe side, we could prevent rolling updates when there is a podTemplate present. Still I think that would go too far. We could add a warning about podTemplates in general and especially changes to it, and that this might affect the correct workings of the upgrade command. Are we ready to wrap the low-hanging and obvious items (renaming the workdir, extending the comment) in this PR? I would also add the initContainers. Updating the docs to warn about unsupported podTemplate could be done in a follow-up or in this one. I am not sure how to handle init containers. |
As long as you are good with not accounting for volume changes, like a change in cache config configmap, then I can wrap this up. |
I'm good with that. |
The comments and volume / mount names have been updated, should be ready for review. |
closes: keycloak#39015 Signed-off-by: Steve Hawkins <shawkins@redhat.com>
Follow-up issue for the docs: #39059 |
closes: #39015
@ahus1 @pruivo @vmuzikar I believe that we have quite a few corner cases yet to address with the auto job logic, not just to address image pull secrets. I tried to add TODOs where there are additional concerns.
At a high level:
usage of the unsupported pod template could cause things to fail or be inaccurate in several ways.
starting with the desired state pod spec will copy over additional settings like the imagePullSecrets, but we still end up needing merge logic to the existing state, so it doesn't actually save us that much effort. We only have merge logic for the imagePullSecrets and volumes. We may have to review if anything else seems applicable - or what we may need to do to keep the logic in sync with what the main deployment logic is doing.
having the auto strategy work with start-dev is problematic because of the profile change - we can account for that, or probably just fail / recreate if someone is using start-dev