8000 fix: excluding update commands from reaug by shawkins · Pull Request #38899 · keycloak/keycloak · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix: excluding update commands from reaug #38899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 5, 2025
Merged

Conversation

shawkins
Copy link
Contributor

closes: #38662

@ahus1 @pruivo @vmuzikar to align better with some of what we talked about in the meeting, the refactoring I want to do with #38514, and the thoughts on #38894, I would like to skip reaug, and also not use the persisted properties when --optimized is not specified.

While this introduces yet another command behavior that is distinct from our other behaviors, it is more inline with our intuition about how the command should function.

@pruivo
Copy link
Contributor
pruivo commented Apr 14, 2025

@shawkins let me refresh my memory for a second by asking

  • If --optimized is set and the build time configuration is different, does the update-compatibility fail?
  • If --optimized is not set, it ignores the persisted configuration. It is similar to start behaviour, which discards the persisted configuration to rebuild, right?

@shawkins
Copy link
Contributor Author

If --optimized is set and the build time configuration is different, does the update-compatibility fail?

Yes, it will fail.

If --optimized is not set, it ignores the persisted configuration. It is similar to start behaviour, which discards the persisted configuration to rebuild, right?

Like start, but we don't alter the persisted properties / augmented state.

Copy link
Contributor
@pruivo pruivo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shawkins. It sounds goot to me 👍

@vmuzikar vmuzikar self-requested a review April 14, 2025 13:17
Copy link
Contributor
@ahus1 ahus1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this locally, and to me the new behavior as observed on the command line is how I expect it to work. Thank you very much, I like it!

I got a question about the note you added, pleaese see below.

Comment on lines 19 to 20
NOTE: make sure the build-time state of your `update-compatibility` command matches the build-time state of the corresponding `build` or `start` command you are checking the compatibility of.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have this paragraph:

[WARNING]
====
Ensure that all configuration options, whether set via environment variables or CLI arguments, are included when running the above command.
Omitting any configuration options results in incomplete metadata, and could lead to a wrong reported result in the next step.
====

Adding it here seems to be a bit out-of-context here for me. Is there something missing below that you want to add?

BTW, "build-time state" might not be something people have a shared understand what it is, and how to ensure it. If the statement needs extension that we use below, can it be shifted in how to ensure the build-state matches?

Copy link
Contributor Author
@shawkins shawkins Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just omit the new note. I had added something based upon the reaugmentation possibility, then refined it for the new logic - but should have removed it completely.

I've updated the pr.

ahus1
ahus1 previously approved these changes Apr 17, 2025
Copy link
Contributor
@ahus1 ahus1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @shawkins - ready to be merged from my side.

I see that @vmuzikar added himself, so not merging it yet.

Copy link
Contributor
@vmuzikar vmuzikar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawkins Sorry for the late reply and thank you for the PR!

I was playing with the scenario @pruivo mentioned:

If --optimized is set and the build time configuration is different, does the update-compatibility fail?

I noticed it truly fails but it fails, however due to --optimized it fails with rather odd message, e.g.:

Build time option: '--db' not usable with metadata

But we can address that as a follow-up.

Other thing I was wondering is if there are any side effects that we skip the reaug but still basically run the server for update-compatibility. What I have in mind is what if someone stops the server, changes some custom providers and runs update-compatibility. Then we would run it in an incorrect (outdated) state, no?

@ahus1
Copy link
Contributor
ahus1 commented Apr 23, 2025

What I have in mind is what if someone stops the server, changes some custom providers and runs update-compatibility. Then we would run it in an incorrect (outdated) state, no?

@vmuzikar - I think this is outside of the documented use, and therefore not supported. So I'm ok to merge it as is.

@shawkins
Copy link
Contributor Author
shawkins commented Apr 23, 2025

Build time option: '--db' not usable with metadata

metadata is just the command name, but I agree that starts to loose context for our sub-commands. It also would be clearer if it also referenced --optimized

As you say this could be done in a different issue, because this is the message we currently display for any command, not just the update ones.

Other thing I was wondering is if there are any side effects that we skip the reaug but still basically run the server for update-compatibility.

We don't run the server for update compatibility - it's all handled on the picocli side of things and exits before there is a quarkus launch.

What I have in mind is what if someone stops the server, changes some custom providers and runs update-compatibility. Then we would run it in an incorrect (outdated) state, no?

@ahus1 @vmuzikar the concern here is the user has effectively put 8000 the classloading index into a bad state wrt CompatibilityMetadataProvider classes. There are three cases:

If you have changed a CompatibilityMetadataProvider class, things may still work as expected.
If you have removed a CompatibilityMetadataProvider class, then you'll see a class loading error.
If you have added a CompatibilityMetadataProvider class, then that won't contribute to the update compatibilty check.

So if CompatibilityMetadataProvider classes are user contributable, then this is a valid concern - albeit one that only happens in a narrow circumstance. If we were to run the update commands in dev mode with the QuarkusClassLoader instead of the RunnerClassLoader, I think we could address this - but that would require some handling in the kc.sh/bat scripts.

For the operator the only circumstance this could happen relies upon the unsupported pod template:

  • user users the unsupported PodTemplate to turn the providers directory into a mount such that provider jars can be changed
  • user marks the cr as optimized=false

@ahus1
Copy link
Contributor
ahus1 commented Apr 23, 2025

For the operator the only circumstance this could happen relies upon the unsupported pod template:

  • user creates an optimized image
  • user users the unsupported PodTemplate to turn the providers directory into a mount such that provider jars can be changed
  • user marks the cr as optimized=false

If the user marks the CR as non-optimized, the update-compatibility command would also run as non-optimized, as would the start command. How would that lead to results where the start command sees some different configuration from the start command?

@shawkins
Copy link
Contributor Author
shawkins commented Apr 23, 2025

How would that lead to results where the start command sees some different configuration from the start command?

The update command without re-augmentation can only see the classes in the provider directory up to the last time a augmentation was run. If you use a default image and mount in some provider jars for example - the start command will pick those up, but the update command will not.

@pruivo
Copy link
Contributor
pruivo commented Apr 23, 2025

The update command without re-augmentation can only see the classes in the provider directory up to the last time a augmentation was run

@shawkins are you limiting/configuring the classloader during reaug? If the jar is in the classpath, it should be visible without reaug.

@shawkins
Copy link
Contributor Author
shawkins commented Apr 23, 2025

@shawkins are you limiting/configuring the classloader during reaug? If the jar is in the classpath, it should be visible without reaug.

The quarkus RunnerClassLoader supercedes normal classpath handling - https://quarkus.io/guides/maven-tooling#quarkus-core_quarkus-package-jar-user-providers-directory

EDIT: looking at the RunnerClassLoader impl it does eventually delegate to the parent classloader. So if we added the provider jars to the classpath built by kc.sh / kc.bat the behavior could be:

If you have changed a CompatibilityMetadataProvider class, things may still work as expected.
If you have removed a CompatibilityMetadataProvider class, then you'll see a class loading error - because the entry still exists in the index.
If you have added a CompatibilityMetadataProvider class, then that should get picked up (still need to cofirm that).

Not sure about the memory / lookup cost of adding additional classpath for the RunnerClassLoader to fall back to.

@ahus1
Copy link
Contributor
ahus1 commented Apr 23, 2025

(invalid)

@ahus1 ahus1 dismissed their stale review April 23, 2025 16:55

Some new questions aroused when testing

@shawkins
Copy link
Contributor Author
shawkins commented Apr 23, 2025

I gave it a try with this PR branch, and the following would detect the changed provider and would update the reaug after a message "Changes detected in configuration. Updating the server image." and will recognize the new provider:

@ahus1 That doesn't appear to have been using this branch. The Picocli.shouldSkipRebuild check will return true, and we'll simply exit rather than checking / performing a reaugmentation.

The behavior you are describing is consistent with the current state of main - that the update command will reaugment when needed.

8000

@shawkins
Copy link
Contributor Author

Since the classloading workarounds are not fully effective:

  • dev mode knows about changes to provider jars, it does not currently detect new provider jars
  • adding classpath entries may introduce some overhead generally and still doesn't help with the case where a CompatibilityMetadataProvider is removed - but used to have, and I believe still do, places where we effectively catch class not found exceptions due in situations like this. I'm not sure how this would look with a ServiceLoader.

I think we can just go back to the previous conculsion and let the update commands reaugment, and just add a small tweak to the docs letting users to be careful when not using --optimized like with the import / export commands.

@ahus1
Copy link
Contributor
ahus1 commented Apr 23, 2025

The behavior you are describing is consistent with the current state of main - that the update command will reaugment when needed.

OK, I stand corrected, apparently I fumbled with the checkout of the correct branch.

I now see that the following does not pick up the provider as expected as you described:

bin/kc.sh build
cp compat-provider-1.0-SNAPSHOT.jar providers/
bin/kc.sh update-compatibility metadata --file=/tmp/file.json

And while I think that this wouldn't happen in an Operator as long as you don't use a podTemplate, this might happen in other situations where you have a custom image build pipeline where you at some point optimize the image, and the later run it in a non-optimized way because you just added a provider or DB driver.

There is another case where you copy providers to an image that was never optimized, where the command would never pick up the provider. And such an image would be perfectly find to be used as a non-optimized image with the Operator:

# freshly unzipped Keycloak
cp compat-provider-1.0-SNAPSHOT.jar providers/
bin/kc.sh update-compatibility metadata --file=/tmp/file.json

Given that, I would rather stick with the existing behavior, than opening this corner case.

Sorry for changing my mind again, and possibly adding more work. If you decide to abandon or park this efforts, I would understand that. Let me know how you want to proceed.

@ahus1
Copy link
Contributor
ahus1 commented Apr 23, 2025

I think we can just go back to the previous conculsion and let the update commands reaugment, and just add a small tweak to the docs letting users to be careful when not using --optimized like with the import / export commands.

+1 for that

@vmuzikar
Copy link
Contributor

Ouch, sorry for sparking another round of discussions around the approach here. :D

I think we can just go back to the previous conculsion and let the update commands reaugment, and just add a small tweak to the docs letting users to be careful when not using --optimized like with the import / export commands.

+1

Just to add, it's not just about CompatibilityMetadataProvider as it could use some other custom providers as well, so it's even more generic problem.

@shawkins
Copy link
Contributor Author

Just to add, it's not just about CompatibilityMetadataProvider as it could use some other custom providers as well, so it's even more generic problem.

For the update commands it is specifically about CompatibilityMetadataProvider. Providers in general are not initialized.

@shawkins
Copy link
Contributor Author

@ahus1 @pruivo (and @vmuzikar) looking more at CompatibilityMetadataProvider - it is in the private spi currently, are there plans to make it public?

If so, it does lack an init method to provide it with a stateful way to get the configuration - it would have to rely on static methods on classes like Config, Profile, and Version, which are not well designed for this purpose.

@shawkins shawkins marked this pull request as draft April 24, 2025 13:38
@ahus1
Copy link
Contributor
ahus1 commented Apr 24, 2025

@shawkins - yes, eventually we want to make it public, and before we do so we can change the API as needed.

We need to solve the following problem here: We don't want to initialize the regular providers via the Keycloak session factory, as doing so would establish connections to the database, Infinispan and maybe even migrate the database.

So this SPI is independent from the KC session factory. This also prevents us from injecting a provider specific scoped configuration to those CompatibilityMetadataProviders. This is why they don't have an init() method.

As a fallback and good-enough-for-now, we're using Config, Profile, and Version to directly access the configuration. This way, an Infinispan related CompatibilityMetadataProvider might look at the configurations of other Infinispan providers and make sense out of that configuration. Depending on how much configuration we then end up using, we might inject an unscoped configuration.

At the moment we think we're only looking into very few profiles and versions, and the number of configurations is minimal. Let's see how it evolves, and we'll keep an eye on it when we implement #38862 - cc @ryanemerson

@shawkins
Copy link
Contributor Author

So this SPI is independent from the KC session factory. This also prevents us from injecting a provider specific scoped configuration to those CompatibilityMetadataProviders. This is why they don't have an init() method.

I'm not suggesting that it should be scoped in the same sense as providers, just that it should be stateful.

We're discussing here what access to the "root" or non-scoped configuration would look like from the existing Scope interface. The current thinking is that the root would still just be a Scope (although in that draft pr it's actually specific to kc. properties). If you have any additional thoughts about how this may align to the update metadata needs (such as access to quarkus or even smallrye specific properties) please add a comment there.

@ahus1
Copy link
Contributor
ahus1 commented Apr 24, 2025

I'm not suggesting that it should be scoped in the same sense as providers, just that it should be stateful.

IMHO it would make this simpler for testing, though I didn't see a reasoning in this or the other PR of the expected benefits. While we seems to agree, it would be good to write them down in an issue eventually.

@shawkins
Copy link
Contributor Author
shawkins commented Apr 28, 2025

Updated to just be a docs change. The other thought is that when --optimized is specified, then we should not use the current augmentation if the persisted properties were auto-created. I'm not sure if that would be seen as a breaking change, but it would remove the need for this warning.

cc @ahus1 @pruivo @vmuzikar

@shawkins shawkins requested review from vmuzikar, ahus1 and pruivo April 28, 2025 13:43
ahus1
ahus1 previously approved these changes Apr 28, 2025
@ahus1 ahus1 self-assigned this Apr 28, 2025
@@ -86,6 +86,8 @@ If you are upgrading to a new {project_name} version, this command must be execu
Failure to meet these requirements results in an incorrect outcome.
====

NOTE: If you do not use `--optimized` keep in mind that an `update` command may implicitly create or update an optimized build for you - if you are running the command from the same machine as a server instance, this may impact the next start of your server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this note also relevant to the metadata command?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's applicable to both update metadata and update check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I'd move this note a bit higher, below the first warning box in Determining the update strategy for an updated configuration chapter. The way it currently is it implies the note is relevant only to the check command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmuzikar just to double check, you are in favor of proceeding with these warnings instead of changing the --optimized logic to not use auto builds correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, having it just as a docs change is fine from my perspective. My suggestion in this thread is just moving the note/warning to a different place in the guide so it's clear it applies both to metadata and check commands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, moved where the warning is at.

also correcting the language across the warnings

closes: keycloak#38662

Signed-off-by: Steve Hawkins <shawkins@redhat.com>
Copy link
Contributor
@vmuzikar vmuzikar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawkins LGTM, thank you.

Considering @ahus1 and @pruivo approved previously and the gist of it remains the same, I'm merging the PR.

@vmuzikar vmuzikar merged commit 3e05f67 into keycloak:main May 5, 2025
54 checks passed
InJoDave pushed a commit to InJoDave/keycloak that referenced this pull request May 6, 2025
also correcting the language across the warnings

closes: keycloak#38662

Signed-off-by: Steve Hawkins <shawkins@redhat.com>
shawkins added a commit to shawkins/keycloak that referenced this pull request May 7, 2025
also correcting the language across the warnings

closes: keycloak#38662

Signed-off-by: Steve Hawkins <shawkins@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update commands trigger build checks
4 participants
0