8000 Invalidate user cache entries when email or username are different from storage by pedroigor · Pull Request #40256 · keycloak/keycloak · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Invalidate user cache entries when email or username are different from storage #40256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jun 17, 2025

Conversation

pedroigor
Copy link
Contributor
@pedroigor pedroigor commented Jun 4, 2025

Closes #40085

  • Ensures that entries returned from the cache when querying by username or email continue to reference the up-to-date value from the underlying storage. This is the case when the user storage provider has a cache policy to expire entries eventually, or that disables caching of their federated users.
  • Simplifies how cache policies that rely on a MAX_LIFESPAN by leveraging the existing mechanisms on Infinispan, therefore removing redundant code on the Keycloak side.

@pedroigor pedroigor closed this Jun 4, 2025
@pedroigor pedroigor reopened this Jun 4, 2025
@pedroigor pedroigor marked this pull request as ready for review June 4, 2025 17:55
@pedroigor pedroigor requested a review from a team as a code owner June 4, 2025 17:55
@ahus1 ahus1 self-assigned this Jun 5, 2025
@ahus1
Copy link
Contributor
ahus1 commented Jun 6, 2025

I pushed some changes that should provide proper eviction of the different user related information that honors the caching defined in the provider.

I tested this manually with LDAP switch to non-importing mode.

Also, I think the checking when something was cached is obsolete and can be removed, as either the ISPN behavior was fixed, or just misunderstood when this was written. IMHO there is no need check the cacheTimestamp again.

@pedroigor
Copy link
Contributor Author

Also, I think the checking when something was cached is obsolete and can be removed, as either the ISPN behavior was fixed, or just misunderstood when this was written. IMHO there is no need check the cacheTimestamp again.

I also stumbled upon the assumption that ISPN could not guarantee eviction of entries, and it would be great if it were redundant. I'll look at your changes and the test failures.

@ahus1
Copy link
Contributor
ahus1 commented Jun 9, 2025

@pedroigor - maybe removing the redundant code could be deferred to 26.3, while fixing the current problems at hand could be backported to 26.2. Let's discuss more of this tomorrow in the meeting I scheduled.

@pedroigor
Copy link
Contributor Author
pedroigor commented Jun 9, 2025

@pedroigor - maybe removing the redundant code could be deferred to 26.3, while fixing the current problems at hand could be backported to 26.2. Let's discuss more of this tomorrow in the meeting I scheduled.

Works for me.

Looks like we can not rely on the entry lifespan on ISPN, and the assumption was correct, at least regarding testing it. There are test failures that are related to skipping the code in validateCache based on the lifespan set to the user storage provider cache policy.

I think the failure is due to the fact that calling setTimeOffset during tests won't change the time in ISPN, and the eviction scheduler/logic won't remove the entry.

@pedroigor
Copy link
Contributor Author
pedroigor commented Jun 9, 2025

After talking with @pruivo, I updated the tests to enable/disable the ISPN time service to make sure entries are evicted in ISPN as expected when changing the time offset during tests.

There are failures in tests with a weird assumption that realm resources instances (e.g: UserModel) from a KeycloakSession will eventually return the adapter instance from the user storage during searches. AFAIK, it should always expect an adapter from the cache layer or null?

@ahus1
Copy link
Contributor
ahus1 commented Jun 10, 2025

There are failures in tests with a weird assumption that realm resources instances (e.g: UserModel) from a KeycloakSession will eventually return the adapter instance from the user storage during searches. AFAIK, it should always expect an adapter from the cache layer or null?

I've seen some cases where caching was disabled completely, then there wouldn't be an adapter returned. Let's have a look later today to see if this is the case in your setup.

@ahus1
Copy link
Contributor
ahus1 commented Jun 10, 2025

@pedroigor - I fixed the tests, and then restored the original logic as much as possible to avoid breaking any existing functionality.

When testing this, I found problems with dailyTimeout() as it would return the daily time was before the current time, which is IMHO a algorithm, and not in line with dailyEvictionBoundary().

Once the tests are green, I would suggest to backport the PR ti 26.2.x.

For 26.3, I would like to deprecate CacheableStoreageProviderModel#shouldInvalidate and also deprecate the methods RealmCacheSession#validateCache and UserCacheSession#validateCache. Possible together with CachedModel#getCacheTimestamp and CachedObject#getCacheTimestamp.

Copy link
@keycloak-github-bot keycloak-github-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreported flaky test detected, please review

@keycloak-github-bot
Copy link

Unreported flaky test detected

If the flaky tests below are affected by the changes, please review and update the changes accordingly. Otherwise, a maintainer should report the flaky tests prior to merging the PR.

org.keycloak.testsuite.cluster.ComponentInvalidationClusterTest#crudWithFailover

Keycloak CI - Clustering IT

java.lang.RuntimeException: java.lang.IllegalStateException: Keycloak unexpectedly died :(
	at org.keycloak.testsuite.arquillian.containers.KeycloakQuarkusServerDeployableContainer.start(KeycloakQuarkusServerDeployableContainer.java:71)
	at org.jboss.arquillian.container.impl.ContainerImpl.start(ContainerImpl.java:185)
	at org.jboss.arquillian.container.impl.client.container.ContainerLifecycleController$8.perform(ContainerLifecycleController.java:137)
	at org.jboss.arquillian.container.impl.client.container.ContainerLifecycleController$8.perform(ContainerLifecycleController.java:133)
...

Report flaky test

Copy link
@keycloak-github-bot keycloak-github-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreported flaky test detected, please review

8000
@keycloak-github-bot
Copy link

Unreported flaky test detected

If the flaky tests below are affected by the changes, please review and update the changes accordingly. Otherwise, a maintainer should report the flaky tests prior to merging the PR.

org.keycloak.testsuite.cluster.ComponentInvalidationClusterTest#crudWithFailover

Keycloak CI - Clustering IT

java.lang.RuntimeException: java.lang.IllegalStateException: Keycloak unexpectedly died :(
	at org.keycloak.testsuite.arquillian.containers.KeycloakQuarkusServerDeployableContainer.start(KeycloakQuarkusServerDeployableContainer.java:71)
	at org.jboss.arquillian.container.impl.ContainerImpl.start(ContainerImpl.java:185)
	at org.jboss.arquillian.container.impl.client.container.ContainerLifecycleController$8.perform(ContainerLifecycleController.java:137)
	at org.jboss.arquillian.container.impl.client.container.ContainerLifecycleController$8.perform(ContainerLifecycleController.java:133)
...

Report flaky test

@ahus1
Copy link
Contributor
ahus1 commented Jun 11, 2025

@pedroigor - I think I fixed the one test that remained failing, please review my latest commit.

return importValidation(realm, user);
user = importValidation(realm, user);
// Case when username was changed directly in the userStorage and doesn't correspond anymore to the username from local DB
if (user != null && username.equalsIgnoreCase(user.getUsername())) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we assume this check is done by https://www.keycloak.org/docs/26.2.5/server_development/#imp 8000 orteduservalidation-interface.

The importValidation will return the same model instance if your provider is not implementing it. In this case, you will eventually return a stale model, potentially different than the username from the provider.

But I guess our recommendation is to implement the ImportedUserValidation interface for user storage providers if the underlying store is writable and you want consistency?

Copy link
Contributor
@ahus1 ahus1 Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change your are annotating is a change you did :D ... I'm good with this change as it reads to me as follows: Validate the given user. If the user doesn't exist any more (null returned), or if the username that is part of the search has changed (second part of the if), then do not return directly, but try all the other stores to find a (new) possible match.

UPDATE: I think this is better than before as returning null from an invalidation and not trying to re-do the lookup is worse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this change now :)

The federated user is linked with a specific provider, and in this case, the importValidation returning null indicates the provider no longer recognizes the user, and the account will be deleted from the local database. There is no need to run the lookup. The account will be eventually imported from a different storage, if the case.

If the provider is a ImportedUserValidation and the username does not match the value of the search, it is perhaps safer to also remove the user from the local database? Or we just remove this check and rely on the provider to run this check as suggested by the documentation?

I'll review this code.

@pedroigor pedroigor force-pushed the issue-40085 branch 4 times, most recently from 6011964 to 22b01b7 Compare June 17, 2025 12:42
pedroigor and others added 12 commits June 17, 2025 21:52
…om storage

Closes keycloak#40085

Signed-off-by: Pedro Igor <pigor.craveiro@gmail.com>
Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
Signed-off-by: Pedro Igor <pigor.craveiro@gmail.com>
…n provider

Signed-off-by: Pedro Igor <pigor.craveiro@gmail.com>
Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
…r/UserSyncTest.java

Co-authored-by: Pedro Igor <pigor.craveiro@gmail.com>
Signed-off-by: Alexander Schwartz <alexander.schwartz@gmx.net>
Signed-off-by: Pedro Igor <pigor.craveiro@gmail.com>
Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
@ahus1 ahus1 requested a review from a team as a code owner June 17, 2025 19:55
@ahus1 ahus1 removed the request for review from a team June 17, 2025 19:59
@ahus1 ahus1 enabled auto-merge (squash) June 17, 2025 20:02
@ahus1 ahus1 disabled auto-merge June 17, 2025 20:02
@ahus1 ahus1 enabled auto-merge (squash) June 17, 2025 20:03
Copy link
Contributor
@ahus1 ahus1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this extensive change, and all the added tests. I reviewed the tests, and also did a manual test locally where the username of a user changed several times. Appoving.

@ahus1 ahus1 merged commit 0188d27 into keycloak:main Jun 17, 2025
76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Federated user IDs are not correctly evicted from cache
2 participants
0