8000 duplicate key value violates unique constraint "constraint_offl_cl_ses_pk3" · Issue #39173 · keycloak/keycloak · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

duplicate key value violates unique constraint "constraint_offl_cl_ses_pk3" #39173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 2 tasks
RadekSalay opened this issue Apr 24, 2025 · 10 comments · Fixed by #39344
Closed
1 of 2 tasks

duplicate key value violates unique constraint "constraint_offl_cl_ses_pk3" #39173

RadekSalay opened this issue Apr 24, 2025 · 10 comments · Fixed by #39344
Assignees
Labels

Comments

@RadekSalay
Copy link
RadekSalay commented Apr 24, 2025

Before reporting an issue

  • I have read and understood the above terms for submitting issues, and I understand that my issue may be closed without action if I do not follow them.

Area

infinispan

Describe the bug

Hi, not very often we find an error in database table app_keycloak.OFFLINE_CLIENT_SESSION.
Full error is
Batch entry 0 insert into app_keycloak.OFFLINE_CLIENT_SESSION (DATA,TIMESTAMP,VERSION,CLIENT_ID,CLIENT_STORAGE_PROVIDER,EXTERNAL_CLIENT_ID,OFFLINE_FLAG,USER_SESSION_ID) values ((.........)) was aborted: ERROR: duplicate key value violates unique constraint "constraint_offl_cl_ses_pk3"

I think the problem is that multiple instances of my Keycloak want to write offline_session to the database. Those who try to write later get an error from the database that such a session already exists. Which instance should write to the database should be controlled by the Infinispan cache. Which works correctly most of the time, because this error is only occasional.

Our Keycloak instances run in AKS. We use dns.DNS_PING for discovery with headless-service.

  - KC_CACHE_STACK=kubernetes
  - KC_CACHE=ispn
  - KEYCLOAK_STATISTICS=db,http,jgroups
  - JGROUPS_DISCOVERY_PROTOCOL=dns.DNS_PING
  - JGROUPS_TRANSPORT_STACK=tcp
  - JGROUPS_DISCOVERY_PROPERTIES=dns_query=keycloak-headless
apiVersion: v1
kind: Service
metadata:
  name: keycloak-headless
spec:
  publishNotReadyAddresses: true
  clusterIP: None
  ports:
    - name: ping
      port: 7800
      targetPort: 7800
      protocol: TCP
  selector:
    app: ew-keycloak

I try adding mode=SYNC and changing the number of owners in cache-ispn.xml for distributed-cache of offlineSessions, but the error still prevails.

In the logs of Jgroup I can see that pods successfully see each other.

Version

26.1.1

Regression

  • The issue is a regression

Expected behavior

Always only one instance of Keycloak will write offline sessions, so there will be no conflict in the database.

Actual behavior

Multiple instances of Keycloak try to write offline_session and only the first one is successful others end with error on duplicate keys.

How to Reproduce?

  • Run Keycloak in AKS cluster with more than one instance.
  • Use DNS_PING for Jgroup discovery.
  • Obtain an offline session.

Anything else?

No response

@EWrejchmar
Copy link

yep, that's pretty annoying issue, check it please 👍

@pedroigor
Copy link
Contributor

@ahus1 Do you mind helping with this one?

@ahus1
Copy link
Contributor
ahus1 commented Apr 25, 2025

@RadekSalay - perform the following steps

  • update to the latest release of Keycloak (at least the latest 26.1.x release) and see if the problem persists

In addition, some more information is needed to analyze this:

  • provide a full stack trace form the log for the exception. Also any follow-up errors as I assume this error is later propagated upwards in the call hierachy.
  • leave your ISPN configuration unchanged (with persistent sessions enabled, you must not change the number of owners for the user and client session caches)
  • do your users log in with username and password, or are there additional screens? witch is the last screen (is it maybe an OTP screen)?
  • if you customized your login theme, verify that it has a to prevent double submits
  • what login flow are you using for the offline sessions (is it a regular auth code flow, or something different?)
  • are you using a new session, or are your users already logged in for another client when you are creating the offline token? There have been some changes in 26.1 around the handling of offline sessions, and this information could be essential to narrow it down.
  • as you have metrics enabled, do you see any 500 errors in any of the login endpoints? Can you name endpoint that is being called?

@keycloak-github-bot
Copy link

Thanks for reporting this issue, but as this is reported against an older and unsupported release we are not able to evaluate the issue. Please verify with the nightly build or the latest release.

If the issue can be reproduced in the nightly build or latest release add a comment with additional information, otherwise this issue will be automatically closed within 14 days.

@ahus1
Copy link
Contributor
ahus1 commented Apr 25, 2025

@RadekSalay - please also provide the complete list of start-up parameters for Keycloak, including the database parameters and the version of the database you are using. Thanks!

@RadekSalay
Copy link
Author

Hi,

  1. Update on latest release 26.1.x. Yes, we are currently on 26.1.1
  2. Full stack trace
java.sql.BatchUpdateException: Batch entry 0 insert into app_keycloak.OFFLINE_CLIENT_SESSION (DATA,TIMESTAMP,VERSION,CLIENT_ID,CLIENT_STORAGE_PROVIDER,EXTERNAL_CLIENT_ID,OFFLINE_FLAG,USER_SESSION_ID) values (('{"authMethod":"openid-connect","notes":{"clientId":"b8ca094d-d5be-4bb4-a0b4-984e0660afb5","userSessionStartedAt":"1745413831","iss":"https://****","startedAt":"1745415181","level-of-authentication":"-1"}}'),('1745415181'::int4),('0'::int4),('b8ca094d-d5be-4bb4-a0b4-984e0660afb5'),('local'),('local'),('0'),('303f4597-5860-4eda-8d42-156b90c69be8')) was aborted: ERROR: duplicate key value violates unique constraint "constraint_offl_cl_ses_pk3"
  Detail: Key (user_session_id, client_id, client_storage_provider, external_client_id, offline_flag)=(303f4597-5860-4eda-8d42-156b90c69be8, b8ca094d-d5be-4bb4-a0b4-984e0660afb5, local, local, 0) already exists.  Call getNextException to see other errors in the batch.
	at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2421)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:580)
	at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:888)
	at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:912)
	at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1739)
	at io.agroal.pool.wrapper.StatementWrapper.executeBatch(StatementWrapper.java:340)
	at org.hibernate.engine.jdbc.batch.internal.BatchImpl.lambda$performExecution$2(BatchImpl.java:279)
	at org.hibernate.engine.jdbc.mutation.internal.PreparedStatementGroupSingleTable.forEachStatement(PreparedStatementGroupSingleTable.java:67)
	at org.hibernate.engine.jdbc.batch.internal.BatchImpl.performExecution(BatchImpl.java:264)
	at org.hibernate.engine.jdbc.batch.internal.BatchImpl.execute(BatchImpl.java:242)
	at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.executeBatch(JdbcCoordinatorImpl.java:188)
	at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:674)
	at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:511)
	at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:414)
	at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:41)
	at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:127)
	at org.hibernate.internal.SessionImpl.doFlush(SessionImpl.java:1429)
	at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:491)
	at org.hibernate.internal.SessionImpl.flushBeforeTransactionCompletion(SessionImpl.java:2354)
	at org.hibernate.internal.SessionImpl.beforeTransactionCompletion(SessionImpl.java:1978)
	at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.beforeTransactionCompletion(JdbcCoordinatorImpl.java:439)
	at org.hiber
8000
nate.resource.transaction.backend.jta.internal.JtaTransactionCoordinatorImpl.beforeCompletion(JtaTransactionCoordinatorImpl.java:336)
	at org.hibernate.resource.transaction.backend.jta.internal.synchronization.SynchronizationCallbackCoordinatorNonTrackingImpl.beforeCompletion(SynchronizationCallbackCoordinatorNonTrackingImpl.java:47)
	at org.hibernate.resource.transaction.backend.jta.internal.synchronization.RegisteredSynchronization.beforeCompletion(RegisteredSynchronization.java:37)
	at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:52)
	at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:348)
	at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:66)
	at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:135)
	at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1290)
	at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:104)
	at io.quarkus.narayana.jta.runtime.NotifyingTransactionManager.commit(NotifyingTransactionManager.java:70)
	at org.keycloak.transaction.JtaTransactionWrapper.commit(JtaTransactionWrapper.java:102)
	at org.keycloak.services.DefaultKeycloakTransactionManager.lambda$commitWithTracing$0(DefaultKeycloakTransactionManager.java:169)
	at org.keycloak.tracing.NoopTracingProvider.trace(NoopTracingProvider.java:59)
	at org.keycloak.tracing.NoopTracingProvider.trace(NoopTracingProvider.java:69)
	at org.keycloak.services.DefaultKeycloakTransactionManager.commitWithTracing(DefaultKeycloakTransactionManager.java:168)
	at org.keycloak.services.DefaultKeycloakTransactionManager.commit(DefaultKeycloakTransactionManager.java:136)
	at org.keycloak.services.DefaultKeycloakSession.closeTransactionManager(DefaultKeycloakSession.java:396)
	at org.keycloak.services.DefaultKeycloakSession.close(DefaultKeycloakSession.java:361)
	at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransactionWithResult(KeycloakModelUtils.java:463)
	at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:336)
	at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:326)
	at org.keycloak.models.sessions.infinispan.changes.PersistentSessionsWorker$BatchWorker.lambda$process$5(PersistentSessionsWorker.java:110)
	at org.keycloak.common.util.Retry.executeWithBackoff(Retry.java:102)
	at org.keycloak.common.util.Retry.executeWithBackoff(Retry.java:93)
	at org.keycloak.models.sessions.infinispan.changes.PersistentSessionsWorker$BatchWorker.lambda$process$7(PersistentSessionsWorker.java:107)
	at org.keycloak.models.utils.KeycloakModelUtils.lambda$runJobInTransaction$1(KeycloakModelUtils.java:337)
	at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransactionWithResult(KeycloakModelUtils.java:456)
	at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:336)
	at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:326)
	at org.keycloak.models.sessions.infinispan.changes.PersistentSessionsWorker$BatchWorker.process(PersistentSessionsWorker.java:91)
	at org.keycloak.models.sessions.infinispan.changes.PersistentSessionsWorker$BatchWorker.run(PersistentSessionsWorker.java:76)
Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "constraint_offl_cl_ses_pk3"
  Detail: Key (user_session_id, client_id, client_storage_provider, external_client_id, offline_flag)=(303f4597-5860-4eda-8d42-156b90c69be8, b8ca094d-d5be-4bb4-a0b4-984e0660afb5, local, local, 0) already exists.
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2733)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2420)
	... 51 more

  1. We currently don't change our ispn-config.xml

  2. Users log in with a username and password. We use a customised KC login page via Keycloakify. Yes last page in our authentication form for offline session is the insertion of OTP code.

  3. In all our login screens, we use

  4. With an offline session, we use two flows. Standard flow and Direct access grants. But most users use standard flow.

  5. are you using a new session, or are your users already logged in for another client when you are creating the offline token?
    The second choice.
    Users are mostly already logged in, with an online session on the web. Then they ask for an offline session on the mobile devices.

  6. We don't record any 500 on authentication endpoints associated with this error.

Start up parameters:

KC_FEATURES=token-exchange,admin-fine-grained-authz
JDBC_PARAMS= sslmode=require
PROXY_ADDRESS_FORWARDING="true"
JDK_JAVA_OPTIONS=-XX:InitialRAMPercentage=60.0 -XX:MaxRAMPercentage=80.0
TRUSTSTORE_LOCATION=/tmp/truststore.jks
KC_LOG=console
KC_LOG_LEVEL=*****
KC_LOG_CONSOLE_COLOR=true
KC_LOG_CONSOLE_FORMAT=%d{HH:mm:ss.SSS} [%t] %-5p - %m%n
KC_DB=postgres
KC_DB_SCHEMA=*****
KC_HOSTNAME_DEBUG=true
KC_HTTP_ENABLED=true
KC_PROXY_HEADERS=xforwarded
KC_HTTP_RELATIVE_PATH=/auth/
KC_METRICS_ENABLED=true
KC_HEALTH_ENABLED=true
 TODO: KC v26 removed!
KC_LEGACY_OBSERVABILITY_INTERFACE=true
KC_CACHE_STACK=kubernetes
KC_CACHE=ispn
KEYCLOAK_STATISTICS=db,http,jgroups
JGROUPS_DISCOVERY_PROTOCOL=dns.DNS_PING
JGROUPS_TRANSPORT_STACK=tcp
JGROUPS_DISCOVERY_PROPERTIES=dns_query=keycloak-headless
KAFKA_USERNAME=*****
KC_SPI_BRUTE_FORCE_PROTECTOR_DEFAULT_BRUTE_FORCE_DETECTOR_ALLOW_CONCURRENT_REQUESTS=true
MS_AUTH_SERVER_URL=*****
KC_TRACING_ENABLED=true
KC_TRACING_ENDPOINT=*****
KC_TRACING_PROTOCOL=http/protobuf
KC_TRACING_SERVICE_NAME=ciam/keycloak
KC_TRACING_JDBC_ENABLED=true

KC_DB_URL="jdbc:postgresql://*****/app_keycloak_prod?&sslmode=verify-ca&sslfactory=org.postgresql.ssl.DefaultJavaSSLFactory&ApplicationName=ew-keycloak-prod&currentSchema=app_keycloak"
KC_DB_USERNAME=*****
KC_DB_SCHEMA=app_keycloak
KC_HOSTNAME=*****
JAVA_OPTS_APPEND=-Djgroups.dns.query=keycloak-headless.sso-keycloak-prod.svc.cluster.local

@ahus1
Copy link
Contributor
ahus1 commented Apr 29, 2025

@RadekSalay - you say you updated to 26.1.1, but the latest version of 26.1.x is 26.1.5. Please update to 26.1.5, or even better update to 26.2.1 which is the latest version of the 26.2.x release branch to reproduce the problem.

In the example you showed, it states offline_flag = 0, so this is not about the creation of an offline session (despite the table name, sorry for not yet having it renamed yet), but about an online session.

I debugged the code in Keycloak 26.2, and the results should be similar in KC 26.1. It is possible if you are already logged in, and then add another client session concurrently in two separate tabs of the same browser, where the requests are executed in two different Keycloak instances, that both of them try to add the client session to the one user session. If this then fails for one of the two calls due to the database constraint, it will repeat the attempt which will then succeed, and therefore you don't see a 500 in the logs and the user also sees no error.

So I consider this a log message that is unfortunately on the ERROR level, but then later recovered without a user impact. Assuming these are calls from a browser with a public client, you might be able to mitigate this by having sticky sessions in your loadbalancer where all requests from the same browser end on the same instance of Keycloak, which would then prevent a concurrent execution, but would run them consecutively thereby preventing the error.
Such concurrent requests could occur if a user for example opens a browser where all the tabs of a previous session become active at the same time.

Depending on the event logging you might have set up, you should be able to retrieve different possibly concurrent events connected to the user session 303f4597-5860-4eda-8d42-156b90c69be8 that is also stated in the error.

I'll prepare a change to prevent logging the error message if it is later recovered by disabling logging for org.hibernate.engine.jdbc.spi.SqlExceptionHelper. There will still be a warning by com.arjuna.ats.arjuna with the message ARJUNA012125: TwoPhaseCoordinator.beforeCompletion. With that being a warning and not an error, I hope this still mitigates most of the issue. As that class is logging also other messages, I am hesitant to disable logging for that class as well.

Please let me know your thoughts on this.

UPDATE: I think I can also filter out the warning ... let's see what the others say about my suggestion in the PR that is now connected to this issue.

ahus1 added a commit to ahus1/keycloak that referenced this issue Apr 29, 2025
… are handled

Closes keycloak#39173

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
@ahus1 ahus1 self-assigned this Apr 29, 2025
ahus1 added a commit to ahus1/keycloak that referenced this issue Apr 30, 2025
… are handled

Closes keycloak#39173

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
@RadekSalay
Copy link
Author
RadekSalay commented Apr 30, 2025

Hi, @ahus1
Thank you for your investigation and the newly created pull request. I'll schedule the update to version 26.2.x, my apologies for the confusion regarding 26.1.x.
Adding sticky sessions is a good idea. I’ll include it in our roadmap.
I'll try to examine the logs to see how the whole session behaved 303f4597-5860-4eda-8d42-156b90c69be8.

@ahus1 ahus1 marked this as a duplicate of #39354 May 5, 2025
mhajas pushed a commit that referenced this issue May 6, 2025
… are handled (#39344)

Closes #39173

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
ahus1 added a commit to ahus1/keycloak that referenced this issue May 6, 2025
… are handled (keycloak#39344)

Closes keycloak#39173

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
ahus1 added a commit to ahus1/keycloak that referenced this issue May 6, 2025
… are handled

Closes keycloak#39173

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
ahus1 added a commit that referenced this issue May 6, 2025
… are handled

Closes #39173

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
shawkins pushed a commit to shawkins/keycloak that referenced this issue May 7, 2025
… are handled (keycloak#39344)

Closes keycloak#39173

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
@ahus1
Copy link
Contributor
ahus1 commented May 12, 2025

@sares001 - this issue was resolved for Keycloak 26.2.4 which was released last week. If the problem persists after you upgraded to 26.2.4 or later, please open a new bug issue.

With the 26.1.x release of Keycloak this is a know behavior, although not visible to end users as Keycloak is retrying internally.

@sares001
Copy link

@ahus1 Thank you, I will upgrade to 26.2.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
0