Keycloak 26.2.0 UI Performance Degradation #39023

VonNao · 2025-04-16T12:51:51Z

Before reporting an issue

I have read and understood the above terms for submitting issues, and I understand that my issue may be closed without action if I do not follow them.

Area

admin/ui

Describe the bug

We have Deployt keycloak via the Keycloak Operator on our Cluster. After updateing from Keycloak 26.1.2 to 26.2.0 Keycloak seems to be kinda slow. UI Operations aswell aus Authentication seem to be a slower.

For our Setup:

We got around 150 LDAP federations against our Active directory.
We deployed it on k8s via the operator and got cnpg as postgres-cluster. Metrics show that none of the systems is anywhere near utilization.

Version

26.2.0

Regression

The issue is a regression

Expected behavior

Responsiveness as of patch 26.1.x

Actual behavior

Degredation is responsivness in regards of ui oprations.

How to Reproduce?

Install keycloak 26.2.0 manage groups etc. via Webinterface

Anything else?

No response

keycloak-github-bot · 2025-04-16T15:32:49Z

Thanks for reporting this issue, but there is insufficient information or lack of steps to reproduce.

Please provide additional details, otherwise this issue will be automatically closed within 14 days.

shawkins · 2025-04-16T15:34:08Z

Can you provide a reproducer? If not, can you provide timings for specific UI screens / actions to highlight level of degredation?

ahus1 · 2025-04-16T20:50:05Z

Could you try to enable tracing as described in https://www.keycloak.org/observability/tracing and provide a trace? If you are using Jaeger, you could either provide a screenshot, or export the trace as a JSON.

As an alternative, you could provide a thread dump of a Keycloak node under load, still that is usually less helpful.

VonNao · 2025-04-23T14:07:03Z

Sorry for the late response. I could narrow the "problem" down. The Slugish ui only is noticable in our keycloak operator deployments. Single node dev instances work as fast as ever.

The infrastructure around the k8s operator deployment did not change from 26.1 to 26.2 is it possible that session affinity could be a thing?

Our Ingress Controller ist Nginx-Ingress with the following annotations.

nginx.ingress.kubernetes.io/backend-protocol: "https"
cert-manager.io/cluster-issuer: "letsencrypt-production"
cert-manager.io/private-key-rotation-policy: Always
cert-manager.io/private-key-algorithm: "ECDSA"
cert-manager.io/private-key-size: "384"

VonNao · 2025-04-23T14:08:41Z

Could you try to enable tracing as described in https://www.keycloak.org/observability/tracing and provide a trace? If you are using Jaeger, you could either provide a screenshot, or export the trace as a JSON.

As an alternative, you could provide a thread dump of a Keycloak node under load, still that is usually less helpful.

Sorry forgot to anwser. Right now sadly we got no jaeger deployed. Our go live is in arround 1 month from now on so we got no real load on the systems. The mostly idle

ahus1 · 2025-04-23T16:21:36Z

@VonNao - As we describe in our docs, Jaeger could be run as a Pod on Kubernetes like any other, similar on how you deploy Keycloak today. While in a production environment you would want all applications to send their traces to Jaeger, it might be enough for a test environment for just Keycloak to send its logs to this test instance of Jaeger.

ssilvert · 2025-05-02T20:02:37Z

@VonNao @ahus1 Can this one be closed or moved to a different team? Or do we think there is a UI bug?

ssilvert · 2025-05-06T13:16:40Z

Closing due to lack of recent interest. We can reopen if needed.

VonNao · 2025-05-06T15:48:02Z

@ssilvert Sorry for the late response. Jaeger is up and running. Following are some screenshots from the traces. Tested with normal ser actions form perspective of a administrator. Some Events just need a really long Time 4s+ to finish. As in another issue mentioned when scaling down to one instance the performance is like 26.1 and earlier.

This last Screenshot is from a login test from a user

I also added our external monitoring as reference. We scaled down to 1 replica at around 10:00am. After that latency went back to normal.

Since with one instance there is no problem i would guess that it has something to do with the infinispan cluster?

If you need more information hook me up.

ahus1 · 2025-05-07T12:19:12Z

Preliminary analysis how this caused:

KC Operator now enables a network policy by default
Kubernetes distributed cache policy is in this version probing not only port 7800, but also the port 7801-7810 (see JGroups errors when running a containerized Keycloak in Strict FIPS mode and with Istio #39454)

This leads to the following symptoms:

JGroups message bundler will issue connects, and they will time out
The connect is blocking, and will delay any other requests in the queue
Due to that, you might see delays from 1-7 seconds.
This happens only when JGroups is reevaluating all members of the cluster, which seems to be every ~20 seconds

Possible remedies (to be verified):

Switch to a different bundler, so the bundler is not blocked (-Djgroups.bundler.type=per-destination)
Once this change is in, we see connect exception that we probably swallowed before.
Instead of kubernetes (default for the Operator), use jdbc-ping as it won't probe the other ports. When using a Keycloak CR, this would be
```
additionalOptions:
   - name: cache-stack
     value: jdbc-ping 
```

VonNao · 2025-05-07T12:32:19Z

Preliminary analysis how this caused:

* KC Operator now enables a network policy by default

* Kubernetes distributed cache policy is in this version probing not only port 7800, but also the port 7801-7810 (see [JGroups errors when running a containerized Keycloak in Strict FIPS mode and with Istio #39454](https://github.com/keycloak/keycloak/issues/39454))

This leads to the following symptoms:

* JGroups message bundler will issue connects, and they will time out

* The connect is blocking, and will delay any other requests in the queue

* Due to that, you might see delays from 1-7 seconds.

* This happens only when JGroups is reevaluating all members of the cluster, which seems to be every ~20 seconds

Possible remedies (to be verified):

* Switch to a different bundler, so the bundler is not blocked (`-Djgroups.bundler.type=per-destination`)
  Once this change is in, we see connect exception that we probably swallowed before.

* Instead of `kubernetes` (default for the Operator), use `jdbc-ping` as it won't probe the other ports. When using a Keycloak CR, this would be
  ```
  additionalOptions:
     - name: cache-stack
       value: jdbc-ping 
  ```

Will try jdbc-ping in our testcluster

wkloucek · 2025-05-07T12:50:13Z

Instead of kubernetes (default for the Operator), use jdbc-ping as it won't probe the other ports. When using a Keycloak CR, this would be
additionalOptions:
   - name: cache-stack
     value: jdbc-ping 

I can confirm the fix (workaround?). Thanks a lot for the pointer!

Before setting it, we saw gaps on the request timeline during loadtesting, because the requests were just hanging:

Now we have a constant request rate caused by our loadtest and answered in time by Keycloak:

VonNao · 2025-05-07T12:56:00Z

Can also confirm worked perfectly. Is there any advantage to kubernestes cache-stack management vs jdbc-ping?

ahus1 · 2025-05-07T13:02:35Z

The kubernetes stack is more battle tested. See #39454 (comment) for a longer description.

As a lot of people are using kubernetes and without much problems before, and eventually we might go for jdbc-ping - but not in a patch release.

Fixes keycloak#39023 Fixes keycloak#39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

Fixes #39023 Fixes #39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

ahus1 · 2025-05-07T19:36:42Z

Added a follow-up issue for the per-destination bundler for 26.3: #39545

ahus1 · 2025-05-08T19:51:41Z

KC 26.2.4 was released today that included a fix.

Fixes keycloak#39023 Fixes keycloak#39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

Fixes #39023 Fixes #39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

ahus1 · 2025-05-12T06:54:53Z

The versions affected by this: ISPN 15.0.14.Final and ISPN 15.0.13.Final

Due to backports, 26.0.11 was affected as well.

VonNao added kind/bug Categorizes a PR related to a bug status/triage labels Apr 16, 2025

keycloak-github-bot bot added area/admin/ui team/ui labels Apr 16, 2025

8000 VonNao changed the title ~~Keycloak 26.2.0~~ Keycloak 26.2.0 UI Performance degredation Apr 16, 2025

VonNao changed the title ~~Keycloak 26.2.0 UI Performance degredation~~ Keycloak 26.2.0 UI Performance Degradation Apr 16, 2025

shawkins added the action/missing-info label Apr 16, 2025

keycloak-github-bot bot added status/auto-expire status/missing-information and removed status/triage action/missing-info labels Apr 16, 2025

keycloak-github-bot bot added status/triage and removed status/missing-information status/auto-expire labels Apr 23, 2025

chuckbutkus mentioned this issue Apr 28, 2025

Performance with multiple pods in kubernetes #39304

Closed

2 tasks

ssilvert closed this as completed May 6, 2025

keycloak-github-bot bot removed the status/triage label May 6, 2025

ahus1 reopened this May 6, 2025

keycloak-github-bot bot added status/reopened status/triage labels May 6, 2025

ahus1 self-assigned this May 6, 2025

keycloak-github-bot bot added this to the 26.2.0 milestone May 7, 2025

ahus1 marked this as a duplicate of #39304 May 7, 2025

ahus1 mentioned this issue May 7, 2025

JGroups errors when running a containerized Keycloak in Strict FIPS mode and with Istio #39454

Closed

2 tasks

pruivo added a commit to pruivo/keycloak that referenced this issue May 7, 2025

Patch kubernetes stack with port_range=0

75b58a0

Fixes keycloak#39023 Fixes keycloak#39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

pruivo added a commit to pruivo/keycloak that referenced this issue May 7, 2025

Patch kubernetes stack with port_range=0

da4bdee

Fixes keycloak#39023 Fixes keycloak#39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

pruivo mentioned this issue May 7, 2025

Patch kubernetes stack with port_range=0 #39536

Merged

pruivo added a commit to pruivo/keycloak that referenced this issue May 7, 2025

Patch kubernetes stack with port_range=0

aa074c1

Fixes keycloak#39023 Fixes keycloak#39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

pruivo mentioned this issue May 7, 2025

Patch kubernetes stack with port_range=0 #39537

Merged

ahus1 closed this as completed in #39536 May 7, 2025

ahus1 pushed a commit that referenced this issue May 7, 2025

Patch kubernetes stack with port_range=0

a263e6c

Fixes #39023 Fixes #39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

github-actions bot added the release/26.3.0 label May 7, 2025

ahus1 added the backport/26.2 label May 7, 2025

ahus1 pushed a commit that referenced this issue May 7, 2025

Patch kubernetes stack with port_range=0

5599836

Fixes #39023 Fixes #39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

github-actions bot added release/26.2.4 and removed backport/26.2 labels May 7, 2025

ahus1 mentioned this issue May 7, 2025

JGroups: Switch to "per-destination" bundler for all TCP based stacks #39545

Open

ahus1 added the backport/26.0 label May 9, 2025

pruivo added a commit to pruivo/keycloak that referenced this issue May 9, 2025

Patch kubernetes stack with port_range=0

6117ea9

Fixes keycloak#39023 Fixes keycloak#39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

pruivo mentioned this issue May 9, 2025

Patch kubernetes stack with port_range=0 #39603

Merged

ahus1 pushed a commit that referenced this issue May 9, 2025

Patch kubernetes stack with port_range=0

1f6851e

Fixes #39023 Fixes #39454 Signed-off-by: Pedro Ruivo <pruivo@redhat.com>

github-actions bot added release/26.0.12 and removed backport/26.0 labels May 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Keycloak 26.2.0 UI Performance Degradation #39023

Keycloak 26.2.0 UI Performance Degradation #39023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Keycloak 26.2.0 UI Performance Degradation #39023

Keycloak 26.2.0 UI Performance Degradation #39023

Comments

Before reporting an issue

Area

Describe the bug

Version

Regression

Expected behavior

Actual behavior

How to Reproduce?

Anything else?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!