Support asynchronous logging #38094

mabartos · 2025-03-13T16:19:48Z

Closes Support Asynchronous logging #38578
Closes Support Syslog async properties #28851

@vmuzikar @shawkins @Pepo48 Could you please check this? Thanks!

8000

keycloak-github-bot · 2025-03-13T17:55:21Z

Unreported flaky test detected

If the flaky tests below are affected by the changes, please review and update the changes accordingly. Otherwise, a maintainer should report the flaky tests prior to merging the PR.

org.keycloak.testsuite.authz.EntitlementAPITest#testTokenExpirationRenewalWhenIssuingTokens

Keycloak CI - Base IT (3)

java.lang.AssertionError: expected:<301> but was:<300>
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.failNotEquals(Assert.java:835)
	at org.junit.Assert.assertEquals(Assert.java:647)
	at org.junit.Assert.assertEquals(Assert.java:633)
...

Report flaky test

keycloak-github-bot

Unreported flaky test detected, please review

docs/documentation/release_notes/topics/26_2_0.adoc

shawkins

Looks good. Would you also consider adding log-async and similar to be defaults across all the handlers?

docs/guides/server/logging.adoc

mabartos · 2025-04-09T10:08:36Z

@shawkins Thanks for the review!

Would you also consider adding log-async and similar to be defaults across all the handlers?

You mean having the async logging enabled by default for all handlers? If we include it to Keycloak 26.2, IMHO, we might not do it by default for now. We can discuss more for the next releases. WDYT?

mabartos · 2025-04-09T10:16:59Z

@shawkins Ready for another review once you have time.

shawkins · 2025-04-09T11:38:33Z

@shawkins Thanks for the review!

Would you also consider adding log-async and similar to be defaults across all the handlers?

You mean having the async logging enabled by default for all handlers? If we include it to Keycloak 26.2, IMHO, we might not do it by default for now. We can discuss more for the next releases. WDYT?

Not on by default, just a way to set async behavior across all handlers similar to how we have both log-level and log-handler-level.

mabartos · 2025-04-09T12:38:42Z

Not on by default, just a way to set async behavior across all handlers similar to how we have both log-level and log-handler-level.

@shawkins Ahh, got it. Yes, IMO that might be a good enhancement, I'll include it there. Thanks!

mabartos · 2025-04-10T14:47:03Z

@keycloak/cloud-native Ready for a review. Thanks!

vmuzikar

@mabartos That's a nice work, thank you! :)

vmuzikar · 2025-04-11T16:52:39Z

docs/guides/server/logging.adoc

+
+These properties are available only when asynchronous logging is enabled for these specific log handlers.
+
+==== Change overflow strategy


With Keycloak being a security product and log messages containing potentially crucial info, IMHO this should not be configurable – discarding messages should not be an option.

@vmuzikar Thanks for this consideration.

I do not think we should force it - I could imagine a situation that users/customers leverage a specific log handler for this kind of source of truth (like syslog and file) and have the other log handlers (like console) as nice-to-have, but rather preferring performance.

I'd vote to have it here as the default strategy is BLOCK anyway, which means that no log record is lost/discarded. And users/customers can always change the configuration to comply with their needs.

Is it ok for you? Or I can even add a small comment on that.

If we just want to keep things opinionated, then only supporting BLOCK is fine - it's no worse than our current behavior.

Ok guys, if you prefer not to include it, we don't have to - at least one option less to manage and support :)

If users/customers want to do some additional performance improvements, they are on their own with these strategies. Moreover, they can basically increase the queue size if necessary (which is supported) in normal cases.

Ok, let's remove it.

@mabartos If I understand correctly, with async there's a dedicated thread for logging. Is that correct? While with sync, each worker thread that handles the incoming HTTP request can send stuff to the log. Does it mean that with sync there could be potentially more threads logging at the same time? That means logging doesn't scale with worker threads?

If I understand correctly, with async there's a dedicated thread for logging. Is that correct?

@vmuzikar Yes, there's a dedicated thread for every log handler (enabled + async) - so for our case, it might be 3 more threads - for every log handler.

Does it mean that with sync there could be potentially more threads logging at the same time? That means logging doesn't scale with worker threads?

(AFAIK) Yes, in sync mode, every worker thread handles the writing to the log handler on its own, so multiple threads access the same shared resource. In that case, writing to the char streams (as might be console, file, syslog, sockets, ...) as a blocking op needs to take care of synchronization, thus using locks. When there are multiple worker threads and the logging is frequent (like trace), there might be a big latency issue as these worker threads are waiting for the resource to be unlocked.

In that case, the async mode works better as it does not block these worker threads and keeps a queue for every log handler.

cc: @dmlloyd At least that's my understanding. Could you verify these facts above? Thanks!

@vmuzikar Do you want me to add more details to the docs? I'm not sure if it's worth including some more impl details to the docs, though.

Async logging might even mitigate/solve the issue with blocked threads when using virtual threads described in #37266.

@mabartos Thanks for the clarification.

As I'm thinking more and more about that, I wonder what particular use case are we trying to solve by the async logging. With blocking strategy enforced, we're basically creating a "buffer" for log messages (the queue). That solves peaks for logging system – short term situation where many log messages need to be processed. But it doesn't solve the situation when there's constantly high load on the logging system.

Now when would we expect the peak to happen in Keycloak's case? Logging rush hours? :) DoS attack? Still, the buffer/queue would help only short term.

Sorry for the push back. :) Just, I want to be sure before we add additional options, there's actually a real use case that would significantly benefit from them.

Do you want me to add more details to the docs?

It would be good to document the risk of messages being lost on unexpected exit. And potential situations when this could be useful as I mentioned above in this comment.

As I'm thinking more and more about that, I wonder what particular use case are we trying to solve by the async logging.

The potential benefits:

For anyone who is exhausting their worker threads, or potentially want to maintain a smaller pool, this can help (if logging is contributing significantly enough).

lower latency response times

Removes the risk of starvation from pinning virtual threads (what instigated this issue / pr)

Now when would we expect the peak to happen in Keycloak's case? Logging rush hours? :) DoS attack? Still, the buffer/queue would help only short term.

When blocking that is correct - there is an eventual horizon problem. Ideally that amount of flooding should only occur with non-production like settings. Is some kind of log or metric that indicates log queues are full / blocking? That would signal to users that their current logging settings are detremental to performance - which I don't believe they would currently infer without profiling.

@shawkins Thanks for more context.

It's a good idea that we could reflect the queue in the metrics. But that's a follow-up.

Ok, let's go ahead with the PR.

vmuzikar · 2025-04-15T13:16:05Z

docs/guides/server/logging.adoc

+
+If the queue is already full, it blocks the main thread and waits for free space in the queue.
+
+NOTE: Be aware that enabling asynchronous logging might bring some **additional memory overhead** due to the additional separate thread and the inner queue.


Just wanted to highlight here what was mentioned in the other thread that it'd be good to mention the risk of log messages loss in case of unexpected shutdown.

Closes keycloak#38578 Closes keycloak#28851 Signed-off-by: Martin Bartoš <mabartos@redhat.com>

mabartos · 2025-04-16T12:59:36Z

@shawkins @vmuzikar Ready for review. Thanks!

shawkins

LGTM, nice work @mabartos

vmuzikar

@mabartos LGTM, thank you. Nice job. :)

keycloak-github-bot bot added the flaky-test label Mar 13, 2025

keycloak-github-bot bot reviewed Mar 13, 2025

View reviewed changes

mabartos force-pushed the asyncLogging branch from bfc0497 to ec0e306 Compare March 20, 2025 09:48

mabartos force-pushed the asyncLogging branch from ec0e306 to 6eeb6a3 Compare April 8, 2025 15:08

mabartos commented Apr 8, 2025

View reviewed changes

docs/documentation/release_notes/topics/26_2_0.adoc Outdated Show resolved Hide resolved

mabartos marked this pull request as ready for review April 8, 2025 15:11

mabartos requested review from a team as code owners April 8, 2025 15:11

keycloak-github-bot bot added the team/cloud-native label Apr 8, 2025

mabartos requested review from shawkins, Pepo48 and vmuzikar April 8, 2025 15:12

shawkins reviewed Apr 8, 2025

View reviewed changes

docs/guides/server/logging.adoc Outdated Show resolved Hide resolved

mabartos force-pushed the asyncLogging branch from 6eeb6a3 to 8cafa35 Compare April 9, 2025 10:15

mabartos force-pushed the asyncLogging branch 2 times, most recently from b8acfcc to 779c254 Compare April 10, 2025 14:46

vmuzikar reviewed Apr 11, 2025

View reviewed changes

mabartos force-pushed the asyncLogging branch 2 times, most recently from 68e9aa4 to da384a8 Compare April 14, 2025 12:37

mabartos linked an issue Apr 14, 2025 that may be closed by this pull request

[Docs] Broken link in ExternalLinksTest for importmap #38930

Closed

2 tasks

This was referenced Apr 15, 2025

Introducing IdpLinkAction as AIA to replace client-initiated account … #38952

Merged

Generate random passwords for imported users in forms and webauthn tests #38935

Merged

mabartos removed a link to an issue Apr 15, 2025

[Docs] Broken link in ExternalLinksTest for importmap #38930

Closed

2 tasks

mabartos force-pushed the asyncLogging branch from da384a8 to dad6b5f Compare April 15, 2025 09:07

vmuzikar reviewed Apr 15, 2025

View reviewed changes

mabartos force-pushed the asyncLogging branch from dad6b5f to df26402 Compare April 16, 2025 12:55

Support asynchronous logging

39a4827

Closes keycloak#38578 Closes keycloak#28851 Signed-off-by: Martin Bartoš <mabartos@redhat.com>

mabartos force-pushed the asyncLogging branch from df26402 to 39a4827 Compare April 16, 2025 12:58

shawkins approved these changes Apr 16, 2025

View reviewed changes

vmuzikar approved these changes Apr 16, 2025

View reviewed changes

vmuzikar enabled auto-merge (squash) April 16, 2025 14:37

vmuzikar merged commit 60fb7a5 into keycloak:main Apr 16, 2025
81 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support asynchronous logging #38094

Support asynchronous logging #38094

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!


		These properties are available only when asynchronous logging is enabled for these specific log handlers.

		==== Change overflow strategy


		If the queue is already full, it blocks the main thread and waits for free space in the queue.

		NOTE: Be aware that enabling asynchronous logging might bring some additional memory overhead due to the additional separate thread and the inner queue.

Support asynchronous logging #38094

Support asynchronous logging #38094

Uh oh!

Conversation

Uh oh!

Uh oh!

Unreported flaky test detected

org.keycloak.testsuite.authz.EntitlementAPITest#testTokenExpirationRenewalWhenIssuingTokens

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!