8000 fix(groups): Group Flush should handle MessageSizeTooLargeError by jose-sequeira · Pull Request #33585 · PostHog/posthog · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix(groups): Group Flush should handle MessageSizeTooLargeError #33585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 16, 2025

Conversation

jose-sequeira
Copy link
Contributor

Important

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Problem

We had an issue in production, as the flush logic did not handle KafkaMessageTooLarge errors, which are non-transient and should produce an ingestion warning message. This PR fixes this

Changes

Did you write or update any docs for this change?

How did you test this code?

Copy link
Contributor
@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Added error handling for MessageSizeTooLarge errors in group flush logic to prevent infinite retries on oversized Kafka messages.

  • Modified BatchWritingGroupStore to properly handle MessageSizeTooLarge errors during group updates, generating ingestion warnings instead of retrying
  • Enhanced promiseRetry utility to support non-retriable errors through new nonRetriableErrorTypes parameter
  • Added test coverage for MessageSizeTooLarge error scenarios in group property updates
  • Fixed production issue where large group properties could cause system instability due to continuous retry attempts

3 files reviewed, 2 comments
Edit PR Review Bot Settings | Greptile

logger.debug('🚫', `failed ${name}, non-retriable error encountered`, { error })
return Promise.reject(error)
}

logger.debug('🔁', `failed ${name}, retrying`, { error })
const nextInterval = Math.min(
retryIntervalMillis * defaultRetryConfig.BACKOFF_FACTOR,
defaultRetryConfig.MAX_INTERVAL
)
await new Promise((resolve) => setTimeout(resolve, retryIntervalMillis))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Use the sleep utility function defined in utils.ts for consistency instead of raw Promise timeout

Suggested change
await new Promise((resolve) => setTimeout(resolve, retryIntervalMillis)< 8000 span class="pl-kos">)
await sleep(retryIntervalMillis)

@jose-sequeira jose-sequeira changed the title Group Flush should handle MessageSizeTooLargeError fix(groups): Group Flush should handle MessageSizeTooLargeError Jun 12, 2025
@jose-sequeira jose-sequeira requested review from a team June 16, 2025 09:27
Copy link
Contributor
@pl pl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment, but LGTM, feel free to merge.

@jose-sequeira
Copy link
Contributor Author

Small comment, but LGTM, feel free to merge.

Think you may have not posted the comment 😅

): Promise<T> {
if (retries <= 0) {
logger.error('🚨', `Final retry failure for ${name}`, { previousError })
return Promise.reject(previousError)
}
return fn().catch(async (error) => {
// Check if error is non-retriable
if (nonRetriableErrorTypes && nonRetriableErrorTypes.some((ErrorType) => error instanceof ErrorType)) {
logger.debug('🚫', `failed ${name}, non-retriable error encountered`, { error })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: wdyt about bumping it to the warn level? Could be useful to see those errors while troubleshooting

@pl
Copy link
Contributor
pl commented Jun 16, 2025

Small comment, but LGTM, feel free to merge.

Think you may have not posted the comment 😅

Oops, don't know where it went - reposted 😅

@jose-sequeira jose-sequeira merged commit 2cf5c5f into master Jun 16, 2025
98 checks passed
@jose-sequeira jose-sequeira deleted the group-handle-big-properties branch June 16, 2025 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0