8000 Feature/939208 fix persistence by lironleizer · Pull Request #309 · conductor-oss/conductor · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Feature/939208 fix persistence #309

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lironleizer
< 8000 div class= "timeline-comment-group js-minimizable-comment-group js-targetable-element TimelineItem-body my-0" id="issue-2659139585">
Copy link

Pull Request type: Bugfix

Changes in this PR: Improve Workflow Reliability during Gradual Restarts with Optional Redis Idempotence

Description: This PR addresses an issue encountered when Conductor is deployed across multiple machines, particularly during a gradual restart sequence. Previously, if a workflow event was consumed from RabbitMQ and saved to Redis but had not yet been processed when a machine shut down, the event would reappear in the RabbitMQ queue. At this point, another machine could re-consume the event, but since the original entry in Redis had not yet expired, Conductor would incorrectly assume that this event was already processed by another machine, resulting in an early acknowledgment to RabbitMQ. This led to the loss of events and untriggered workflows, impacting reliability and predictability.

The Fix: This update introduces a new property, eventExecutionPersistenceEnabled, which offers greater flexibility and control over event handling. When set to false, this property disables the Redis persistence mechanism for idempotence, delegating responsibility for deduplication and idempotence to business services. By default, eventExecutionPersistenceEnabled is set to true, preserving the current behavior for backward compatibility.

Key Benefits:

Enhanced Reliability: Prevents workflow event loss during machine restarts by allowing business services to handle idempotence directly.
Backward Compatibility: Maintains the existing Redis persistence behavior by default, ensuring seamless integration for existing deployments.
Greater Control: Empowers operators to manage idempotence in ways that best suit their distributed environments.
This update brings added robustness to distributed setups and ensures workflows are processed reliably, even during server transitions.

Liron Leizerovich and others added 2 commits November 13, 2024 17:35
…uplicate message delivery persistence mechanism.
…uplicate message delivery persistence mechanism.
@lironleizer
Copy link
Author

@dilip-lukose @v1r3n
please advise.

@@ -73,6 +73,9 @@ conductor.default-event-queue.type=sqs
conductor.app.workflow-execution-lock-enabled=false
conductor.workflow-execution-lock.type=noop_lock

#Used to enable/disable the duplicate message delivery persistence
conductor.app.eventExecutionPersistenceEnabled=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep default value to false?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to preserve backwards compatibility this should be false.

}

public void updateEventExecution(EventExecution eventExecution) {
executionDAOFacade.updateEventExecution(eventExecution);
if (isEventExecutionPersistenceEnabled) {
Copy link
Contributor
@manan164 manan164 Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lironleizer , Why are we only updating event execution when the eventExecutionPersistenceEnabled is set to true. We should update in any case. Please help me understand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0