8000 S3 Dependency for Synchronous Replica Recovery During Failover · Issue #1089 · zalando/spilo · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

S3 Dependency for Synchronous Replica Recovery During Failover #1089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
elesueur opened this issue Feb 26, 2025 · 0 comments
Open

S3 Dependency for Synchronous Replica Recovery During Failover #1089

elesueur opened this issue Feb 26, 2025 · 0 comments

Comments

@elesueur
Copy link

We have a product configuration with Postgres configured with 2 synchronous replicas (3 postgres in total) and synchronous_mode_strict enabled.

The 3 Postgres pods are spread across 3 nodes (virtual machines) in a Kubernetes cluster.

We have configured WAL-G to backup to an internal S3 service backed by SeaweedFS and WAL files are pushed there every 15 mins. There are various pieces to SeaweedFS, and they can be randomly distributed across the 3 cluster nodes.

We are testing failover time to recovery (PG to be writable) when a node is lost. When a node is lost, we see promotion of a synchronous replica to primary require calling to S3 (wal-g wal-fetch). If S3 is unresponsive, the promotion is stalled until it becomes available to read the WAL archives.

The S3 service is not properly HA yet, and may be down for several minutes when one of the cluster nodes is lost. There is a significant cost to making SeaweedFS HA (additional pods consuming memory/cpu and data replication across multiple disks).

We are trying to understand if there are strong reasons why access to S3 is needed to promote synchronous replicas in the event of a failover, and whether it can be corrected by making a change to how Spilo accesses WAL files during recovery.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0