8000 Add support to offload PVC replication management outside of Ramen by ShyamsundarR · Pull Request #2028 · RamenDR/ramen · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add support to offload PVC replication management outside of Ramen #2028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ShyamsundarR
Copy link
Member
TL;DR: Add support to not manage replication of PVCs either via VolumeReplication or Volsync methods in Ramen and instead let users decide on means to replicate the PVCs and reflect PVC state using a label for Ramen orchestration progress. This is for use with storage backends that support replication schemes that either are controlled outside the scope of k8s or controlled via other k8s APIs. It does make the process more manual though.

TODO:

  • Document the feature
  • PVC cleanup for restore should remove the PVC state label
  • Shutoff metrics for offloaded DRPCs
  • Should replication interval also be reflected as an annotation/label on the PVC?
  • Run a sample test in drenv, by marking the RBD class as offloaded and test the workflow

If a StorageClass on both peer clusters reports as offloaded via the
"ramendr.openshift.io/offloaded" label on them, the peerClass will report
offload as true for further offload based processing.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
If a common async SC exists, denoted by different storageID labels on the SC
across the peers, and both have the offloaded label set, then mark the
peerClass as an offloaded peer class.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
If VRG peerClass requires an update, ensure the update is only done when
switching from an existing peerClass with the same offload property.

Switching across different offloaded property is not supported, and would
require that the workload be DR disabled and enabled again for DR to switch
to the newer peerClass property.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Process list of PVCs to protect as either a complete set of offloaded PVCs
or a split between VR/VS (as before).

offloaded PVCs are still processed as part of the VR stack, and hence are
appended to the VR list of PVCs to protect.

Further, the VRG is set to offload as true, for future inspection on method
to process the VRG as.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Offloaded VRGs would not have associated VR (or VGR) created for the same.
These are processed based on PVC labels that users set to denote current state
of the PVC and hence the derived protected PVC conditions for the same.

An offloaded PVC needs to be marked with the appropriate ramen label
(replication-offload-state), with an appropriate value (Primary/Secondary/Error).

Reconciliation for offloaded PVCs is done in the VR code path, like the case was
for Sync PVCs.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Test runs through DR protecting using a VRG, to demoting it
as secondary and ensuring secondary semantics are also met.

Includes some common changes to enable reuse of some helper
functions.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
@BenamarMk
Copy link
Member
BenamarMk commented May 13, 2025

A general comment in regards to code organization. I don't know if we have the time to do this, but the point is that the current structure with vrg_volrep and vrg_volsync is quite specific. They each focus on managing replication using VolumeReplication and VolSync respectively. However, as we introduce concepts like offloading, the logic in vrg_volrep starts becoming overloaded with conditionals (if/else), which adds complexity and reduces clarity. The same concern applies as we continue to support both sync and async replication.

A cleaner and more modular organization would be:
volumereplicationgroup_controller: Handles general reconciliation logic and dispatches work based on replication type.
vrg_volrep: Manages PVCs using VolumeReplication.
vrg_volsync: Manages PVCs using VolSync.
vrg_offload: Handles offloaded replication scenarios.
vrg_metro: Handles sync logic

Maybe for sync logic is too late, and perhaps something to think about for the future clean-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0