8000 bgpv1 + bgpv2: Fix DiffStore to work with multiple server instances by rastislavs · Pull Request #34177 · cilium/cilium · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

bgpv1 + bgpv2: Fix DiffStore to work with multiple server instances #34177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 9, 2024

Conversation

rastislavs
Copy link
Contributor
@rastislavs rastislavs commented Aug 5, 2024

As the same DiffStore instance is used to diff services when reconciling for multiple server instances (virtual routers), we need to differentiate the diff per instance (and to be future-proof per reconciler as well) for diff to work properly.

Fixes: #33076

BGPv1 + BGPv2: Fix incorrect service reconciliation in setups with multiple BGP instances (virtual routers)

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Aug 5, 2024
@rastislavs rastislavs changed the title bgpv1: Fix DiffStore to work with multiple server instances bgpv1 + bgpv2: Fix DiffStore to work with multiple server instances Aug 5, 2024
@rastislavs rastislavs added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. area/bgp Impacts the Border Gateway Protocol feature. labels Aug 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Aug 5, 2024
@rastislavs rastislavs force-pushed the diffstore-instance-fix branch 3 times, most recently from b918268 to 25f9fea Compare August 5, 2024 10:22
@rastislavs
Copy link
Contributor Author

/test

@rastislavs rastislavs added the needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch label Aug 5, 2024
Copy link
Contributor
@harsimran-pabla harsimran-pabla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, lgtm 🚀

@rastislavs rastislavs marked this pull request as ready for review August 5, 2024 13:27
@rastislavs rastislavs requested a review from a team as a code owner August 5, 2024 13:27
@rastislavs rastislavs force-pushed the diffstore-instance-fix branch from 25f9fea to 44cebde Compare August 7, 2024 14:29
@rastislavs
Copy link
Contributor Author
rastislavs commented Aug 7, 2024

@YutaroHayakawa , @harsimran-pabla changes since the previous version:

  • The DiffStore got two new methods, InitDiff(callerID) and CleanupDiff(callerID).
  • The InitDiff is necessary so that we do not miss events between the full reconcile and the initial caller-specific Diff. We call InitDiff upon full reconciliation in the reconcilers, so that Diff returns changes since the last full or partial reconcile.
  • In BGPv1 we were doing partial reconciliation even during the first reconcile. It wasn't a problem, since all changes were in the diffstore, but now we always start with a full reconcile.
  • To wire the CleanupDiff call, new reconciler method Cleanup(instance) has been introduced for both BGPv1 and v2.
  • Cleanup(instance) is NOT called when the instance is re-created by the preflight reconcilers. It is not a problem ATM, as that always causes a full reconciliation, during which InitDiff is called and we start with a clean diff in the reconciler.

However, we need to think about how we can call Cleanup(instance) for all reconcilers when the instance is re-created from the preflight reconcilers if we want to remove the common reconciler metadata in the future. I was thinking of removing the preflight "reconcilers" and moving their logic into manager, or having some other way to tell the manager that the instance needs to be re-created. Right now it is kind of weird that these reconcilers are modifying manager-specific stuff (instance and me 8000 tadata).

@YutaroHayakawa
Copy link
Member
YutaroHayakawa commented Aug 8, 2024

Cleanup(instance) is NOT called when the instance is re-created by the preflight reconcilers. It is not a problem ATM, as that always causes a full reconciliation, during which InitDiff is called and we start with a clean diff in the reconciler.

How about introducing Init(instance) which is called when the instance is created instead of lazily call it on the initial reconciliation?

@rastislavs
Copy link
Contributor Author

How about introducing Init(instance) which is called when the instance is created instead of lazily call it on the initial reconciliation?

Yes I think introducing Init(instance) to the reconciler API makes sense, that should also address @harsimran-pabla 's concern about potentially calling partial reconciliation before full reconciliation (that would not be a problem anymore).

However, as we do not have an easy way to call Init(instance) of the other reconcilers from the preflight reconciler, some further refactoring of the preflight logic will anyway be needed to allow removing InitDiff(callerID) call from the full reconciliation.

@rastislavs rastislavs force-pushed the diffstore-instance-fix branch 3 times, most recently from f0f3226 to fa5bc88 Compare August 8, 2024 13:52
@rastislavs
Copy link
Contributor Author

@YutaroHayakawa , @harsimran-pabla I have added Init(instance) method as well, and included ASN in the "instance" structs (otherwise it was not available during Init).

Copy link
Contributor
@harsimran-pabla harsimran-pabla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, lgtm!

Copy link
Member
@YutaroHayakawa YutaroHayakawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! A few non-blocking comments.

As the same DiffStore instance is used to diff services when
reconciling for multiple server instances (virtual routers),
we need to differentiate the diff per instance (and to be
future-proof per reconciler as well) for diff to work properly.

This introduces a new callerID argument to the Diff method of the
DiffStore to provide caller-specific diff as well as InitDiff(callerID)
and CleanupDiff(callerID) methods to initiate and cleanup
caller-specific diffs. To wire this properly in the service reconcilers,
new reconciler methods Init(instance) and Cleanup(instance) have been
added to the reconciler API.

Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
@rastislavs rastislavs force-pushed the diffstore-instance-fix branch from fa5bc88 to 26ce191 Compare August 9, 2024 06:56
@rastislavs
Copy link
Contributor Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 9, 2024
@julianwiedmann julianwiedmann added this pull request to the merge queue Aug 9, 2024
Merged via the queue into cilium:main with commit 4a9d01c Aug 9, 2024
64 of 66 checks passed
@jschwinger233 jschwinger233 mentioned this pull request Aug 12, 2024
15 tasks
@jschwinger233 jschwinger233 added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Aug 12, 2024
@github-actions github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Aug 13, 2024
@rastislavs rastislavs added the backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. label Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bgp Impacts the Border Gateway Protocol feature. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects 4DE9
None yet
Development

Successfully merging this pull request may close these issues.

LoadBalancer IPs don't seem to be being advertised (pod cidr works)
5 participants
0