8000 Report estimated expiry timers for connection-based FQDN entries by joestringer · Pull Request #32013 · cilium/cilium · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Report estimated expiry timers for connection-based FQDN entries #32013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 6, 2024

Conversation

joestringer
Copy link
Member
@joestringer joestringer commented Apr 17, 2024

Based on #32012

Report the expiry time for entries kept alive based on connection state
as the next time that connection tracking garbage collection will run.
This way, the expiration time should be closer to the actual expiry time
rather than simply reporting the zero timestamp.

Commits 1-5 do some refactoring and cleanups in preparation for the target
feature in this PR. The final commit has the functional changes.

  • daemon: Fix BenchmarkFqdnCache
  • daemon: Use netip library in fqdn logic
  • daemon: Tidy up fqdn dump parsing logic
  • fqdn: Add EndpointID filter to endpoint getter
  • daemon: Move DNS model code to pkg/fqdn
  • fqdn: Report zombie expiry time by CT GC

Related: #24246
Fixes: #21540

@joestringer joestringer added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. area/fqdn Affects the FQDN policies feature labels Apr 17, 2024
@joestringer joestringer force-pushed the pr/joe/fqdn-conn-timeouts branch 2 times, most recently from e8e42b0 to 84c0846 Compare April 17, 2024 22:03
@joestringer joestringer force-pushed the pr/joe/ct-gc-after-restore branch from d1135ed to 15772a8 Compare April 18, 2024 21:54
@joestringer joestringer added the dont-merge/blocked Another PR must be merged before this one. label Apr 18, 2024
@joestringer joestringer force-pushed the pr/joe/ct-gc-after-restore branch from 15772a8 to 09fac0c Compare April 25, 2024 16:07
@joestringer joestringer force-pushed the pr/joe/fqdn-conn-timeouts branch from 84c0846 to cc4df44 Compare April 25, 2024 16:09
Base automatically changed from pr/joe/ct-gc-after-restore to main April 30, 2024 01:59
@joestringer joestringer force-pushed the pr/joe/fqdn-conn-timeouts branch from cc4df44 to 32dc26f Compare April 30, 2024 04:29
@joestringer
Copy link
Member Author

/test

@joestringer joestringer removed the dont-merge/blocked Another PR must be merged before this one. label Apr 30, 2024
@joestringer joestringer marked this pull request as ready for review April 30, 2024 06:18
@joestringer joestringer requested review from a team as code owners April 30, 2024 06:18
Copy link
Member
@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 🚀
PR looks good to me, but I think the current logic in the BenchmarkFqdnCache is incorrect. Not sure if we want to tackle that in a follow up PR, though, since here we are just updating it as a consequence of the main changes.

@joestringer joestringer force-pushed the pr/joe/fqdn-conn-timeouts branch from 32dc26f to 4280d05 Compare April 30, 2024 23:23
@joestringer
Copy link
Member Author

@pippolo84 done, thanks for the review.

@joestringer
Copy link
Member Author

/test

Copy link
Contributor
@markpash markpash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting the commits nicely ❤️

Copy link
Member
@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes along with comments make sense, LGTM!
Just a related q: the connection expiry times are not static, are they? The connection entry would be kept alive as long as there is traffic seen, regardless of when the next GC runs. I wonder if this warrants a documentation note as the expiration times are interpreted differently than DNS TTL.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 2, 2024
@joestringer joestringer force-pushed the pr/joe/fqdn-conn-timeouts branch from 4280d05 to 4aedd54 Compare May 3, 2024 21:06
@joestringer
Copy link
Member Author

/test

@joestringer
Copy link
Member Author

Just a related q: the connection expiry times are not static, are they? The connection entry would be kept alive as long as there is traffic seen, regardless of when the next GC runs. I wonder if this warrants a documentation note as the expiration times are interpreted differently than DNS TTL.

Correct. I've taken a stab at describing the behaviour here: #32350 . Strictly speaking this behaviour was already the case, the only difference with this PR is that the user-facing CLI output will reflect this more closely.

@joestringer joestringer removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 3, 2024
@joestringer joestringer enabled auto-merge May 3, 2024 21:22
This benchmark was broken (triggered segfault) and was based on the
older go.check framework. Switch it to standard testing and fix it up.

Signed-off-by: Joe Stringer <joe@cilium.io>
Modernize this code by using netip over net for filtering IPs.

Signed-off-by: Joe Stringer <joe@cilium.io>
Split API request parsing from the code to form the model, so we can
move the model output generation logic under pkg/fqdn in an upcoming
patch.

Signed-off-by: Joe Stringer <joe@cilium.io>
This will allow us to converge usage of this function by the FQDN dump
routines to either directly look up one specific endpoint or to dump all
endpoints via this function in an upcoming patch.

Signed-off-by: Joe Stringer <joe@cilium.io>
Moving this logic into pkg/fqdn allows it to be more self-contained, and
makes it easier to make changes solely within the pkg/fqdn package for
future extensions of the filtering and formatting logic.

Two changes are made to the main function being moved here:
- At the start, use `NameManager.config.GetEndpointsDNSInfo()` to fetch the
  endpoints rather than pulling them from the EndpointManager directly.
- Switch from pulling the `endpoint.ID` field to `endpoint.ID64`.

Signed-off-by: Joe Stringer <joe@cilium.io>
Report the expiry time for entries kept alive based on connection state
as the next time that connection tracking garbage collection runs. This
way, the expiration time should be closer to the actual expiry time
rather than simply reporting the zero timestamp.

Signed-off-by: Joe Stringer <joe@cilium.io>
@joestringer joestringer force-pushed the pr/joe/fqdn-conn-timeouts branch from 4aedd54 to 795d80b Compare May 6, 2024 17:19
@joestringer
Copy link
Member Author

/test

@joestringer joestringer added this pull request to the merge queue May 6, 2024
Merged via the queue into main with commit 4c98d60 May 6, 2024
266 of 270 checks passed
@joestringer joestringer deleted the pr/joe/fqdn-conn-timeouts branch May 6, 2024 21:11
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/fqdn Affects the FQDN policies feature ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calculate estimated "TTL" for connection entries (zombies) in the FQDN cache
4 participants
0