8000 fix pod-to-pod MTU drop when both in+egress proxy and IPSec enabled by smagnani96 · Pull Request #35173 · cilium/cilium · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix pod-to-pod MTU drop when both in+egress proxy and IPSec enabled #35173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 24, 2024

Conversation

smagnani96
Copy link
Contributor
@smagnani96 smagnani96 commented Oct 2, 2024

This PR fixes the pod-to-pod traffic being dropped because of using a higher MTU value. This is caused by a non-configuration of the routes from proxy, in the routing table n. 2005 respectively. While the pod-to-pod route is being adjusted according to the IPSec overhead and the adjusted size of the authentication key, the from-proxy route is not changed as well.

This also enables the pod-to-pod-with-l7-policy-encryption test for IPSec in IPv4. Such test is therefore skipped if/for:

  1. IPv6, please refer to the follow-up issue plain TCP RST packet in pod-to-pod-with-l7-policy-encryption for IPv6+IPSec+VXLan #35485;
  2. cilium version < 17.0.0, need to backport the fix before enabling the test.

Fixes: #33168

Fix packet drops for pod-to-pod connections that pass through ingress & egress proxy when using IPsec, caused by MTU misconfiguration.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 2, 2024
@github-actions github-actions bot added the cilium-cli This PR contains changes related with cilium-cli label Oct 2, 2024
@smagnani96 smagnani96 force-pushed the pr/mtu-via-proxy branch 3 times, most recently from 833e7a2 to 3f1fffa Compare October 2, 2024 15:06
@smagnani96
Copy link
Contributor Author

/test

@smagnani96 smagnani96 marked this pull request as ready for review October 2, 2024 15:59
@smagnani96 smagnani96 requested review from a team as code owners October 2, 2024 15:59
@pchaigno pchaigno added release-note/bug This PR fixes an issue in a previous release of Cilium. feature/ipsec Relates to Cilium's IPsec feature area/mtu Relates to MTU management in Cilium. needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Oct 3, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Oct 3, 2024
@smagnani96
Copy link
Contributor Author

/test

@smagnani96 smagnani96 marked this pull request as draft October 8, 2024 14:34
@smagnani96
Copy link
Contributor Author

/ci-ipsec-e2e

1 similar comment
@smagnani96
Copy link
Contributor Author

/ci-ipsec-e2e

Copy link
Member
@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re https://github.com/cilium/cilium/actions/runs/11240829491/job/31251421453:

I found the leak packets are all TCP reset:

 Error: bpftrace output is not empty
[18:14:44:129639] fd00:10:244:1::f085:34820 -> fd00:10:244:3::82cc:8080 (proto: 6, TCP flags: ...R, encap: 1, ifindex: 10, netns: f0000098, override: 0)

(Love this tcp flag information, love Marco!)

So it's probably caused by ipcache race issue when pods are removed but packets are still lingering around, maybe there is a small window when ipcache entry for that lingered packet is gone. We saw the similar issue before but forgot how we worked around it...

@jschwinger233
Copy link
Member

Interesting, failed jobs are all vxlan tunnel.

What did CI say? Any failure in tunnel scenarios for pod-to-pod-with-l7-policy-encryption?

Maybe that's how CI warned us? But geneve tunnel was green...

Can you check the v6 route table, what's the mtu there, does that look good to you?

@smagnani96
Copy link
Contributor Author

Can you check the v6 route table, what's the mtu there, does that look good to you?

All configs failing are using VxLan. Tests with Geneve are working.
Connectivity is ok.
Unencrypted RST between:

Pod Identity IPv4 IPv6 Node IPv4 IPv6 Agent Envoy
client 24974 10.244.1.111 fd00:10:244:1::f085 kind-worker 10.244.1.124 fd00:10:244:1::5c12 cilium-scxv9 cilium-envoy-8r5tp
echo-other-node 61410 10.244.3.239 fd00:10:244:3::82cc kind-worker2 10.244.3.91 fd00:10:244:3::dda9 cilium-hcv5v cilium-envoy-px94d
fd00:10:244:1::f085:34820 -> fd00:10:244:3::82cc:8080 (proto: 6, TCP flags: ...R, encap: 1, ifindex: 10, netns: f0000098, override: 0)

Table 200

Default table to use for IPSec routing rules (0xe00, 0xd00). MTU set to 1450: 1500 - 50 (tunnel overhead)

# IPv4
local 10.244.1.0/24 dev cilium_vxlan proto kernel scope host 
10.244.0.0/24 dev cilium_host proto kernel mtu 1450 
10.244.3.0/24 dev cilium_host proto kernel mtu 1450 

# IPv6
local fd00:10:244:1::/64 dev cilium_vxlan proto kernel metric 1024 pref medium
fd00:10:244::/64 dev cilium_host proto kernel metric 1024 mtu 1450 pref medium
fd00:10:244:3::/64 dev cilium_host proto kernel metric 1024 mtu 1450 pref medium

Table 2004

Default table to use routing rules to the proxy.

# IPv4
local default dev lo proto kernel scope host 

# IPv6
local default dev lo proto kernel metric 1024 pref medium

Default

MTU set to 1373: 1500 - 50 (tunnel) - 77 (IPSec)

# IPv4
default via 172.18.0.1 dev eth0 
10.244.0.0/24 via 10.244.1.124 dev cilium_host proto kernel src 10.244.1.124 mtu 1373 
10.244.1.0/24 via 10.244.1.124 dev cilium_host proto kernel src 10.244.1.124 
10.244.1.124 dev cilium_host proto kernel scope link 
10.244.3.0/24 via 10.244.1.124 dev cilium_host proto kernel src 10.244.1.124 mtu 1373 
172.18.0.0/16 dev eth0 proto kernel scope link src 172.18.0.4 
192.168.0.0/16 dev eth1 proto kernel scope link src 192.168.0.2 

# IPv6
fc00:c111::/64 dev eth0 proto kernel metric 256 pref medium
fc00:c112::/64 dev eth1 proto kernel metric 256 pref medium
fd00:10:244::/64 dev cilium_host proto kernel src fd00:10:244:1::5c12 metric 1024 mtu 1373 pref medium
fd00:10:244:1::5c12 dev cilium_host proto kernel metric 256 pref medium
fd00:10:244:1::/64 dev cilium_host proto kernel src fd00:10:244:1::5c12 metric 1024 pref medium
fd00:10:244:3::/64 dev cilium_host proto kernel src fd00:10:244:1::5c12 metric 1024 mtu 1373 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev eth1 proto kernel metric 256 pref medium
fe80::/64 dev cilium_net proto kernel metric 256 pref medium
fe80::/64 dev cilium_vxlan proto kernel metric 256 pref medium
fe80::/64 dev lxc_health proto kernel metric 256 pref medium
fe80::/64 dev lxc61a1aee0bc58 proto kernel metric 256 pref medium
fe80::/64 dev lxc2a78580b2efb proto kernel metric 256 pref medium
fe80::/64 dev lxc408d98fc6df1 proto kernel metric 256 pref medium
default via fc00:c111::1 dev eth0 metric 1024 pref medium

I'm not seeing particular differences wrt routing table with Geneve.
I'll keep investigating logs and try to replicate with the exact same kernel.

@pchaigno pchaigno added this pull request to the merge queue Oct 24, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 24, 2024
Merged via the queue into cilium:main with commit 1595e8b Oct 24, 2024
63 checks passed
@julianwiedmann julianwiedmann added the kind/bug This is a bug in the Cilium logic. label Oct 24, 2024
@julianwiedmann
Copy link
Member

@smagnani96 word-smithing the release note a bit, I'd suggest

Fix pod-to-pod MTU drop when using in+egress egress proxy with IPSec and enable IPv4 test

or slightly rephrased

Fix packet drops for pod-to-pod connections that pass through ingress & egress proxy when using IPsec, caused by MTU misconfiguration.

Striking the test change, as it's not user-relevant (and this note also gets copied into the release note for stable releases when backporting)

smagnani96 added a commit to smagnani96/cilium that referenced this pull request Oct 24, 2024
This commit enables the pod-to-pod-with-l7-policy-encryption for IPSec in
IPv6. In cilium#35173 we fixed the MTU issue and enabled this test for IPv4.

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
@julianwiedmann
Copy link
Member

I think this is also worth backporting to v1.15. Adding the label, please shout if you disagree.

@julianwiedmann julianwiedmann added affects/v1.14 This issue affects v1.14 branch needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Oct 25, 2024
@pchaigno
Copy link
Member

I think this is also worth backporting to v1.15. Adding the label, please shout if you disagree.

MTU+IPsec+ingress L7+egress L7 doesn't really scream major bugfix to me 😅

@tklauser tklauser mentioned this pull request Oct 25, 2024
7 tasks
@tklauser tklauser added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Oct 25, 2024
@julianwiedmann
Copy link
Member

I think this is also worth backporting to v1.15. Adding the label, please shout if you disagree.

MTU+IPsec+ingress L7+egress L7 doesn't really scream major bugfix to me 😅

The MTU aspect isn't something that users would actively (mis-)configure though, no? It's just what's causing the problem.

Imho we shouldn't straight-out break connectivity like this, in particular when users have no easy work-around (disable IPsec or L7 policies, pick your security poison?).

@github-actions github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Oct 25, 2024
@rastislavs rastislavs mentioned this pull request Oct 28, 2024
6 tasks
@rastislavs rastislavs added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Oct 28, 2024
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. labels Oct 29, 2024
smagnani96 added a commit to smagnani96/cilium that referenced this pull request Nov 4, 2024
This commit enables the `pod-to-pod-with-l7-policy-encryption` cli
connectivity test from v1.15, after the successful backports of cilium#35173 in:

* v1.15: cilium#35586
* v1.16: cilium#35543

While enabling the test, in this commit we split the version check logic
(that is independent from the IP family used) from the check for running
IPv6+IPsec (that should be prevented due to a current limitation of
having a flaky plain-text packet in the test suite, tracked in cilium#35485).

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
smagnani96 added a commit to smagnani96/cilium that referenced this pull request Nov 5, 2024
This commit enables the `pod-to-pod-with-l7-policy-encryption` cli
connectivity test for v1.15 and v1.16, after the backports of cilium#35173 in:

* v1.15: cilium#35586
* v1.16: cilium#35543

While enabling the test, in this commit we split the version check logic
(that is independent from the IP family used) from the check for running
IPv6+IPsec (that should be prevented due to a current limitation of
having a flaky plain-text packet in the test suite, tracked in cilium#35485).

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Nov 22, 2024
This commit enables the `pod-to-pod-with-l7-policy-encryption` cli
connectivity test for v1.15 and v1.16, after the backports of #35173 in:

* v1.15: #35586
* v1.16: #35543

While enabling the test, in this commit we split the version check logic
(that is independent from the IP family used) from the check for running
IPv6+IPsec (that should be prevented due to a current limitation of
having a flaky plain-text packet in the test suite, tracked in #35485).

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
@smagnani96 smagnani96 deleted the pr/mtu-via-proxy branch March 18, 2025 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.14 This issue affects v1.14 branch area/mtu Relates to MTU management in Cilium. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. cilium-cli This PR contains changes related with cilium-cli feature/ipsec Relates to Cilium's IPsec feature kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pod to pod packets via egress + ingress proxy are MTU dropped when IPsec is enabled
8 participants
0