8000 cilium: Allow to configure tunnel source port range by borkmann · Pull Request #37777 · cilium/cilium · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

cilium: Allow to configure tunnel source port range #37777

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 21, 2025
Merged

cilium: Allow to configure tunnel source port range #37777

merged 4 commits into from
Feb 21, 2025

Conversation

borkmann
Copy link
Member
@borkmann borkmann commented Feb 20, 2025

(see commit desc)

@borkmann borkmann added release-note/misc This PR makes changes that have no direct user impact. needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Feb 20, 2025
@borkmann borkmann force-pushed the pr/vxlan branch 2 times, most recently from 6f18359 to 7564968 Compare February 21, 2025 08:53
@borkmann borkmann marked this pull request as ready for review February 21, 2025 08:54
@borkmann borkmann requested review from a team as code owners February 21, 2025 08:54
@borkmann borkmann force-pushed the pr/vxlan branch 2 times, most recently from 7e7af0f to 4f51a54 Compare February 21, 2025 09:20
@borkmann borkmann requested review from a team as code owners February 21, 2025 09:28
@borkmann borkmann force-pushed the pr/vxlan branch 2 times, most recently from 56dc736 to e30755b Compare February 21, 2025 10:26
Today, Azure's networking stack supports 1M total flows (500k inbound and
500k outbound) for a VM, see details in the link below.

Users with Cilium tunneling can get limited in terms of E/W traffic for
larger clusters since vxlan/geneve is using the inner hash for deriving
a source port in order to RSS-spread the flows on the remote node CPUs.
This, however, also means that inbound and outbound number of different
flows Azure is tracking can become very large since Azure is looking at
the outer 5-tuple. The skb->hash is not symmetric, so for a given flow
that is tunnled through vxlan, Azure if tracking 2 flows.

Anyway, add the ability to specify the source port range for the vxlan
tunnel device. In the kernel this clamps the port:

  src_port = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
                               vxlan->cfg.port_max, true);

For geneve, this is currently no possible, but I'll do a separate kernel
fix to add support for it to that users for geneve don't suffer the same.

Before:

  [...]
  61: cilium_vxlan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default
      link/ether ca:5b:36:9f:11:4f brd ff:ff:ff:ff:ff:ff promiscuity 0  allmulti 0 minmtu 68 maxmtu 65535
      vxlan external id 0 srcport 0 0 dstport 8472 nolearning ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536
  [...]

After (if no prior vxlan device was present):

  # ./daemon/cilium-agent --enable-ipv4=true --enable-ipv6=false \
     --datapath-mode=veth  --bpf-lb-mode=snat --devices=enp5s0 \
     --k8s-kubeconfig-path=$HOME/.kube/config \
     --tunnel-source-port-range=1000-2000

  [...]
  61: cilium_vxlan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default
      link/ether ca:5b:36:9f:11:4f brd ff:ff:ff:ff:ff:ff promiscuity 0  allmulti 0 minmtu 68 maxmtu 65535
      vxlan external id 0 srcport 1000 2000 dstport 8472 nolearning ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536
  [...]

  If a cilium_vxlan device was already present, it is not deleted and
  reconfigured given this creates disruptions of ongoing connections.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://learn.microsoft.com/en-us/azure/virtual-network/virtual-machine-network-throughput#flow-limits-and-active-connections-recommendations
@borkmann
Copy link
Member Author

/test

Copy link
Contributor
@gentoo-root gentoo-root left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lucky pull request number!

@borkmann borkmann force-pushed the pr/vxlan branch 2 times, most recently from 407dde3 to caddcb4 Compare February 21, 2025 15:21
@borkmann
Copy link
Member Author

/test

Small test to validate the low/high source port on the vxlan device and
another test on an existing device to ensure the low/high source port
range does not change at runtime (only upon first creation).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The MTU change test never removed the vxlan/geneve device, potentially
causing subsequent tests to fail given the device already exists.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add the ability for users to configure tunnel-source-port-range via Helm.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann
Copy link
Member Author

/test

@borkmann
Copy link
Member Author

fyi, the selftest still needs to wait for vishvananda/netlink#1062

@borkmann borkmann merged commit d3e27a0 into main Feb 21, 2025
279 of 281 checks passed
@borkmann borkmann deleted the pr/vxlan branch February 21, 2025 21:15
@julianwiedmann julianwiedmann added the area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. label Feb 26, 2025
@nbusseneau nbusseneau mentioned this pull request Feb 27, 2025
17 tasks
@nbusseneau nbusseneau added backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. and removed needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Feb 27, 2025
@julianwiedmann
Copy link
Member

Added #37924 to cover the XDP aspect.

@github-actions github-actions bot added backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. and removed backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. labels Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
0