8000 Enable L7 DNS proxy for nodes by atykhyy · Pull Request #34024 · cilium/cilium · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Enable L7 DNS proxy for nodes #34024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 14, 2024
Merged

Enable L7 DNS proxy for nodes #34024

merged 3 commits into from
Aug 14, 2024

Conversation

atykhyy
Copy link
Contributor
@atykhyy atykhyy commented Jul 25, 2024

FQDN-based network policy is extremely useful for egress filtering, but is currently only supported in pods. This means that nodes must either rely on some other filtering mechanism, or resort to periodically updating L3/L4 CIDR policies. However, the code which collects DNS requests and combines them with policy to automatically produce L3/L4 filters is at endpoint level and is almost agnostic with respect to kind of endpoint (pod or node). This PR enables FQDN-based network policy for nodes by adding a bit of BPF code which, when instructed by network policy, redirects node's DNS requests to the same transparent proxy that is used for pods. No changes in iptables rules or routing are required.

FQDN-based network policy for nodes is likely to be a good addition to Cilium, as it would enable egress filtering by FQDNs for the whole cluster without relying on third-party solutions. Also, the same approach may possibly work for other L7 rules.

A sample policy I used for testing is

apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: hostdns
specs:
- egress:
  - toCIDR:
    - 168.63.129.16/32 # example DNS server IP (whatever is in cilium agent's /etc/resolv.conf)
    toPorts:
    - ports:
      - port: "53"
        protocol: ANY
      rules:
        dns:
        - matchPattern: '*'
  - toFQDNs: # works when host has L7 DNS policy
    - matchName: api.ipify.org
  nodeSelector: {}

@atykhyy atykhyy requested review from a team as code owners July 25, 2024 22:54
@atykhyy atykhyy requested review from nathanjsweet and qmonnet July 25, 2024 22:54
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 25, 2024
@github-actions github-actions bot added sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. kind/community-contribution This was a contribution made by a community member. labels Jul 25, 2024
Copy link
Member
@nathanjsweet nathanjsweet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inclusion of "host" in your PR as CIDR selectable needs further discussion. I would like to see a lot of this functionality behind the --policy-cidr-match-mode flag as well.

@qmonnet qmonnet requested a review from nathanjsweet July 31, 2024 14:04
Copy link
Member
@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! With a couple comments below from my side.

I'd like to see a conclusion to the discussion with @nathanjsweet, I'm not sure why it was marked as “resolved”?

@qmonnet qmonnet added release-note/major This PR introduces major new functionality to Cilium. area/host-firewall Impacts the host firewall or the host endpoint. labels Jul 31, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jul 31, 2024
@qmonnet
Copy link
Member
qmonnet commented Jul 31, 2024

Another question: How did you test the feature? It would be nice to get a new test for this in cilium-cli.

@qmonnet qmonnet mentioned this pull request Jul 31, 2024
3 tasks
@maintainer-s-little-helper
Copy link

Commit b2168e4 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Aug 1, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Aug 1, 2024
@atykhyy
Copy link
Contributor Author
atykhyy commented Aug 1, 2024

Another question: How did you test the feature? It would be nice to get a new test for this in cilium-cli.

I have actually been running the previous 'version' of this feature (#19892) - one that redirected packets with iptables rules rather than eBPF - in our cluster for a long time. The basic smoke test I've been using is to turn on audit mode, create a simple DNS L7 host policy like the one in the PR description, then get the host to make a request allowed by FQDN and one disallowed by FQDN and observe policy verdicts in the output of cilium monitor -t policy-verdict. Do you mean you want me to add a test like this to the suite invoked by cilium connectivity test?

@qmonnet
Copy link
Member
qmonnet commented Aug 1, 2024

Do you mean you want me to add a test like this to the suite invoked by cilium connectivity test?

Yes, that would be ideal.

@qmonnet
Copy link
Member
qmonnet commented Aug 1, 2024

/test

atykhyy added a commit to Apptane/cilium-cli that referenced this pull request Aug 2, 2024
This commit adds a connectivity test for host L7 DNS proxy capability
added in cilium/cilium#34024.

Signed-off-by: Anton Tykhyy <atykhyy@gmail.com>
@atykhyy
Copy link
Contributor Author
atykhyy commented Aug 2, 2024

I added a test to cilium-cli, but I will have to re-fix #23330 in one of these two ways mentioned in that issue

  • Pass a custom /etc/resolv.conf to kubelet via --resolv-conf in the Kind / kubeadm config.
  • Override /etc/resolv.conf of Kind nodes after creating a cluster (no race condition, as CoreDNS pods won't be started, as a CNI is not ready).

because #24713 changes resolv.conf only in coredns pods, leaving host networking pods with default resolv.conf and making the host-firewall-egress-to-fqdns test I added fail when #23330 is a problem.

atykhyy added a commit to Apptane/cilium-cli that referenced this pull request Aug 2, 2024
This commit adds a connectivity test for host L7 DNS proxy capability
added in cilium/cilium#34024.

Signed-off-by: Anton Tykhyy <atykhyy@gmail.com>
@qmonnet
Copy link
Member
qmonnet commented Aug 6, 2024

I added a test to cilium-cli, but I will have to re-fix #23330

So if I follow correctly, you add a new test in cilium-cli for the change in the current PR in cilium/cilium-cli#2740, and this test also needs the fix in #34154 to work, is this correct? Very nice work, thanks a lot! 🚀

Looking at the results from CI today (apologies for the delay), I see that we have two failures:

  • Cilium L4LB XDP shouldn't be an issue, I think it's been broken during a few days, and it should hopefully be resolved if you rebase your PR on the current main branch.
  • Datapath BPF Complexity, however, is a concern. This means that the BPF program for the host becomes too complex for the verifier with your changes, at least for some combinations of options we're testing. We'll need to address this to be able to merge. Not sure what's the best path here, maybe try removing some of the tracing points to see if it makes a difference (I didn't dug into the details of the log). Otherwise it's more tricky and we may have to re-arrange code or add a tail call? Cc @julianwiedmann

I will be off soon so I'll ask someone else from sig-datapath to review and follow-up.

@qmonnet qmonnet requested review from a team and aspsk and removed request for a team August 6, 2024 13:13
@atykhyy
Copy link
Contributor Author
atykhyy commented Aug 6, 2024

So if I follow correctly, you add a new test in cilium-cli for the change in the current PR in cilium/cilium-cli#2740, and this test also needs the fix in #34154 to work, is this correct?

Yes.

Not sure what's the best path here [BPF verifier], maybe try removing some of the tracing points to see if it makes a difference (I didn't dug into the details of the log).

I'll try that, thanks. I looked at the logs, but could not make much sense out of it. BPF verifier quirks is not something one can pick up in a day or two. However verifier tests pass for newer kernels (e.g. 6.8.0 on my test VM), maybe the feature can be gated to newer kernels only as a last resort?

@qmonnet
Copy link
Member
qmonnet commented Aug 7, 2024

/test

atykhyy added 3 commits August 7, 2024 15:26
Restored DNS rules code expected that LookupIPsBySecID returns only IP addresses,
but if an L7 DNS rule has toCIDR selectors configured, it also returns CIDRs,
and after such a DNS rule is restored, it does not match any target DNS servers
and the DNS proxy rejects DNS queries until the endpoint is regenerated.

This commit fixes this issue, replacing string IP addresses in restore.DNSRules
with a wrapped netip.Prefix (preserving JSON serialization of restored rules).

Signed-off-by: Anton Tykhyy <atykhyy@gmail.com>
The host endpoint currently does not have any IP addresses associated with it.
This prevents DNS proxy from working when source is the host endpoint.
This commit is a minimal change which enables the host endpoint to be looked up
by its IP address without touching any other code.

Signed-off-by: Anton Tykhyy <atykhyy@gmail.com>
FQDN-based network policy is extremely useful for egress filtering, but is
currently only supported in pods. This means that nodes must either rely on
some other filtering mechanism, or resort to periodically updating L3/L4 CIDR
policies. However, the code which collects DNS requests and combines them with
policy to automatically produce L3/L4 filters is at endpoint level and is
almost agnostic with respect to kind of endpoint (pod or node). This changeset
enables FQDN-based network policy for nodes by adding BPF code which redirects
node's DNS requests to the same transparent proxy that is used for pods.

Signed-off-by: Anton Tykhyy <atykhyy@gmail.com>
@atykhyy
Copy link
Contributor Author
atykhyy commented Aug 7, 2024

Not sure what's the best path here [BPF verifier], maybe try removing some of the tracing points to see if it makes a difference (I didn't dug into the details of the log).

The BPF complexity test was failing on cil-to-netdev function in older kernels under test (5.10 and 6.1) with "BPF program too large" error. I commented out calls to send_trace_notify(ctx, TRACE_TO_PROXY, ...) as suggested above and it helped, but it wouldn't do to remove these traces in real code. The thing that worked turned out to be swapping the order of return code checks:

diff --git a/bpf/bpf_host.c b/bpf/bpf_host.c
index e7bf9bd284..89d60ab0ba 100644
--- a/bpf/bpf_host.c
+++ b/bpf/bpf_host.c
@@ -1421,12 +1421,12 @@ int cil_to_netdev(struct __ctx_buff *ctx __maybe_unused)
                break;
        }

-       if (IS_ERR(ret))
-               goto drop_err;
-
        if (ret == CTX_ACT_REDIRECT)
                return ret;

+       if (IS_ERR(ret))
+               goto drop_err;
+
 skip_host_firewall:
 #endif /* ENABLE_HOST_FIREWALL */

@qmonnet
Copy link
Member
qmonnet commented Aug 8, 2024

/test

@qmonnet qmonnet dismissed their stale review August 8, 2024 17:06

Not available for review

@qmonnet qmonnet requested a review from nathanjsweet August 8, 2024 20:22
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 13, 2024
@youngnick youngnick added this pull request to the merge queue Aug 14, 2024
Merged via the queue into cilium:main with commit 9dd6b90 Aug 14, 2024
67 of 68 checks passed
@cccsss01
Copy link
cccsss01 commented Sep 16, 2024

@atykhyy
Great work!

Do we know if these limitations are still in effect?
(Clustermesh and/or wireguard)?
In the context of ClusterMesh, the following combination of options is not supported:

Cilium operating in CRD mode (as opposed to KVstore mode),

Host Policies enabled,

tunneling enabled,

kube-proxy-replacement enabled, and

WireGuard enabled.

This combination results in a failure to connect to the clustermesh-apiserver. For details, refer to GitHub issue 31209.

Host Policies do not work on host WireGuard interfaces. For details, see GitHub issue 17636.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/host-firewall Impacts the host firewall or the host endpoint. kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/major This PR introduces major new functionality to Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0