-
Notifications
You must be signed in to change notification settings - Fork 1.9k
cmd/k8s-operator, k8s-operator: support direct connections on ProxyGroups #16115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pull Request Revisions
13 more revisions
☑️ AI review skipped after 5 revisions, comment with `/review` to review again HelpReact with emojis to give feedback on AI-generated reviews:
We'd love to hear from you—reach out anytime at team@review.ai. |
a6d809b
to
8085b80
Compare
Just want to add some notes for this PR:
|
8085b80
to
dd9850a
Compare
func (r *ProxyGroupReconciler) findStaticEndpoints(ctx context.Context, port int32, proxyClass *tsapi.ProxyClass, logger *zap.SugaredLogger) ([]netip.AddrPort, error) { | ||
if len(proxyClass.Spec.TailnetListenerConfig.NodePortConfig.Selector) < 1 { | ||
logger.Debugf("no LabelSelectors specified on tailnetListenerConfig for ProxyClass %q. Listing all nodes") | ||
} | ||
|
||
nodes := new(corev1.NodeList) | ||
selectors := client.MatchingLabels(proxyClass.Spec.TailnetListenerConfig.NodePortConfig.Selector) | ||
|
||
err := r.List(ctx, nodes, selectors) | ||
if err != nil { | ||
return nil, fmt.Errorf("failed to list nodes: %w", err) | ||
} | ||
|
||
if len(nodes.Items) == 0 { | ||
err := fmt.Errorf("failed to match nodes to configured NodeSelectors in TailnetListenerConfig") | ||
logger.Warn(err.Error()) | ||
return nil, err | ||
} | ||
|
||
endpoints := []netip.AddrPort{} | ||
|
||
for _, n := range nodes.Items { | ||
for _, a := range n.Status.Addresses { | ||
if a.Type == corev1.NodeExternalIP { | ||
addrPort := fmt.Sprintf("%s:%d", a.Address, port) | ||
i, err := netip.ParseAddrPort(addrPort) | ||
if err != nil { | ||
logger.Debugf("failed to parse external address on node %q: %q", n.Name, addrPort) | ||
continue | ||
} | ||
logger.Debugf("adding endpoint %q to staticEndpoints config", addrPort) | ||
endpoints = append(endpoints, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the findStaticEndpoints function, there's no limit on the number of endpoints collected. For clusters with many nodes, this could result in a very large list. Consider adding a limit or pagination mechanism to avoid potential performance issues with large node counts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm interesting. Going to leave this comment open for now. At the moment we could just limit it to 1 node since tailscaled won't leverage more than one anyway (but this will change in the future).
4132310
to
d75f0bb
Compare
…oups updates: #14674 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
5a166f2
to
66a43ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the many comments. Please could you also add a more detailed PR description? As well as any specific decisions made in the implementation, it would also be super helpful to include some example config for how to try it out.
cmd/k8s-operator/svc-for-pg.go
Outdated
@@ -104,6 +104,10 @@ func (r *HAServiceReconciler) Reconcile(ctx context.Context, req reconcile.Reque | |||
hostname := nameForService(svc) | |||
logger = logger.With("hostname", hostname) | |||
|
|||
if isManagedResource(svc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we filter these in operator.go
instead, so we never get into the Reconcile
function in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I've gone ahead and done this. Will need to make sure we test it because we haven't done this for any other reconcilers. Also good to note that we haven't done this for other reconcilers, would be good to do this across the board where necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keeping this open so that I double check that this is working as expected. Also meaning to circle back and test the behaviour of the reconciler (I was seeing a lot of noise)
cmd/k8s-operator/proxygroup.go
Outdated
@@ -206,6 +215,164 @@ func (r *ProxyGroupReconciler) Reconcile(ctx context.Context, req reconcile.Requ | |||
return setStatusReady(pg, metav1.ConditionTrue, reasonProxyGroupReady, reasonProxyGroupReady) | |||
} | |||
|
|||
func getProxyGroupSvcPorts(ctx context.Context, c client.Client, namespace string, usedPorts map[int32]bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading correctly I think the map[int32]bool
type would be better off as a Set
from the util/set
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry could you be a bit more specific? I am not familiar with util/set
package 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I omitted the tailscale.com part of the packge: https://pkg.go.dev/tailscale.com/util/set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think that the bool
is necessary here. Later on we iterate over the map to discover ports in the range that have not been used (value is false). See here.
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com> Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com> Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com> Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com> Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
@@ -84,7 +84,7 @@ type ProxyClassSpec struct { | |||
UseLetsEncryptStagingEnvironment bool `json:"useLetsEncryptStagingEnvironment,omitempty"` | |||
// Configuration for listeners on proxies in order to facilitate | |||
// 'direct connections' from other devices on the tailnet. | |||
// See https://tailscale.com/kb/1411/device-connectivity for details on tailnet device connectivity. | |||
// See https://tailscale.com/kb/1445/kubernetes-operator-customization#static-endpoints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might change... make sure we update this if it does.
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
7ce42da
to
dcce7f8
Compare
@@ -375,6 +383,7 @@ func runReconcilers(opts reconcilerOpts) { | |||
err = builder. | |||
ControllerManagedBy(mgr). | |||
For(&corev1.Service{}). | |||
WithEventFilter(objectManagedResourceFilterPredicate()). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something, but why are we adding this? The Services we reconcile are not operator's managed Services?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break this reconciler- we'll no longer reconcile the HA Services at all.
I think what you are aiming for is the opposite (dont reconcile managed Services rather than only reconcile managed Services) and also that filter should only to be applied to Services - you can achieve that with a filter similar to what we use elsewhere and only applied to Services.
We should generally avoid filters that are added to all watched objects, it'd be quite easy to introduce subtle bugs that way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - I added this yesterday and hadn't tested yet.
This filter makes sure we don't reconcile on Kubernetes Services that are tailscale.com/managed = true
. This was upon review comment from @tomhjp requesting that we do this as a filter on the reconciler rather than in reconcile function itself.
However, I just noticed that the filter in fact is missing a !
, and is currently reconciling on managed Services only. I will fix this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See latest commit. Still need to test this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the filter to Services only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(See my last comment - also in this specific case it will break all other reconciles that are for tailscale managed objects like state Secrets)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I see what you're saying. This will break the reconcile for other objects that we watch on this reconciler. Sorry about this, I genericized it because I thought it was going to make it more expandable but in doing so you're right it will affect the other watches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's why we shouldn't use these- it's really easy to break other things especially given we don't have e2e tests so this is really only ever tested by hand
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
cbbc5d4
to
b2f913e
Compare
// ProxyGroup | ||
reqs := make([]reconcile.Request, 0) | ||
for _, pg := range pgList.Items { | ||
if pg.Spec.ProxyClass != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also have default ProxyClass
for k, v := range tlc.NodePortConfig.Selector { | ||
if val, ok := labels[k]; ok && val == v { | ||
logger.Debugf("reconciling on ProxyGroup %q due to event from Node %q", pg.Name, o.GetName()) | ||
reqs = append(reqs, reconcile.Request{NamespacedName: client.ObjectKeyFromObject(&pg)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic will result in the same ProxyGroup being appended to reqs
multiple times if the selector specifies multiple labels. It will also result in ProxyGroup being reconciled if only one of the labels matches.
I think what we want is the same behaviour as for any Kubernetes selector where all labels have to match for the node to be selected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be some helper function somwhere that you could use
Name: invalidSvcName, | ||
Namespace: tsNamespace, | ||
Labels: map[string]string{ | ||
kubetypes.LabelManaged: "true", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add tailscale.com/managed
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - do you mean, "we shouldn't add tailscale.com/managed
here? Since kubetypes.LabelManaged = "tailscale.com/managed"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I'm sorry, I actually forgot the kubetypes
is our library 🤦🏼 ignore this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem at all!
Co-authored-by: Irbe Krumina <irbe@tailscale.com> Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
Spec: corev1.ServiceSpec{ | ||
Type: corev1.ServiceTypeNodePort, | ||
Ports: []corev1.ServicePort{ | ||
// NOTE (ChaosInTheCRD): we set the ports once we've iterated over every svc and found any old configuration we want to persist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
// NOTE (ChaosInTheCRD): we set the ports once we've iterated over every svc and found any old configuration we want to persist | |
// NOTE(ChaosInTheCRD): we set the ports once we've iterated over every svc and found any old configuration we want to persist. |
// Copyright (c) Tailscale Inc & AUTHORS | ||
// SPDX-License-Identifier: BSD-3-Clause | ||
|
||
package main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can we name this file something more specific to the NodePort Services? We have other bits of fairly complicated port wrangling in other places (see egress services logic, for example)
portMax = 65535 | ||
portMin = 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
portMax = 65535 | |
portMin = 1024 | |
tailscaledPortMax = 65535 | |
tailscaledPortMin = 1024 |
or something else more explicit
updates: #14674