-
Notifications
You must be signed in to change notification settings - Fork 8k
Operator #10015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operator #10015
Conversation
It has become imperative to have some proper way to handle CRDs. A two-phase installation technique is viable, however, given that we have released 1.0 with broken Helm `crd-install` tech, we must find a way forward that permits us to add and remove CRDs as needed. Further, we have some good planning happening around an operator. This bash code is just a placeholder for that later work. This works until it is replaced by the more advanced operator implementation.
This simply adds a deployment with appropriate settings to deploy the Istio operator.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sdake If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign @linsun |
/test e2e-simpleTests-cni |
/test istio-integ-k8s-tests |
/test istio-pilot-multicluster-e2e |
heritage: {{ .Release.Service }} | ||
release: {{ .Release.Name }} | ||
data: | ||
config: |- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you read this from a file rather than inlining?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way would be to include 2 files - one the original crd yaml (from 1.0), and one crd-1.1.yaml. I believe we need to keep the 1.0 yaml around as is, and it might be a good practice to have separate files for each version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed re separate YAMLs for each version - this was in my original PR on this topic. I was moving ahead a little too fast to reintegrate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the galley PR, this is read from files in helm/files directory.
- apiGroups: [""] | ||
resources: ["configmaps"] | ||
verbs: ["get", "list", "create", "watch"] | ||
- apiGroups: ["apiextensions.k8s.io"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be easy to give it more persmissions later? What is the permissions upgrade story?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mandarjog Thanks for challenging the assumptions in this PR. Will verify this works properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mandarjog permissions upgrades work for helm template
+ kubectl apply
. Will test helm upgrade
model next.
Cheers
-stedve
operator/operator.sh
Outdated
|
||
while true; do | ||
kubectl apply -f /etc/istio/operator/config | ||
sleep 5 # ugh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add sleep option? Every 5 s seems excessive
name: virtualservices.networking.istio.io | ||
labels: | ||
app: istio-pilot | ||
chart: istio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For new install charts: please stop adding 'chart/heritage/release' labels.
I checked helm doc - and it doesn't appear to be any requirement to add them. Labels create secondary indexes in etcd and are expensive, plus label changes usually result in upgrade failures.
Too late to remove them if they got added in 1.0 - put for new templates we can stop this. ( we should go over the diffs from 1.0 and remove new labels as well )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right - not sure why the labels were added in the first place... Attempting to keep the code base consistent. As you point out, that may not be the optimal approach here. The tiller process adds those labels - perhaps the original PRs that added those labels were meant to ensure the helm template
rendering was the same as helm install
. I'll take a look at removing these labels once we have a 1.1 install that works properly with the additional CRDs. I seem to recall they were added in the 1.1 series..
@@ -0,0 +1,32 @@ | |||
{{/* vim: set filetype=mustache: */}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's another thing I hope we can avoid in future - if we remove the extra labels we also don't need this template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems excessive to keep a container running just to apply the crds (which is needed once).
Also I am not sure mixing the operator in the istio chart will work well (ordering, etc). While full decoupling and modularization can't happen in 1.1 - we can at least move in that direction with new components.
The operator (doing only crd fixing in 1.1) can be installed as a separate helm chart. And for security reasons I think istio-operator should be a separate namespace from the start (so we don't have to move it out later). It'll have a service account with elevated privs, no need for an istio-system admin to get access to them.
@@ -0,0 +1,5 @@ | |||
approvers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move this into install directory? There seems to be no go code here. So, a top level directory seems unnecessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can move this to the install directory, however, there will be go code shortly, and will just need to move it back. Env WG is keen to have an operator for master (which this PR is against). This PR is targeted at kicking off that effort.
Happy to move to install if you desire given above new information - PLMK.
Cheers
-steve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there is going to be go code, its fine to have a top level directory. That said, since this is specific to kubernetes, don't you think this should be under pkg/kube/operator directory? The secret controller is also present there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rshriram agreed, kube specific operator code should be under pkg/kube/operator. WIll make it so.
Cheers
-steve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rshriram would appreciate your expert eye on the parts of the code I've reviewed (also read the commit message please). I am going to implement this approach for galley. I think it will be simpler than a full blown process/pod just to instantiate CRDs at runtime. In the future, this code could serve as a starting point for an operator implementation (in 1.2 timeframe) but I am really keen to make CRDs work well in 1.1 - and time is precious.
Cheers
-steve
Thanks for the thorough reviews. Will submit updates that address review comments when I have them sorted out. Cheers |
This is mostly a bunch of copy and paste from galley and the secret controller. I found during this work that galley may be a more appropriate place for ConfigMap to CustomResourceDefinition conversion. I will submit a PR with the relevant aspects of the code integrated into Galley. I have promised to have the operator PR available for review, and its available :) - This code requires a namespace called istio-operator. - The meat of this code is the ConfigMap to CRD conversion based upon a ListWatch. - The ListWatch watches for any configmap labeled with istio/operator and converts it to a CRD. As for direct integration in Galley, I will submit a PR prior to our next status meeting for the galley reviewers to evaluate. I think that may be simpler/less risk for 1.1 but will need a PR to really be certain.
Codecov Report
@@ Coverage Diff @@
## master #10015 +/- ##
========================================
+ Coverage 70% 70% +1%
========================================
Files 436 436
Lines 41014 41242 +228
========================================
+ Hits 28610 28822 +212
- Misses 11027 11038 +11
- Partials 1377 1382 +5
Continue to review full report at Codecov.
|
maxRetries = 5 | ||
) | ||
|
||
var ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a key part of how this PR works - Codec Decode is used in the conversions of configmaps to CRDs.
|
||
// addConfigMap is called in the event that a configmap was added to the system. | ||
func (c *Controller) addConfigMap(s *corev1.ConfigMap) { | ||
log.Info("Converting ConfigMap to CRD object.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this particular function is the main aspect of the ConfigMap to CRD conversion. For whatever reason, I was not able to sort out how to get YAMLtoJSON to return an array of JSON blobs. I tried a whole boatload of different things to arrive at this particular simple implementation.
I would really appreciate a thorough review of this part of the code as I am going to use it within the galley PR mentioned in the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yaml.NewYAMLToJSONDecoder
will parse and decode multi-doc YAML files.
@sdake: The following tests failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
configMapInformer := cache.NewSharedIndexInformer( | ||
&cache.ListWatch{ | ||
ListFunc: func(opts meta_v1.ListOptions) (runtime.Object, error) { | ||
opts.LabelSelector = operatorLabel + "=true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note - the informer is looking for configmaps with the label istio/operator
"istio.io/istio/operator/cmd/shared" | ||
) | ||
|
||
var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to file") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not be in the production code, consider create {blah}_test.go and move all testing code there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed.
} | ||
|
||
// Printf is a FormatFn that prints the formatted string to os.Stdout. | ||
var Printf = func(format string, args ...interface{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why you needed to redefine Fatalf and Printf? It looks like istio has already logging package "istio.io/istio/pkg/log"
why not to use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c&p from galley. Not sure what the answer is - perhaps for consistency or to scope the debug system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Galley's implementation was c&p from Mixers which defined its own printf/fatalf interfaces. We can't use istio.io/istio/pkg/log Printf functions as is because the signature is different. The Galley, Mixer, and Pilot command handlers could be refactored to accept the istio.io/istio/pkg/log interfaces.
|
||
// Serialize the given object in nicely formatted JSON. | ||
func Serialize(v interface{}) string { | ||
b, _ := json.MarshalIndent(v, "", " ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignoring error is a really bad idea, especially in conversion functions. What is the point of wrapping Marshal into Serialize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also c&p from galley. Looking at the galley implementation, there may not be a way to properly handle an error in this situation so error handling was punted. Feel free to fix in galley :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see Serialize()
is used anywhere in Galley. We can remove for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ozevren - general comment about galley - thought you may want to take a look. (some of this code was bootstrapped from galley to save time) :)
) | ||
|
||
const ( | ||
operatorLabel = "istio-operator" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to check with k8s labeling model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have decided to hardcode configmap names rather than work on labels since at present we don't need dynamic handling of configmap->crd conversion.
informer cache.SharedIndexInformer | ||
} | ||
|
||
func init() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imho you do not need to use init()
here..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed.
queue: queue, | ||
} | ||
|
||
configMapInformer.AddEventHandler(cache.ResourceEventHandlerFuncs{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you do not plan to put back on the queue received object, I would suggest not to use it and call directly a function to deal with ConfigMap add/delete. Lots of unneeded complexity will go away. Also you can do type assertion right here and pass to your func v1.ConfigMap object.
yamlStrings := strings.Split(s.Data["config"], "---") | ||
|
||
// Build a set of yaml strings and range over them. | ||
for _, yamlCRD := range yamlStrings { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why you need to convert Yaml into Json??? Normally you would get []byte from example Data field of ConfigMap and then Unmarshall it directly into the destination golang object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the configmap encodes several yaml blobs (with a ---
seperator) into one configmap with one key. The CRDs could be split by keys and this code would not be necessary, however, that would require preprocessing. We want as minimal preprocessing as possible - which is an implementation constraint.
crdDeserialized, _, err := deserializer.Decode(jsonBlob, nil, nil) | ||
if err != nil { | ||
// Coudln't decode the object, no sense creating the CRD | ||
fmt.Printf("%v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using istio logging, to get more info like timestamp etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed.
CRD := crdDeserialized.(*apiextensionsv1beta1.CustomResourceDefinition) | ||
|
||
// Create the CRD object | ||
if _, err := c.clientExt.ApiextensionsV1beta1().CustomResourceDefinitions().Create(CRD); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using exponential retry and backoff mechanism when creating API objects, not to fail when API server is just busy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea - will look into it for the galley implementation.
@sbezverk thanks for the great review dude :) I have moved on to a different implementation to solve this problem here: However when we get around to implementing a full-blown process/operator, your review will come in handy. To many of your questions, I think some would apply to galley proper - possibly some low hanging fruit for people to repair. Cheers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth looking into https://github.com/kubernetes-sigs/controller-runtime or even https://github.com/kubernetes-sigs/kubebuilder to aid in the development and maintenance of a k8s operator.
} | ||
|
||
// Printf is a FormatFn that prints the formatted string to os.Stdout. | ||
var Printf = func(format string, args ...interface{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Galley's implementation was c&p from Mixers which defined its own printf/fatalf interfaces. We can't use istio.io/istio/pkg/log Printf functions as is because the signature is different. The Galley, Mixer, and Pilot command handlers could be refactored to accept the istio.io/istio/pkg/log interfaces.
|
||
// Serialize the given object in nicely formatted JSON. | ||
func Serialize(v interface{}) string { | ||
b, _ := json.MarshalIndent(v, "", " ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see Serialize()
is used anywhere in Galley. We can remove for now.
|
||
// addConfigMap is called in the event that a configmap was added to the system. | ||
func (c *Controller) addConfigMap(s *corev1.ConfigMap) { | ||
log.Info("Converting ConfigMap to CRD object.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yaml.NewYAMLToJSONDecoder
will parse and decode multi-doc YAML files.
General comment: why aren’t you using the |
@secat good question. I am a big fan of This is simply a very rough prototype. I needed to understand the codebase fully to come up with the simplest approach to handling 1.1 upgrades. For 1.2, I think we can take a hard look at these various code generators and SDKs. Upstream now is leaning towards simply a process that starts, does crd registration from the configmap, and exits. This process would run as a Kubernetes job and be included in its own Helm chart or manifest. The workflow for deployment would be:
Cheers |
In future we may use kubebuilder or operator-sdk ( I believe there is a 3rd framework used in Redhat installer). But will need to be evaluated and a choice be made - and of course the question is what features we really need and if it fits with our use cases, long term the operator may need to handle multi-cluster and other things. We have plenty of code to watch/interact k8s resources - not sure tools to make it easy are needed at this point. |
I think the value of these tools is just to get you started quickly and without much thinking, boiler plate code. For complex controllers I do not think these tools are useful, as they do not provide any |
A couple of nice things about kubebuilder is that it simplifies API stub generation and management, and it is being worked upon actively to support operators (non-human kind) with libraries rather than scaffolding. The validations are nice but probably insufficient. There's some scaffolding for testing which may encourage people to write UTs. Boilerplate for watching resources we don't need and logic is up to us. I think we should definitely take a look at the frameworks when the upgrade problems are solved. |
@sdake I am currently using the operator-sdk version From the sigs.k8s.io/controller-runtime project description:
There is also an interesting article from Google Cloud named Best practices for building Kubernetes Operators and stateful apps for further reference on using SDK for building Kubernetes Operators. |
@sdake: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This pull request has been automatically marked as stale because it has not had activity in the last 2 weeks. It will be closed in 30 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
I am closing this PR in favor of the much simpler (for 1.1): |
No description provided.