10000 Operator by sdake · Pull Request #10015 · istio/istio · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Operator #10015

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Operator #10015

wants to merge 3 commits into from

Conversation

sdake
Copy link
Member
@sdake sdake commented Nov 16, 2018

No description provided.

It has become imperative to have some proper way to handle CRDs.  A two-phase
installation technique is viable, however, given that we have released 1.0
with broken Helm `crd-install` tech, we must find a way forward that permits
us to add and remove CRDs as needed.  Further, we have some good planning
happening around an operator.  This bash code is just a placeholder for that
later work. This works until it is replaced by the more advanced operator
implementation.
This simply adds a deployment with appropriate settings to deploy
the Istio operator.
@istio-testing
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sdake
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: linsun

If they are not already assigned, you can assign the PR to them by writing /assign @linsun in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sdake
Copy link
Member Author
sdake commented Nov 17, 2018

/assign @linsun

@sdake
Copy link
Member Author
sdake commented Nov 17, 2018

/test e2e-simpleTests-cni

@sdake
Copy link
Member Author
sdake commented Nov 17, 2018

/test istio-integ-k8s-tests

@sdake
Copy link
Member Author
sdake commented Nov 17, 2018

/test istio-pilot-multicluster-e2e

heritage: {{ .Release.Service }}
release: {{ .Release.Name }}
data:
config: |-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you read this from a file rather than inlining?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way would be to include 2 files - one the original crd yaml (from 1.0), and one crd-1.1.yaml. I believe we need to keep the 1.0 yaml around as is, and it might be a good practice to have separate files for each version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed re separate YAMLs for each version - this was in my original PR on this topic. I was moving ahead a little too fast to reintegrate them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the galley PR, this is read from files in helm/files directory.

- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list", "create", "watch"]
- apiGroups: ["apiextensions.k8s.io"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be easy to give it more persmissions later? What is the permissions upgrade story?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mandarjog Thanks for challenging the assumptions in this PR. Will verify this works properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mandarjog permissions upgrades work for helm template + kubectl apply. Will test helm upgrade model next.

Cheers
-stedve


while true; do
kubectl apply -f /etc/istio/operator/config
sleep 5 # ugh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add sleep option? Every 5 s seems excessive

name: virtualservices.networking.istio.io
labels:
app: istio-pilot
chart: istio
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For new install charts: please stop adding 'chart/heritage/release' labels.

I checked helm doc - and it doesn't appear to be any requirement to add them. Labels create secondary indexes in etcd and are expensive, plus label changes usually result in upgrade failures.

Too late to remove them if they got added in 1.0 - put for new templates we can stop this. ( we should go over the diffs from 1.0 and remove new labels as well )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right - not sure why the labels were added in the first place... Attempting to keep the code base consistent. As you point out, that may not be the optimal approach here. The tiller process adds those labels - perhaps the original PRs that added those labels were meant to ensure the helm template rendering was the same as helm install. I'll take a look at removing these labels once we have a 1.1 install that works properly with the additional CRDs. I seem to recall they were added in the 1.1 series..

@@ -0,0 +1,32 @@
{{/* vim: set filetype=mustache: */}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's another thing I hope we can avoid in future - if we remove the extra labels we also don't need this template.

Copy link
Contributor
@costinm costinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems excessive to keep a container running just to apply the crds (which is needed once).

Also I am not sure mixing the operator in the istio chart will work well (ordering, etc). While full decoupling and modularization can't happen in 1.1 - we can at least move in that direction with new components.

The operator (doing only crd fixing in 1.1) can be installed as a separate helm chart. And for security reasons I think istio-operator should be a separate namespace from the start (so we don't have to move it out later). It'll have a service account with elevated privs, no need for an istio-system admin to get access to them.

@@ -0,0 +1,5 @@
approvers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this into install directory? There seems to be no go code here. So, a top level directory seems unnecessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move this to the install directory, however, there will be go code shortly, and will just need to move it back. Env WG is keen to have an operator for master (which this PR is against). This PR is targeted at kicking off that effort.

Happy to move to install if you desire given above new information - PLMK.

Cheers
-steve

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is going to be go code, its fine to have a top level directory. That said, since this is specific to kubernetes, don't you think this should be under pkg/kube/operator directory? The secret controller is also present there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rshriram agreed, kube specific operator code should be under pkg/kube/operator. WIll make it so.

Cheers
-steve

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rshriram would appreciate your expert eye on the parts of the code I've reviewed (also read the commit message please). I am going to implement this approach for galley. I think it will be simpler than a full blown process/pod just to instantiate CRDs at runtime. In the future, this code could serve as a starting point for an operator implementation (in 1.2 timeframe) but I am really keen to make CRDs work well in 1.1 - and time is precious.

Cheers
-steve

@sdake
Copy link
Member Author
sdake commented Nov 19, 2018

Thanks for the thorough reviews. Will submit updates that address review comments when I have them sorted out.

Cheers
-steve

This is mostly a bunch of copy and paste from galley and the
secret controller.  I found during this work that galley may be
a more appropriate place for ConfigMap to CustomResourceDefinition
conversion.  I will submit a PR with the relevant aspects of the
code integrated into Galley.  I have promised to have the operator PR
available for review, and its available :)

- This code requires a namespace called istio-operator.
- The meat of this code is the ConfigMap to CRD conversion based upon
a ListWatch.
- The ListWatch watches for any configmap labeled with istio/operator
and converts it to a CRD.

As for direct integration in Galley, I will submit a PR prior to our
next status meeting for the galley reviewers to evaluate.  I think that
may be simpler/less risk for 1.1 but will need a PR to really be certain.
@codecov
Copy link
codecov bot commented Nov 24, 2018

Codecov Report

Merging #10015 into master will increase coverage by 1%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #10015    +/-   ##
========================================
+ Coverage      70%      70%    +1%     
========================================
  Files         436      436            
  Lines       41014    41242   +228     
========================================
+ Hits        28610    28822   +212     
- Misses      11027    11038    +11     
- Partials     1377     1382     +5
Impacted Files Coverage Δ
mixer/pkg/protobuf/yaml/resolver.go 95% <0%> (-5%) ⬇️
...ter/kubernetesenv/template/template_handler.gen.go 96% <0%> (-2%) ⬇️
mixer/adapter/signalfx/metrics.go 56% <0%> (-1%) ⬇️
mixer/adapter/dogstatsd/dogstatsd.go 100% <0%> (ø) ⬆️
mixer/adapter/rbac/rbac.go 0% <0%> (ø) ⬆️
mixer/adapter/list/regexList.go 100% <0%> (ø) ⬆️
mixer/adapter/inventory.gen.go 0% <0%> (ø) ⬆️
mixer/adapter/memquota/dedup.go 100% <0%> (ø) ⬆️
pilot/pkg/proxy/envoy/v2/ads.go 84% <0%> (+1%) ⬆️
mixer/adapter/bypass/bypass.go 65% <0%> (+1%) ⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4720a3d...497f8fe. Read the comment docs.

maxRetries = 5
)

var (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a key part of how this PR works - Codec Decode is used in the conversions of configmaps to CRDs.


// addConfigMap is called in the event that a configmap was added to the system.
func (c *Controller) addConfigMap(s *corev1.ConfigMap) {
log.Info("Converting ConfigMap to CRD object.")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this particular function is the main aspect of the ConfigMap to CRD conversion. For whatever reason, I was not able to sort out how to get YAMLtoJSON to return an array of JSON blobs. I tried a whole boatload of different things to arrive at this particular simple implementation.

I would really appreciate a thorough review of this part of the code as I am going to use it within the galley PR mentioned in the commit message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml.NewYAMLToJSONDecoder will parse and decode multi-doc YAML files.

@istio-testing
Copy link
Collaborator
istio-testing commented Nov 24, 2018

@sdake: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
prow/e2e-simpleTests-cni.sh e2ab13d link /test e2e-simpleTests-cni
prow/istio-pilot-multicluster-e2e.sh e2ab13d link /test istio-pilot-multicluster-e2e
prow/istio-presubmit.sh 497f8fe link /test istio-presubmit

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

configMapInformer := cache.NewSharedIndexInformer(
&cache.ListWatch{
ListFunc: func(opts meta_v1.ListOptions) (runtime.Object, error) {
opts.LabelSelector = operatorLabel + "=true"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note - the informer is looking for configmaps with the label istio/operator

"istio.io/istio/operator/cmd/shared"
)

var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to file")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not be in the production code, consider create {blah}_test.go and move all testing code there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed.

}

// Printf is a FormatFn that prints the formatted string to os.Stdout.
var Printf = func(format string, args ...interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why you needed to redefine Fatalf and Printf? It looks like istio has already logging package "istio.io/istio/pkg/log" why not to use it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c&p from galley. Not sure what the answer is - perhaps for consistency or to scope the debug system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Galley's implementation was c&p from Mixers which defined its own printf/fatalf interfaces. We can't use istio.io/istio/pkg/log Printf functions as is because the signature is different. The Galley, Mixer, and Pilot command handlers could be refactored to accept the istio.io/istio/pkg/log interfaces.


// Serialize the given object in nicely formatted JSON.
func Serialize(v interface{}) string {
b, _ := json.MarshalIndent(v, "", " ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring error is a really bad idea, especially in conversion functions. What is the point of wrapping Marshal into Serialize?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also c&p from galley. Looking at the galley implementation, there may not be a way to properly handle an error in this situation so error handling was punted. Feel free to fix in galley :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see Serialize() is used anywhere in Galley. We can remove for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ozevren - general comment about galley - thought you may want to take a look. (some of this code was bootstrapped from galley to save time) :)

)

const (
operatorLabel = "istio-operator"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to check with k8s labeling model.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will take a look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have decided to hardcode configmap names rather than work on labels since at present we don't need dynamic handling of configmap->crd conversion.

informer cache.SharedIndexInformer
}

func init() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho you do not need to use init() here..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed.

queue: queue,
}

configMapInformer.AddEventHandler(cache.ResourceEventHandlerFuncs{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you do not plan to put back on the queue received object, I would suggest not to use it and call directly a function to deal with ConfigMap add/delete. Lots of unneeded complexity will go away. Also you can do type assertion right here and pass to your func v1.ConfigMap object.

yamlStrings := strings.Split(s.Data["config"], "---")

// Build a set of yaml strings and range over them.
for _, yamlCRD := range yamlStrings {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you need to convert Yaml into Json??? Normally you would get []byte from example Data field of ConfigMap and then Unmarshall it directly into the destination golang object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the configmap encodes several yaml blobs (with a --- seperator) into one configmap with one key. The CRDs could be split by keys and this code would not be necessary, however, that would require preprocessing. We want as minimal preprocessing as possible - which is an implementation constraint.

crdDeserialized, _, err := deserializer.Decode(jsonBlob, nil, nil)
if err != nil {
// Coudln't decode the object, no sense creating the CRD
fmt.Printf("%v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using istio logging, to get more info like timestamp etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed.

CRD := crdDeserialized.(*apiextensionsv1beta1.CustomResourceDefinition)

// Create the CRD object
if _, err := c.clientExt.ApiextensionsV1beta1().CustomResourceDefinitions().Create(CRD); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using exponential retry and backoff mechanism when creating API objects, not to fail when API server is just busy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea - will look into it for the galley implementation.

@sdake
Copy link
Member Author
sdake commented Nov 26, 2018

@sbezverk thanks for the great review dude :)

I have moved on to a different implementation to solve this problem here:
#10120

However when we get around to implementing a full-blown process/operator, your review will come in handy. To many of your questions, I think some would apply to galley proper - possibly some low hanging fruit for people to repair.

Cheers
-steve

@sdake sdake added area/environments do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. labels Nov 26, 2018
Copy link
Contributor
@ayj ayj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth looking into https://github.com/kubernetes-sigs/controller-runtime or even https://github.com/kubernetes-sigs/kubebuilder to aid in the development and maintenance of a k8s operator.

}

// Printf is a FormatFn that prints the formatted string to os.Stdout.
var Printf = func(format string, args ...interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Galley's implementation was c&p from Mixers which defined its own printf/fatalf interfaces. We can't use istio.io/istio/pkg/log Printf functions as is because the signature is different. The Galley, Mixer, and Pilot command handlers could be refactored to accept the istio.io/istio/pkg/log interfaces.


// Serialize the given object in nicely formatted JSON.
func Serialize(v interface{}) string {
b, _ := json.MarshalIndent(v, "", " ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see Serialize() is used anywhere in Galley. We can remove for now.


// addConfigMap is called in the event that a configmap was added to the system.
func (c *Controller) addConfigMap(s *corev1.ConfigMap) {
log.Info("Converting ConfigMap to CRD object.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml.NewYAMLToJSONDecoder will parse and decode multi-doc YAML files.

@secat
Copy link
secat commented Nov 27, 2018

General comment: why aren’t you using the kubebuilder or the operator-sdk for the operator logic?

@sdake
Copy link
Member Author
sdake commented Nov 27, 2018

@secat good question. I am a big fan of kubebuilder. I have not evaluated operator-sdk.

This is simply a very rough prototype. I needed to understand the codebase fully to come up with the simplest approach to handling 1.1 upgrades. For 1.2, I think we can take a hard look at these various code generators and SDKs.

Upstream now is leaning towards simply a process that starts, does crd registration from the configmap, and exits. This process would run as a Kubernetes job and be included in its own Helm chart or manifest.

The workflow for deployment would be:

helm install istio-system ^^ above code
helm upgrade istio - existing chart with subtractions based upon above code.

Cheers
-steve

@costinm
Copy link
Contributor
costinm commented Nov 27, 2018

In future we may use kubebuilder or operator-sdk ( I believe there is a 3rd framework used in Redhat installer). But will need to be evaluated and a choice be made - and of course the question is what features we really need and if it fits with our use cases, long term the operator may need to handle multi-cluster and other things. We have plenty of code to watch/interact k8s resources - not sure tools to make it easy are needed at this point.

@sbezverk
Copy link
Contributor
sbezverk commented Nov 27, 2018

I think the value of these tools is just to get you started quickly and without much thinking, boiler plate code. For complex controllers I do not think these tools are useful, as they do not provide any operator's logic which still needs to be developed.

@ostromart
Copy link
Contributor

A couple of nice things about kubebuilder is that it simplifies API stub generation and management, and it is being worked upon actively to support operators (non-human kind) with libraries rather than scaffolding. The validations are nice but probably insufficient. There's some scaffolding for testing which may encourage people to write UTs. Boilerplate for watching resources we don't need and logic is up to us. I think we should definitely take a look at the frameworks when the upgrade problems are solved.

@secat
Copy link
secat commented Nov 28, 2018

@sdake I am currently using the operator-sdk version v0.1.0 since I will integrate with the Operator Lifecycle Manager. However this SDK uses the same sigs.k8s.io/controller-runtime base library than the kubebuilder SDK.

From the sigs.k8s.io/controller-runtime project description:

The Kubernetes controller-runtime Project is a set of go libraries for building Controllers.

There is also an interesting article from Google Cloud named Best practices for building Kubernetes Operators and stateful apps for further reference on using SDK for building Kubernetes Operators.

@istio-testing
Copy link
Collaborator

@sdake: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@istio-testing istio-testing added the needs-rebase Indicates a PR needs to be rebased before being merged label Nov 29, 2018
@stale
Copy link
stale bot commented Dec 14, 2018

This pull request has been automatically marked as stale because it has not had activity in the last 2 weeks. It will be closed in 30 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale label Dec 14, 2018
@sdake
Copy link
Member Author
sdake commented Dec 19, 2018

I am closing this PR in favor of the much simpler (for 1.1):
#10562

@sdake sdake closed this Dec 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/environments do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. needs-rebase Indicates a PR needs to be rebased before being merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0