-
Notifications
You must be signed in to change notification settings - Fork 19
First Invasive check through external K8s Job - dcgmi #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Claudia <c.misale@ibm.com>
Signed-off-by: Claudia <c.misale@ibm.com>
Signed-off-by: Claudia <c.misale@ibm.com>
Signed-off-by: Claudia <c.misale@ibm.com>
first take on node labeling on dcgm 3 failure first take on node labeling on dcgm 3 failure explicit set of resources, plus some minor stuff Signed-off-by: Claudia <c.misale@ibm.com>
Signed-off-by: Claudia <c.misale@ibm.com>
Signed-off-by: Claudia <c.misale@ibm.com>
Signed-off-by: Claudia <c.misale@ibm.com>
PR is updated with core for creating an external Job running Each Autopilot daemon pod is responsible for checking ONLY the resources on the node they belong to. Each dcgmi Job:
The PR also contains a bunch of refactoring and bug fixes and improvements
|
autopilot-daemon/pkg/utils/global.go
Outdated
|
||
func GetClientsetInstance() *K8sClientset { | ||
var lock = &sync.Mutex{} | ||
if k8sClientset == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a code cleanliness perspective, I think we can remove this outer == nil
check and acquire the lock every time
Removed comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @cmisale !!
* update to golang version Signed-off-by: Claudia <c.misale@ibm.com> * add singleton clientset instance Signed-off-by: Claudia <c.misale@ibm.com> * changes to dependencies Signed-off-by: Claudia <c.misale@ibm.com> * intrusive checks enablement Signed-off-by: Claudia <c.misale@ibm.com> * This is a combination of 3 commits. first take on node labeling on dcgm 3 failure first take on node labeling on dcgm 3 failure explicit set of resources, plus some minor stuff Signed-off-by: Claudia <c.misale@ibm.com> * bugfix: better error handling Signed-off-by: Claudia <c.misale@ibm.com> * bugfix: unreachable node count Signed-off-by: Claudia <c.misale@ibm.com> * Minor touches. Set invasive timer to 0 to avoid them Signed-off-by: Claudia <c.misale@ibm.com> * Update functions.go Removed comments --------- Signed-off-by: Claudia <c.misale@ibm.com>
This PR creates a singleton clientset that we can use anywhere in the code to get access to kube api object.
It might be an overkill, but for now we can keep this.
An example of usage
I also updated golang version, it seemed like a good moment to do that.