-
Notifications
You must be signed in to change notification settings - Fork 536
[SLI Metrics] RayClusterHeadPodReady #3571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[SLI Metrics] RayClusterHeadPodReady #3571
Conversation
func (c *RayClusterMetricsManager) ObserveRayClusterHeadPodReady(name, namespace string, ready bool) { | ||
c.rayClusterHeadPodReady.DeleteLabelValues(name, namespace, strconv.FormatBool(!ready)) | ||
c.rayClusterHeadPodReady.WithLabelValues(name, namespace, strconv.FormatBool(ready)).Set(1) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another approach is to follow the convention used by kube_state_metrics
, where metrics represent status conditions using a condition label (e.g., "true", "false", "unknown"), and the metric value is either 0 or 1. In this model, each resource instance emits multiple metrics for the same status, but only one will have a value of 1 at any given time — indicating the current state.
User can also determine when a given resource changed state, providing valuable insights for auditing, debugging, and historical analysis.
However, this approach does come with a trade-off: it increases metric cardinality, as each resource will now generate multiple labeled series — one for each condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current approach also works well — it's simple and straightforward, so perhaps we can continue with it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to the way you recommended, so we don't need to change it in the future.
Signed-off-by: You-Cheng Lin <mses010108@gmail.com>
Signed-off-by: You-Cheng Lin <mses010108@gmail.com>
6e29a16
to
b8b8f0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, can you also update the description?
cc @kevin85421 for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on our previous discussion, we've decided to use kube-state-metrics for this instead. cc @troychiu @win5923?
https://docs.google.com/document/d/1zNiE7lVZYjhrxlTbh1UXOVpR6hh1GIeSfCfE9Lt5v6Y/edit?tab=t.0
Yes, but I saw Troy mention on Slack that we could do this first, so I thought that we were supposed to keep this metric. |
By default, In this case, I think |
Yeah I think our conclusion is using kube-state-metrics. |
Why are these changes needed?
kuberay_cluster_head_pod_ready
proposed in 1.4.0.Manual test:
Related issue number
Checks