8000 Add `sys.cluster_health` table to get the overall cluster health by seut · Pull Request #17617 · crate/crate · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add sys.cluster_health table to get the overall cluster health #17617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 17, 2025

Conversation

seut
Copy link
Member
@seut seut commented Mar 14, 2025

This new table will expose the overall cluster health based on the cluster state and individual table health exposed by the existing sys.health table.
For example, if the cluster is starting and recovering, no table is yet available, such sys.health will return an empty result set. Such a state will result in a RED overal cluster health exposed by this new table.

Relates to #17600.

@seut seut requested a review from matriv March 14, 2025 16:06
@seut seut force-pushed the s/sys-cluster-health branch 2 times, most recently from 6fbb4a8 to 2bba4e3 Compare March 14, 2025 16:36
Copy link
Contributor
@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Left some minor comments
Additionally, maybe add one more test when cluster is YELLOW and tables are RED, (the code path is tested already so maybe an overkill).

@seut seut requested a review from matriv March 17, 2025 08:51
Copy link
Contributor
@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

Copy link
Contributor
mergify bot commented Mar 17, 2025

This pull request has been removed from the queue for the following reason: pull request dequeued.

Pull request #17617 has been dequeued, merge conditions unmatch:

  • label=ready-to-merge
  • any of [🛡 GitHub branch protection]:
    • check-neutral = ci/jenkins/pr_tests
    • check-skipped = ci/jenkins/pr_tests
    • check-success = ci/jenkins/pr_tests
  • #approved-reviews-by >= 1 [🛡 GitHub branch protection]
  • #approved-reviews-by>=1
  • #changes-requested-reviews-by = 0 [🛡 GitHub branch protection]

You should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it.

If you want to requeue this pull request, you can post a @mergifyio requeue comment.

if (health == TableHealth.Health.RED || tableHealth.getMissingShards() > 0) {
finalDescription = "One or more tables are missing shards";
} else if (health == TableHealth.Health.YELLOW) {
finalDescription = "One or more tables have underreplicated shards";
Copy link
Member
@mfussenegger mfussenegger Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would make sense to include the table names in the description, or would that get too long? (Maybe with truncation, after 10 tables or so)

Given that the info can be retrieved via sys.health in a structured way it is probably not too important, but could be convenient.

(Not a blocker)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how much this helps, as if this occurs, one might need to look into sys.health to read the missing/underreplicated shards per each table anyway. But if you think it's helpful, I'm fine to add this (with truncation).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also leave it as is for now, and extend it later if we notice that it would be useful to have the info.

@seut seut force-pushed the s/sys-cluster-health branch 2 times, most recently from df33ff6 to a5b5d06 Compare March 17, 2025 12:25
This new table will expose the overall cluster health based
on the cluster state and individual table health exposed by the
existing `sys.health` table.
For example, if the cluster is starting and recovering, no table
is yet available, such `sys.health` will return an empty result set.
Such a state will result in a `RED` overal cluster health exposed
by this new table.

Relates to #17600.
@seut seut force-pushed the s/sys-cluster-health branch from a5b5d06 to 116240b Compare March 17, 2025 12:34
@seut seut added the ready-to-merge Let Mergify merge the PR once approved and checks pass label Mar 17, 2025
@mergify mergify bot merged commit bb401fc into master Mar 17, 2025
22 checks passed
@mergify mergify bot deleted the s/sys-cluster-health branch March 17, 2025 13:03
seut added a commit that referenced this pull request Mar 19, 2025
seut added a commit that referenced this pull request Mar 20, 2025
mergify bot pushed a commit that referenced this pull request Mar 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed ready-to-merge Let Mergify merge the PR once approved and checks pass
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0