8000 Add `sys.cluster_health` table to get the overall cluster health by seut · Pull Request #17617 · crate/crate · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add sys.cluster_health table to get the overall cluster health #17617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions docs/admin/system-information.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1901,6 +1901,55 @@ Health definition
The ``sys.health`` table is subject to :ref:`shard_table_permissions` as it
will expose a summary of table shard states.

.. _sys-cluster_health:

Cluster Health
==============

The ``sys.cluster_health`` table returns the health of the entire cluster. Only
a single entry is returned, containing the overall health of the cluster,
including the overall number of missing shards and underreplicated shards of all
tables. Any table-specific health issues, exposed by the
:ref:`sys.health <sys-health>` will be reflected here as well.

+----------------------------+-------------------------------------+--------------+
| Column Name | Description | Return Type |
+============================+=====================================+==============+
| ``health`` | The cluster health label. | ``TEXT`` |
| | Can be RED, YELLOW or GREEN. | |
+----------------------------+-------------------------------------+--------------+
| ``severity`` | The health as a ``smallint`` value. | ``SMALLINT`` |
| | Useful when ordering on health. | |
+----------------------------+-------------------------------------+--------------+
| ``description`` | A description of the current health | ``TEXT`` |
+----------------------------+-------------------------------------+--------------+
| ``missing_shards`` | The number of unassigned or not | ``INTEGER`` |
| | started shards of all tables. | |
+----------------------------+-------------------------------------+--------------+
| ``underreplicated_shards`` | The number of shards which are | ``INTEGER`` |
| | not fully replicated of all tables. | |
+----------------------------+-------------------------------------+--------------+

Both ``missing_shards`` and ``underreplicated_shards`` might return ``-1`` if
the cluster is in an unhealthy state that prevents the exact number from being
calculated. This could be the case when the cluster can't elect a master,
because there are not enough eligible nodes available. In this case, the
``description`` field will contain appropriate explanation regarding the
cluster status.

::

cr> select * from sys.cluster_health order by severity desc;
+-------------+--------+----------------+----------+------------------------+
| description | health | missing_shards | severity | underreplicated_shards |
+-------------+--------+----------------+----------+------------------------+
| | GREEN | 0 | 1 | 0 |
+-------------+--------+----------------+----------+------------------------+
SELECT 1 row in set (... sec)

The `health` with the highest `severity` will always define the `health` of the
query scope.

.. _sys-repositories:

Repositories
Expand Down
5 changes: 5 additions & 0 deletions docs/appendices/release-notes/6.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -147,5 +147,10 @@ Administration and Operations
to give more information on ongoing merges and whether the recovery source is
still present in merged segments.

- Added the :ref:`sys.cluster_health <sys-cluster_health>` table to provide
information about the health of the whole cluster in comparison to
the :ref:`sys.health <sys-health>` table which exposes health about each
table only.

Client interfaces
-----------------
3 changes: 2 additions & 1 deletion docs/general/information-schema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ number of replicas.
| sys | allocations | BASE TABLE | NULL | NULL |
| sys | checks | BASE TABLE | NULL | NULL |
| sys | cluster | BASE TABLE | NULL | NULL |
| sys | cluster_health | BASE TABLE | NULL | NULL |
| sys | health | BASE TABLE | NULL | NULL |
| sys | jobs | BASE TABLE | NULL | NULL |
| sys | jobs_log | BASE TABLE | NULL | NULL |
Expand All @@ -143,7 +144,7 @@ number of replicas.
| sys | summits | BASE TABLE | NULL | NULL |
| sys | users | BASE TABLE | NULL | NULL |
+--------------------+-----------------------------------+------------+------------------+--------------------+
SELECT 76 rows in set (... sec)
SELECT 77 rows in set (... sec)


The table also contains additional information such as the specified
Expand Down
103 changes: 103 additions & 0 deletions server/src/main/java/io/crate/metadata/sys/SysClusterHealth.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
/*
* Licensed to Crate.io GmbH ("Crate") under one or more contributor
* license agreements. See the NOTICE file distributed with this work for
* additional information regarding copyright ownership. Crate licenses
* this file to you under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License. You may
* obtain a copy of the License ED48 at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations
* under the License.
*
* However, if you have executed another commercial license agreement
* with Crate these terms will supersede the license and you may use the
* software solely pursuant to the terms of the relevant commercial agreement.
*/

package io.crate.metadata.sys;

import static io.crate.types.DataTypes.LONG;
import static io.crate.types.DataTypes.SHORT;
import static io.crate.types.DataTypes.STRING;

import java.util.List;
import java.util.Set;
import java.util.concurrent.CompletableFuture;

import org.elasticsearch.cluster.ClusterState;
import org.elasticsearch.cluster.block.ClusterBlock;
import org.elasticsearch.cluster.block.ClusterBlockLevel;
import org.elasticsearch.rest.RestStatus;

import io.crate.metadata.RelationName;
import io.crate.metadata.SystemTable;

public class SysClusterHealth {

Check warning on line 40 in server/src/main/java/io/crate/metadata/sys/SysClusterHealth.java

View check run for this annotation

Codecov / codecov/patch

server/src/main/java/io/crate/metadata/sys/SysClusterHealth.java#L40

Added line #L40 was not covered by tests

public static final RelationName IDENT = new RelationName(SysSchemaInfo.NAME, "cluster_health");

static SystemTable<ClusterHealth> INSTANCE = SystemTable.<ClusterHealth>builder(IDENT)
.add("health", STRING, ClusterHealth::getHealth)
.add("severity", SHORT, ClusterHealth::getSeverity)
.add("description", STRING, ClusterHealth::description)
.add("missing_shards", LONG, ClusterHealth::missingShards)
.add("underreplicated_shards", LONG, ClusterHealth::underreplicatedShards)
.build();

public static CompletableFuture<Iterable<ClusterHealth>> compute(ClusterState clusterState) {
// Following implementation of {@link org.elasticsearch.cluster.health.ClusterStateHealth}
Set<ClusterBlock> blocksRed = clusterState.blocks().global(RestStatus.SERVICE_UNAVAILABLE);
if (!blocksRed.isEmpty()) {
var block = blocksRed.iterator().next();
ClusterHealth clusterHealth = new ClusterHealth(
TableHealth.Health.RED,
block.description(),
-1,
-1
);
return CompletableFuture.completedFuture(List.of(clusterHealth));
}
Set<ClusterBlock> blocksYellow = clusterState.blocks().global(ClusterBlockLevel.METADATA_WRITE);
final TableHealth.Health clusterHealth = !blocksYellow.isEmpty() ? TableHealth.Health.YELLOW : TableHealth.Health.GREEN;
final String description = !blocksYellow.isEmpty() ? blocksYellow.iterator().next().description() : "";
return TableHealth.compute(clusterState)
.thenApply((it) -> {
long missingShards = 0;
long underreplicatedShards = 0;
TableHealth.Health health = clusterHealth;
String finalDescription = description;
for (var tableHealth : it) {
if (tableHealth.health().severity() > health.severity()) {
health = tableHealth.health();
// shard level health is only set to RED if there are missing primary shards that were
// allocated before see {@link ClusterShardHealth#getInactivePrimaryHealth()}.
// Otherwise, the table health is set to YELLOW.
if (health == TableHealth.Health.RED || tableHealth.getMissingShards() > 0) {
finalDescription = "One or more tables are missing shards";
} else if (health == TableHealth.Health.YELLOW) {
finalDescription = "One or more tables have underreplicated shards";
Copy link
Member
@mfussenegger mfussenegger Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would make sense to include the table names in the description, or would that get too long? (Maybe with truncation, after 10 tables or so)

Given that the info can be retrieved via sys.health in a structured way it is probably not too important, but could be convenient.

(Not a blocker)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how much this helps, as if this occurs, one might need to look into sys.health to read the missing/underreplicated shards per each table anyway. But if you think it's helpful, I'm fine to add this (with truncation).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also leave it as is for now, and extend it later if we notice that it would be useful to have the info.

}
}
missingShards += tableHealth.getMissingShards();
underreplicatedShards += tableHealth.getUnderreplicatedShards();
}
return List.of(new ClusterHealth(health, finalDescription, missingShards, underreplicatedShards));
});
}

public record ClusterHealth(TableHealth.Health health, String description, long missingShards, long underreplicatedShards) {

public String getHealth() {
return health.name();
}

public short getSeverity() {
return health.severity();
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ public SysSchemaInfo(ClusterService clusterService, Roles roles) {
Map.entry(SysSummitsTableInfo.IDENT.name(), SysSummitsTableInfo.INSTANCE),
Map.entry(SysAllocationsTableInfo.IDENT.name(), SysAllocationsTableInfo.INSTANCE),
Map.entry(SysHealth.IDENT.name(), SysHealth.INSTANCE),
Map.entry(SysClusterHealth.IDENT.name(), SysClusterHealth.INSTANCE),
Map.entry(SysMetricsTableInfo.NAME.name(), SysMetricsTableInfo.create(localNode)),
Map.entry(SysSegmentsTableInfo.IDENT.name(), SysSegmentsTableInfo.create(clusterService::localNode)),
Map.entry(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,14 @@ public SysTableDefinitions(ClusterService clusterService,
true
)
),
Map.entry(
SysClusterHealth.IDENT,
new StaticTableDefinition<>(
() -> SysClusterHealth.compute(clusterService.state()),
SysClusterHealth.INSTANCE.expressions(),
false
)
),
Map.entry(
SysMetricsTableInfo.NAME,
new StaticTableDefinition<>(
Expand Down
4 changes: 4 additions & 0 deletions server/src/main/java/io/crate/metadata/sys/TableHealth.java
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,10 @@ public String getPartitionIdent() {
return partitionIdent;
}

public Health health() {
return health;
}

public String getHealth() {
return health.toString();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,16 @@ public Set<ClusterBlock> global(ClusterBlockLevel level) {
return levelHolders.get(level).global();
}

public Set<ClusterBlock> global(RestStatus status) {
Set<ClusterBlock> blocks = new HashSet<>();
for (ClusterBlock clusterBlock : global) {
if (clusterBlock.status().equals(status)) {
blocks.add(clusterBlock);
}
}
return blocks;
}

public ImmutableOpenMap<String, Set<ClusterBlock>> indices(ClusterBlockLevel level) {
return levelHolders.get(level).indices();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@
import static io.crate.protocols.postgres.PGErrorStatus.INTERNAL_ERROR;
import static io.crate.testing.Asserts.assertThat;
import static io.netty.handler.codec.http.HttpResponseStatus.BAD_REQUEST;
import static org.assertj.core.api.Assertions.assertThat;

import java.util.Collections;
import java.util.List;
Expand Down Expand Up @@ -114,6 +113,7 @@ public void testDefaultTables() {
"NULL| NULL| NULL| strict| NULL| NULL| NULL| SYSTEM GENERATED| NULL| NULL| NULL| crate| allocations| sys| BASE TABLE| NULL",
"NULL| NULL| NULL| strict| NULL| NULL| NULL| SYSTEM GENERATED| NULL| NULL| NULL| crate| checks| sys| BASE TABLE| NULL",
"NULL| NULL| NULL| strict| NULL| NULL| NULL| SYSTEM GENERATED| NULL| NULL| NULL| crate| cluster| sys| BASE TABLE| NULL",
"NULL| NULL| NULL| strict| NULL| NULL| NULL| SYSTEM GENERATED| NULL| NULL| NULL| crate| cluster_health| sys| BASE TABLE| NULL",
"NULL| NULL| NULL| strict| NULL| NULL| NULL| SYSTEM GENERATED| NULL| NULL| NULL| crate| health| sys| BASE TABLE| NULL",
"NULL| NULL| NULL| strict| NULL| NULL| NULL| SYSTEM GENERATED| NULL| NULL| NULL| crate| jobs| sys| BASE TABLE| NULL",
"NULL| NULL| NULL| strict| NULL| NULL| NULL| SYSTEM GENERATED| NULL| NULL| NULL| crate| jobs_log| sys| BASE TABLE| NULL",
Expand Down Expand Up @@ -213,12 +213,12 @@ public void testSelectViewsFromInformationSchema() {
@Test
public void testSearchInformationSchemaTablesRefresh() {
execute("select * from information_schema.tables");
assertThat(response.rowCount()).isEqualTo(72L);
assertThat(response.rowCount()).isEqualTo(73L);

execute("create table t4 (col1 integer, col2 string) with(number_of_replicas=0)");

execute("select * from information_schema.tables");
assertThat(response.rowCount()).isEqualTo(73L);
assertThat(response.rowCount()).isEqualTo(74L);
}

@Test
Expand Down Expand Up @@ -571,7 +571,7 @@ public void testTableConstraintsWithOrderBy() {
@Test
public void testDefaultColumns() {
execute("select * from information_schema.columns order by table_schema, table_name");
assertThat(response.rowCount()).isEqualTo(1042);
assertThat(response.rowCount()).isEqualTo(1047);
}

@Test
Expand Down Expand Up @@ -889,7 +889,7 @@ public void testGlobalCount() {
execute("create table t3 (id integer, col1 string) clustered into 3 shards with(number_of_replicas=0)");
execute("select count(*) from information_schema.tables");
assertThat(response.rowCount()).isEqualTo(1);
assertThat(response.rows()[0][0]).isEqualTo(75L);
assertThat(response.rows()[0][0]).isEqualTo(76L);
}

@Test
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ public void testSetNonDynamicTableSetting() {
public void testFilterOnNull() throws Exception {
execute("select * from information_schema.tables " +
"where settings IS NULL");
assertThat(response.rowCount()).isEqualTo(72L);
assertThat(response.rowCount()).isEqualTo(73L);
execute("select * from information_schema.tables " +
"where table_name = 'settings_table' and settings['blocks']['read'] IS NULL");
assertThat(response.rowCount()).isEqualTo(0);
Expand Down
Loading
0