From 6f8599bf4263e0b91b076576eb5729dcc74b07bf Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Sun, 18 Dec 2022 15:49:08 -0500 Subject: [PATCH 01/28] Start first draft of ADR-084 Signed-off-by: Thane Thomson --- .../adr-084-data-companion-pull-api.md | 115 ++++++++++++++++++ 1 file changed, 115 insertions(+) create mode 100644 docs/architecture/adr-084-data-companion-pull-api.md diff --git a/docs/architecture/adr-084-data-companion-pull-api.md b/docs/architecture/adr-084-data-companion-pull-api.md new file mode 100644 index 00000000000..b3f67be1f86 --- /dev/null +++ b/docs/architecture/adr-084-data-companion-pull-api.md @@ -0,0 +1,115 @@ +# ADR 084: Data Companion Pull API + +## Changelog + +- 2022-12-18: First draft (@thanethomson) + +## Status + +Accepted | Rejected | Deprecated | Superseded by + +## Context + +Following from the discussion around the development of [ADR 082][adr-082], an +alternative model is proposed here for offloading certain data from +Tendermint-based nodes to a "data companion". This alternative model inverts the +control of the data offloading process, when compared to ADR 082, from the +Tendermint node to the data companion. + +This particular model would + +## Alternative Approaches + +Other considered alternatives to this ADR are also outlined in +[ADR-082][adr-082]. + +## Decision + +> This section records the decision that was made. + +## Detailed Design + +### Requirements + +Similar requirements are proposed here as for [ADR-082][adr-082]. + +1. Only a single trusted companion service _must_ be supported in order to + reduce the likelihood of overloading a node via the companion API. Use cases + that require multiple companions will have to implement an intermediate/proxy + companion that can scale independently of the node. + +2. All or part of the following data _must_ be obtainable by the companion: + 1. Committed block data + 2. `FinalizeBlockResponse` data, but only for committed blocks + +3. The companion _must_ be able to establish the earliest height for which the + node has all of the requisite data. + +4. The companion _must_ be able to establish, as close to real-time as possible, + the height and ID of the block that was just committed by the network, so + that it can request relevant heights' data from the node. + +5. The API _must_ be (or be able to be) appropriately shielded from untrusted + consumers and abuse. Critical control facets of the API (e.g. those that + influence the node's pruning mechanisms) _must_ be implemented in such a way + as to eliminate the possibility of accidentally exposing those endpoints to + the public internet unprotected. + +6. The node _must_ know, by way of signals from the companion, which heights' + associated data are safe to automatically prune. + +7. The API _must_ be opt-in. When off or not in use, it _should_ have no impact + on system performance. + +8. It _must_ not cause back-pressure into consensus. + +8. It _must_ not cause unbounded memory growth. + +10. It _must_ not cause unbounded disk storage growth. + +11. It _must_ provide insight to operators (e.g. by way of logs/metrics) to + assist in dealing with possible failure modes. + +12. The solution _should_ be able to be backported to older versions of + Tendermint (e.g. v0.34). + +### Entity Relationships + +The following model shows the proposed relationships between Tendermint, a +socket-based ABCI application, and the proposed data companion service. + +``` + +----------+ +------------+ +----------------+ + | ABCI App | <--- | Tendermint | <--- | Data Companion | + +----------+ +------------+ +----------------+ +``` + +In this diagram, it is evident that Tendermint connects out to the ABCI +application, and the companion connects to the Tendermint node. + +### gRPC API + +At the time of this writing, it is proposed that Tendermint implement a full +gRPC interface ([\#9751]). As such, we have several options when it comes to +implementing the data companion pull API: + +1. Extend the existing RPC API to simply provide the additional data + companion-specific endpoints. +2. Implement a separate RPC API on a different port to the standard gRPC + interface. + +## Consequences + +> This section describes the consequences, after applying the decision. All +> consequences should be summarized here, not just the "positive" ones. + +### Positive + +### Negative + +### Neutral + +## References + +[adr-082]: https://github.com/tendermint/tendermint/pull/9437 +[\#9751]: https://github.com/tendermint/tendermint/issues/9751 From 430d95806f469fb35cb7f32251a9a712399deaad Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Sun, 18 Dec 2022 18:29:38 -0500 Subject: [PATCH 02/28] Add initial gRPC API Signed-off-by: Thane Thomson --- .../adr-084-data-companion-pull-api.md | 135 +++++++++++++++++- 1 file changed, 133 insertions(+), 2 deletions(-) diff --git a/docs/architecture/adr-084-data-companion-pull-api.md b/docs/architecture/adr-084-data-companion-pull-api.md index b3f67be1f86..3a67303423c 100644 --- a/docs/architecture/adr-084-data-companion-pull-api.md +++ b/docs/architecture/adr-084-data-companion-pull-api.md @@ -94,9 +94,139 @@ gRPC interface ([\#9751]). As such, we have several options when it comes to implementing the data companion pull API: 1. Extend the existing RPC API to simply provide the additional data - companion-specific endpoints. + companion-specific endpoints. In order to meet the + [requirements](#requirements), however, some of the endpoints will have to be + protected by default. This is simpler for clients to interact with though, + because they only need to interact with a single endpoint for all of their + needs. 2. Implement a separate RPC API on a different port to the standard gRPC - interface. + interface. This allows for a clearer distinction between the standard and + data companion-specific gRPC interfaces, but complicates the server and + client interaction models. + +Due to the poorer operator experience in option 2, it would be preferable to +implement option 1, but have certain endpoints be +[access-controlled](#access-control) by default. + +With this in mind, the following gRPC API is proposed, where the Tendermint node +will implement these services. + +```protobuf +import "tendermint/abci/types.proto"; +import "tendermint/types/types.proto"; +import "tendermint/types/block.proto"; + +// BlockService provides information about blocks. +service BlockService { + // GetLatestBlockID returns a stream of the latest block IDs as they are + // committed by the network. + rpc GetLatestBlockID(GetLatestBlockIDRequest) returns (stream GetLatestBlockIDResponse) {} + + // GetBlock attempts to retrieve the block at a particular height. + rpc GetBlock(GetBlockRequest) returns (GetBlockResponse) {} + + // GetBlockResults attempts to retrieve the results of block execution for a + // particular height. + rpc GetBlockResults(GetBlockResultsRequest) returns (GetBlockResultsResponse) {} +} + +// DataCompanionService provides privileged access to specialized pruning +// functionality on the Tendermint node to help optimize node storage. +service DataCompanionService { + // SetRetainHeight notifies the node of the minimum height whose data must + // be retained by the node. This data includes block data and block + // execution results. + // + // The lower of this retain height and that set by the application in its + // Commit response will be used by the node to determine which heights' data + // can be pruned. + rpc SetRetainHeight(SetRetainHeightRequest) returns (SetRetainHeightResponse) {} + + // GetRetainHeight returns the retain height set by the companion and that + // set by the application. This can give the companion an indication as to + // which heights' data are currently available. + rpc GetRetainHeight(GetRetainHeightRequest) returns (GetRetainHeightResponse) {} +} + +message GetLatestBlockIDRequest {} + +// GetLatestBlockIDResponse is a lightweight reference to the latest committed +// block. +message GetLatestBlockIDResponse { + // The height of the latest committed block. + int64 height = 1; + // The ID of the latest committed block. + tendermint.types.BlockID block_id = 2; +} + +message GetBlockRequest { + // The height of the block to get. + int64 height = 1; +} + +message GetBlockResponse { + // Block data for the requested height. + tendermint.types.Block block = 1; +} + +message GetBlockResultsRequest { + // The height of the block results to get. + int64 height = 1; +} + +message GetBlockResultsResponse { + // All events produced by the ABCI BeginBlock call for the block. + repeated tendermint.abci.Event begin_block_events = 1; + + // All transaction results produced by block execution. + repeated tendermint.abci.ExecTxResult tx_results = 2; + + // Validator updates during block execution. + repeated tendermint.abci.ValidatorUpdate validator_updates = 3; + + // Consensus parameter updates during block execution. + tendermint.types.ConsensusParams consensus_param_updates = 4; + + // All events produced by the ABCI EndBlock call. + // NB: This should be called finalize_block_events when ABCI 2.0 lands. + repeated tendermint.abci.Event end_block_events = 5; +} + +message SetRetainHeightRequest { + int64 height = 1; +} + +message SetRetainHeightResponse {} + +message GetRetainHeightRequest {} + +message GetRetainHeightResponse { + // The retain height as set by the data companion. + int64 data_companion_retain_height = 1; + // The retain height as set by the ABCI application. + int64 app_retain_height = 2; + // The height whose data is currently retained, which is influenced by the + // data companion and ABCI application retain heights, but the node may not + // yet have executed its pruning operation. + int64 actual_retain_height = 3; +} +``` + +### Access Control + +As covered in the [gRPC API section](#grpc-api), it would be preferable to +implement some form of access control for sensitive, data companion-specific +APIs. At least **basic HTTP authentication** should be implemented for these +endpoints, where credentials should be obtained (in order of precedence): + +1. From an `.htpasswd` file, using the same format as [Apache `.htpasswd` + files][htpasswd], whose location is set in the Tendermint configuration file. +2. Randomly generated and written to the logs. + +### Configuration + +The following configuration file update is proposed to support the data +companion API. ## Consequences @@ -113,3 +243,4 @@ implementing the data companion pull API: [adr-082]: https://github.com/tendermint/tendermint/pull/9437 [\#9751]: https://github.com/tendermint/tendermint/issues/9751 +[htpasswd]: https://httpd.apache.org/docs/current/programs/htpasswd.html From c4ac47a9673ab17e2478139ef10d52863cdf46a7 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Thu, 5 Jan 2023 12:06:21 -0500 Subject: [PATCH 03/28] Expand on pull API service definitions, requirements Signed-off-by: Thane Thomson --- .../adr-084-data-companion-pull-api.md | 173 ++++++++++++------ 1 file changed, 120 insertions(+), 53 deletions(-) diff --git a/docs/architecture/adr-084-data-companion-pull-api.md b/docs/architecture/adr-084-data-companion-pull-api.md index 3a67303423c..221f816e216 100644 --- a/docs/architecture/adr-084-data-companion-pull-api.md +++ b/docs/architecture/adr-084-data-companion-pull-api.md @@ -11,12 +11,13 @@ Accepted | Rejected | Deprecated | Superseded by ## Context Following from the discussion around the development of [ADR 082][adr-082], an -alternative model is proposed here for offloading certain data from -Tendermint-based nodes to a "data companion". This alternative model inverts the -control of the data offloading process, when compared to ADR 082, from the -Tendermint node to the data companion. +alternative model is proposed here for offloading certain data from nodes to a +"data companion". This alternative model inverts the control of the data +offloading process, when compared to ADR 082, from the node to the data +companion. -This particular model would +Overall, this approach provides slightly weaker guarantees than that of ADR 082, +but represents a simpler model to implement. ## Alternative Approaches @@ -33,44 +34,38 @@ Other considered alternatives to this ADR are also outlined in Similar requirements are proposed here as for [ADR-082][adr-082]. -1. Only a single trusted companion service _must_ be supported in order to - reduce the likelihood of overloading a node via the companion API. Use cases - that require multiple companions will have to implement an intermediate/proxy - companion that can scale independently of the node. +1. A node _must_ support at most one data companion. -2. All or part of the following data _must_ be obtainable by the companion: +2. All or part of the following data _must_ be obtainable by the companion, and + as close to real-time as possible: 1. Committed block data 2. `FinalizeBlockResponse` data, but only for committed blocks 3. The companion _must_ be able to establish the earliest height for which the node has all of the requisite data. -4. The companion _must_ be able to establish, as close to real-time as possible, - the height and ID of the block that was just committed by the network, so - that it can request relevant heights' data from the node. - -5. The API _must_ be (or be able to be) appropriately shielded from untrusted +4. The API _must_ be (or be able to be) appropriately shielded from untrusted consumers and abuse. Critical control facets of the API (e.g. those that influence the node's pruning mechanisms) _must_ be implemented in such a way as to eliminate the possibility of accidentally exposing those endpoints to the public internet unprotected. -6. The node _must_ know, by way of signals from the companion, which heights' +5. The node _must_ know, by way of signals from the companion, which heights' associated data are safe to automatically prune. -7. The API _must_ be opt-in. When off or not in use, it _should_ have no impact +6. The API _must_ be opt-in. When off or not in use, it _should_ have no impact on system performance. -8. It _must_ not cause back-pressure into consensus. +7. It _must_ not cause back-pressure into consensus. 8. It _must_ not cause unbounded memory growth. -10. It _must_ not cause unbounded disk storage growth. +9. It _must_ provide one or more ways for operators to control storage growth. -11. It _must_ provide insight to operators (e.g. by way of logs/metrics) to +10. It _must_ provide insight to operators (e.g. by way of logs/metrics) to assist in dealing with possible failure modes. -12. The solution _should_ be able to be backported to older versions of +11. The solution _should_ be able to be backported to older versions of Tendermint (e.g. v0.34). ### Entity Relationships @@ -90,7 +85,7 @@ application, and the companion connects to the Tendermint node. ### gRPC API At the time of this writing, it is proposed that Tendermint implement a full -gRPC interface ([\#9751]). As such, we have several options when it comes to +gRPC interface ([\#81]). As such, we have several options when it comes to implementing the data companion pull API: 1. Extend the existing RPC API to simply provide the additional data @@ -111,7 +106,17 @@ implement option 1, but have certain endpoints be With this in mind, the following gRPC API is proposed, where the Tendermint node will implement these services. +#### Block Service + +When implementing gRPC support for a node, this service (or at least a slight +variation of it) would be applicable anyways, regardless of whether this ADR is +implemented. + ```protobuf +syntax = "proto3"; + +package tendermint.block_service.v1; + import "tendermint/abci/types.proto"; import "tendermint/types/types.proto"; import "tendermint/types/block.proto"; @@ -130,24 +135,6 @@ service BlockService { rpc GetBlockResults(GetBlockResultsRequest) returns (GetBlockResultsResponse) {} } -// DataCompanionService provides privileged access to specialized pruning -// functionality on the Tendermint node to help optimize node storage. -service DataCompanionService { - // SetRetainHeight notifies the node of the minimum height whose data must - // be retained by the node. This data includes block data and block - // execution results. - // - // The lower of this retain height and that set by the application in its - // Commit response will be used by the node to determine which heights' data - // can be pruned. - rpc SetRetainHeight(SetRetainHeightRequest) returns (SetRetainHeightResponse) {} - - // GetRetainHeight returns the retain height set by the companion and that - // set by the application. This can give the companion an indication as to - // which heights' data are currently available. - rpc GetRetainHeight(GetRetainHeightRequest) returns (GetRetainHeightResponse) {} -} - message GetLatestBlockIDRequest {} // GetLatestBlockIDResponse is a lightweight reference to the latest committed @@ -191,6 +178,35 @@ message GetBlockResultsResponse { // NB: This should be called finalize_block_events when ABCI 2.0 lands. repeated tendermint.abci.Event end_block_events = 5; } +``` + +#### Data Companion Service + +```protobuf +syntax = "proto3"; + +package tendermint.data_companion_service.v1; + +// DataCompanionService provides privileged access to specialized pruning +// functionality on the Tendermint node to help optimize node storage. +service DataCompanionService { + // SetRetainHeight notifies the node of the minimum height whose data must + // be retained by the node. This data includes block data and block + // execution results. + // + // Setting a retain height lower than a previous setting will result in an + // error. + // + // The lower of this retain height and that set by the application in its + // Commit response will be used by the node to determine which heights' data + // can be pruned. + rpc SetRetainHeight(SetRetainHeightRequest) returns (SetRetainHeightResponse) {} + + // GetRetainHeight returns the retain height set by the companion and that + // set by the application. This can give the companion an indication as to + // which heights' data are currently available. + rpc GetRetainHeight(GetRetainHeightRequest) returns (GetRetainHeightResponse) {} +} message SetRetainHeightRequest { int64 height = 1; @@ -205,42 +221,93 @@ message GetRetainHeightResponse { int64 data_companion_retain_height = 1; // The retain height as set by the ABCI application. int64 app_retain_height = 2; - // The height whose data is currently retained, which is influenced by the - // data companion and ABCI application retain heights, but the node may not - // yet have executed its pruning operation. - int64 actual_retain_height = 3; } ``` +With this API design, it is technically possible for an integrator to attach +multiple data companions to the node, but only one of their retain heights will +be applied. + ### Access Control As covered in the [gRPC API section](#grpc-api), it would be preferable to implement some form of access control for sensitive, data companion-specific APIs. At least **basic HTTP authentication** should be implemented for these -endpoints, where credentials should be obtained (in order of precedence): +endpoints, where credentials should be obtained from an `.htpasswd` file, using +the same format as [Apache `.htpasswd` files][htpasswd], whose location is set +in the Tendermint configuration file. + +`.htpasswd` files are relatively standard and are supported by many web servers. +The format is relatively straightforward to parse and interpret. -1. From an `.htpasswd` file, using the same format as [Apache `.htpasswd` - files][htpasswd], whose location is set in the Tendermint configuration file. -2. Randomly generated and written to the logs. +We should strongly consider only supporting the bcrypt encryption option for +passwords stored in `.htpasswd` files. ### Configuration The following configuration file update is proposed to support the data companion API. -## Consequences +```toml +# A data companion, if enabled, is intended to offload the storage of certain +# types of data from the node. Specifically: +# 1. Block data, including transactions. +# 2. Block results, including events, transaction results, validator and +# consensus parameter updates. +# +# A data companion can influence the pruning height of the node, and therefore +# the data companion gRPC service is considered to be a sensitive endpoint that +# is password-protected by default. +[data_companion] + +# Is the data companion API enabled at all? Default: false +enabled = false + +# Authentication configuration for the data companion. +[data_companion.authentication] + +# The authentication method to use. At present, the only supported method is +# "basic" (i.e. basic HTTP authentication). +method = "basic" + +# Path to the file containing basic authentication credentials to access the +# data companion service. If the data companion is enabled and this password +# file is not supplied, or the file does not exist, or the file is of an invalid +# format, the node will fail to start. +# +# See https://httpd.apache.org/docs/current/programs/htpasswd.html for details +# on how to create/configure this file. +password_file = "/path/to/.htpasswd" +``` -> This section describes the consequences, after applying the decision. All -> consequences should be summarized here, not just the "positive" ones. +## Consequences ### Positive +- Facilitates offloading of data to an external service, which can be scaled + independently of the node + - Potentially reduces load on the node itself + - Paves the way for eventually reducing the surface area of a node's exposed + APIs + ### Negative +- Increases system complexity slightly in the short-term +- If data companions are not correctly implemented and deployed (e.g. if a + companion is attached to the same storage as the node, and/or if its retain + height signaling is poorly handled), this could result in substantially + increased storage usage + ### Neutral +- Expands the overall API surface area of a node in the short-term + ## References -[adr-082]: https://github.com/tendermint/tendermint/pull/9437 -[\#9751]: https://github.com/tendermint/tendermint/issues/9751 +- [ADR 082 - Data Companion Push API][adr-082] +- [\#81 - rpc: Add gRPC support][\#81] +- [`.htpasswd`][htpasswd] + +[adr-082]: https://github.com/CometBFT/tendermint/pull/73 +[\#81]: https://github.com/CometBFT/tendermint/issues/81 [htpasswd]: https://httpd.apache.org/docs/current/programs/htpasswd.html From b6aabb25fd26ee41597b20ed96ef0363fad2e8e1 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Thu, 5 Jan 2023 12:13:28 -0500 Subject: [PATCH 04/28] Add metric Signed-off-by: Thane Thomson --- docs/architecture/adr-084-data-companion-pull-api.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/architecture/adr-084-data-companion-pull-api.md b/docs/architecture/adr-084-data-companion-pull-api.md index 221f816e216..6b2f2a7125f 100644 --- a/docs/architecture/adr-084-data-companion-pull-api.md +++ b/docs/architecture/adr-084-data-companion-pull-api.md @@ -280,6 +280,15 @@ method = "basic" password_file = "/path/to/.htpasswd" ``` +### Metrics + +The following metrics are proposed to be added to monitor the health of the +interaction between a node and its data companion: + +- `data_companion_retain_height` - The current retain height as requested by the + data companion. This can give operators insight into whether the companion is + lagging significantly behind the current network height. + ## Consequences ### Positive From a4e7f20e3e349a6b58dff7e7e51341935961aea1 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Thu, 5 Jan 2023 12:14:46 -0500 Subject: [PATCH 05/28] Add positive consequence re: greater leeway for companion Signed-off-by: Thane Thomson --- docs/architecture/adr-084-data-companion-pull-api.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/architecture/adr-084-data-companion-pull-api.md b/docs/architecture/adr-084-data-companion-pull-api.md index 6b2f2a7125f..edee5e45032 100644 --- a/docs/architecture/adr-084-data-companion-pull-api.md +++ b/docs/architecture/adr-084-data-companion-pull-api.md @@ -298,6 +298,8 @@ interaction between a node and its data companion: - Potentially reduces load on the node itself - Paves the way for eventually reducing the surface area of a node's exposed APIs +- Allows the data companion more leeway in reading the data it needs than the + approach in [ADR 082][adr-082]. ### Negative From 724bc966f6158df2efb28926ec9b3016e3b8c8c5 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Thu, 5 Jan 2023 12:16:49 -0500 Subject: [PATCH 06/28] Add positive consequence re: simpler implementation Signed-off-by: Thane Thomson --- docs/architecture/adr-084-data-companion-pull-api.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/architecture/adr-084-data-companion-pull-api.md b/docs/architecture/adr-084-data-companion-pull-api.md index edee5e45032..8a2b1b91d6e 100644 --- a/docs/architecture/adr-084-data-companion-pull-api.md +++ b/docs/architecture/adr-084-data-companion-pull-api.md @@ -299,7 +299,8 @@ interaction between a node and its data companion: - Paves the way for eventually reducing the surface area of a node's exposed APIs - Allows the data companion more leeway in reading the data it needs than the - approach in [ADR 082][adr-082]. + approach in [ADR 082][adr-082] +- Simpler implementation than [ADR 082][adr-082] ### Negative From 1823f199ef878df394b5d7aac83667ab9f801205 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Mon, 9 Jan 2023 11:38:28 -0500 Subject: [PATCH 07/28] Add max_retain_lag config option Signed-off-by: Thane Thomson --- docs/architecture/adr-084-data-companion-pull-api.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/architecture/adr-084-data-companion-pull-api.md b/docs/architecture/adr-084-data-companion-pull-api.md index 8a2b1b91d6e..5ea0205c077 100644 --- a/docs/architecture/adr-084-data-companion-pull-api.md +++ b/docs/architecture/adr-084-data-companion-pull-api.md @@ -260,9 +260,15 @@ companion API. # is password-protected by default. [data_companion] -# Is the data companion API enabled at all? Default: false +# Is the data companion gRPC API enabled at all? Default: false enabled = false +# The maximum number of blocks to allow the data companion's retain height to +# lag behind the current height. The node will block all other operations until +# the companion's retain height is less than `max_retain_lag` blocks behind the +# current height. Default: 100 +max_retain_lag = 100 + # Authentication configuration for the data companion. [data_companion.authentication] From 5c94e822056d6fdbb0cd1661fab77a4dc1ab219a Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Mon, 9 Jan 2023 16:00:45 -0500 Subject: [PATCH 08/28] Expand on pruning-related behaviour and requirements Signed-off-by: Thane Thomson --- .../adr-084-data-companion-pull-api.md | 38 ++++++++++++++++--- 1 file changed, 32 insertions(+), 6 deletions(-) diff --git a/docs/architecture/adr-084-data-companion-pull-api.md b/docs/architecture/adr-084-data-companion-pull-api.md index 5ea0205c077..eed4b028bcb 100644 --- a/docs/architecture/adr-084-data-companion-pull-api.md +++ b/docs/architecture/adr-084-data-companion-pull-api.md @@ -53,19 +53,27 @@ Similar requirements are proposed here as for [ADR-082][adr-082]. 5. The node _must_ know, by way of signals from the companion, which heights' associated data are safe to automatically prune. -6. The API _must_ be opt-in. When off or not in use, it _should_ have no impact +6. The companion _must_ be able to handle the possibility that a node might + start from a non-zero height (i.e. that the node may state sync from a + specific height beyond genesis). + +7. The companion _must_ be able to handle the possibility that a node may + disable and then later re-enable the companion interface, potentially causing + the companion to have missing data in between those two heights. + +8. The API _must_ be opt-in. When off or not in use, it _should_ have no impact on system performance. -7. It _must_ not cause back-pressure into consensus. +9. It _must_ not cause back-pressure into consensus. -8. It _must_ not cause unbounded memory growth. +10. It _must_ not cause unbounded memory growth. -9. It _must_ provide one or more ways for operators to control storage growth. +11. It _must_ provide one or more ways for operators to control storage growth. -10. It _must_ provide insight to operators (e.g. by way of logs/metrics) to +12. It _must_ provide insight to operators (e.g. by way of logs/metrics) to assist in dealing with possible failure modes. -11. The solution _should_ be able to be backported to older versions of +13. The solution _should_ be able to be backported to older versions of Tendermint (e.g. v0.34). ### Entity Relationships @@ -82,6 +90,23 @@ socket-based ABCI application, and the proposed data companion service. In this diagram, it is evident that Tendermint connects out to the ABCI application, and the companion connects to the Tendermint node. +### Pruning Behaviour + +There are two "modes" of pruning that need to be supported by the node: + +1. **Data companion disabled**: Here, the node prunes as per normal based on the + `retain_height` parameter supplied by the application via the + [`Commit`][abci-commit] ABCI response. +2. **Data companion enabled**: Here, the node prunes _blocks_ based on the lower + of the retain heights specified by the application and the data companion, + and the node prunes _block results_ based on the retain height set by the + data companion. + +Enabling the `discard_abci_responses` flag under the `[storage]` section in the +configuration is incompatible with enabling a data companion. If +`storage.discard_abci_responses` and `data_companion.enabled` are both `true`, +then the node _must_ fail to start. + ### gRPC API At the time of this writing, it is proposed that Tendermint implement a full @@ -329,3 +354,4 @@ interaction between a node and its data companion: [adr-082]: https://github.com/CometBFT/tendermint/pull/73 [\#81]: https://github.com/CometBFT/tendermint/issues/81 [htpasswd]: https://httpd.apache.org/docs/current/programs/htpasswd.html +[abci-commit]: ../../spec/abci/abci++_methods.md#commit From f7d3bd5022bdace64aebb28cd086ae7c5bafc2b4 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Tue, 28 Feb 2023 16:46:57 -0500 Subject: [PATCH 09/28] Renumber ADR from 084 to 101 Signed-off-by: Thane Thomson --- docs/architecture/README.md | 1 + ....md => adr-101-data-companion-pull-api.md} | 19 ++++++++++--------- 2 files changed, 11 insertions(+), 9 deletions(-) rename docs/architecture/{adr-084-data-companion-pull-api.md => adr-101-data-companion-pull-api.md} (96%) diff --git a/docs/architecture/README.md b/docs/architecture/README.md index b48a9d32b24..4c0f2d68400 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -42,6 +42,7 @@ numbering our ADRs from 100 onwards. ### Proposed +- [ADR-101: Data companion pull API](./adr-101-data-companion-pull-api.md) - [ADR-103: Protobuf definition versioning](./adr-103-proto-versioning.md) - [ADR-105: Refactor list of senders in mempool](./adr-105-refactor-mempool-senders.md) diff --git a/docs/architecture/adr-084-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md similarity index 96% rename from docs/architecture/adr-084-data-companion-pull-api.md rename to docs/architecture/adr-101-data-companion-pull-api.md index eed4b028bcb..a4b102d649e 100644 --- a/docs/architecture/adr-084-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -1,7 +1,8 @@ -# ADR 084: Data Companion Pull API +# ADR 101: Data Companion Pull API ## Changelog +- 2023-02-28: Renumber from 084 to 101 (@thanethomson) - 2022-12-18: First draft (@thanethomson) ## Status @@ -10,7 +11,7 @@ Accepted | Rejected | Deprecated | Superseded by ## Context -Following from the discussion around the development of [ADR 082][adr-082], an +Following from the discussion around the development of [ADR 100][adr-100], an alternative model is proposed here for offloading certain data from nodes to a "data companion". This alternative model inverts the control of the data offloading process, when compared to ADR 082, from the node to the data @@ -22,7 +23,7 @@ but represents a simpler model to implement. ## Alternative Approaches Other considered alternatives to this ADR are also outlined in -[ADR-082][adr-082]. +[ADR-100][adr-100]. ## Decision @@ -32,7 +33,7 @@ Other considered alternatives to this ADR are also outlined in ### Requirements -Similar requirements are proposed here as for [ADR-082][adr-082]. +Similar requirements are proposed here as for [ADR-100][adr-100]. 1. A node _must_ support at most one data companion. @@ -330,8 +331,8 @@ interaction between a node and its data companion: - Paves the way for eventually reducing the surface area of a node's exposed APIs - Allows the data companion more leeway in reading the data it needs than the - approach in [ADR 082][adr-082] -- Simpler implementation than [ADR 082][adr-082] + approach in [ADR 100][adr-100] +- Simpler implementation than [ADR 100][adr-100] ### Negative @@ -347,11 +348,11 @@ interaction between a node and its data companion: ## References -- [ADR 082 - Data Companion Push API][adr-082] +- [ADR 100 - Data Companion Push API][adr-100] - [\#81 - rpc: Add gRPC support][\#81] - [`.htpasswd`][htpasswd] -[adr-082]: https://github.com/CometBFT/tendermint/pull/73 -[\#81]: https://github.com/CometBFT/tendermint/issues/81 +[adr-100]: https://github.com/cometbft/cometbft/pull/73 +[\#81]: https://github.com/cometbft/cometbft/issues/81 [htpasswd]: https://httpd.apache.org/docs/current/programs/htpasswd.html [abci-commit]: ../../spec/abci/abci++_methods.md#commit From 2e16ab6822c120480b13dcce1d1769b0c9e15ce3 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Tue, 4 Apr 2023 06:58:52 -0400 Subject: [PATCH 10/28] Apply suggestions from code review Co-authored-by: Adi Seredinschi Co-authored-by: Jasmina Malicevic --- .../adr-101-data-companion-pull-api.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index a4b102d649e..791dfe0df7d 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -14,10 +14,10 @@ Accepted | Rejected | Deprecated | Superseded by Following from the discussion around the development of [ADR 100][adr-100], an alternative model is proposed here for offloading certain data from nodes to a "data companion". This alternative model inverts the control of the data -offloading process, when compared to ADR 082, from the node to the data +offloading process, when compared to ADR 100, from the node to the data companion. -Overall, this approach provides slightly weaker guarantees than that of ADR 082, +Overall, this approach provides slightly weaker guarantees than that of ADR 100, but represents a simpler model to implement. ## Alternative Approaches @@ -43,7 +43,7 @@ Similar requirements are proposed here as for [ADR-100][adr-100]. 2. `FinalizeBlockResponse` data, but only for committed blocks 3. The companion _must_ be able to establish the earliest height for which the - node has all of the requisite data. + node has all of the associated data. 4. The API _must_ be (or be able to be) appropriately shielded from untrusted consumers and abuse. Critical control facets of the API (e.g. those that @@ -52,7 +52,7 @@ Similar requirements are proposed here as for [ADR-100][adr-100]. the public internet unprotected. 5. The node _must_ know, by way of signals from the companion, which heights' - associated data are safe to automatically prune. + associated data are safe to prune. 6. The companion _must_ be able to handle the possibility that a node might start from a non-zero height (i.e. that the node may state sync from a @@ -65,7 +65,7 @@ Similar requirements are proposed here as for [ADR-100][adr-100]. 8. The API _must_ be opt-in. When off or not in use, it _should_ have no impact on system performance. -9. It _must_ not cause back-pressure into consensus. +9. The API _must_ not cause back-pressure into consensus. 10. It _must_ not cause unbounded memory growth. @@ -79,7 +79,7 @@ Similar requirements are proposed here as for [ADR-100][adr-100]. ### Entity Relationships -The following model shows the proposed relationships between Tendermint, a +The following model shows the proposed relationships between CometBFT, a socket-based ABCI application, and the proposed data companion service. ``` @@ -98,8 +98,8 @@ There are two "modes" of pruning that need to be supported by the node: 1. **Data companion disabled**: Here, the node prunes as per normal based on the `retain_height` parameter supplied by the application via the [`Commit`][abci-commit] ABCI response. -2. **Data companion enabled**: Here, the node prunes _blocks_ based on the lower - of the retain heights specified by the application and the data companion, +2. **Data companion enabled**: Here, the node prunes _blocks_ based on the minimum + among the retain heights specified by the application and the data companion, and the node prunes _block results_ based on the retain height set by the data companion. @@ -110,7 +110,7 @@ then the node _must_ fail to start. ### gRPC API -At the time of this writing, it is proposed that Tendermint implement a full +At the time of this writing, it is proposed that CometBFT implement a full gRPC interface ([\#81]). As such, we have several options when it comes to implementing the data companion pull API: @@ -129,7 +129,7 @@ Due to the poorer operator experience in option 2, it would be preferable to implement option 1, but have certain endpoints be [access-controlled](#access-control) by default. -With this in mind, the following gRPC API is proposed, where the Tendermint node +With this in mind, the following gRPC API is proposed, where the CometBFT node will implement these services. #### Block Service From f012661ad36d0b1b69e7c6182ae9554eab936a5a Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Tue, 4 Apr 2023 15:27:27 -0400 Subject: [PATCH 11/28] Replace Tendermint references with CometBFT Signed-off-by: Thane Thomson --- .../architecture/adr-101-data-companion-pull-api.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 791dfe0df7d..0c66d8d68da 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -2,6 +2,7 @@ ## Changelog +- 2023-04-04: Update based on review feedback (@thanethomson) - 2023-02-28: Renumber from 084 to 101 (@thanethomson) - 2022-12-18: First draft (@thanethomson) @@ -75,7 +76,7 @@ Similar requirements are proposed here as for [ADR-100][adr-100]. assist in dealing with possible failure modes. 13. The solution _should_ be able to be backported to older versions of - Tendermint (e.g. v0.34). + CometBFT (e.g. v0.34). ### Entity Relationships @@ -84,12 +85,12 @@ socket-based ABCI application, and the proposed data companion service. ``` +----------+ +------------+ +----------------+ - | ABCI App | <--- | Tendermint | <--- | Data Companion | + | ABCI App | <--- | CometBFT | <--- | Data Companion | +----------+ +------------+ +----------------+ ``` -In this diagram, it is evident that Tendermint connects out to the ABCI -application, and the companion connects to the Tendermint node. +In this diagram, it is evident that CometBFT connects out to the ABCI +application, and the companion connects to the CometBFT node. ### Pruning Behaviour @@ -214,7 +215,7 @@ syntax = "proto3"; package tendermint.data_companion_service.v1; // DataCompanionService provides privileged access to specialized pruning -// functionality on the Tendermint node to help optimize node storage. +// functionality on the CometBFT node to help optimize node storage. service DataCompanionService { // SetRetainHeight notifies the node of the minimum height whose data must // be retained by the node. This data includes block data and block @@ -261,7 +262,7 @@ implement some form of access control for sensitive, data companion-specific APIs. At least **basic HTTP authentication** should be implemented for these endpoints, where credentials should be obtained from an `.htpasswd` file, using the same format as [Apache `.htpasswd` files][htpasswd], whose location is set -in the Tendermint configuration file. +in the CometBFT configuration file. `.htpasswd` files are relatively standard and are supported by many web servers. The format is relatively straightforward to parse and interpret. From 5f4c02a5d527114eafb6ee9e84a751ba83c87431 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Tue, 4 Apr 2023 19:00:12 -0400 Subject: [PATCH 12/28] Compress context, alternatives and requirements Signed-off-by: Thane Thomson --- .../adr-101-data-companion-pull-api.md | 89 +++++++------------ 1 file changed, 30 insertions(+), 59 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 0c66d8d68da..c5e9b823d8a 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -12,19 +12,12 @@ Accepted | Rejected | Deprecated | Superseded by ## Context -Following from the discussion around the development of [ADR 100][adr-100], an -alternative model is proposed here for offloading certain data from nodes to a -"data companion". This alternative model inverts the control of the data -offloading process, when compared to ADR 100, from the node to the data -companion. - -Overall, this approach provides slightly weaker guarantees than that of ADR 100, -but represents a simpler model to implement. +See the [context for ADR-100][adr-100-context]. ## Alternative Approaches -Other considered alternatives to this ADR are also outlined in -[ADR-100][adr-100]. +[ADR-100][adr-100], as well as the [alternatives][adr-100-alt] outlined in +ADR-100, are all alternative approaches. ## Decision @@ -32,65 +25,37 @@ Other considered alternatives to this ADR are also outlined in ## Detailed Design -### Requirements - -Similar requirements are proposed here as for [ADR-100][adr-100]. - -1. A node _must_ support at most one data companion. - -2. All or part of the following data _must_ be obtainable by the companion, and - as close to real-time as possible: - 1. Committed block data - 2. `FinalizeBlockResponse` data, but only for committed blocks - -3. The companion _must_ be able to establish the earliest height for which the - node has all of the associated data. - -4. The API _must_ be (or be able to be) appropriately shielded from untrusted - consumers and abuse. Critical control facets of the API (e.g. those that - influence the node's pruning mechanisms) _must_ be implemented in such a way - as to eliminate the possibility of accidentally exposing those endpoints to - the public internet unprotected. - -5. The node _must_ know, by way of signals from the companion, which heights' - associated data are safe to prune. +The model proposed in this ADR inverts that proposed in ADR-100, with the node +being the server and the data companion being the client. Here, the companion +"pulls" data from the node. -6. The companion _must_ be able to handle the possibility that a node might - start from a non-zero height (i.e. that the node may state sync from a - specific height beyond genesis). +This provides much weaker data delivery guarantees than the "push" model of +ADR-100. In this "pull" model, the companion can lag behind consensus, but the +node does not crash if the companion is unavailable. -7. The companion _must_ be able to handle the possibility that a node may - disable and then later re-enable the companion interface, potentially causing - the companion to have missing data in between those two heights. - -8. The API _must_ be opt-in. When off or not in use, it _should_ have no impact - on system performance. - -9. The API _must_ not cause back-pressure into consensus. - -10. It _must_ not cause unbounded memory growth. - -11. It _must_ provide one or more ways for operators to control storage growth. - -12. It _must_ provide insight to operators (e.g. by way of logs/metrics) to - assist in dealing with possible failure modes. +### Requirements -13. The solution _should_ be able to be backported to older versions of - CometBFT (e.g. v0.34). +The requirements for ADR-101 are the same as the [requirements for +ADR-100][adr-100-req]. ### Entity Relationships The following model shows the proposed relationships between CometBFT, a socket-based ABCI application, and the proposed data companion service. -``` - +----------+ +------------+ +----------------+ - | ABCI App | <--- | CometBFT | <--- | Data Companion | - +----------+ +------------+ +----------------+ +```mermaid +flowchart RL + comet[CometBFT] + companion[Data Companion] + app[ABCI App] + + comet --> app + companion --> comet ``` -In this diagram, it is evident that CometBFT connects out to the ABCI -application, and the companion connects to the CometBFT node. +In this diagram, it is evident that CometBFT (as a client) connects out to the +ABCI application (a server), and the companion (a client) connects to the +CometBFT node (a server). ### Pruning Behaviour @@ -353,7 +318,13 @@ interaction between a node and its data companion: - [\#81 - rpc: Add gRPC support][\#81] - [`.htpasswd`][htpasswd] -[adr-100]: https://github.com/cometbft/cometbft/pull/73 + +[adr-100-context]: https://github.com/cometbft/cometbft/blob/thane/adr-082-data-companion-api/docs/architecture/adr-100-data-companion-push-api.md#context +[adr-100]: https://github.com/cometbft/cometbft/blob/thane/adr-082-data-companion-api/docs/architecture/adr-100-data-companion-push-api.md +[adr-100-req]: https://github.com/cometbft/cometbft/blob/thane/adr-082-data-companion-api/docs/architecture/adr-100-data-companion-push-api.md#requirements +[adr-100-alt]: https://github.com/cometbft/cometbft/blob/thane/adr-082-data-companion-api/docs/architecture/adr-100-data-companion-push-api.md#alternative-approaches [\#81]: https://github.com/cometbft/cometbft/issues/81 [htpasswd]: https://httpd.apache.org/docs/current/programs/htpasswd.html [abci-commit]: ../../spec/abci/abci++_methods.md#commit From 617c270043dfdd37332914e1ca51f57f2a90b921 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Tue, 4 Apr 2023 20:00:25 -0400 Subject: [PATCH 13/28] Address majority of remaining feedback Signed-off-by: Thane Thomson --- .../adr-101-data-companion-pull-api.md | 128 ++++++++++++------ 1 file changed, 84 insertions(+), 44 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index c5e9b823d8a..43c06b820db 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -80,13 +80,13 @@ At the time of this writing, it is proposed that CometBFT implement a full gRPC interface ([\#81]). As such, we have several options when it comes to implementing the data companion pull API: -1. Extend the existing RPC API to simply provide the additional data - companion-specific endpoints. In order to meet the +1. Extend the proposed gRPC API from [\#81] to simply provide the additional + data companion-specific endpoints. In order to meet the [requirements](#requirements), however, some of the endpoints will have to be protected by default. This is simpler for clients to interact with though, because they only need to interact with a single endpoint for all of their needs. -2. Implement a separate RPC API on a different port to the standard gRPC +2. Implement a separate gRPC API on a different port to the standard gRPC interface. This allows for a clearer distinction between the standard and data companion-specific gRPC interfaces, but complicates the server and client interaction models. @@ -100,9 +100,8 @@ will implement these services. #### Block Service -When implementing gRPC support for a node, this service (or at least a slight -variation of it) would be applicable anyways, regardless of whether this ADR is -implemented. +The following `BlockService` will be implemented as part of [\#81], regardless +of whether or not this ADR is implemented. ```protobuf syntax = "proto3"; @@ -115,9 +114,11 @@ import "tendermint/types/block.proto"; // BlockService provides information about blocks. service BlockService { - // GetLatestBlockID returns a stream of the latest block IDs as they are - // committed by the network. - rpc GetLatestBlockID(GetLatestBlockIDRequest) returns (stream GetLatestBlockIDResponse) {} + // GetLatestHeight returns a stream of the latest block heights committed by + // the network. This is a long-lived stream that is only terminated by the + // server if an error occurs. The companion is expected to handle such + // disconnections and automatically reconnect. + rpc GetLatestHeight(GetLatestHeightRequest) returns (stream GetLatestHeightResponse) {} // GetBlock attempts to retrieve the block at a particular height. rpc GetBlock(GetBlockRequest) returns (GetBlockResponse) {} @@ -127,19 +128,17 @@ service BlockService { rpc GetBlockResults(GetBlockResultsRequest) returns (GetBlockResultsResponse) {} } -message GetLatestBlockIDRequest {} +message GetLatestHeightRequest {} -// GetLatestBlockIDResponse is a lightweight reference to the latest committed -// block. -message GetLatestBlockIDResponse { - // The height of the latest committed block. +// GetLatestHeightResponse provides the height of the latest committed block. +message GetLatestHeightResponse { + // The height of the latest committed block. Will be 0 if no data has been + // committed yet. int64 height = 1; - // The ID of the latest committed block. - tendermint.types.BlockID block_id = 2; } message GetBlockRequest { - // The height of the block to get. + // The height of the block to get. Set to 0 to return the latest block. int64 height = 1; } @@ -149,31 +148,52 @@ message GetBlockResponse { } message GetBlockResultsRequest { - // The height of the block results to get. + // The height of the block results to get. Set to 0 to return the latest + // block. int64 height = 1; } message GetBlockResultsResponse { - // All events produced by the ABCI BeginBlock call for the block. - repeated tendermint.abci.Event begin_block_events = 1; + // The height of the block whose results are provided in the remaining + // fields. + int64 height = 1; + + // All events produced by the ABCI FinalizeBlock call. + repeated tendermint.abci.Event events = 2; - // All transaction results produced by block execution. - repeated tendermint.abci.ExecTxResult tx_results = 2; + // All transaction results (and their associated events) produced by block + // execution. + repeated tendermint.abci.ExecTxResult tx_results = 3; // Validator updates during block execution. - repeated tendermint.abci.ValidatorUpdate validator_updates = 3; + repeated tendermint.abci.ValidatorUpdate validator_updates = 4; // Consensus parameter updates during block execution. - tendermint.types.ConsensusParams consensus_param_updates = 4; + tendermint.types.ConsensusParams consensus_param_updates = 5; - // All events produced by the ABCI EndBlock call. - // NB: This should be called finalize_block_events when ABCI 2.0 lands. - repeated tendermint.abci.Event end_block_events = 5; + // The app hash returned by the application after execution of all of the + // transactions in the block. + bytes app_hash = 6; } ``` +There are several reasons as to why there are two separate gRPC calls for this +data as opposed to just one: + +1. The quantity of data stored by each is application-dependent, and coalescing + the two types of data could impose significant overhead in some cases. +2. The existing JSON-RPC API distinguishes between endpoints providing these two + types of data (`/block` and `/block_results`), and the `BlockService` should + simply provide a gRPC alternative to these two endpoints. +3. Eventually, when we no longer need to store block results at all, we can + simply deprecate the `GetBlockResults` API call without affecting clients who + rely on `GetBlock`. + #### Data Companion Service +This gRPC service is the only novel service proposed for the data companion API, +and effectively gives the data companion a say in how the node prunes its data. + ```protobuf syntax = "proto3"; @@ -224,16 +244,23 @@ be applied. As covered in the [gRPC API section](#grpc-api), it would be preferable to implement some form of access control for sensitive, data companion-specific -APIs. At least **basic HTTP authentication** should be implemented for these -endpoints, where credentials should be obtained from an `.htpasswd` file, using -the same format as [Apache `.htpasswd` files][htpasswd], whose location is set -in the CometBFT configuration file. +APIs. Two main possibilities are considered here for access control: + +1. Only allow the `DataCompanionService` to be exposed via a gRPC endpoint bound + to `localhost`. This, however, only caters for deployments run on VMs, and + not for those running in Docker containers. In general, in a containerized + environment, it would be preferable to have the companion run in a separate + container to the node, such that it fails and recovers independently of the + node. -`.htpasswd` files are relatively standard and are supported by many web servers. -The format is relatively straightforward to parse and interpret. +2. Implement **basic HTTP authentication** on the `DataCompanionService`, where + credentials should be obtained from an `.htpasswd` file, using the same + format as [Apache `.htpasswd` files][htpasswd], whose location is set in the + CometBFT configuration file. -We should strongly consider only supporting the bcrypt encryption option for -passwords stored in `.htpasswd` files. + `.htpasswd` files are relatively standard and are supported by many web + servers. bcrypt encryption for passwords stored in `.htpasswd` files is + strongly recommended here. ### Configuration @@ -255,12 +282,6 @@ companion API. # Is the data companion gRPC API enabled at all? Default: false enabled = false -# The maximum number of blocks to allow the data companion's retain height to -# lag behind the current height. The node will block all other operations until -# the companion's retain height is less than `max_retain_lag` blocks behind the -# current height. Default: 100 -max_retain_lag = 100 - # Authentication configuration for the data companion. [data_companion.authentication] @@ -284,8 +305,27 @@ The following metrics are proposed to be added to monitor the health of the interaction between a node and its data companion: - `data_companion_retain_height` - The current retain height as requested by the - data companion. This can give operators insight into whether the companion is - lagging significantly behind the current network height. + data companion. This can give operators insight into how the data companion is + affecting pruning. +- `data_companion_block_response_height` - The highest height for block response + data sent to the companion since node startup. This can give operators insight + into whether the data companion is significantly lagging behind the node. +- `data_companion_block_results_response_height` - The highest height for block + result response data sent to the companion since node startup. This can give + operators insight into whether the data companion is significantly lagging + behind the node. +- `data_companion_block_response_count` - The total number of block responses + sent to the data companion since node startup. +- `data_companion_block_result_response_count` - The total number of block + results responses sent to the data companion since node startup. +- `data_companion_block_response_bytes` - The total number of bytes sent to the + data companion since startup for block response data. This can potentially + alert operators to bottlenecks in communication between the node and the data + companion. +- `data_companion_block_results_response_bytes` - The total number of bytes sent + to the data companion since startup for block results response data. This can + potentially alert operators to bottlenecks in communication between the node + and the data companion. ## Consequences @@ -305,7 +345,7 @@ interaction between a node and its data companion: - Increases system complexity slightly in the short-term - If data companions are not correctly implemented and deployed (e.g. if a companion is attached to the same storage as the node, and/or if its retain - height signaling is poorly handled), this could result in substantially + height signalling is poorly handled), this could result in substantially increased storage usage ### Neutral From e4adadeb6fbf831386745a650c6b43625114e6d6 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Sat, 6 May 2023 06:16:48 -0400 Subject: [PATCH 14/28] Update ADR from review feedback Signed-off-by: Thane Thomson --- .../adr-101-data-companion-pull-api.md | 340 +++++++++--------- 1 file changed, 177 insertions(+), 163 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 43c06b820db..9b9a76ead24 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -2,6 +2,7 @@ ## Changelog +- 2023-05-03: Update based on synchronous feedback from team (@thanethomson) - 2023-04-04: Update based on review feedback (@thanethomson) - 2023-02-28: Renumber from 084 to 101 (@thanethomson) - 2022-12-18: First draft (@thanethomson) @@ -14,6 +15,17 @@ Accepted | Rejected | Deprecated | Superseded by See the [context for ADR-100][adr-100-context]. +The primary novelty introduced in this ADR is effectively just a new gRPC API +that allows an external application to influence which data the node prunes. +Otherwise, existing and planned RPC interfaces (such as the planned gRPC +interface) in addition to this pruning API should, in theory, deliver the same +kind of value as the solution proposed in ADR-100. + +Even though the pruning API could be useful to operators outside the context of +the usage of a data companion (e.g. it could provide operators with more control +over node pruning behaviour), it is presented as part of the data companion +discussion to illustrate its initial intended use case. + ## Alternative Approaches [ADR-100][adr-100], as well as the [alternatives][adr-100-alt] outlined in @@ -21,7 +33,7 @@ ADR-100, are all alternative approaches. ## Decision -> This section records the decision that was made. +To implement ADR-101 instead of ADR-100. ## Detailed Design @@ -59,25 +71,35 @@ CometBFT node (a server). ### Pruning Behaviour -There are two "modes" of pruning that need to be supported by the node: +Two parameters are proposed as a necessary part of the pruning API: + +- **Pruning service block retain height**, which influences the height to which + the node will retain blocks. This is different to the **application block + retain height**, which is set by the application in its response to each ABCI + `commit` message. + + The node will prune blocks to whichever is lower between the pruning service + and application block retain heights. + + The default value is 0, which indicates that only the application block retain + height must be taken into consideration. + +- **Pruning service block results retain height**, which influences the height + to which the node will retain block results. -1. **Data companion disabled**: Here, the node prunes as per normal based on the - `retain_height` parameter supplied by the application via the - [`Commit`][abci-commit] ABCI response. -2. **Data companion enabled**: Here, the node prunes _blocks_ based on the minimum - among the retain heights specified by the application and the data companion, - and the node prunes _block results_ based on the retain height set by the - data companion. + The default value is 0, which indicates that all block results will be + retained unless the `storage.discard_abci_responses` parameter is enabled, in + which case no block results will be stored except those that are necessary to + facilitate consensus. -Enabling the `discard_abci_responses` flag under the `[storage]` section in the -configuration is incompatible with enabling a data companion. If -`storage.discard_abci_responses` and `data_companion.enabled` are both `true`, -then the node _must_ fail to start. +These parameters need to be durable (i.e. stored on disk). Setting either of +these two values to 0 after they were previously non-zero will effectively +disable that specific facet of the pruning behaviour. ### gRPC API -At the time of this writing, it is proposed that CometBFT implement a full -gRPC interface ([\#81]). As such, we have several options when it comes to +At the time of this writing, it is proposed that CometBFT implement a full gRPC +interface ([\#81]). As such, we have several options when it comes to implementing the data companion pull API: 1. Extend the proposed gRPC API from [\#81] to simply provide the additional @@ -91,22 +113,22 @@ implementing the data companion pull API: data companion-specific gRPC interfaces, but complicates the server and client interaction models. -Due to the poorer operator experience in option 2, it would be preferable to -implement option 1, but have certain endpoints be -[access-controlled](#access-control) by default. - -With this in mind, the following gRPC API is proposed, where the CometBFT node -will implement these services. +Due to past experience of operators exposing _all_ RPC endpoints on a specific +port to the public internet, option 2 will be chosen here to minimize the +chances of this happening in future, even though it offers a slightly more +complicated experience for operators. #### Block Service The following `BlockService` will be implemented as part of [\#81], regardless -of whether or not this ADR is implemented. +of whether or not this ADR is implemented. This API, therefore, needs to be more +generally useful than just for the purposes of the data companion. The minimal +API to support a data companion, however, is presented in this ADR. ```protobuf syntax = "proto3"; -package tendermint.block_service.v1; +package tendermint.services.block.v1; import "tendermint/abci/types.proto"; import "tendermint/types/types.proto"; @@ -116,16 +138,12 @@ import "tendermint/types/block.proto"; service BlockService { // GetLatestHeight returns a stream of the latest block heights committed by // the network. This is a long-lived stream that is only terminated by the - // server if an error occurs. The companion is expected to handle such + // server if an error occurs. The caller is expected to handle such // disconnections and automatically reconnect. rpc GetLatestHeight(GetLatestHeightRequest) returns (stream GetLatestHeightResponse) {} - // GetBlock attempts to retrieve the block at a particular height. - rpc GetBlock(GetBlockRequest) returns (GetBlockResponse) {} - - // GetBlockResults attempts to retrieve the results of block execution for a - // particular height. - rpc GetBlockResults(GetBlockResultsRequest) returns (GetBlockResultsResponse) {} + // GetBlockByHeight attempts to retrieve the block at a particular height. + rpc GetBlockByHeight(GetBlockByHeightRequest) returns (GetBlockByHeightResponse) {} } message GetLatestHeightRequest {} @@ -134,133 +152,140 @@ message GetLatestHeightRequest {} message GetLatestHeightResponse { // The height of the latest committed block. Will be 0 if no data has been // committed yet. - int64 height = 1; + uint64 height = 1; } -message GetBlockRequest { +message GetBlockByHeightRequest { // The height of the block to get. Set to 0 to return the latest block. - int64 height = 1; + uint64 height = 1; } -message GetBlockResponse { +message GetBlockByHeightResponse { // Block data for the requested height. tendermint.types.Block block = 1; } +``` -message GetBlockResultsRequest { - // The height of the block results to get. Set to 0 to return the latest - // block. - int64 height = 1; -} +#### Block Results Service -message GetBlockResultsResponse { - // The height of the block whose results are provided in the remaining - // fields. - int64 height = 1; +The following `BlockResultsService` service is proposed _separately_ to the +`BlockService`. There are several reasons as to why there are two separate gRPC +services to meet the companion's needs as opposed to just one: - // All events produced by the ABCI FinalizeBlock call. - repeated tendermint.abci.Event events = 2; +1. The quantity of data stored by each is application-dependent, and coalescing + the two types of data could impose significant overhead in some cases (this + is primarily a justification for having separate RPC calls for each type of + data). +2. Operators can enable/disable these services independently of one another. +3. The existing JSON-RPC API distinguishes between endpoints providing these two + types of data (`/block` and `/block_results`), so users are already + accustomed to this distinction. +4. Eventually, when we no longer need to store block results at all, we can + simply deprecate the `BlockResultsService` without affecting clients who rely + on `BlockService`. - // All transaction results (and their associated events) produced by block - // execution. - repeated tendermint.abci.ExecTxResult tx_results = 3; +```protobuf +syntax = "proto3"; - // Validator updates during block execution. - repeated tendermint.abci.ValidatorUpdate validator_updates = 4; +package tendermint.services.block_results.v1; - // Consensus parameter updates during block execution. - tendermint.types.ConsensusParams consensus_param_updates = 5; +// BlockResultsService provides information about the execution results for +// specific blocks. +service BlockResultsService { + // GetBlockResults attempts to retrieve the execution results associated + // with a block of a certain height. + rpc GetBlockResults(GetBlockResultsRequest) returns (GetBlockResultsResponse) +} - // The app hash returned by the application after execution of all of the - // transactions in the block. - bytes app_hash = 6; +message GetBlockResultsRequest { + // The height of the block whose results are to be retrieved. Set to 0 to + // return the latest block's results. + uint64 height = 1; } -``` -There are several reasons as to why there are two separate gRPC calls for this -data as opposed to just one: +message GetBlockResultsResponse { + // The height associated with the block results. + uint64 height = 1; -1. The quantity of data stored by each is application-dependent, and coalescing - the two types of data could impose significant overhead in some cases. -2. The existing JSON-RPC API distinguishes between endpoints providing these two - types of data (`/block` and `/block_results`), and the `BlockService` should - simply provide a gRPC alternative to these two endpoints. -3. Eventually, when we no longer need to store block results at all, we can - simply deprecate the `GetBlockResults` API call without affecting clients who - rely on `GetBlock`. + // The contents of the FinalizeBlock response, which contain block execution + // results. + tendermint.abci.ResponseFinalizeBlock finalize_block_response = 2; +} +``` -#### Data Companion Service +#### Pruning Service -This gRPC service is the only novel service proposed for the data companion API, -and effectively gives the data companion a say in how the node prunes its data. +This gRPC service is the only novel service proposed in this ADR, and +effectively gives a single external caller (e.g. a data companion) a say in how +the node prunes its data. ```protobuf syntax = "proto3"; -package tendermint.data_companion_service.v1; +package tendermint.services.pruning.v1; -// DataCompanionService provides privileged access to specialized pruning -// functionality on the CometBFT node to help optimize node storage. -service DataCompanionService { - // SetRetainHeight notifies the node of the minimum height whose data must - // be retained by the node. This data includes block data and block - // execution results. - // - // Setting a retain height lower than a previous setting will result in an - // error. +// PruningService provides privileged access to specialized pruning +// functionality on the CometBFT node to help control node storage. +service PruningService { + // SetBlockRetainHeightRequest indicates to the node that it can safely + // prune all block data up to the specified retain height. // // The lower of this retain height and that set by the application in its // Commit response will be used by the node to determine which heights' data // can be pruned. - rpc SetRetainHeight(SetRetainHeightRequest) returns (SetRetainHeightResponse) {} + rpc SetBlockRetainHeight(SetBlockRetainHeightRequest) returns (SetBlockRetainHeightResponse) - // GetRetainHeight returns the retain height set by the companion and that - // set by the application. This can give the companion an indication as to - // which heights' data are currently available. - rpc GetRetainHeight(GetRetainHeightRequest) returns (GetRetainHeightResponse) {} + // GetBlockRetainHeight returns information about the retain height + // parameters used by the node to influence block retention/pruning. + rpc GetBlockRetainHeight(GetBlockRetainHeightRequest) returns (GetBlockRetainHeightResponse) + + // SetBlockResultsRetainHeightRequest indicates to the node that it can + // safely prune all block results data up to the specified height. + // + // The node will always store the block results for the latest height to + // help facilitate crash recovery. + rpc SetBlockResultsRetainHeight(SetBlockResultsRetainHeightRequest) returns (SetBlockResultsRetainHeightResponse) + + // GetBlockResultsRetainHeight returns information about the retain height + // parameters used by the node to influence block results retention/pruning. + rpc GetBlockResultsRetainHeight(GetBlockResultsRetainHeightRequest) returns (GetBlockResultsRetainHeightResponse) } -message SetRetainHeightRequest { - int64 height = 1; +message SetBlockRetainHeightRequest { + uint64 height = 1; } -message SetRetainHeightResponse {} +message SetBlockRetainHeightResponse {} -message GetRetainHeightRequest {} +message GetBlockRetainHeightRequest {} -message GetRetainHeightResponse { - // The retain height as set by the data companion. - int64 data_companion_retain_height = 1; - // The retain height as set by the ABCI application. - int64 app_retain_height = 2; -} -``` +message GetBlockRetainHeightResponse { + // The retain height set by the application. + uint64 app_retain_height = 1; -With this API design, it is technically possible for an integrator to attach -multiple data companions to the node, but only one of their retain heights will -be applied. + // The retain height set via the pruning service (e.g. by the data + // companion) specifically for blocks. + uint64 pruning_service_retain_height = 2; +} -### Access Control +message SetBlockResultsRetainHeightRequest { + uint64 height = 1; +} -As covered in the [gRPC API section](#grpc-api), it would be preferable to -implement some form of access control for sensitive, data companion-specific -APIs. Two main possibilities are considered here for access control: +message SetBlockResultsRetainHeightResponse {} -1. Only allow the `DataCompanionService` to be exposed via a gRPC endpoint bound - to `localhost`. This, however, only caters for deployments run on VMs, and - not for those running in Docker containers. In general, in a containerized - environment, it would be preferable to have the companion run in a separate - container to the node, such that it fails and recovers independently of the - node. +message GetBlockResultsRetainHeightRequest {} -2. Implement **basic HTTP authentication** on the `DataCompanionService`, where - credentials should be obtained from an `.htpasswd` file, using the same - format as [Apache `.htpasswd` files][htpasswd], whose location is set in the - CometBFT configuration file. +message GetBlockResultsRetainHeightResponse { + // The retain height set by the pruning service (e.g. by the data + // companion) specifically for block results. + uint64 pruning_service_retain_height = 1; +} +``` - `.htpasswd` files are relatively standard and are supported by many web - servers. bcrypt encryption for passwords stored in `.htpasswd` files is - strongly recommended here. +With this API design, it is technically possible for an integrator to attach +multiple data companions to the node, but only one of their retain heights will +be considered by the node. ### Configuration @@ -268,35 +293,40 @@ The following configuration file update is proposed to support the data companion API. ```toml -# A data companion, if enabled, is intended to offload the storage of certain -# types of data from the node. Specifically: -# 1. Block data, including transactions. -# 2. Block results, including events, transaction results, validator and -# consensus parameter updates. # -# A data companion can influence the pruning height of the node, and therefore -# the data companion gRPC service is considered to be a sensitive endpoint that -# is password-protected by default. -[data_companion] +# This is the envisaged configuration section for the gRPC API that will be +# introduced as part of https://github.com/cometbft/cometbft/issues/81 +# (Still a WIP) +# +[grpc] -# Is the data companion gRPC API enabled at all? Default: false -enabled = false +# The host/port on which to expose non-sensitive gRPC endpoints. +laddr = "tcp://localhost:26654" -# Authentication configuration for the data companion. -[data_companion.authentication] +# +# Configuration for sensitive gRPC endpoints, which should **never** be exposed +# to the public internet. +# +[grpc.sensitive] +# The host/port on which to expose sensitive gRPC endpoints. +laddr = "tcp://localhost:26655" -# The authentication method to use. At present, the only supported method is -# "basic" (i.e. basic HTTP authentication). -method = "basic" +# +# Configuration specifically for the gRPC pruning service, which is considered a +# sensitive service. +# +[grpc.sensitive.pruning_service] -# Path to the file containing basic authentication credentials to access the -# data companion service. If the data companion is enabled and this password -# file is not supplied, or the file does not exist, or the file is of an invalid -# format, the node will fail to start. +# Only controls whether the pruning service is accessible via the gRPC API - not +# whether a previously set pruning service retain height is honoured by the +# node. +# +# To disable the influence of previously set pruning service retain height(s) on +# node pruning, this endpoint should be enabled and the relevant pruning service +# retain heights should be set to 0. # -# See https://httpd.apache.org/docs/current/programs/htpasswd.html for details -# on how to create/configure this file. -password_file = "/path/to/.htpasswd" +# Disabled by default. +enabled = false ``` ### Metrics @@ -304,28 +334,13 @@ password_file = "/path/to/.htpasswd" The following metrics are proposed to be added to monitor the health of the interaction between a node and its data companion: -- `data_companion_retain_height` - The current retain height as requested by the - data companion. This can give operators insight into how the data companion is - affecting pruning. -- `data_companion_block_response_height` - The highest height for block response - data sent to the companion since node startup. This can give operators insight - into whether the data companion is significantly lagging behind the node. -- `data_companion_block_results_response_height` - The highest height for block - result response data sent to the companion since node startup. This can give - operators insight into whether the data companion is significantly lagging - behind the node. -- `data_companion_block_response_count` - The total number of block responses - sent to the data companion since node startup. -- `data_companion_block_result_response_count` - The total number of block - results responses sent to the data companion since node startup. -- `data_companion_block_response_bytes` - The total number of bytes sent to the - data companion since startup for block response data. This can potentially - alert operators to bottlenecks in communication between the node and the data - companion. -- `data_companion_block_results_response_bytes` - The total number of bytes sent - to the data companion since startup for block results response data. This can - potentially alert operators to bottlenecks in communication between the node - and the data companion. +- `grpc_pruning_service_retain_height` - The current retain height as requested + by the pruning service. This can give operators insight into how the data + companion is affecting pruning. + +Other metrics may be proposed as part of the non-sensitive gRPC API that could +assist operators in understanding the health of the interaction with the data +companion, but only if the data companion is the exclusive user of those APIs. ## Consequences @@ -338,7 +353,8 @@ interaction between a node and its data companion: APIs - Allows the data companion more leeway in reading the data it needs than the approach in [ADR 100][adr-100] -- Simpler implementation than [ADR 100][adr-100] +- Simpler implementation and fewer changes within the node than [ADR + 100][adr-100] ### Negative @@ -356,7 +372,6 @@ interaction between a node and its data companion: - [ADR 100 - Data Companion Push API][adr-100] - [\#81 - rpc: Add gRPC support][\#81] -- [`.htpasswd`][htpasswd] -[adr-100-context]: https://github.com/cometbft/cometbft/blob/thane/adr-082-data-companion-api/docs/architecture/adr-100-data-companion-push-api.md#context -[adr-100]: https://github.com/cometbft/cometbft/blob/thane/adr-082-data-companion-api/docs/architecture/adr-100-data-companion-push-api.md -[adr-100-req]: https://github.com/cometbft/cometbft/blob/thane/adr-082-data-companion-api/docs/architecture/adr-100-data-companion-push-api.md#requirements -[adr-100-alt]: https://github.com/cometbft/cometbft/blob/thane/adr-082-data-companion-api/docs/architecture/adr-100-data-companion-push-api.md#alternative-approaches +[adr-100-context]: ./adr-100-data-companion-push-api.md#context +[adr-100]: ./adr-100-data-companion-push-api.md +[adr-100-req]: ./adr-100-data-companion-push-api.md#requirements +[adr-100-alt]: ./adr-100-data-companion-push-api.md#alternative-approaches [\#81]: https://github.com/cometbft/cometbft/issues/81 [abci-commit]: ../../spec/abci/abci++_methods.md#commit From 7b126b402d8ede05d691c39b1da10ee164de81dd Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Wed, 19 Jul 2023 11:57:38 -0400 Subject: [PATCH 16/28] Mark ADR as accepted Signed-off-by: Thane Thomson --- docs/architecture/README.md | 2 +- docs/architecture/adr-101-data-companion-pull-api.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/architecture/README.md b/docs/architecture/README.md index 4c0f2d68400..fc163208fd4 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -42,12 +42,12 @@ numbering our ADRs from 100 onwards. ### Proposed -- [ADR-101: Data companion pull API](./adr-101-data-companion-pull-api.md) - [ADR-103: Protobuf definition versioning](./adr-103-proto-versioning.md) - [ADR-105: Refactor list of senders in mempool](./adr-105-refactor-mempool-senders.md) ### Accepted +- [ADR-101: Data companion pull API](./adr-101-data-companion-pull-api.md) - [ADR-104: State sync from local snapshot](./adr-104-out-of-band-state-sync.md) - [ADR-107: Rename protobuf versions of 0.x releases to pre-v1 betas](./adr-107-betaize-proto-versions.md) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 4cad35835c4..221873afff5 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -9,7 +9,7 @@ ## Status -Accepted | Rejected | Deprecated | Superseded by +Accepted ## Context From 6b98271b0e78d0d671e0f05a172a9ea64d2a4149 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Wed, 19 Jul 2023 12:02:29 -0400 Subject: [PATCH 17/28] Simplify method and message names Signed-off-by: Thane Thomson --- .../adr-101-data-companion-pull-api.md | 22 ++++++++++--------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 221873afff5..4a8a6ba407a 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -142,8 +142,8 @@ service BlockService { // disconnections and automatically reconnect. rpc GetLatestHeight(GetLatestHeightRequest) returns (stream GetLatestHeightResponse) {} - // GetBlockByHeight attempts to retrieve the block at a particular height. - rpc GetBlockByHeight(GetBlockByHeightRequest) returns (GetBlockByHeightResponse) {} + // GetByHeight attempts to retrieve the block at a particular height. + rpc GetByHeight(GetByHeightRequest) returns (GetByHeightResponse) {} } message GetLatestHeightRequest {} @@ -155,14 +155,16 @@ message GetLatestHeightResponse { uint64 height = 1; } -message GetBlockByHeightRequest { +message GetByHeightRequest { // The height of the block to get. Set to 0 to return the latest block. uint64 height = 1; } -message GetBlockByHeightResponse { +message GetByHeightResponse { + // The ID associated with the relevant block. + tendermint.types.BlockID block_id = 1; // Block data for the requested height. - tendermint.types.Block block = 1; + tendermint.types.Block block = 2; } ``` @@ -192,18 +194,18 @@ package tendermint.services.block_results.v1; // BlockResultsService provides information about the execution results for // specific blocks. service BlockResultsService { - // GetBlockResults attempts to retrieve the execution results associated - // with a block of a certain height. - rpc GetBlockResults(GetBlockResultsRequest) returns (GetBlockResultsResponse) + // GetByHeight attempts to retrieve the execution results associated with a + // block of a certain height. + rpc GetByHeight(GetByHeightRequest) returns (GetByHeightResponse) } -message GetBlockResultsRequest { +message GetByHeightRequest { // The height of the block whose results are to be retrieved. Set to 0 to // return the latest block's results. uint64 height = 1; } -message GetBlockResultsResponse { +message GetByHeightResponse { // The height associated with the block results. uint64 height = 1; From 45a836406eb4d3ee1632958159d930a82cf8d07e Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Thu, 20 Jul 2023 06:53:17 -0400 Subject: [PATCH 18/28] Rename "sensitive" gRPC config section to "privileged" Signed-off-by: Thane Thomson --- .../adr-101-data-companion-pull-api.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 4a8a6ba407a..d5c91fb3103 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -302,22 +302,22 @@ companion API. # [grpc] -# The host/port on which to expose non-sensitive gRPC endpoints. +# The host/port on which to expose non-privileged gRPC endpoints. laddr = "tcp://localhost:26654" # -# Configuration for sensitive gRPC endpoints, which should **never** be exposed +# Configuration for privileged gRPC endpoints, which should **never** be exposed # to the public internet. # -[grpc.sensitive] -# The host/port on which to expose sensitive gRPC endpoints. +[grpc.privileged] +# The host/port on which to expose privileged gRPC endpoints. laddr = "tcp://localhost:26655" # # Configuration specifically for the gRPC pruning service, which is considered a -# sensitive service. +# privileged service. # -[grpc.sensitive.pruning_service] +[grpc.privileged.pruning_service] # Only controls whether the pruning service is accessible via the gRPC API - not # whether a previously set pruning service retain height is honoured by the @@ -340,7 +340,7 @@ interaction between a node and its data companion: by the pruning service. This can give operators insight into how the data companion is affecting pruning. -Other metrics may be proposed as part of the non-sensitive gRPC API that could +Other metrics may be proposed as part of the non-privileged gRPC API that could assist operators in understanding the health of the interaction with the data companion, but only if the data companion is the exclusive user of those APIs. From 86a9e44aa7adfe725d01d4362c9a58370f020f7b Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Thu, 20 Jul 2023 06:54:59 -0400 Subject: [PATCH 19/28] Update metrics to reflect both block and block results independently Signed-off-by: Thane Thomson --- docs/architecture/adr-101-data-companion-pull-api.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index d5c91fb3103..17cb4266c22 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -336,9 +336,10 @@ enabled = false The following metrics are proposed to be added to monitor the health of the interaction between a node and its data companion: -- `grpc_pruning_service_retain_height` - The current retain height as requested - by the pruning service. This can give operators insight into how the data - companion is affecting pruning. +- `pruning_service_block_retain_height` - The current block retain height as + requested by the pruning service. +- `pruning_service_block_results_retain_height` - The current block results + retain height as requested by the pruning service. Other metrics may be proposed as part of the non-privileged gRPC API that could assist operators in understanding the health of the interaction with the data From 4b79adbb1c71ba91d7def43946dbc909c3c8d9dd Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Thu, 20 Jul 2023 07:14:03 -0400 Subject: [PATCH 20/28] Update pruning behaviour and configuration Signed-off-by: Thane Thomson --- .../adr-101-data-companion-pull-api.md | 39 +++++++++++++------ 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 17cb4266c22..78c4df3fc49 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -81,20 +81,10 @@ Two parameters are proposed as a necessary part of the pruning API: The node will prune blocks to whichever is lower between the pruning service and application block retain heights. - The default value is 0, which indicates that only the application block retain - height must be taken into consideration. - - **Pruning service block results retain height**, which influences the height to which the node will retain block results. - The default value is 0, which indicates that all block results will be - retained unless the `storage.discard_abci_responses` parameter is enabled, in - which case no block results will be stored except those that are necessary to - facilitate consensus. - -These parameters need to be durable (i.e. stored on disk). Setting either of -these two values to 0 after they were previously non-zero will effectively -disable that specific facet of the pruning behaviour. +These parameters need to be durable (i.e. stored on disk). ### gRPC API @@ -295,6 +285,33 @@ The following configuration file update is proposed to support the data companion API. ```toml +[storage] + +# +# Storage pruning configuration relating only to the data companion. +# +[storage.pruning.data_companion] + +# Whether automatic pruning respects values set by the data companion. Disabled +# by default. All other parameters in this section are ignored when this is +# disabled. +# +# If disabled, only the application retain height will influence block pruning +# (but not block results pruning). Only enabling this at a later stage will +# potentially mean that blocks below the application-set retain height at the +# time will not be available to the data companion. +enabled = false + +# The initial value for the data companion block retain height if the data +# companion has not yet explicitly set one. If the data companion has already +# set a block retain height, this is ignored. +initial_block_retain_height = 0 + +# The initial value for the data companion block results retain height if the +# data companion has not yet explicitly set one. If the data companion has +# already set a block results retain height, this is ignored. +initial_block_results_retain_height = 0 + # # This is the envisaged configuration section for the gRPC API that will be # introduced as part of https://github.com/cometbft/cometbft/issues/81 From ec791e26b64efb75f391caef2276e43d093b8f7f Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Thu, 20 Jul 2023 20:59:32 +0300 Subject: [PATCH 21/28] Fix syntax in ADR 101 Pruning service Add semicolons after each method definition. --- docs/architecture/adr-101-data-companion-pull-api.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 78c4df3fc49..de7f4b29a43 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -225,22 +225,22 @@ service PruningService { // The lower of this retain height and that set by the application in its // Commit response will be used by the node to determine which heights' data // can be pruned. - rpc SetBlockRetainHeight(SetBlockRetainHeightRequest) returns (SetBlockRetainHeightResponse) + rpc SetBlockRetainHeight(SetBlockRetainHeightRequest) returns (SetBlockRetainHeightResponse); // GetBlockRetainHeight returns information about the retain height // parameters used by the node to influence block retention/pruning. - rpc GetBlockRetainHeight(GetBlockRetainHeightRequest) returns (GetBlockRetainHeightResponse) + rpc GetBlockRetainHeight(GetBlockRetainHeightRequest) returns (GetBlockRetainHeightResponse); // SetBlockResultsRetainHeightRequest indicates to the node that it can // safely prune all block results data up to the specified height. // // The node will always store the block results for the latest height to // help facilitate crash recovery. - rpc SetBlockResultsRetainHeight(SetBlockResultsRetainHeightRequest) returns (SetBlockResultsRetainHeightResponse) + rpc SetBlockResultsRetainHeight(SetBlockResultsRetainHeightRequest) returns (SetBlockResultsRetainHeightResponse); // GetBlockResultsRetainHeight returns information about the retain height // parameters used by the node to influence block results retention/pruning. - rpc GetBlockResultsRetainHeight(GetBlockResultsRetainHeightRequest) returns (GetBlockResultsRetainHeightResponse) + rpc GetBlockResultsRetainHeight(GetBlockResultsRetainHeightRequest) returns (GetBlockResultsRetainHeightResponse); } message SetBlockRetainHeightRequest { From afc30a050a285c3e4319df34f794bc0bc424e65f Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Thu, 10 Aug 2023 06:36:25 -0400 Subject: [PATCH 22/28] Remove config comment that is no longer relevant Signed-off-by: Thane Thomson --- docs/architecture/adr-101-data-companion-pull-api.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index de7f4b29a43..766301ab0e8 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -338,11 +338,7 @@ laddr = "tcp://localhost:26655" # Only controls whether the pruning service is accessible via the gRPC API - not # whether a previously set pruning service retain height is honoured by the -# node. -# -# To disable the influence of previously set pruning service retain height(s) on -# node pruning, this endpoint should be enabled and the relevant pruning service -# retain heights should be set to 0. +# node. That is controlled by settings in the [storage.pruning] section. # # Disabled by default. enabled = false From 913bcc57eb264e44db50bf8078dc40f3dc937329 Mon Sep 17 00:00:00 2001 From: Thane Thomson Date: Thu, 10 Aug 2023 12:21:52 -0400 Subject: [PATCH 23/28] Expand metrics based on feedback Signed-off-by: Thane Thomson --- docs/architecture/adr-101-data-companion-pull-api.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 766301ab0e8..c1fbde554c0 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -353,6 +353,12 @@ interaction between a node and its data companion: requested by the pruning service. - `pruning_service_block_results_retain_height` - The current block results retain height as requested by the pruning service. +- `application_block_retain_height` - The current block retain height as set by + the application. +- `block_store_base_height` - The actual base height of the block store, which + is influenced by the application and pruning service block retain heights. +- `abci_results_base_height` - The actual base height of stored block results, + which is influenced by the pruning service block results retain height. Other metrics may be proposed as part of the non-privileged gRPC API that could assist operators in understanding the health of the interaction with the data From d2efb3bc1a00ffba4d2c2fa49ab1dc08ce09da4f Mon Sep 17 00:00:00 2001 From: Jasmina Malicevic Date: Thu, 28 Sep 2023 09:52:26 +0200 Subject: [PATCH 24/28] Apply suggestions from @cason Co-authored-by: Daniel --- docs/architecture/adr-101-data-companion-pull-api.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index c1fbde554c0..6b6e733cf55 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -13,9 +13,9 @@ Accepted ## Context -See the [context for ADR-100][adr-100-context]. +See the [context for the Data Companion Push API (ADR-100)][adr-100-context]. -The primary novelty introduced in this ADR is effectively just a new gRPC API +The primary novelty introduced in this ADR is effectively a new gRPC API that allows an external application to influence which data the node prunes. Otherwise, existing and planned RPC interfaces (such as the planned gRPC interface) in addition to this pruning API should, in theory, deliver the same @@ -76,7 +76,7 @@ Two parameters are proposed as a necessary part of the pruning API: - **Pruning service block retain height**, which influences the height to which the node will retain blocks. This is different to the **application block retain height**, which is set by the application in its response to each ABCI - `commit` message. + `Commit` message. The node will prune blocks to whichever is lower between the pruning service and application block retain heights. @@ -186,7 +186,7 @@ package tendermint.services.block_results.v1; service BlockResultsService { // GetByHeight attempts to retrieve the execution results associated with a // block of a certain height. - rpc GetByHeight(GetByHeightRequest) returns (GetByHeightResponse) + rpc GetByHeight(GetByHeightRequest) returns (GetByHeightResponse) {} } message GetByHeightRequest { From 7ede1d075adbabcb68203485c2f259ed70d94c04 Mon Sep 17 00:00:00 2001 From: Jasmina Malicevic Date: Mon, 2 Oct 2023 13:09:30 +0200 Subject: [PATCH 25/28] Added indexer prunning section --- .../adr-101-data-companion-pull-api.md | 69 ++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 6b6e733cf55..0a6c31cb30c 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -275,13 +275,72 @@ message GetBlockResultsRetainHeightResponse { } ``` +##### Indexer pruning service + +This gRPC service can be used to instruct CometBFT to prune the transaction and +block index events. + +To support this, the above described pruning service is extended as follows: + +``` + +// PruningService provides privileged access to specialized pruning +// functionality on the CometBFT node to help control node storage. +service PruningService { + + + // SetTxIndexerRetainHeightRequest indicates to the node that it can safely + // prune all tx indices up to the specified retain height. + rpc SetTxIndexerRetainHeight(SetTxIndexerRetainHeightRequest) returns (SetTxIndexerRetainHeightResponse); + + // GetTxIndexerRetainHeight returns information about the retain height + // parameters used by the node to influence TxIndexer pruning + rpc GetTxIndexerRetainHeight(GetTxIndexerRetainHeightRequest) returns (GetTxIndexerRetainHeightResponse); + + // SetBlockIndexerRetainHeightRequest indicates to the node that it can safely + // prune all block indices up to the specified retain height. + rpc SetBlockIndexerRetainHeight(SetBlockIndexerRetainHeightRequest) returns (SetBlockIndexerRetainHeightResponse); + + // GetBlockIndexerRetainHeight returns information about the retain height + // parameters used by the node to influence BlockIndexer pruning + rpc GetBlockIndexerRetainHeight(GetBlockIndexerRetainHeightRequest) returns (GetBlockIndexerRetainHeightResponse); + + +} + +message SetTxIndexerRetainHeightRequest { + uint64 height = 1; +} + +message SetTxIndexerRetainHeightResponse {} + +message GetTxIndexerRetainHeightRequest {} + +message GetTxIndexerRetainHeightResponse { + uint64 height = 1; +} + +message SetBlockIndexerRetainHeightRequest { + uint64 height = 1; +} + +message SetBlockIndexerRetainHeightResponse {} + +message GetBlockIndexerRetainHeightRequest {} + +message GetBlockIndexerRetainHeightResponse { + uint64 height = 1; +} + +``` + With this API design, it is technically possible for an integrator to attach multiple data companions to the node, but only one of their retain heights will be considered by the node. ### Configuration -The following configuration file update is proposed to support the data +The following configuration file (`config.toml`) update is proposed to support the data companion API. ```toml @@ -359,6 +418,14 @@ interaction between a node and its data companion: is influenced by the application and pruning service block retain heights. - `abci_results_base_height` - The actual base height of stored block results, which is influenced by the pruning service block results retain height. +- `block_indexer_retain_height` - The current block indexer retain height + requested by the pruning service. +- `tx_indexer_retain_height` - The current tx indexer retain height + requested by the pruning service. +- `block_indexer_base_height` - The minimum height at which we have block events + (should demonstrate the effects of pruning the block indexer) +- `tx_indexer_base_height` - The minimum height at which we have transaction events + (should demonstrate the effects of pruning the tx indexer) Other metrics may be proposed as part of the non-privileged gRPC API that could assist operators in understanding the health of the interaction with the data From 3c7ad0f273f10e862f7878a3be60a8d41c7a3140 Mon Sep 17 00:00:00 2001 From: Jasmina Malicevic Date: Mon, 2 Oct 2023 13:11:05 +0200 Subject: [PATCH 26/28] Fixed layout --- docs/architecture/adr-101-data-companion-pull-api.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 0a6c31cb30c..ae138ab6469 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -278,11 +278,11 @@ message GetBlockResultsRetainHeightResponse { ##### Indexer pruning service This gRPC service can be used to instruct CometBFT to prune the transaction and -block index events. +block events indexed by CometBFT. To support this, the above described pruning service is extended as follows: -``` +```protobuf // PruningService provides privileged access to specialized pruning // functionality on the CometBFT node to help control node storage. From cc1206b49fbc55de301baf2d6929e38e73855199 Mon Sep 17 00:00:00 2001 From: Jasmina Malicevic Date: Mon, 2 Oct 2023 13:15:49 +0200 Subject: [PATCH 27/28] Added link to docs of implementation --- docs/architecture/adr-101-data-companion-pull-api.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index ae138ab6469..18e378d7e01 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -461,6 +461,7 @@ companion, but only if the data companion is the exclusive user of those APIs. - [ADR 100 - Data Companion Push API][adr-100] - [\#81 - rpc: Add gRPC support][\#81] +- [Documentation on current implementation of ADR][dc-docs] [adr-100-context]: ./adr-100-data-companion-push-api.md#context [adr-100]: ./adr-100-data-companion-push-api.md @@ -468,3 +469,4 @@ companion, but only if the data companion is the exclusive user of those APIs. [adr-100-alt]: ./adr-100-data-companion-push-api.md#alternative-approaches [\#81]: https://github.com/cometbft/cometbft/issues/81 [abci-commit]: ../../spec/abci/abci++_methods.md#commit +[dc-docs]: https://github.com/cometbft/cometbft/tree/main/docs/data-companion \ No newline at end of file From 2ab15c20f7ec01c694278d173f20337c10e4c2d5 Mon Sep 17 00:00:00 2001 From: Jasmina Malicevic Date: Mon, 2 Oct 2023 13:16:29 +0200 Subject: [PATCH 28/28] Minor fix --- docs/architecture/adr-101-data-companion-pull-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/architecture/adr-101-data-companion-pull-api.md b/docs/architecture/adr-101-data-companion-pull-api.md index 18e378d7e01..963dd578beb 100644 --- a/docs/architecture/adr-101-data-companion-pull-api.md +++ b/docs/architecture/adr-101-data-companion-pull-api.md @@ -461,7 +461,7 @@ companion, but only if the data companion is the exclusive user of those APIs. - [ADR 100 - Data Companion Push API][adr-100] - [\#81 - rpc: Add gRPC support][\#81] -- [Documentation on current implementation of ADR][dc-docs] +- [Documentation on current implementation of ADR-101][dc-docs] [adr-100-context]: ./adr-100-data-companion-push-api.md#context [adr-100]: ./adr-100-data-companion-push-api.md