-
Notifications
You must be signed in to change notification settings - Fork 646
refactor(meta): avoid using frontend generated ColumnIndexMapping
when constructing new dispatcher for replacement
#21499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ColumnIndexMapping
when constructing new dispatcher for replacementColumnIndexMapping
when constructing new dispatcher for replacement
ColumnIndexMapping
when constructing new dispatcher for replacementColumnIndexMapping
when constructing new dispatcher for replacement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the dispatcher construction for replacement by removing reliance on the frontend‑generated ColumnIndexMapping for rewriting output indices and by introducing a new helper function for generating output indices. Key changes include:
- Adding a new method (column_ids) for SourceBackfillNode in the prost module.
- Replacing inline column index extraction with the helper function gen_output_indices in the meta module.
- Renaming and converting types from DispatchStrategy to DispatcherType across several meta and controller files to support the refactoring.
Reviewed Changes
Copilot reviewed 5 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
src/prost/src/lib.rs | New method column_ids added to SourceBackfillNode. |
src/meta/src/stream/stream_graph/fragment.rs | Updated logic for generating output_indices and converted variable names to enhance clarity. |
src/meta/src/rpc/ddl_controller.rs | Removal of the col_index_mapping parameter and related dispatcher rewriting. |
src/meta/src/manager/metadata.rs | Updated type usage from DispatchStrategy to PbDispatcherType. |
src/meta/src/controller/fragment.rs | Refactored downstream fragment processing and dispatcher type conversion. |
Files not reviewed (2)
- e2e_test/ddl/alter_table_column_type.slt: Language not supported
- e2e_test/source_legacy/cdc_inline/alter/cdc_table_alter.slt: Language not supported
Comments suppressed due to low confidence (1)
src/meta/src/stream/stream_graph/fragment.rs:1147
- [nitpick] Consider updating the parameter type of required_columns from &Vec to &[i32] for a more idiomatic and flexible API.
fn gen_output_indices(required_columns: &Vec<i32>, upstream_columns: Vec<i32>) -> Option<Vec<u32>> {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we want to switch from calculating in FE to calculating in meta?
What was the original reason to calculate in FE? 🤔
Exactly
Cannot remember clearly. For example, there's an upstream table on its schema version
The previous approach eliminated the need for the meta node to resolve table schema information, which might have been considered simpler to implement when table schema changes were first introduced. However, the key point is that index mapping can be flattened regardless of how many mappings are applied. This does hold any more if we are going to work on complicated mappings involving expressions (for altering column type). |
b7650c8
to
aab2115
Compare
9a1e7d9
to
05dc197
Compare
05dc197
to
8cee6c9
Compare
8cee6c9
to
088350c
Compare
…ering a table Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
088350c
to
8de600f
Compare
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Previously, when performing schema change, the frontend will generate a
ColumnIndexMapping
between the old and the new schema. It will be used by the meta node to rewrite theoutput_indices
field of the dispatchers, to maintain the same schema from the view of existing downstream jobs.Things start to get complicated since we support altering the column type (#19755): instead of using a simple
output_indices
to transform the data chunks in the dispatcher, we need to introduce some sort of expression evaluation to transform between between two different but compatible composite types.Note that altering type can be called multiple times on a column. Instead of rewriting the transforming expression with an enhanced
FieldIndexMapping
, it's obvious that generating the transformation between the latest upstream & downstream schema each time can be much easier to implement. This is the motivation behind this PR.However, we still cannot deprecate the frontend generated
ColumnIndexMapping
now, because it's also used to rewrite the index expressions.Considering that the persisted information is all about column index instead of ID, I still haven't come up with a solution to easily update the functional indexes after altering column type.UPDATE: it will be removed in an upcoming PR after we eliminate the need for using it to rewrite index expressions:ALTER COLUMN TYPE
#21681col_index_mapping
when replacing table #21685Checklist
Documentation
Release note