mysql cdc source: schema change while RW is offline fails the connector permanently #21801
Labels
S-need-design
Status: A detailed design is needed before coding. Typically used for feat/refactor issues.
type/bug
Type: Bug. Only for issues.
Milestone
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
After creating a MySQL CDC source/table, if we take RW offline, make schema changes on the MySQL side, and then restart RW, we will only receive the latest table schema. This results in a failure to parse the (unconsumed) binlog with the original schema, causing the connector to fail permanently.
Error message/log
To Reproduce
a.sh
a.slt
risedev-profiles.user.yml
Then run
a.sh
.Expected behavior
Should sync all records correctly and never fail.
How did you deploy RisingWave?
No response
The version of RisingWave
2.3.2 (also latest main)
Additional context
Decoding binlog requires schema information. And if there's schema change, we need a complete schema history. However, we are using
MemorySchemaHistory
and this might be the cause.risingwave/java/connector-node/risingwave-connector-service/src/main/resources/debezium.properties
Line 4 in d42e9cc
After RW is taking back online, it can only see the latest MySQL table schema because all history is gone. If there's still unconsumed binlog with original schema, Debezium will fail to parse it.
This topic can be quite complicated. Please refer to the Debezium docs for a comprehensive understanding: https://debezium.io/documentation/reference/stable/connectors/mysql.html#understanding-why-initial-snapshots-capture-the-schema-history-for-all-tables
The text was updated successfully, but these errors were encountered: