common: correct the table info when retrieving without virtual columns #1474

wk989898 · 2025-06-25T09:31:54Z

What problem does this PR solve?

Issue Number: ref #1449

What is changed and how it works?

Removes the BuildTiDBTableInfoWithoutVirtualColumns function since it has no effect.
The generated columns will be skipped when gen SQL in the sqlmodel package.

Check List

Tests

Unit test
Integration test

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

`None`.

ti-chi-bot · 2025-06-25T09:31:56Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

wk989898 · 2025-06-25T09:32:16Z

/test all

gemini-code-assist

Summary of Changes

Hello @wk989898, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the getColumnSchemaWithoutVirtualColumns function within pkg/common/table_info_helper.go. The previous implementation failed to correctly filter and copy non-virtual columns into a new columnSchema object due to issues with slice initialization and the source of iteration. The changes ensure that the new column schema is properly constructed with only the intended non-virtual columns.

Highlights

Bug Fix: Column Schema Filtering: I've identified and fixed a logical error in the getColumnSchemaWithoutVirtualColumns function. Previously, this function was not correctly populating the Columns field of the new columnSchema object, leading to an empty or incorrectly filtered list of columns.
Correct Slice Initialization: The Columns field of the newColumnSchema is now explicitly initialized as a new, empty slice with appropriate capacity (make([]*model.ColumnInfo, 0, len(s.Columns))). This prevents unintended shallow copying and ensures the slice is ready for independent population.
Accurate Column Iteration: The loop responsible for filtering and adding non-virtual columns now correctly iterates over the original s.Columns slice (for _, srcCol := range s.Columns). This resolves a bug where the loop was incorrectly iterating over the newly created (and empty) newColumnSchema.Columns, preventing any columns from being added.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This PR refactors the getColumnSchemaWithoutVirtualColumns function. While the changes to how the Columns slice is handled are correct, I've identified a significant bug and a performance issue within the same function that should be addressed. The Indices slice is being mutated, causing unintended side effects on the original columnSchema, and the columnsOffset map is initialized with zero capacity, which is inefficient.

gemini-code-assist · 2025-06-25T09:33:51Z

pkg/common/table_info_helper.go

@@ -771,7 +771,7 @@ func (s *columnSchema) getColumnList(isUpdate bool) (int, string) {
 func (s *columnSchema) getColumnSchemaWithoutVirtualColumns() *columnSchema {
 	newColumnSchema := &columnSchema{
 		Digest:                        s.Digest,
-		Columns:                       s.Columns,
+		Columns:                       make([]*model.ColumnInfo, 0, len(s.Columns)),


Initializing Columns with a new slice is a good change to prevent aliasing and potential mutations of the original columnSchema. However, the Indices field is shallow-copied on the next line (775). Later in the function (lines 805-809), the offsets of index columns are modified. This mutates s.Indices in the original columnSchema, which is unexpected for a function with a 'get' prefix and can lead to subtle bugs. To fix this, you should perform a deep copy of s.Indices before it's modified. Given this is a correctness issue, it's important to address it.

gemini-code-assist · 2025-06-25T09:33:52Z

pkg/common/table_info_helper.go

 	rowColumnsCurrentOffset := 0
 	columnsOffset := make(map[string]int, len(newColumnSchema.Columns))
-	for _, srcCol := range newColumnSchema.Columns {
+	for _, srcCol := range s.Columns {


The change to iterate over s.Columns is correct. However, this change, combined with the change at the start of the function, introduces a performance issue on the preceding line (793). The line columnsOffset := make(map[string]int, len(newColumnSchema.Columns)) now initializes the map with a capacity of 0, because newColumnSchema.Columns is an empty slice at this point. For better performance, you should initialize the map with an appropriate capacity.

columnsOffset := make(map[string]int, len(s.Columns))

ti-chi-bot · 2025-06-25T11:29:26Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongyunyan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [hongyunyan]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2025-06-25T11:29:26Z

[LGTM Timeline notifier]

Timeline:

2025-06-25 11:29:26.57819291 +0000 UTC m=+876019.301371877: ☑️ agreed by hongyunyan.

wk989898 · 2025-06-26T09:42:12Z

/gemini summary

gemini-code-assist · 2025-06-26T09:43:24Z

Summary of Changes

This pull request addresses an issue where table information was not being correctly retrieved when virtual columns were excluded. It refactors the code to streamline the process of handling arguments with and without generated columns, ensuring accurate data retrieval and manipulation during database operations.

Highlights

Table Info Correction: Corrects how table information is retrieved when virtual columns are not included.
Code Removal: Removes the BuildTiDBTableInfoWithoutVirtualColumns function from table_info.go and getColumnSchemaWithoutVirtualColumns from table_info_helper.go.
Argument Handling: Modifies getArgs and introduces getArgsWithGeneratedColumn in sql_builder.go to handle arguments with and without generated columns.
DML Writer Update: Updates groupRowsByType in mysql_writer_dml.go to use getArgsWithGeneratedColumn for insert, update, and delete operations.

Changelog

pkg/common/table_info.go
- Removes the BuildTiDBTableInfoWithoutVirtualColumns function (lines 469-479).
pkg/common/table_info_helper.go
- Removes the getColumnSchemaWithoutVirtualColumns function (lines 771-812).
pkg/sink/mysql/mysql_writer_dml.go
- Removes conditional logic for virtual columns (lines 742-744).
- Replaces getArgs with getArgsWithGeneratedColumn for insert operations (line 748 -> 743).
- Replaces getArgs with getArgsWithGeneratedColumn for update operations (lines 763-764 -> 758-759).
- Replaces getArgs with getArgsWithGeneratedColumn for delete operations (line 778 -> 773).
pkg/sink/mysql/sql_builder.go
- Modifies buildInsert to use getArgs without the enableGeneratedColumn parameter (line 73).
- Modifies buildUpdate to use getArgs without the enableGeneratedColumn parameter (line 131).
- Modifies getArgs to exclude generated columns (lines 161-170).
- Introduces getArgsWithGeneratedColumn to include generated columns (lines 173-179).

Activity

The PR was approved by hongyunyan.
The PR is pending a linked issue to remove the do-not-merge/needs-linked-issue label.
The pull-cdc-kafka-integration-heavy test failed and needs to be re-run.
gemini-code-assist[bot] left two review comments, one regarding a potential mutation of the original columnSchema and another regarding a performance issue with map initialization.

ti-chi-bot · 2025-06-26T10:09:37Z

@wk989898: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-cdc-pulsar-integration-light	`7a13210`	link	false	`/test pull-cdc-pulsar-integration-light`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Update table_info_helper.go

38bd28a

gemini-code-assist bot reviewed Jun 25, 2025

View reviewed changes

remove enableGeneratedColumn

9876676

ti-chi-bot bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 25, 2025

wk989898 changed the title ~~WIP~~ common: correct the table info when retrieving without virtual columns Jun 25, 2025

hongyunyan approved these changes Jun 25, 2025

View reviewed changes

ti-chi-bot bot added the lgtm label Jun 25, 2025

ti-chi-bot bot added the approved label Jun 25, 2025

add generate column test

be5d526

wk989898 marked this pull request as ready for review June 26, 2025 02:55

ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 26, 2025

wk989898 added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2025

fix

bd37282

ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 26, 2025

Merge branch 'master' into wk989898-patch-4

12ef648

ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 26, 2025

ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-linked-issue labels Jun 26, 2025

wk989898 requested a review from hongyunyan June 26, 2025 09:50

update

7a13210

ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

common: correct the table info when retrieving without virtual columns #1474

common: correct the table info when retrieving without virtual columns #1474

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

common: correct the table info when retrieving without virtual columns #1474

Are you sure you want to change the base?

common: correct the table info when retrieving without virtual columns #1474

Conversation

Uh oh!

What problem does this PR solve?

What is changed and how it works?

Check List

Tests

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

Choose a reason for hiding this comment

Code Review

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

[LGTM Timeline notifier]

Uh oh!

Uh oh!

Summary of Changes

Highlights

Uh oh!

Uh oh!

Uh oh!