8000 common: correct the table info when retrieving without virtual columns by wk989898 · Pull Request #1474 · pingcap/ticdc · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

common: correct the table info when retrieving without virtual columns #1474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

wk989898
Copy link
Collaborator
@wk989898 wk989898 commented Jun 25, 2025

What problem does this PR solve?

Issue Number: ref #1449

What is changed and how it works?

Removes the BuildTiDBTableInfoWithoutVirtualColumns function since it has no effect.
The generated columns will be skipped when gen SQL in the sqlmodel package.

Check List

Tests

  • Unit test
  • Integration test

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

`None`.

Copy link
ti-chi-bot bot commented Jun 25, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 25, 2025
@wk989898
Copy link
Collaborator Author

/test all

Copy link
@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @wk989898, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the getColumnSchemaWithoutVirtualColumns function within pkg/common/table_info_helper.go. The previous implementation failed to correctly filter and copy non-virtual columns into a new columnSchema object due to issues with slice initialization and the source of iteration. The changes ensure that the new column schema is properly constructed with only the intended non-virtual columns.

Highlights

  • Bug Fix: Column Schema Filtering: I've identified and fixed a logical error in the getColumnSchemaWithoutVirtualColumns function. Previously, this function was not correctly populating the Columns field of the new columnSchema object, leading to an empty or incorrectly filtered list of columns.
  • Correct Slice Initialization: The Columns field of the newColumnSchema is now explicitly initialized as a new, empty slice with appropriate capacity (make([]*model.ColumnInfo, 0, len(s.Columns))). This prevents unintended shallow copying and ensures the slice is ready for independent population.
  • Accurate Column Iteration: The loop responsible for filtering and adding non-virtual columns now correctly iterates over the original s.Columns slice (for _, srcCol := range s.Columns). This resolves a bug where the loop was incorrectly iterating over the newly created (and empty) newColumnSchema.Columns, preventing any columns from being added.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR refactors the getColumnSchemaWithoutVirtualColumns function. While the changes to how the Columns slice is handled are correct, I've identified a significant bug and a performance issue within the same function that should be addressed. The Indices slice is being mutated, causing unintended side effects on the original columnSchema, and the columnsOffset map is initialized with zero capacity, which is inefficient.

@@ -771,7 +771,7 @@ func (s *columnSchema) getColumnList(isUpdate bool) (int, string) {
func (s *columnSchema) getColumnSchemaWithoutVirtualColumns() *columnSchema {
newColumnSchema := &columnSchema{
Digest: s.Digest,
Columns: s.Columns,
Columns: make([]*model.ColumnInfo, 0, len(s.Columns)),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Initializing Columns with a new slice is a good change to prevent aliasing and potential mutations of the original columnSchema. However, the Indices field is shallow-copied on the next line (775). Later in the function (lines 805-809), the offsets of index columns are modified. This mutates s.Indices in the original columnSchema, which is unexpected for a function with a 'get' prefix and can lead to subtle bugs. To fix this, you should perform a deep copy of s.Indices before it's modified. Given this is a correctness issue, it's important to address it.

rowColumnsCurrentOffset := 0
columnsOffset := make(map[string]int, len(newColumnSchema.Columns))
for _, srcCol := range newColumnSchema.Columns {
for _, srcCol := range s.Columns {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The change to iterate over s.Columns is correct. However, this change, combined with the change at the start of the function, introduces a performance issue on the preceding line (793). The line columnsOffset := make(map[string]int, len(newColumnSchema.Columns)) now initializes the map with a capacity of 0, because newColumnSchema.Columns is an empty slice at this point. For better performance, you should initialize the map with an appropriate capacity.

columnsOffset := make(map[string]int, len(s.Columns))

@ti-chi-bot ti-chi-bot bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 25, 2025
@wk989898 wk989898 changed the title WIP common: correct the table info when retrieving without virtual columns Jun 25, 2025
@ti-chi-bot ti-chi-bot bot added the lgtm label Jun 25, 2025
Copy link
ti-chi-bot bot commented Jun 25, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongyunyan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
ti-chi-bot bot commented Jun 25, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-06-25 11:29:26.57819291 +0000 UTC m=+876019.301371877: ☑️ agreed by hongyunyan.

@ti-chi-bot ti-chi-bot bot added the approved label Jun 25, 2025
@wk989898 wk989898 marked this pull request as ready for review June 26, 2025 02:55
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 26, 2025
@wk989898 wk989898 added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2025
@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 26, 2025
@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 26, 2025
@wk989898
Copy link
Collaborator Author

/gemini summary

Copy link

Summary of Changes

This pull request addresses an issue where table information was not being correctly retrieved when virtual columns were excluded. It refactors the code to streamline the process of handling arguments with and without generated columns, ensuring accurate data retrieval and manipulation during database operations.

Highlights

  • Table Info Correction: Corrects how table information is retrieved when virtual columns are not included.
  • Code Removal: Removes the BuildTiDBTableInfoWithoutVirtualColumns function from table_info.go and getColumnSchemaWithoutVirtualColumns from table_info_helper.go.
  • Argument Handling: Modifies getArgs and introduces getArgsWithGeneratedColumn in sql_builder.go to handle arguments with and without generated columns.
  • DML Writer Update: Updates groupRowsByType in mysql_writer_dml.go to use getArgsWithGeneratedColumn for insert, update, and delete operations.
Changelog
  • pkg/common/table_info.go
    • Removes the BuildTiDBTableInfoWithoutVirtualColumns function (lines 469-479).
  • pkg/common/table_info_helper.go
    • Removes the getColumnSchemaWithoutVirtualColumns function (lines 771-812).
  • pkg/sink/mysql/mysql_writer_dml.go
    • Removes conditional logic for virtual columns (lines 742-744).
    • Replaces getArgs with getArgsWithGeneratedColumn for insert operations (line 748 -> 743).
    • Replaces getArgs with getArgsWithGeneratedColumn for update operations (lines 763-764 -> 758-759).
    • Replaces getArgs with getArgsWithGeneratedColumn for delete operations (line 778 -> 773).
  • pkg/sink/mysql/sql_builder.go
    • Modifies buildInsert to use getArgs without the enableGeneratedColumn parameter (line 73).
    • Modifies buildUpdate to use getArgs without the enableGeneratedColumn parameter (line 131).
    • Modifies getArgs to exclude generated columns (lines 161-170).
    • Introduces getArgsWithGeneratedColumn to include generated columns (lines 173-179).
Activity
  • The PR was approved by hongyunyan.
  • The PR is pending a linked issue to remove the do-not-merge/needs-linked-issue label.
  • The pull-cdc-kafka-integration-heavy test failed and needs to be re-run.
  • gemini-code-assist[bot] left two review comments, one regarding a potential mutation of the original columnSchema and another regarding a performance issue with map initialization.

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-linked-issue labels Jun 26, 2025
@wk989898 wk989898 requested a review from hongyunyan June 26, 2025 09:50
@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 26, 2025
Copy link
ti-chi-bot bot commented Jun 26, 2025

@wk989898: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cdc-pulsar-integration-light 7a13210 link false /test pull-cdc-pulsar-integration-light

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/needs-triage-completed lgtm release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0