8000 [BugFix] Create temporary partitions bugfix + fix comparisons on Type.Date and Type.Datetime by ctbrennan · Pull Request #60014 · StarRocks/starrocks · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[BugFix] Create temporary partitions bugfix + fix comparisons on Type.Date and Type.Datetime #60014

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ctbrennan
Copy link
Contributor
@ctbrennan ctbrennan commented Jun 18, 2025

Why I'm doing:

When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values."

  • Logging at each step showed that:
    • The table and partition column’s type remained DATETIME throughout the parser, visitor, and analyzer.
    • However, inside MultiRangePartitionDesc.buildDateTypePartition(), even though the type was reported as DATETIME, the code used the DATE formatter unless the object passed an identity (==) comparison to Type.DATETIME.

Note on what I learned during unit test writing

I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe.
The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say.

What I'm doing:

  • Replaced == with .isDatetime() when comparing partition column types to Type.DATETIME in date partition construction logic.
    • For this, the relevant one is in com.starrocks.sql.ast.MultiRangePartitionDesc#buildDateTypePartition. I also went and did the same thing where there was obviously the same bug for Date and Datetime types.
  • Another possible solution would have been to override Type.equals() but I'm not confident that I'd do it correctly for all cases and it would have a bigger blast radius.
  • Temporary partitions now correctly parse and preserve full datetime values, supporting hourly partitioning aligned with the schema.

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.5
    • 3.4
    • 3.3

@ctbrennan ctbrennan requested review from a team as code owners June 18, 2025 02:55
@ctbrennan ctbrennan changed the title Create temporary partitions bugfix + fix comparisons on Type.Date and Type.Datetime [BugFix] Create temporary partitions bugfix + fix comparisons on Type.Date and Type.Datetime Jun 18, 2025
@alvin-celerdata
Copy link
Contributor

@mergify rebase

Copy link
Contributor
mergify bot commented Jun 19, 2025

rebase

❌ Pull request can't be updated with latest base branch changes

Mergify needs the author permission to update the base branch of the pull request.
@ctbrennan needs to authorize modification on its head branch.

@alvin-celerdata
Copy link
Contributor

@ctbrennan, the failed UT has been fixed. Could you rebase to the main branch to pass the CI check?

@ctbrennan ctbrennan force-pushed the cbrennan/temppartitionbugfixupstream branch from da4c82f to d6813c8 Compare June 19, 2025 16:19
@ctbrennan
Copy link
Contributor Author

@ctbrennan, the failed UT has been fixed. Could you rebase to the main branch to pass the CI check?

yes, done

Copy link

Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

pass : 6 / 6 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/sql/analyzer/PartitionDescAnalyzer.java 1 1 100.00% []
🔵 com/starrocks/sql/optimizer/rule/transformation/ListPartitionPruner.java 1 1 100.00% []
🔵 com/starrocks/sql/ast/MultiRangePartitionDesc.java 1 1 100.00% []
🔵 com/starrocks/sql/ast/PartitionValue.java 1 1 100.00% []
🔵 com/starrocks/analysis/DateLiteral.java 2 2 100.00% []

Copy link

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@alvin-celerdata alvin-celerdata merged commit 5f25005 into StarRocks:main Jun 20, 2025
66 of 67 checks passed
8000
Copy link

@Mergifyio backport branch-3.3

Copy link

@Mergifyio backport branch-3.5

Copy link

@Mergifyio backport branch-3.4

Copy link
Contributor
mergify bot commented Jun 20, 2025

backport branch-3.3

✅ Backports have been created

Copy link
Contributor
mergify bot commented Jun 20, 2025

backport branch-3.5

✅ Backports have been created

Copy link
Contributor
mergify bot commented Jun 20, 2025

backport branch-3.4

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Jun 20, 2025
….Date and Type.Datetime (#60014)

Why I'm doing:
When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values."

Logging at each step showed that:
The table and partition column’s type remained DATETIME throughout the parser, visitor, and analyzer.
However, inside MultiRangePartitionDesc.buildDateTypePartition(), even though the type was reported as DATETIME, the code used the DATE formatter unless the object passed an identity (==) comparison to Type.DATETIME.
Note on what I learned during unit test writing
I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe.
The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say.

What I'm doing:
Replaced == with .isDatetime() when comparing partition column types to Type.DATETIME in date partition construction logic.
For this, the relevant one is in com.starrocks.sql.ast.MultiRangePartitionDesc#buildDateTypePartition. I also went and did the same thing where there was obviously the same bug for Date and Datetime types.
Another possible solution would have been to override Type.equals() but I'm not confident that I'd do it correctly for all cases and it would have a bigger blast radius.
Temporary partitions now correctly parse and preserve full datetime values, supporting hourly partitioning aligned with the schema.

Fixes #issue

(cherry picked from commit 5f25005)
mergify bot pushed a commit that referenced this pull request Jun 20, 2025
….Date and Type.Datetime (#60014)

Why I'm doing:
When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values."

Logging at each step showed that:
The table and partition column’s type remained DATETIME throughout the parser, visitor, and analyzer.
However, inside MultiRangePartitionDesc.buildDateTypePartition(), even though the type was reported as DATETIME, the code used the DATE formatter unless the object passed an identity (==) comparison to Type.DATETIME.
Note on what I learned during unit test writing
I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe.
The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say.

What I'm doing:
Replaced == with .isDatetime() when comparing partition column types to Type.DATETIME in date partition construction logic.
For this, the relevant one is in com.starrocks.sql.ast.MultiRangePartitionDesc#buildDateTypePartition. I also went and did the same thing where there was obviously the same bug for Date and Datetime types.
Another possible solution would have been to override Type.equals() but I'm not confident that I'd do it correctly for all cases and it would have a bigger blast radius.
Temporary partitions now correctly parse and preserve full datetime values, supporting hourly partitioning aligned with the schema.

Fixes #issue

(cherry picked from commit 5f25005)
mergify bot pushed a commit that referenced this pull request Jun 20, 2025
….Date and Type.Datetime (#60014)

Why I'm doing:
When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values."

Logging at each step showed that:
The table and partition column’s type remained DATETIME throughout the parser, visitor, and analyzer.
However, inside MultiRangePartitionDesc.buildDateTypePartition(), even though the type was reported as DATETIME, the code used the DATE formatter unless the object passed an identity (==) comparison to Type.DATETIME.
Note on what I learned during unit test writing
I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe.
The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say.

What I'm doing:
Replaced == with .isDatetime() when comparing partition c
6D40
olumn types to Type.DATETIME in date partition construction logic.
For this, the relevant one is in com.starrocks.sql.ast.MultiRangePartitionDesc#buildDateTypePartition. I also went and did the same thing where there was obviously the same bug for Date and Datetime types.
Another possible solution would have been to override Type.equals() but I'm not confident that I'd do it correctly for all cases and it would have a bigger blast radius.
Temporary partitions now correctly parse and preserve full datetime values, supporting hourly partitioning aligned with the schema.

Fixes #issue

(cherry picked from commit 5f25005)
wanpengfei-git pushed a commit that referenced this pull request Jun 20, 2025
….Date and Type.Datetime (backport #60014) (#60104)

Co-authored-by: Connor Brennan <cbrennan@pinterest.com>
wanpengfei-git pushed a commit that referenced this pull request Jun 20, 2025
….Date and Type.Datetime (backport #60014) (#60106)

Co-authored-by: Connor Brennan <cbrennan@pinterest.com>
wanpengfei-git pushed a commit that referenced this pull request Jun 20, 2025
….Date and Type.Datetime (backport #60014) (#60105)

Co-authored-by: Connor Brennan <cbrennan@pinterest.com>
ctbrennan added a commit to pinterest/starrocks that referenced this pull request Jun 20, 2025
….Date and Type.Datetime (StarRocks#60014)

Why I'm doing:
When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values."

Logging at each step showed that:
The table and partition column’s type remained DATETIME throughout the parser, visitor, and analyzer.
However, inside MultiRangePartitionDesc.buildDateTypePartition(), even though the type was reported as DATETIME, the code used the DATE formatter unless the object passed an identity (==) comparison to Type.DATETIME.
Note on what I learned during unit test writing
I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe.
The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say.

What I'm doing:
Replaced == with .isDatetime() when comparing partition column types to Type.DATETIME in date partition construction logic.
For this, the relevant one is in com.starrocks.sql.ast.MultiRangePartitionDesc#buildDateTypePartition. I also went and did the same thing where there was obviously the same bug for Date and Datetime types.
Another possible solution would have been to override Type.equals() but I'm not confident that I'd do it correctly for all cases and it would have a bigger blast radius.
Temporary partitions now correctly parse and preserve full datetime values, supporting hourly partitioning aligned with the schema.

Fixes #issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0