-
Notifications
You must be signed in to change notification settings - Fork 2k
[BugFix] Create temporary partitions bugfix + fix comparisons on Type.Date and Type.Datetime #60014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Create temporary partitions bugfix + fix comparisons on Type.Date and Type.Datetime #60014
Conversation
@mergify rebase |
❌ Pull request can't be updated with latest base branch changesMergify needs the author permission to update the base branch of the pull request. |
@ctbrennan, the failed UT has been fixed. Could you rebase to the main branch to pass the CI check? |
da4c82f
to
d6813c8
Compare
yes, done |
|
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 6 / 6 (100.00%) file detail
|
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
@Mergifyio backport branch-3.3 |
@Mergifyio backport branch-3.5 |
@Mergifyio backport branch-3.4 |
✅ Backports have been created
|
✅ Backports have been created
|
✅ Backports have been created
|
….Date and Type.Datetime (#60014) Why I'm doing: When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values." Logging at each step showed that: The table and partition column’s type remained DATETIME throughout the parser, visitor, and analyzer. However, inside MultiRangePartitionDesc.buildDateTypePartition(), even though the type was reported as DATETIME, the code used the DATE formatter unless the object passed an identity (==) comparison to Type.DATETIME. Note on what I learned during unit test writing I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe. The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say. What I'm doing: Replaced == with .isDatetime() when comparing partition column types to Type.DATETIME in date partition construction logic. For this, the relevant one is in com.starrocks.sql.ast.MultiRangePartitionDesc#buildDateTypePartition. I also went and did the same thing where there was obviously the same bug for Date and Datetime types. Another possible solution would have been to override Type.equals() but I'm not confident that I'd do it correctly for all cases and it would have a bigger blast radius. Temporary partitions now correctly parse and preserve full datetime values, supporting hourly partitioning aligned with the schema. Fixes #issue (cherry picked from commit 5f25005)
….Date and Type.Datetime (#60014) Why I'm doing: When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values." Logging at each step showed that: The table and partition column’s type remained DATETIME throughout the parser, visitor, and analyzer. However, inside MultiRangePartitionDesc.buildDateTypePartition(), even though the type was reported as DATETIME, the code used the DATE formatter unless the object passed an identity (==) comparison to Type.DATETIME. Note on what I learned during unit test writing I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe. The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say. What I'm doing: Replaced == with .isDatetime() when comparing partition column types to Type.DATETIME in date partition construction logic. For this, the relevant one is in com.starrocks.sql.ast.MultiRangePartitionDesc#buildDateTypePartition. I also went and did the same thing where there was obviously the same bug for Date and Datetime types. Another possible solution would have been to override Type.equals() but I'm not confident that I'd do it correctly for all cases and it would have a bigger blast radius. Temporary partitions now correctly parse and preserve full datetime values, supporting hourly partitioning aligned with the schema. Fixes #issue (cherry picked from commit 5f25005)
….Date and Type.Datetime (#60014) Why I'm doing: When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values." Logging at each step showed that: The table and partition column’s type remained DATETIME throughout the parser, visitor, and analyzer. However, inside MultiRangePartitionDesc.buildDateTypePartition(), even though the type was reported as DATETIME, the code used the DATE formatter unless the object passed an identity (==) comparison to Type.DATETIME. Note on what I learned during unit test writing I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe. The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say. What I'm doing: Replaced == with .isDatetime() when comparing partition c 6D40 olumn types to Type.DATETIME in date partition construction logic. For this, the relevant one is in com.starrocks.sql.ast.MultiRangePartitionDesc#buildDateTypePartition. I also went and did the same thing where there was obviously the same bug for Date and Datetime types. Another possible solution would have been to override Type.equals() but I'm not confident that I'd do it correctly for all cases and it would have a bigger blast radius. Temporary partitions now correctly parse and preserve full datetime values, supporting hourly partitioning aligned with the schema. Fixes #issue (cherry picked from commit 5f25005)
….Date and Type.Datetime (StarRocks#60014) Why I'm doing: When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values." Logging at each step showed that: The table and partition column’s type remained DATETIME throughout the parser, visitor, and analyzer. However, inside MultiRangePartitionDesc.buildDateTypePartition(), even though the type was reported as DATETIME, the code used the DATE formatter unless the object passed an identity (==) comparison to Type.DATETIME. Note on what I learned during unit test writing I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe. The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say. What I'm doing: Replaced == with .isDatetime() when comparing partition column types to Type.DATETIME in date partition construction logic. For this, the relevant one is in com.starrocks.sql.ast.MultiRangePartitionDesc#buildDateTypePartition. I also went and did the same thing where there was obviously the same bug for Date and Datetime types. Another possible solution would have been to override Type.equals() but I'm not confident that I'd do it correctly for all cases and it would have a bigger blast radius. Temporary partitions now correctly parse and preserve full datetime values, supporting hourly partitioning aligned with the schema. Fixes #issue
Why I'm doing:
When adding temporary partitions using an ALTER TABLE statement with START and END values that include hour/minute/second, StarRocks would incorrectly truncate the partition key values and only preserve the date part after parsing, even if the partition column is of type DATETIME. This leads to errors when adding hourly temporary partitions, including failed validation due to "The lower values must smaller than upper values."
Note on what I learned during unit test writing
I found by forcibly creating a non-singleton ScalarType in tests—that there must exist at least one code path in the system where a ScalarType (such as DATETIME) is instantiated without going through the canonical singleton factory method (Type.DATETIME, Type.fromPrimitiveType, etc.). This probably happens during some sort of image deserialization, by json/protobuf libraries maybe.
The added unit test demonstrates this risk. While this PR addresses the specific instance I found, similar issues could exist in other type comparison sites throughout the codebase, hard to say.
What I'm doing:
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: