-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Share null mask with constant null arg vector #17234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Looks good - but perhaps it would be better to remove the reference altogether in this case:
@@ -43,6 +43,9 @@ static void ConstantOrNullFunction(DataChunk &args, ExpressionState &state, Vect | |||
if (ConstantVector::IsNull(args.data[idx])) { | |||
// input is constant null, return constant null | |||
result.Reference(info.value); | |||
auto &result_mask = ConstantVector::Validity(result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the reference altogether, and just set the result to a constant NULL? (i.e. result.SetVectorType(VectorType::CONSTANT_VECTOR);
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I've updated the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Initialize
should not also not be necessary anymore now, i.e. we just need the two lines (SetVectorType
and ConstantVector::SetNull
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constant_or_null function may accept three or more inputs (e.g., constant_or_null(true, c1, c2)).
In cases where: c1 is a flat vector, and c2 is a constant vector :
we first access c1 and share its mask with the result through result_mask.Combine(input_mask).
Later, when we access c2 and find it's a constant null. If we simply call SetNull at this stage, it would corrupt c1's mask (since we're still sharing c1's mask state from the earlier Combine operation).
So we should call Initialize to switch the mask reference to c2's mask (the constant null vector) before calling SetNull.And c2 will not be affected because it's already null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see - makes sense. Thanks!
Thanks! |
Share null mask with constant null arg vector (duckdb/duckdb#17234)
Share null mask with constant null arg vector (duckdb/duckdb#17234)
Share null mask with constant null arg vector (duckdb/duckdb#17234)
Share null mask with constant null arg vector (duckdb/duckdb#17234)
Fixes #17217
When the constant_or_null function encounters a constant NULL input, it will set the null mask of result vector for row 0 to invalid.
If the null mask is shared with other function arg vector (due to a previous Combine() call), modifying the null mask for row 0 would incorrectly affect the arg vector. This leads to wrong query result.
Share the null mask with the constant NULL arg vector should fix the issue.