-
Notifications
You must be signed in to change notification settings - Fork 3.9k
opt: fix bug where ON CONFLICT DO NOTHING ignores some input #59147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, nice find!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, and @rytaft)
pkg/sql/logictest/testdata/logic_test/upsert, line 1258 at r1 (raw file):
w INT UNIQUE, x INT, y INT DEFAULT 5,
[nit] the DEFAULT 5 is unnecessary
pkg/sql/logictest/testdata/logic_test/upsert, line 1266 at r1 (raw file):
statement ok INSERT INTO uniq VALUES (1, 20, 20, 20, 20), (20, 1, 20, 20, 20), (20, 20, 20, 20, 20),
[nit] It's hard to parse this example and associate values to columns. I would choose different ranges of values for each column, or better yet instead of integers use strings that incorporate the column name (`k1', 'x1', 'x2').
Might also be useful to add a comment with a note for each row (e.g. the second row must be discarded because of conflict on v).
pkg/sql/opt/optbuilder/testdata/upsert, line 74 at r1 (raw file):
w INT UNIQUE, x INT, y INT DEFAULT 5,
[nit] unnecessary DEFAULT 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find! I'm glad the fix is so simple.
Reviewed 5 of 5 files at r1.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @andy-kimball and @rytaft)
Prior to this patch, it was possible for some valid input to an INSERT ... ON CONFLICT DO NOTHING to be discarded. For example, consider the following example: CREATE TABLE uniq ( k INT PRIMARY KEY, v INT UNIQUE, w INT UNIQUE, x INT, y INT DEFAULT 5, UNIQUE (x, y) ); INSERT INTO uniq VALUES (1, 1, 1, 1, 1); INSERT INTO uniq VALUES (1, 20, 20, 20, 20), (20, 1, 20, 20, 20), (20, 20, 20, 20, 20) ON CONFLICT DO NOTHING; Since row (20, 20, 20, 20, 20) does not conflict with the existing row in uniq, it should be inserted. However, prior to this patch, all three rows in the second INSERT statement were discarded. This commit fixes the problem by applying the upsert-distinct-on operators for each index after all conflicting rows are removed by anti-joins. Fixes cockroachdb#59125 Release note (bug fix): Fixed a bug in which some non-conflicting rows provided as input to an INSERT ... ON CONFLICT DO NOTHING statement could be discarded, and not inserted. This could happen in cases where the table had one or more unique indexes in addition to the primary index, and some of the rows in the input conflicted with existing values in one or more unique index. This scenario could cause the rows that did not conflict to be erroneously discarded. This has now been fixed.
d66ea09
to
b5a33d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball and @mgartner)
pkg/sql/logictest/testdata/logic_test/upsert, line 1258 at r1 (raw file):
Previously, RaduBerinde wrote…
[nit] the DEFAULT 5 is unnecessary
Done.
pkg/sql/logictest/testdata/logic_test/upsert, line 1266 at r1 (raw file):
Previously, RaduBerinde wrote…
[nit] It's hard to parse this example and associate values to columns. I would choose different ranges of values for each column, or better yet instead of integers use strings that incorporate the column name (`k1', 'x1', 'x2').
Might also be useful to add a comment with a note for each row (e.g. the second row must be discarded because of conflict on v).
Done.
pkg/sql/opt/optbuilder/testdata/upsert, line 74 at r1 (raw file):
Previously, RaduBerinde wrote…
[nit] unnecessary DEFAULT 5
Done.
bors r+ |
Build succeeded: |
Prior to this patch, it was possible for some valid input to an
INSERT ... ON CONFLICT DO NOTHING
to be discarded. For example,consider the following example:
Since row
(20, 20, 20, 20, 20)
does not conflict with the existingrow in uniq, it should be inserted. However, prior to this patch, all
three rows in the second
INSERT
statement were discarded.This commit fixes the problem by applying the
upsert-distinct-on
operators for each index after all conflicting rows are removed by
anti-join
s.Fixes #59125
Release note (bug fix): Fixed a bug in which some non-conflicting rows
provided as input to an
INSERT ... ON CONFLICT DO NOTHING
statement couldbe discarded, and not inserted. This could happen in cases where the
table had one or more unique indexes in addition to the primary index,
and some of the rows in the input conflicted with existing values in one
or more unique index. This scenario could cause the rows that did not
conflict to be erroneously discarded. This has now been fixed.