-
-
Notifications
You must be signed in to change notification settings - Fork 10
Musicbrainz Schema update v30 #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
note that this reverts the adaptions done by acoustid#8 as musicbrainz does not ship the scripts with function definitions containing a `SET schema` local adaptions would always lead to merge problems. if users have problems that functions do not use the correct schema for queries and can't find functions or tables, consider to set the schema on session, user or database level: ``` -- session SET search_path to musicbrainz,public; -- user level ALTER USER musicbrainz SET search_path TO musicbrainz, public; -- check user level settings via psql \drds mbdata; -- check user level via query SELECT r.rolname AS role_name, rs.setconfig AS configured_settings FROM pg_roles r LEFT JOIN pg_db_role_setting rs ON r.oid = rs.setrole WHERE r.rolname = 'musicbrainz'; -- db level SELECT setting FROM pg_settings WHERE name = 'search_path'; ALTER DATABASE database_name SET search_path TO musicbrainz, public; ```
musicbrainz has fixed upgrade issues see metabrainz/musicbrainz-server@c7b8c29
WalkthroughThis update implements the MusicBrainz schema change for version 30.0.0, covering all required SQL migrations, new and updated functions, triggers, constraints, and documentation. It migrates the schema, updates partitioned tables, adds new columns and tables, adjusts indexes, and provides upgrade instructions, fully aligning the codebase with the latest MusicBrainz schema release. Changes
Assessment against linked issues
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 22
🔭 Outside diff range comments (1)
mbslave/sql/CreateMirrorOnlyTriggers.sql (1)
1-2
: 🛠️ Refactor suggestionSet search_path for mirror trigger definitions
This file creates triggers on unqualified tables (
medium
,release_group_primary_type
, etc.) without setting the schema. On non-default search_path configurations, these DDLs will error out. Please add:SET search_path = musicbrainz, public;right after
\set ON_ERROR_STOP 1
.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (46)
CHANGELOG.rst
(1 hunks)README.rst
(4 hunks)mbslave/__init__.py
(1 hunks)mbslave/sql/CreateFKConstraints.sql
(1 hunks)mbslave/sql/CreateFunctions.sql
(102 hunks)mbslave/sql/CreateIndexes.sql
(3 hunks)mbslave/sql/CreateMirrorOnlyFunctions.sql
(12 hunks)mbslave/sql/CreateMirrorOnlyTriggers.sql
(3 hunks)mbslave/sql/CreatePrimaryKeys.sql
(1 hunks)mbslave/sql/CreateReplicationTriggers.sql
(1 hunks)mbslave/sql/CreateReplicationTriggers2.sql
(1 hunks)mbslave/sql/CreateTables.sql
(4 hunks)mbslave/sql/CreateTriggers.sql
(3 hunks)mbslave/sql/DropFKConstraints.sql
(1 hunks)mbslave/sql/DropFunctions.sql
(1 hunks)mbslave/sql/DropIndexes.sql
(1 hunks)mbslave/sql/DropMirrorOnlyTriggers.sql
(2 hunks)mbslave/sql/DropPrimaryKeys.sql
(1 hunks)mbslave/sql/DropReplicationTriggers.sql
(1 hunks)mbslave/sql/Dro 8000 pReplicationTriggers2.sql
(1 hunks)mbslave/sql/DropTables.sql
(1 hunks)mbslave/sql/DropTriggers.sql
(2 hunks)mbslave/sql/InsertTestData.sql
(4 hunks)mbslave/sql/TruncateTables.sql
(1 hunks)mbslave/sql/caa/CreateMQTriggers.sql
(0 hunks)mbslave/sql/caa/CreateViews.sql
(0 hunks)mbslave/sql/caa/DropMQTriggers.sql
(0 hunks)mbslave/sql/eaa/CreateMQTriggers.sql
(0 hunks)mbslave/sql/eaa/CreateViews.sql
(0 hunks)mbslave/sql/eaa/DropMQTriggers.sql
(0 hunks)mbslave/sql/updates/20240221-mbs-13492.sql
(1 hunks)mbslave/sql/updates/20240726-mbs-9373.sql
(1 hunks)mbslave/sql/updates/20241017-mbs-9253-13464.sql
(1 hunks)mbslave/sql/updates/20241017-mbs-9253-master_and_standalone.sql
(1 hunks)mbslave/sql/updates/20241017-mbs-9253-mirror_only.sql
(1 hunks)mbslave/sql/updates/20241125-mbs-13832.sql
(1 hunks)mbslave/sql/updates/20250320-mbs-13768-fks.sql
(1 hunks)mbslave/sql/updates/20250320-mbs-13768.sql
(1 hunks)mbslave/sql/updates/20250408-mbs-13322.sql
(1 hunks)mbslave/sql/updates/20250408-mbs-13964-all.sql
(1 hunks)mbslave/sql/updates/20250425-mbs-13464-master_and_standalone.sql
(1 hunks)mbslave/sql/updates/20250425-mbs-13464.sql
(1 hunks)mbslave/sql/updates/20250425-mbs-13966.sql
(1 hunks)mbslave/sql/updates/schema-change/30.all.sql
(1 hunks)mbslave/sql/updates/schema-change/30.master_and_standalone.sql
(1 hunks)mbslave/sql/updates/schema-change/30.mirror_only.sql
(1 hunks)
💤 Files with no reviewable changes (6)
- mbslave/sql/caa/CreateViews.sql
- mbslave/sql/eaa/CreateViews.sql
- mbslave/sql/eaa/DropMQTriggers.sql
- mbslave/sql/caa/DropMQTriggers.sql
- mbslave/sql/eaa/CreateMQTriggers.sql
- mbslave/sql/caa/CreateMQTriggers.sql
🔇 Additional comments (65)
CHANGELOG.rst (1)
1-2
: Version entry added correctly
The new version header and underline match the existing RST style.mbslave/__init__.py (1)
4-4
: Bumped package version to 30.0.0
Matches the new schema release. Ensure downstream packaging and distribution are updated accordingly.mbslave/sql/DropTables.sql (1)
271-271
: Removemedium_gid_redirect
table on downgrade
Good addition to match the new table. As with indexes, consider guarding against missing tables withIF EXISTS
for idempotent downgrades.mbslave/sql/DropReplicationTriggers.sql (1)
207-207
:✅ Verification successful
Drop trigger for
medium_gid_redirect
replication
This complements the creation ofreptg_medium_gid_redirect
. Confirm the trigger name matches your create script.Run:
🏁 Script executed:
#!/bin/bash # Verify trigger definition exists rg -n "CREATE TRIGGER.*reptg_medium_gid_redirect" -n mbslave/sqlLength of output: 155
Drop trigger name matches creation – verified
TheDROP TRIGGER IF EXISTS reptg_medium_gid_redirect
in DropReplicationTriggers.sql lines up with theCREATE TRIGGER "reptg_medium_gid_redirect"
in mbslave/sql/CreateReplicationTriggers.sql:818. No action needed.mbslave/sql/TruncateTables.sql (1)
271-271
: Includemedium_gid_redirect
in truncation
Truncatingmedium_gid_redirect
withRESTART IDENTITY CASCADE
ensures its sequence resets alongside related data. Order looks consistent betweenmedium_format
andmedium_index
.mbslave/sql/CreateFKConstraints.sql (1)
3009-3012
: Add FK onmedium_gid_redirect.new_id
Constraintmedium_gid_redirect_fk_new_id
enforces referential integrity againstmedium(id)
, matching patterns used for other*_gid_redirect
tables.mbslave/sql/CreatePrimaryKeys.sql (1)
267-267
: Confirm primary key addition for medium_gid_redirect
This line adds themedium_gid_redirect_pkey
constraint on themedium_gid_redirect
table, aligning with its definition inCreateTables.sql
and its removal inDropPrimaryKeys.sql
. Everything matches up.mbslave/sql/DropPrimaryKeys.sql (1)
267-267
: Confirm drop of medium_gid_redirect primary key
This drops themedium_gid_redirect_pkey
constraint you added inCreatePrimaryKeys.sql
. Good symmetry for rollbacks.mbslave/sql/DropReplicationTriggers2.sql (1)
207-207
: Confirm drop of replication trigger for medium_gid_redirect
This line removesreptg2_medium_gid_redirect
onmedium_gid_redirect
, matching the trigger created inCreateReplicationTriggers2.sql
. Looks correct.mbslave/sql/DropMirrorOnlyTriggers.sql (3)
1-3
: Missing schema context for trigger operations
Without an explicitSET search_path
or schema qualifiers, theseDROP TRIGGER
statements will only apply if the user’s default search path includes themusicbrainz
schema. To avoid “trigger does not exist” errors under custom users, add aSET search_path = musicbrainz;
at the top or qualify each table with the schema.
20-21
: Old mirror triggers for primary/secondary types dropped
Droppinga_upd_release_group_primary_type_mirror
anda_upd_release_group_secondary_type_mirror
aligns with the new trigger definitions in the mirror-only scripts.
36-37
: Old constraint triggers for primary/secondary types dropped
Removingapply_artist_release_group_pending_updates_mirror
on both type tables prepares for the updated deferred triggers.mbslave/sql/CreateReplicationTriggers2.sql (2)
1-3
: Add schema search_path for replication triggers
These triggers operate in themusicbrainz
schema. Insert aSET search_path = musicbrainz;
immediately after\set ON_ERROR_STOP 1
to ensureFOR EACH ROW
statements target the correct schema under non-default users.
818-821
: New replication trigger for medium_gid_redirect
The triggerreptg2_medium_gid_redirect
is correctly inserted in alphabetical order betweenmedium_format
andmedium_index
. It will fire on all DML and calldbmirror2.recordchange()
, matching the pattern of otherreptg2_
triggers.mbslave/sql/updates/20240221-mbs-13492.sql (2)
1-5
: Ensure correct schema and timeout settings
You disable the statement timeout locally, which is good for long-running updates. However, addSET search_path = musicbrainz;
after\set ON_ERROR_STOP 1
to guarantee theeditor
table is found under custom users.
23-23
: TheCOMMIT;
correctly closes the transaction.mbslave/sql/DropFKConstraints.sql (2)
1-3
: Add schema search_path for constraint drops
Without aSET search_path = musicbrainz;
, droppingmedium_gid_redirect_fk_new_id
may fail under non-default users. Either add the search path at the top or qualify the table withmusicbrainz.
.
600-600
: Drop of medium_gid_redirect foreign key
Droppingmedium_gid_redirect_fk_new_id
here aligns with the subsequent re-add in the migration. Good catch to maintain rollback capability.mbslave/sql/updates/20250320-mbs-13768-fks.sql (2)
1-3
: ON_ERROR_STOP and search_path set correctly
The psql meta-command\set ON_ERROR_STOP 1
andSET search_path = musicbrainz;
ensure that the migration halts on errors and operates in the correct schema.
7-10
: Foreign key constraint on medium_gid_redirect
Addingmedium_gid_redirect_fk_new_id
referencingmedium(id)
enforces referential integrity for the UID redirect table. Confirm that themedium_gid_redirect
table exists and itsnew_id
column is correctly typed.mbslave/sql/InsertTestData.sql (5)
113-114
: Include newgid
column for medium test records
The two INSERTs intomedium
now supply the UUIDgid
, aligning test data with the updated schema. Looks correct and matches the existingrelease(id)
references.
160-161
: Addgid
for media on release 2
These INSERTs formedium(id 3,4)
include the new UUID column and reference a validrelease=2
. Good coverage of the new schema changes.
163-164
: Addgid
for media on release 3
Extends the test data formedium(id 5,6)
with the new UUID field—consistent with the othermedium
entries.
245-245
: New link record for redirect tests
INSERTinglink(id 4, link_type 108, attribute_count 0)
prepares for the new redirect logic. Please verify thatlink_type 108
exists in your test fixtures.
260-260
: Link two new artists vial_artist_artist
Thel_artist_artist
row useslink = 4
and entity IDs8,9
, which are defined above. The ordering ensures thelink
exists before the referencing row.mbslave/sql/CreateReplicationTriggers.sql (1)
818-820
: Add replication trigger formedium_gid_redirect
This new trigger follows the pattern of other_gid_redirect
tables and ensures changes inmedium_gid_redirect
are captured.mbslave/sql/updates/20250425-mbs-13464-master_and_standalone.sql (2)
1-3
: Ensure DDL runs in the correct schema
Settingsearch_path = musicbrainz
upfront guarantees the ALTER statements apply to the intended schema, avoiding lookup issues for non-default users.
7-17
: Enforce referential integrity onartist_release
Addingartist_release_fk_artist
andartist_release_fk_release
withON DELETE CASCADE
secures the link toartist(id)
andrelease(id)
. Good clear constraints.mbslave/sql/updates/20250408-mbs-13322.sql (3)
1-3
: Stop on error and begin transaction
\set ON_ERROR_STOP 1
plusBEGIN;
ensures the script fails fast and runs atomically.
5-13
: Definedelete_unused_url
with FK-safe cleanup
The function deletes matching rows fromurl_gid_redirect
andurl
, and only suppresses foreign‐key violations—other errors still bubble up. Logic is concise and focused.
15-15
: Commit the function creation transaction
Wrapping the definition inCOMMIT;
finalizes the change in one atomic step.README.rst (4)
11-11
: Remove stray trailing space
Cleaned up the sentence about “replica of the MusicBrainz database.” for consistency.
39-39
: Fix typo in list heading
Corrected “Alternativelly” to “Alternatively” to improve readability.
49-53
: Document customsearch_path
requirement
Adds a clear advisory for non-default DB users with an exampleALTER USER
command—this directly addresses PR objectives.
93-99
: Add v30 schema upgrade instructions
Outlines running30.all.sql
and updatingreplication_control
—steps correctly match the new release and file names.mbslave/sql/DropTriggers.sql (1)
197-198
: Confirm drop of new release group type triggersThese
DROP TRIGGER IF EXISTS
statements correctly target the newly introduced triggers onrelease_group_primary_type
andrelease_group_secondary_type
, ensuring a clean slate before recreating them in the migration.Also applies to: 391-392
mbslave/sql/CreateMirrorOnlyFunctions.sql (1)
12-12
: Remove explicitsearch_path
from mirror-only functionsDropping redundant
SET search_path
clauses standardizes these functions alongside others inCreateFunctions.sql
. No change to execution context is introduced.Also applies to: 34-34, 43-43, 55-55, 70-70, 90-90, 104-104, 126-126, 134-134, 151-151, 160-160, 168-168, 185-185, 206-206, 223-223
mbslave/sql/updates/20250320-mbs-13768.sql (1)
21-29
: Linkmedium_gid_redirect.new_id
tomedium.id
via foreign keyThe
medium_gid_redirect
table currently has no inline FK onnew_id
. As noted in the related FKs migration, confirm thatnew_id
is constrained tomedium(id)
to maintain referential integrity.mbslave/sql/updates/20250408-mbs-13964-all.sql (1)
29-32
: Create trigger and truncate existing dataThe trigger definition and immediate
TRUNCATE recording_first_release_date
correctly reset the mirror state for this migration.mbslave/sql/updates/20250425-mbs-13966.sql (2)
38-58
: Conditional update logic is soundThe
UPDATE ... FROM
includes anIS DISTINCT FROM
filter, so only changed rows are written. This minimizes unnecessary updates.
67-67
: Clear pending updates after bulk operationsTruncating
artist_release_group_pending_update
here prevents stale entries before triggers repopulate it. This aligns with the upgrade flow.mbslave/sql/updates/20241125-mbs-13832.sql (1)
1-49
: Explicit schema qualifiers ensure search_path independenceAll referenced tables and views are fully schema-qualified (
cover_art_archive.*
,event_art_archive.*
,musicbrainz.edit
), so this migration won’t break when a user’s search_path is different. Nice work.mbslave/sql/updates/schema-change/30.mirror_only.sql (1)
5-6
: Explicit search_path setBy adding
SET search_path = musicbrainz, public;
you guarantee mirror-only triggers target the right schema. This aligns with best practices for multi‐schema deployments.mbslave/sql/CreateTables.sql (3)
432-432
: Include full artist name with correct collation
Addingname VARCHAR COLLATE musicbrainz NOT NULL
to theartist_release
partition replaces the old single-character sort field and ensures proper sorting under the MusicBrainz collation.
466-466
: Maintain consistency for group name collation
Thename VARCHAR COLLATE musicbrainz NOT NULL
addition inartist_release_group
mirrorsartist_release
and guarantees uniform sorting behavior.
2903-2904
: Add track_count and gid to medium table
Introducingtrack_count INTEGER NOT NULL DEFAULT 0
andgid UUID NOT NULL
meets the schema v30 requirements. The default ensures existing records aren’t NULL, and the UUID will be uniquely indexed elsewhere.mbslave/sql/CreateTriggers.sql (3)
457-458
: Add mirror update trigger for medium table
The new AFTER UPDATE trigger onmedium
callsa_upd_medium_mirror()
to keep mirror tables in sync. Ensure the function exists and its logic matches the updated schema.
586-590
: Introduce mirror update triggers for release group types
These triggers firea_upd_release_group_primary_type_mirror()
anda_upd_release_group_secondary_type_mirror()
after updates, aligning the mirror tables with schema v30. Good addition.
1311-1322
: Add deferrable constraint triggers for child_order changes
The constraint triggers onrelease_group_primary_type
andrelease_group_secondary_type
ensure pending updates run only whenchild_order
differs. This avoids unnecessary updates.mbslave/sql/updates/20241017-mbs-9253-master_and_standalone.sql (3)
1-5
: Initialize transaction with error-stop and schema context
Using\set ON_ERROR_STOP 1
andSET search_path = musicbrainz
ensures the script halts on error and works in the correct schema.
19-23
: Create mirror update triggers for release_group types
These AFTER UPDATE triggers invoke mirror procedures for primary and secondary types, keeping mirror tables aligned.
25-35
: Add deferrable constraint triggers on child_order changes
Deferrable triggers onrelease_group_primary_type
/secondary_type
fire only whenchild_order
changes, optimizing update propagation.mbslave/sql/updates/20250425-mbs-13464.sql (3)
1-4
: Begin migration with error-stop
The script starts a transaction with\set ON_ERROR_STOP 1
andBEGIN;
—standard practice.
5-9
: Drop old function and tables in proper order
get_artist_release_rows
is dropped before its underlying tables, avoiding dependency errors. Good.
22-27
: Define partitions for artist_release
Sub-tables forFALSE
andTRUE
values complete the partition setup. Looks correct.mbslave/sql/updates/schema-change/30.master_and_standalone.sql (6)
1-8
: Initialize master/standalone migration
Setting schema search path, disabling statement timeout, and beginning the transaction ensures consistent execution across environments.
15-25
: Enforce foreign keys on artist_release_group in master and standalone
Adds cascading deletes to maintain integrity across master and mirror setups.
27-32
: Mirror update triggers for release_group types
Recreates triggers for primary and secondary type mirror updates in both master and standalone contexts.
33-43
: Deferrable constraint triggers for child_order changes
Ensures pending updates fire only whenchild_order
changes, across master and standalone.
51-61
: Add cascading foreign keys on artist_release
Linksartist_release.artist
→artist(id)
andrelease
→release(id)
with ON DELETE CASCADE, aligning master and mirror schemas.
69-73
: Add foreign key on medium_gid_redirect
Ensuresmedium_gid_redirect.new_id
referencesmedium(id)
, supporting the new UUID redirection mechanism.mbslave/sql/CreateIndexes.sql (2)
510-510
: Enforce uniqueness on medium.gid
A unique index onmedium(gid)
supports fast lookup and guarantees no duplicate UUIDs.
736-736
: Index medium_gid_redirect(new_id)
This index speeds up joins and deletes on themedium_gid_redirect
table, enabling efficient redirect lookups.mbslave/sql/CreateFunctions.sql (1)
16-346
: Normalize LANGUAGE specifiers and remove schema qualifications
Across all of these functions, theLANGUAGE
declarations have been standardized (dropping redundant quotes aroundplpgsql
, unifying capitalization forsql
vsSQL
, etc.), and the explicitSET search_path = musicbrainz, public
clauses have been removed from their definitions. This cleans up the function definitions, avoids hidden schema-path dependencies, and aligns with the PR goal of moving search‐path configuration out of individual routines.mbslave/sql/updates/schema-change/30.all.sql (1)
11-11
: Explicit script‐level search_path is correct
TheSET search_path = musicbrainz, public;
at the top of this all-in-one migration properly ensures all subsequent unqualified references resolve to themusicbrainz
schema. This aligns with the PR’s objective of externalizing search-path setup.
Version 30.0.0 | ||
============== | ||
|
||
- Schema change 30. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Changelog detail could be enriched
Consider adding a brief note about requiring a user-level search_path
change (e.g., ALTER USER foo SET search_path TO musicbrainz, public
) to help DB admins avoid the function-not-found error.
🤖 Prompt for AI Agents
In CHANGELOG.rst at line 4, the changelog entry "Schema change 30." is too brief
and lacks important context. Enhance this entry by adding a note about the
necessity for a user-level search_path change, such as "ALTER USER foo SET
search_path TO musicbrainz, public", to inform database administrators about
this requirement and help prevent function-not-found errors.
DROP INDEX medium_gid_redirect_idx_new_id; | ||
DROP INDEX medium_idx_gid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Drop index statements for new medium
GID support
These lines clean up the two new indexes. Verify the names match those in your creation scripts. You might also consider using IF EXISTS
to avoid failures during downgrades.
Run:
🏁 Script executed:
#!/bin/bash
# Verify index definitions exist
rg -n "CREATE.*medium_gid_redirect_idx_new_id" -n mbslave/sql
rg -n "CREATE.*medium_idx_gid" -n mbslave/sql
Length of output: 756
Add IF EXISTS to DROP INDEX statements for safer downgrades
Verified that both index names match their creation in:
- mbslave/sql/CreateIndexes.sql: lines 510 & 736
- mbslave/sql/updates/20250320-mbs-13768.sql
- mbslave/sql/updates/schema-change/30.all.sql
Recommended fix in mbslave/sql/DropIndexes.sql (lines 406–407):
-DROP INDEX medium_gid_redirect_idx_new_id;
-DROP INDEX medium_idx_gid;
+DROP INDEX IF EXISTS medium_gid_redirect_idx_new_id;
+DROP INDEX IF EXISTS medium_idx_gid;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
DROP INDEX medium_gid_redirect_idx_new_id; | |
DROP INDEX medium_idx_gid; | |
– DROP INDEX medium_gid_redirect_idx_new_id; | |
– DROP INDEX medium_idx_gid; | |
+ DROP INDEX IF EXISTS medium_gid_redirect_idx_new_id; | |
+ DROP INDEX IF EXISTS medium_idx_gid; |
🤖 Prompt for AI Agents
In mbslave/sql/DropIndexes.sql around lines 406 to 407, the DROP INDEX
statements lack the IF EXISTS clause, which can cause errors if the indexes do
not exist during downgrades. Modify both DROP INDEX statements to include IF
EXISTS to safely attempt dropping the indexes only if they exist, preventing
failures in downgrade scripts.
\set ON_ERROR_STOP 1 | ||
|
||
BEGIN; | ||
|
||
UPDATE editor | ||
SET privs = privs | 16384 -- set new voting disabled flag | ||
WHERE (privs & 1024) > 0; -- where editor had editing disabled flag | ||
|
||
COMMIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Make the migration self-contained by qualifying schema or setting search_path
The UPDATE editor
statement assumes the musicbrainz
schema is in the user’s search_path. If it isn’t, this will fail. To harden the script, either add after line 1:
SET LOCAL search_path = musicbrainz, public;
or fully qualify the table:
UPDATE musicbrainz.editor …
This ensures the migration runs reliably under any user.
🤖 Prompt for AI Agents
In mbslave/sql/updates/20240726-mbs-9373.sql lines 1 to 9, the UPDATE statement
on the editor table assumes the musicbrainz schema is in the search_path, which
may cause failures if it is not. To fix this, either add "SET LOCAL search_path
= musicbrainz, public;" immediately after line 1 or fully qualify the table name
as "musicbrainz.editor" in the UPDATE statement to ensure the migration runs
reliably regardless of the user's search_path.
DROP FUNCTION a_upd_release_group_primary_type_mirror(); | ||
DROP FUNCTION a_upd_release_group_secondary_type_mirror(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Optional: suppress errors with IF EXISTS on drop functions
Since ON_ERROR_STOP
is unset here, missing functions won’t halt execution but will generate noise. You could use:
DROP FUNCTION IF EXISTS a_upd_release_group_primary_type_mirror();
DROP FUNCTION IF EXISTS a_upd_release_group_secondary_type_mirror();
to keep the logs clean.
🤖 Prompt for AI Agents
In mbslave/sql/DropFunctions.sql around lines 42 to 43, the DROP FUNCTION
statements do not use IF EXISTS, which causes unnecessary error messages if the
functions are missing. Modify these statements to include IF EXISTS, changing
them to DROP FUNCTION IF EXISTS a_upd_release_group_primary_type_mirror(); and
DROP FUNCTION IF EXISTS a_upd_release_group_secondary_type_mirror(); to suppress
errors and keep logs clean.
UPDATE editor | ||
SET privs = privs | 8192 -- set new beginner flag | ||
WHERE id != 4 -- avoid setting ModBot as beginner | ||
AND NOT deleted | ||
AND ( | ||
member_since > NOW() - INTERVAL '2 weeks' | ||
OR | ||
NOT EXISTS ( | ||
SELECT 1 | ||
FROM edit | ||
WHERE edit.editor = editor.id | ||
AND edit.autoedit = 0 | ||
AND edit.status = 2 | ||
OFFSET 9 | ||
) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Refactor EXISTS check for performance
Using OFFSET 9
without a LIMIT
will scan all edits beyond the ninth. Consider adding LIMIT 1
:
…
AND NOT EXISTS (
SELECT 1
FROM edit
WHERE edit.editor = editor.id
AND edit.autoedit = 0
AND edit.status = 2
OFFSET 9 LIMIT 1
)
…
This stops at the 10th row and reduces work.
🤖 Prompt for AI Agents
In mbslave/sql/updates/20240221-mbs-13492.sql around lines 6 to 21, the NOT
EXISTS subquery uses OFFSET 9 without a LIMIT, causing it to scan all rows
beyond the ninth. To improve performance, add LIMIT 1 after OFFSET 9 in the
subquery so it stops checking after the 10th row, reducing unnecessary scanning.
CREATE OR REPLACE FUNCTION get_artist_release_rows( | ||
release_id INTEGER | ||
) RETURNS SETOF artist_release AS $$ | ||
BEGIN | ||
-- PostgreSQL 12 generates a vastly more efficient plan when only | ||
-- one release ID is passed. A condition like `r.id = any(...)` | ||
-- can be over 200x slower, even with only one release ID in the | ||
-- array. | ||
RETURN QUERY EXECUTE $SQL$ | ||
SELECT DISTINCT ON (ar.artist, r.id) | ||
ar.is_track_artist, | ||
ar.artist, | ||
integer_date(rfrd.year, rfrd.month, rfrd.day) AS first_release_date, | ||
array_agg( | ||
DISTINCT rl.catalog_number ORDER BY rl.catalog_number | ||
) FILTER (WHERE rl.catalog_number IS NOT NULL)::TEXT[] AS catalog_numbers, | ||
min(iso.code ORDER BY iso.code)::CHAR(2) AS country_code, | ||
left(regexp_replace( | ||
(CASE r.barcode WHEN '' THEN '0' ELSE r.barcode END), | ||
'[^0-9]+', '', 'g' | ||
), 18)::BIGINT AS barcode, | ||
r.name, | ||
r.id | ||
FROM ( | ||
SELECT FALSE AS is_track_artist, racn.artist, r.id AS release | ||
FROM release r | ||
JOIN artist_credit_name racn ON racn.artist_credit = r.artist_credit | ||
UNION ALL | ||
SELECT TRUE AS is_track_artist, tacn.artist, m.release | ||
FROM medium m | ||
JOIN track t ON t.medium = m.id | ||
JOIN artist_credit_name tacn ON tacn.artist_credit = t.artist_credit | ||
) ar | ||
JOIN release r ON r.id = ar.release | ||
LEFT JOIN release_first_release_date rfrd ON rfrd.release = r.id | ||
LEFT JOIN release_label rl ON rl.release = r.id | ||
LEFT JOIN release_country rc ON rc.release = r.id | ||
LEFT JOIN iso_3166_1 iso ON iso.area = rc.country | ||
$SQL$ || (CASE WHEN release_id IS NULL THEN '' ELSE 'WHERE r.id = $1' END) || | ||
$SQL$ | ||
GROUP BY ar.is_track_artist, ar.artist, rfrd.release, r.id | ||
ORDER BY ar.artist, r.id, ar.is_track_artist | ||
$SQL$ | ||
USING release_id; | ||
END; | ||
$$ LANGUAGE plpgsql; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review dynamic SQL in get_artist_release_rows()
The concatenation of the WHERE clause lacks a leading space, causing ...rc.countryWHERE r.id = $1
. Add a space or newline before WHERE
to avoid syntax errors.
Proposed fix:
- $SQL$ || (CASE WHEN release_id IS NULL THEN '' ELSE 'WHERE r.id = $1' END) ||
+ $SQL$ || (CASE WHEN release_id IS NULL THEN '' ELSE ' WHERE r.id = $1' END) ||
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
CREATE OR REPLACE FUNCTION get_artist_release_rows( | |
release_id INTEGER | |
) RETURNS SETOF artist_release AS $$ | |
BEGIN | |
-- PostgreSQL 12 generates a vastly more efficient plan when only | |
-- one release ID is passed. A condition like `r.id = any(...)` | |
-- can be over 200x slower, even with only one release ID in the | |
-- array. | |
RETURN QUERY EXECUTE $SQL$ | |
SELECT DISTINCT ON (ar.artist, r.id) | |
ar.is_track_artist, | |
ar.artist, | |
integer_date(rfrd.year, rfrd.month, rfrd.day) AS first_release_date, | |
array_agg( | |
DISTINCT rl.catalog_number ORDER BY rl.catalog_number | |
) FILTER (WHERE rl.catalog_number IS NOT NULL)::TEXT[] AS catalog_numbers, | |
min(iso.code ORDER BY iso.code)::CHAR(2) AS country_code, | |
left(regexp_replace( | |
(CASE r.barcode WHEN '' THEN '0' ELSE r.barcode END), | |
'[^0-9]+', '', 'g' | |
), 18)::BIGINT AS barcode, | |
r.name, | |
r.id | |
FROM ( | |
SELECT FALSE AS is_track_artist, racn.artist, r.id AS release | |
FROM release r | |
JOIN artist_credit_name racn ON racn.artist_credit = r.artist_credit | |
UNION ALL | |
SELECT TRUE AS is_track_artist, tacn.artist, m.release | |
FROM medium m | |
JOIN track t ON t.medium = m.id | |
JOIN artist_credit_name tacn ON tacn.artist_credit = t.artist_credit | |
) ar | |
JOIN release r ON r.id = ar.release | |
LEFT JOIN release_first_release_date rfrd ON rfrd.release = r.id | |
LEFT JOIN release_label rl ON rl.release = r.id | |
LEFT JOIN release_country rc ON rc.release = r.id | |
LEFT JOIN iso_3166_1 iso ON iso.area = rc.country | |
$SQL$ || (CASE WHEN release_id IS NULL THEN '' ELSE 'WHERE r.id = $1' END) || | |
$SQL$ | |
GROUP BY ar.is_track_artist, ar.artist, rfrd.release, r.id | |
ORDER BY ar.artist, r.id, ar.is_track_artist | |
$SQL$ | |
USING release_id; | |
END; | |
$$ LANGUAGE plpgsql; | |
CREATE OR REPLACE FUNCTION get_artist_release_rows( | |
release_id INTEGER | |
) RETURNS SETOF artist_release AS $$ | |
BEGIN | |
-- PostgreSQL 12 generates a vastly more efficient plan when only | |
-- one release ID is passed. A condition like `r.id = any(...)` | |
-- can be over 200x slower, even with only one release ID in the | |
-- array. | |
RETURN QUERY EXECUTE $SQL$ | |
SELECT DISTINCT ON (ar.artist, r.id) | |
ar.is_track_artist, | |
ar.artist, | |
integer_date(rfrd.year, rfrd.month, rfrd.day) AS first_release_date, | |
array_agg( | |
DISTINCT rl.catalog_number ORDER BY rl.catalog_number | |
) FILTER (WHERE rl.catalog_number IS NOT NULL)::TEXT[] AS catalog_numbers, | |
min(iso.code ORDER BY iso.code)::CHAR(2) AS country_code, | |
left(regexp_replace( | |
(CASE r.barcode WHEN '' THEN '0' ELSE r.barcode END), | |
'[^0-9]+', '', 'g' | |
), 18)::BIGINT AS barcode, | |
r.name, | |
r.id | |
FROM ( | |
SELECT FALSE AS is_track_artist, racn.artist, r.id AS release | |
FROM release r | |
JOIN artist_credit_name racn ON racn.artist_credit = r.artist_credit | |
UNION ALL | |
SELECT TRUE AS is_track_artist, tacn.artist, m.release | |
FROM medium m | |
JOIN track t ON t.medium = m.id | |
JOIN artist_credit_name tacn ON tacn.artist_credit = t.artist_credit | |
) ar | |
JOIN release r ON r.id = ar.release | |
LEFT JOIN release_first_release_date rfrd ON rfrd.release = r.id | |
LEFT JOIN release_label rl ON rl.release = r.id | |
LEFT JOIN release_country rc ON rc.release = r.id | |
LEFT JOIN iso_3166_1 iso ON iso.area = rc.country | |
$SQL$ || (CASE WHEN release_id IS NULL THEN '' ELSE ' WHERE r.id = $1' END) || | |
$SQL$ | |
GROUP BY ar.is_track_artist, ar.artist, rfrd.release, r.id | |
ORDER BY ar.artist, r.id, ar.is_track_artist | |
$SQL$ | |
USING release_id; | |
END; | |
$$ LANGUAGE plpgsql; |
🤖 Prompt for AI Agents
In mbslave/sql/updates/20250425-mbs-13464.sql around lines 28 to 73, the dynamic
SQL concatenation for the WHERE clause in the get_artist_release_rows() function
lacks a leading space before 'WHERE', causing a syntax error by merging it with
the previous token. Fix this by adding a space or newline before 'WHERE' in the
concatenated string to ensure proper SQL syntax.
CREATE INDEX artist_release_nonva_idx_sort ON artist_release_nonva (artist, first_release_date NULLS LAST, catalog_numbers NULLS LAST, country_code NULLS LAST, barcode NULLS LAST, name, release); | ||
CREATE INDEX artist_release_va_idx_sort ON artist_release_va (artist, first_release_date NULLS LAST, catalog_numbers NULLS LAST, country_code NULLS LAST, barcode NULLS LAST, name, release); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Add sort indexes for partitioned artist_release tables
Indexes on (artist, first_release_date, catalog_numbers, country_code, barcode, name, release)
optimize queries that aggregate per artist. Consider CONCURRENTLY
in production to avoid long locks.
🤖 Prompt for AI Agents
In mbslave/sql/CreateIndexes.sql around lines 72 to 73, add indexes on the
columns (artist, first_release_date, catalog_numbers, country_code, barcode,
name, release) for the partitioned tables artist_release_nonva and
artist_release_va to optimize artist-based aggregation queries. When applying
these indexes in production, use the CONCURRENTLY option to prevent long locking
of the tables during index creation.
CREATE INDEX artist_release_group_nonva_idx_sort ON artist_release_group_nonva (artist, unofficial, primary_type_child_order NULLS FIRST, primary_type NULLS FIRST, secondary_type_child_orders NULLS FIRST, secondary_types NULLS FIRST, first_release_date NULLS LAST, name, release_group); | ||
CREATE INDEX artist_release_group_va_idx_sort ON artist_release_group_va (artist, unofficial, primary_type_child_order NULLS FIRST, primary_type NULLS FIRST, secondary_type_child_orders NULLS FIRST, secondary_types NULLS FIRST, first_release_date NULLS LAST, name, release_group); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Add sort indexes for partitioned artist_release_group tables
Similar composite indexes on release_group partitions help performance for group-based queries. Consider building these indexes concurrently.
🤖 Prompt for AI Agents
In mbslave/sql/CreateIndexes.sql around lines 80 to 81, the new composite
indexes on the partitioned artist_release_group tables should be created
concurrently to avoid locking issues during index creation. Modify the CREATE
INDEX statements to include the CONCURRENTLY keyword, ensuring the indexes are
built without blocking writes on the tables.
|
||
CREATE OR REPLACE FUNCTION controlled_for_whitespace(TEXT) RETURNS boolean AS $$ | ||
SELECT NOT padded_by_whitespace($1); | ||
$$ LANGUAGE sql IMMUTABLE SET search_path = musicbrainz, public; | ||
$$ LANGUAGE SQL IMMUTABLE SET search_path = musicbrainz, public; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Drop remaining SET search_path
from function
The controlled_for_whitespace
function still embeds a SET search_path = musicbrainz, public
in its LANGUAGE
clause. To fully eliminate schema search‐path assumptions within function bodies, please remove this directive and rely on the script-level or user-level setting instead.
🤖 Prompt for AI Agents
In mbslave/sql/CreateFunctions.sql at line 1592, remove the `SET search_path =
musicbrainz, public` directive from the LANGUAGE clause of the
`controlled_for_whitespace` function to avoid embedding schema search-path
assumptions inside the function. Rely on script-level or user-level search_path
settings instead.
\set ON_ERROR_STOP 1 | ||
|
||
BEGIN; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add explicit search_path at script start
This migration defines and drops functions and tables using unqualified names but doesn’t set a search_path
. If run under a nondefault role, objects may end up in the wrong schema or fail to resolve. Please add a SET search_path = musicbrainz, public;
immediately after BEGIN;
to guarantee the script operates in the intended schema.
🤖 Prompt for AI Agents
In mbslave/sql/updates/20241017-mbs-9253-13464.sql at lines 1 to 4, the script
lacks an explicit search_path setting, which can cause schema resolution issues.
After the BEGIN; statement, add the line SET search_path = musicbrainz, public;
to ensure all unqualified object names are resolved in the correct schema during
the migration.
This PR does seem to be missing something. I setup a fresh database with this branch and then attempted to sync up to latest. I ran into this error:
|
resolves #19
the manually entered search path definitions for functions would conflict on every run of
scripts/update_sql.sh
without them, the sync process could fail with errors like this if you configure a username different to
musicbrainz
the reason is that running
mbslave sync
configures the database session withoutset_search_path
and the default search path for the custom username does not match the musicbrainz schema anymore.as a workaround you can set the searchpath for the database user configured in
mbslave.cfg
optiondatabase.user
on user level