8000 Function source normalization (pg_dump) · Issue #139 · mzabani/codd · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Function source normalization (pg_dump) #139
Closed
@TravisCardwell

Description

@TravisCardwell

The Codd representation of a PostgreSQL function/procedure includes the MD5 hash of the source code of the function body when the function is implemented in SQL or PL/pgSQL (reference). This function body source code is a string that includes the exact whitespace used during creation. This whitespace is collapsed when the database is dumped using pg_dump, however, causing a difference in representation.

This issue can be avoided by not using pg_dump schema dumps. For example, dumps are often used when upgrading PostgreSQL. An alternative is to initialize the schema on the new version from the migrations (using Codd) and to only use pg_dump to migrate the data. Such schema dumps are the de facto standard way to save a snapshot of a database (schema), however, so being able to compare them would be nice.

This issue could be resolved by normalizing the function body source code. This requires parsing the source code. (Doing this correctly requires parsing of nested dollar-quoted strings using identifiers, using an FSM augmented with an identifier stack.) I have not tried implementing this, but my current idea of how to do so it to change the representation to include the full source code. An option could then be used to determine how that source code is compared (equality by default, normalized when requested). Would including the full source code in the representation be problematic?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0