8000 How to support partitioning using Postgres, atlas and ent · Issue #1579 · ariga/atlas · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

How to support partitioning using Postgres, atlas and ent #1579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments

@manttila74
Copy link

Hi,

I have a problem supporting partitioned tables using Postgres, atlas and ent.

I haven't been able to model the partitioned table correctly in ent schema and therefore the atlas migrate lint and migration file generation produce this error:
Error: partition key cannot be added to "xxx" (drop and add is required)
exit status 1

Basically, I think it's because of the multicolumn PK not modeled accordingly. It should contain the partition key with some other fields, one of them being a timestamp.

Any ideas how to solve this?

Thanks,

-Manu

@a8m
Copy link
Member
a8m commented Apr 11, 2023

Hey @manttila74 👋
The partition option can be defined using HCL using the partition option or with SQL using the standard syntax. Can you elaborate on how you defined it with Ent?

@manttila74
Copy link
Author

Thanks for quick reply.

Here is a simplified version of it:

func (Anons) Fields() []ent.Field {
return []ent.Field{
field.Int16("partition_id").
Default(util.GetPartitionId(time.Now())).
Annotations(
entsql.DefaultExprs(map[string]string{
dialect.MySQL: "get_partition_id(CURRENT_TIMESTAMP)",
dialect.Postgres: "get_partition_id(CURRENT_TIMESTAMP)",
}),
).
Immutable(),
field.Time("event_at").
Default(time.Now).
Annotations(&entsql.Annotation{
Default: "CURRENT_TIMESTAMP",
}).
Immutable(),
field.UUID("x_id", uuid.UUID{}),
field.UUID("y_id", uuid.UUID{}),
field.UUID("z_id", uuid.UUID{}),
field.Uint64("id").
StorageKey("sequence_number").
SchemaType(map[string]string{
dialect.Postgres: "bigserial",
}),
...
}
}

func (Anons) Edges() []ent.Edge {
return nil
}

func (Anons) Indexes() []ent.Index {
return []ent.Index{
index.Fields("partition_id", "event_at", "x_id", "y_id", "z_id").
Unique(),
index.Fields("id").
Unique()}
}

In ent there is no way of defining multicolumn PKs unless it's an edge schema, see (ent/ent#1949).
The schema here is part of a different microservice and database, so the edge type of schema doesn't apply here. That's why the PK is defined as unique index above.

When the partitioning is removed from the migration file, the following diff will be produced. There the dropping of PK can be seen, which I suspect is the problem.

Basically there's two problems here:

  1. The sequence_number cannot be the only PK column, since partition key should be part of it. Ent defines it automatically, since we are overriding the "id" column.
  2. The multicolumn PK with (partition_id, event_at, x_id, y_id, z_id) is enough to uniquely differentiate the rows, but there is no way of defining it in ent.

-- drop index "anons_sequence_number" from table: "anons"
DROP INDEX "anons_sequence_number";
-- create sequence for serial column "sequence_number"
CREATE SEQUENCE IF NOT EXISTS "anons_sequence_number_seq" OWNED BY "anons"."sequence_number";
-- modify "anons" table
ALTER TABLE "anons" ALTER COLUMN "sequence_number" SET DEFAULT nextval('"anons_sequence_number_seq"'), ALTER COLUMN "sequence_number" DROP DEFAULT, DROP CONSTRAINT "anons_pkey1" , ADD PRIMARY KEY ("sequence_number");
-- create index "anons_sequence_number" to table: "anons"
CREATE UNIQUE INDEX "anons_sequence_number" ON "anons" ("sequence_number");
-- create index "anons_partition_id_event_at_x_id_y_id_z_id" to table: "anons"
CREATE UNIQUE INDEX "anons_partition_id_event_at_x_id_y_id_z_id" ON "anons" ("partition_id", "event_at", "x_id", "y_id", "z_id");

In HCL the reverse-engineer schema looks like this:

table "anons" {
schema = schema.pm
column "partition_id" {
null = false
type = smallint
default = sql("get_partition_id(CURRENT_TIMESTAMP)")
}
column "event_at" {
null = false
type = timestamptz
default = sql("CURRENT_TIMESTAMP")
}
column "x_id" {
null = false
type = uuid
}
column "z_id" {
null = false
type = uuid
}
column "link_id" {
null = false
type = uuid
}
column "sequence_number" {
null = false
type = bigint
default = sql("nextval('anons_sequence_number_seq1'::regclass)")
}
...
primary_key {
columns = [column.partition_id, column.event_at, column.x_id, column.z_id]
}
index "pmrfanalytics_sequence_number" {
columns = [column.sequence_number]
}
partition {
type = LIST
columns = [column.partition_id]
}
}

@liorkesten
Copy link
liorkesten commented Feb 3, 2025

Any updates on this? Looking for a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0