8000 Suboptimal query plan with tight filter constraint when limiting to 3 or fewer rows · Issue #270 · tensorchord/VectorChord · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Suboptimal query plan with tight filter constraint when limiting to 3 or fewer rows #270
Open
@mertalev

Description

@mertalev

Given this query, where assets contains 31805 rows, asset_faces and face_search (lists=256, probes=32) contain 166496 and the expected number of results is 1 (deletedAt is not null for most assets):

select
      "asset_faces"."id",
      "asset_faces"."personId",
      face_search.embedding <=> $1 as "distance"
    from
      "asset_faces"
      inner join "assets" on "assets"."id" = "asset_faces"."assetId"
      inner join "face_search" on "face_search"."faceId" = "asset_faces"."id"
    where
      "assets"."ownerId" = any (array[$2]::uuid[])
      and "assets"."deletedAt" is null
    order by
      "distance"
    limit
      $3;

The query plan when limit = 1:

Limit  (cost=21.79..869.86 rows=1 width=40) (actual time=95.844..95.845 rows=0 loops=1)
  ->  Nested Loop  (cost=21.79..89069.71 rows=105 width=40) (actual time=95.843..95.844 rows=0 loops=1)
        ->  Nested Loop  (cost=21.49..79586.99 rows=166496 width=66) (actual time=2.149..74.610 rows=19384 loops=1)
              ->  Index Scan using face_index on face_search  (cost=21.07..3352.63 rows=166496 width=34) (actual time=2.126..45.766 rows=19384 loops=1)
                    Order By: (embedding <=> <omitted>::vector)
"              ->  Index Scan using ""PK_6df76ab2eb6f5b57b7c2f1fc684"" on asset_faces  (cost=0.42..0.46 rows=1 width=48) (actual time=0.001..0.001 rows=1 loops=19384)"
"                    Index Cond: (id = face_search.""faceId"")"
        ->  Memoize  (cost=0.30..0.34 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=19384)
"              Cache Key: asset_faces.""assetId"""
              Cache Mode: logical
              Hits: 8894  Misses: 10490  Evictions: 0  Overflows: 0  Memory Usage: 820kB
"              ->  Index Scan using ""PK_da96729a8b113377cfb6a62439c"" on assets  (cost=0.29..0.33 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=10490)"
"                    Index Cond: (id = asset_faces.""assetId"")"
"                    Filter: ((""deletedAt"" IS NULL) AND (""ownerId"" = ANY ('{ef718196-74ab-43c8-835b-9852c1fdb389}'::uuid[])))"
                    Rows Removed by Filter: 1
Planning Time: 0.560 ms
Execution Time: 95.925 ms

The query plan when limit = 4:

Limit  (cost=3242.38..3242.39 rows=4 width=40) (actual time=8.470..8.471 rows=1 loops=1)
  ->  Sort  (cost=3242.38..3242.65 rows=105 width=40) (actual time=8.468..8.470 rows=1 loops=1)
        Sort Key: ((face_search.embedding <=> <omitted>::vector))
        Sort Method: quicksort  Memory: 25kB
        ->  Nested Loop  (cost=0.84..3240.81 rows=105 width=40) (actual time=1.331..8.462 rows=1 loops=1)
              ->  Nested Loop  (cost=0.42..3193.07 rows=105 width=32) (actual time=1.284..8.414 rows=1 loops=1)
                    ->  Seq Scan on assets  (cost=0.00..2924.81 rows=20 width=16) (actual time=0.009..8.263 rows=22 loops=1)
"                          Filter: ((""deletedAt"" IS NULL) AND (""ownerId"" = ANY ('{ef718196-74ab-43c8-835b-9852c1fdb389}'::uuid[])))"
                          Rows Removed by Filter: 31783
"                    ->  Index Scan using ""IDX_asset_faces_assetId_personId"" on asset_faces  (cost=0.42..13.31 rows=10 width=48) (actual time=0.006..0.006 rows=0 loops=22)"
"                          Index Cond: (""assetId"" = assets.id)"
              ->  Index Scan using face_search_pkey on face_search  (cost=0.42..0.45 rows=1 width=34) (actual time=0.014..0.014 rows=1 loops=1)
"                    Index Cond: (""faceId"" = asset_faces.id)"
Planning Time: 0.838 ms
Execution Time: 8.507 ms

The second query is significantly faster and returns the single result. This is using 0.4.1; prefilter doesn't seem to make a difference either way.

If I remove the deletedAt constraint, the query becomes fast with limit = 1:

Limit  (cost=21.79..22.32 rows=1 width=40) (actual time=1.477..1.478 rows=1 loops=1)
  ->  Nested Loop  (cost=21.79..89485.68 rows=166496 width=40) (actual time=1.476..1.476 rows=1 loops=1)
        ->  Nested Loop  (cost=21.49..79586.99 rows=166496 width=66) (actual time=1.440..1.441 rows=1 loops=1)
              ->  Index Scan using face_index on face_search  (cost=21.07..3352.63 rows=166496 width=34) (actual time=1.423..1.423 rows=1 loops=1)
                    Order By: (embedding <=> <omitted>::vector)
"              ->  Index Scan using ""PK_6df76ab2eb6f5b57b7c2f1fc684"" on asset_faces  (cost=0.42..0.46 rows=1 width=48) (actual time=0.013..0.013 rows=1 loops=1)"
"                    Index Cond: (id = face_search.""faceId"")"
        ->  Memoize  (cost=0.30..0.34 rows=1 width=16) (actual time=0.012..0.012 rows=1 loops=1)
"              Cache Key: asset_faces.""assetId"""
              Cache Mode: logical
              Hits: 0  Misses: 1  Evictions: 0  Overflows: 0  Memory Usage: 1kB
"              ->  Index Scan using ""PK_da96729a8b113377cfb6a62439c"" on assets  (cost=0.29..0.33 rows=1 width=16) (actual time=0.010..0.010 rows=1 loops=1)"
"                    Index Cond: (id = asset_faces.""assetId"")"
"                    Filter: (""ownerId"" = ANY ('{ef718196-74ab-43c8-835b-9852c1fdb389}'::uuid[]))"
Planning Time: 0.553 ms
Execution Time: 1.534 ms

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0