8000 feat(common): support input/output of new data type `vector(n)` by xiangjinwu · Pull Request #22019 · risingwavelabs/risingwave · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat(common): support input/output of new data type vector(n) #22019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: feat-common-type-vector-p0-dummy
Choose a base branch
from

Conversation

xiangjinwu
Copy link
Contributor
@xiangjinwu xiangjinwu commented May 27, 2025

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

The new data type vector(n) is expected to match most of the behavior of vector from pgvector. Compared to existing real[]:

  • The length parameter in the data type is NOT ignored.
  • Elements can NOT be null, nan, inf or -inf.

Different from pgvector, the length parameter is always required. In pgvector it is optional for simple operations but required when building index.

This PR fills in some todos left by #21380 to support basic usage:

  • convert to / from protobuf
  • convert to / from pgwire text format and varchar
    • literal in explain output
  • assign pg_type oids
  • convert to / from value encoding
  • parse and bind vector(n) from input SQL
    • convert back to ast
  • column name derivation for select null::vector(3);
  • debug format showing vector literal in explain output
  • generic operations on all data types (TODO: add tests)

This implementation uses an inner ListArray to mimic the expected semantic. We may consider switching to a more direct implementation that could potentially BREAK existing value encoding (persisted data).

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • I have checked the Release Timeline and Currently Supported Versions to determine which release branches I need to cherry-pick this PR into.

Documentation

  • My PR needs documentation updates.
Release note

@github-actions github-actions bot added the type/feature Type: New feature. label May 27, 2025
fn new(_capacity: usize) -> Self {
todo!("VECTOR_PLACEHOLDER")
fn new(capacity: usize) -> Self {
Self::with_type(capacity, DataType::Vector(3))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +187 to +191
// TODO(VECTOR_PLACEHOLDER): not in CAST_TABLE because
// * `DataType::try_from(DataTypeName::Vector).unwrap().to_oid()` panics
// * `DataTypeName::Vector.to_oid()` is better but `to_oid` does not work for `DataTypeName::List`
|| matches!((source, target), (DataType::Varchar, DataType::Vector(_)) if CastContext::Explicit <= allows)
|| matches!((source, target), (DataType::Vector(_), DataType::Varchar) if CastContext::Assign <= allows)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Type: New feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0