8000 feat: WIP - Integrate Surrealdb as VectorDB by dsaad68 · Pull Request #3235 · agno-agi/agno · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: WIP - Integrate Surrealdb as VectorDB #3235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 40 commits into
base: main
Choose a base branch
from

Conversation

dsaad68
Copy link
Contributor
@dsaad68 dsaad68 commented May 18, 2025

🚀 SurrealDB Vector Database Integration

🔍 Overview

This PR adds support for SurrealDB as a multi-model database backend for Agno. SurrealDB is a scalable, distributed database with native vector embedding support and HNSW indexes for vector search, making it a perfect fit for RAG applications.

🌟 About SurrealDB

SurrealDB is "the ultimate multi-model database for AI applications" that unifies vectors, graphs, documents, time-series and files in a single platform.

✨ Features

  • Added SurrealVectorDb implementation supporting both synchronous and asynchronous operations
  • Implemented all standard vector database operations including:
    • Document insertion/upsert
    • Vector similarity search with filtering
    • Collection management (create, drop, exists)
  • Added support for HNSW index configuration for performance tuning
  • Added both sync and async demo files showing integration with PDFUrlKnowledgeBase

🛠️ Improvements

  • 🔌 Extended the AgentKnowledge async capabilities with async_filter_existing_documents

📝 Usage Example

from agno.vectordb.surrealdb import SurrealVectorDb

# Initialize SurrealDB vector database
surrealdb = SurrealVectorDb(
    url="ws://localhost:8000",
    username="root",
    password="root",
    namespace="test",
    database="test",
    collection="documents",
    efc=150,  # HNSW construction parameter
    m=12,     # HNSW connections parameter
    search_ef=40  # HNSW search parameter
)

# Use with any knowledge base implementation
knowledge_base = PDFUrlKnowledgeBase(
    urls=["https://example.com/document.pdf"],
    vector_db=surrealdb
)

# Load documents and search
knowledge_base.load(recreate=True)

# Create agent and query synchronously
agent = Agent(knowledge=knowledge_base, show_tool_calls=True, debug_mode=True)
agent.print_response("What are the 3 categories of Thai SELECT is given to restaurants overseas?", markdown=True)

🔗 References

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Improvement
  • Model update
  • Other:

Checklist

  • Code complies with style guidelines
  • Ran format/validation scripts (./scripts/format.sh and ./scripts/validate.sh)
  • Self-review completed
  • Documentation updated (comments, docstrings)
  • Examples and guides: Relevant cookbook examples have been included or updated (if applicable)
  • Tested in clean environment
  • Tests added/updated (if applicable)

Additional Notes

Add any important context (deployment instructions, screenshots, security considerations, etc.)

Copy link
Contributor
@kausmeows kausmeows left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dsaad68 thanks alot for this contribution. small comment

Also make sure to fix the failing style and validation test checks...

< 8000 /div>
@dsaad68
Copy link
Contributor Author
dsaad68 commented May 20, 2025

I have added scripts to the cookbook and done the formatting.

@dsaad68
Copy link
Contributor Author
dsaad68 commented May 20, 2025

@kausmeows could you please check if added surrealdb correctly to pyproject.toml?

@kausmeows
Copy link
Contributor

@kausmeows could you please check if added surrealdb correctly to pyproject.toml?

Yes that looks fine but @dsaad68 i was testing this and ran the cookbook- cookbook/agent_concepts/knowledge/vector_dbs/surrealdb/surrealdb.py, but it did not work for me..? See below

image

I ran the docker command mentioned, am i missing anything?

@dsaad68
Copy link
Contributor Author
dsaad68 commented May 21, 2025

@kausmeows could you please check if added surrealdb correctly to pyproject.toml?

Yes that looks fine but @dsaad68 i was testing this and ran the cookbook- cookbook/agent_concepts/knowledge/vector_dbs/surrealdb/surrealdb.py, but it did not work for me..? See below

image

I ran the docker command mentioned, am i missing anything?

I have fixed this problem, the issue was with the version of the SurrealDB package. Can you run it again?
Also, I have changed the name as you requested.

@kausmeows
Copy link
Contributor

@kausmeows could you please check if added surrealdb correctly to pyproject.toml?

Yes that looks fine but @dsaad68 i was testing this and ran the cookbook- cookbook/agent_concepts/knowledge/vector_dbs/surrealdb/surrealdb.py, but it did not work for me..? See below
image
I ran the docker command mentioned, am i missing anything?

I have fixed this problem, the issue was with the version of the SurrealDB package. Can you run it again? Also, I have changed the name as you requested.

@dsaad68 not sure if i'm doing something wrong but still getting the same error as in above screenshot...
I also did pip install -U surrealdb to upgrade to the latest and also tried at version 1.0.3

@dsaad68
Copy link
Contributor Author
dsaad68 commented May 22, 2025

@kausmeows could you please check if added surrealdb correctly to pyproject.toml?

Yes that looks fine but @dsaad68 i was testing this and ran the cookbook- cookbook/agent_concepts/knowledge/vector_dbs/surrealdb/surrealdb.py, but it did not work for me..? See below
image
I ran the docker command mentioned, am i missing anything?

I have fixed this problem, the issue was with the version of the SurrealDB package. Can you run it again? Also, I have changed the name as you requested.

@dsaad68 not sure if i'm doing something wrong but still getting the same error as in above screenshot... I also did pip install -U surrealdb to upgrade to the latest and also tried at version 1.0.3

Sorry for the trouble. I will look into it.

However, a somewhat unconventional approach, could you please attempt to relocate the script to the other folder, such as the root directory, and then test it?

@dsaad68
Copy link
Contributor Author
dsaad68 commented May 22, 2025

@kausmeows could you please check if added surrealdb correctly to pyproject.toml?

Yes that looks fine but @dsaad68 i was testing this and ran the cookbook- cookbook/agent_concepts/knowledge/vector_dbs/surrealdb/surrealdb.py, but it did not work for me..? See below
image
I ran the docker command mentioned, am i missing anything?

I have fixed this problem, the issue was with the version of the SurrealDB package. Can you run it again? Also, I have changed the name as you requested.

@dsaad68 not sure if i'm doing something wrong but still getting the same error as in above screenshot... I also did pip install -U surrealdb to upgrade to the latest and also tried at version 1.0.3

I have found the problem, I will solve it.

@dsaad68
Copy link
Contributor Author
dsaad68 commented May 26, 2025

@kausmeowsI have fixed the problem. Now it should work.

@dsaad68 dsaad68 marked this pull request as ready for review May 26, 2025 14:08
@dsaad68 dsaad68 requested a review from a team as a code owner May 26, 2025 14:08
DEFINE FIELD IF NOT EXISTS content ON {collection} TYPE string;
DEFINE FIELD IF NOT EXISTS embedding ON {collection} TYPE array<float>;
DEFINE FIELD IF NOT EXISTS meta_data ON {collection} TYPE object;
DEFINE INDEX IF NOT EXISTS vector_idx ON {collection} FIELDS embedding HNSW DIMENSION {dimensions} DIST COSINE;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would use the following defaults. It would be nice to make them explicit, or add a link to the docs about vector indexes

TYPE F64  -- other options e.g. F32, I64, ...
M 12  -- Max Connections per Element
EFC 150  -- Exploration factor during construction

Alse, there's the M-Tree index type, that may work on small data sets and dimensions. What do you think about making this customaizable instead of a convention to use only HNSW?

For example, this would be an m-tree index:

DEFINE INDEX IF NOT EXISTS vector_idx ON {collection} FIELDS embedding MTREE DIMENSION {dimensions} DIST COSINE TYPE F64;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take care of it later.

Copy link
@martinschaer martinschaer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current UPSERT_QUERY does not update documents because the doc.is is not used in the query.

@kausmeows
Copy link
Contributor

Hi @martinschaer appreciate the reviews here. Thanks a lot for jumping in! 🚀

@dsaad68 let me know when i can final test it, no rush!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0