8000 feat: Qdrant hybrid search by Anush008 · Pull Request #2787 · agno-agi/agno · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: Qdrant hybrid search #2787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

Anush008
Copy link
@Anush008 Anush008 commented Apr 11, 2025

Summary

This PR adds support for hybrid searches when using Qdrant as the vector store.

qdrant/fastembed's BM25 is used for sparse embeddings by default(Customizable).

Qdrant's hybrid search reference: https://qdrant.tech/documentation/concepts/hybrid-queries/#hybrid-search

Type of change

  • New feature

Signed-off-by: Anush008 <anushshetty90@gmail.com>
Signed-off-by: Anush008 <anushshetty90@gmail.com>
@Anush008 Anush008 requested a review from a team as a code owner April 11, 2025 18:03
@Anush008 Anush008 changed the title Feat qdrant hybrid search feat: Qdrant hybrid search Apr 11, 2025
@kausmeows
Copy link
Contributor
kausmeows commented Apr 12, 2025

Hey @Anush008, thanks a lot for this contribution. It was recently added in the community wishlist and I was about to get to it 😅. Really appreciate it.

I'll test it from my side soon and get back!!

Signed-off-by: Anush008 <anushshetty90@gmail.com>
@Anush008 Anush008 force-pushed the feat--Qdrant-Hybrid-search branch from a6bbc1d to 0bf5bb8 Compare April 13, 2025 05:20
Copy link
Contributor
@kausmeows kausmeows left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments @Anush008

For some reference on the design we follow, it'll be better for a quick reference to here- https://github.com/agno-agi/agno/blob/main/libs/agno/agno/vectordb/lancedb/lance_db.py

  • We ideally would want to keep keyword_search, vector_search and hybrid_search as their own functions based on the search type.
  • It'll be great if you could default to this approach and implement them also? Only if you get the time, else we're fine with hybrid_search for now :)

Signed-off-by: Anush008 <anushshetty90@gmail.com>
…_hybrid_search()

Signed-off-by: Anush008 <anushshetty90@gmail.com>
Signed-off-by: Anush008 <anushshetty90@gmail.com>
Signed-off-by: Anush008 <anushshetty90@gmail.com>
@Anush008
Copy link
Author

Hey @kausmeows. I incorporated your suggestions.

@Anush008 Anush008 requested a review from kausmeows April 14, 2025 12:54
@Anush008
Copy link
Author

Hey guys. Just bumping this PR. Please take a look when possible.

@KevinZhang19870314
Copy link

Any updates?
I am trying to run the hybrid search on my windows, throw error: DLL load error: DLL load failed while importing onnxruntime_pybind11_state: A dynamic link library (DLL) initialization routine failed.

@kausmeows
Copy link
Contributor

Hey @Anush008 sorry about the delay, we were working on a major feature here- #3005 on supporting knowledge based filtering (manual + agentic).

There have been some major changes in the qdrant db file which i think will conflict here. Next step is hybrid search. If you'd like to resolve the conflicts and take this to the finish line please let me know?

Else I can take over this PR and build on top of it!!? Thanks a lot for this amazing contribution

…earch

Signed-off-by: Anush008 <anushshetty90@gmail.com>
@Anush008
Copy link
Author

Hello @kausmeows.
The conflicts are resolved now.

Signed-off-by: Anush008 <anushshetty90@gmail.com>
@Anush008 Anush008 force-pushed the feat--Qdrant-Hybrid-search branch from e31a2c4 to adefab2 Compare May 14, 2025 03:53
@kausmeows
Copy link
Contributor
kausmeows commented May 14, 2025

@Anush008 hybrid search is not working??
I ran this example-

NOTE: better to add this in the cookbooks in the path- cookbook/agent_concepts/knowledge/vector_dbs/qdrant_db/qdrant_db_hybrid_search.py

from agno.agent import Agent
from agno.knowledge.pdf_url import PDFUrlKnowledgeBase
from agno.vectordb.qdrant import Qdrant, SearchType

COLLECTION_NAME = "thai-recipes"

vector_db = Qdrant(collection=COLLECTION_NAME, url="http://localhost:6333", search_type=SearchType.hybrid)

knowledge_base = PDFUrlKnowledgeBase(
    urls=["https://agno-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
    vector_db=vector_db,
)

knowledge_base.load(recreate=False)  # Comment out after first run

# Create and use the agent
agent = Agent(knowledge=knowledge_base, show_tool_calls=True)
agent.print_response(
    "List down the ingredients to make Massaman Gai", markdown=True)

Also change the libs/agno/agno/vectordb/qdrant/__init__.py

from agno.vectordb.qdrant.qdrant import Qdrant
from agno.vectordb.search import SearchType

__all__ = [
    "Qdrant", "SearchType"
]

image

key = f"meta_data.{key}"
filters = self._format_filters(filters)
if self.search_type == SearchType.vector:
results = self._run_vector_search_sync(query, limit, filters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also lets rename the functions here-

        if self.search_type == SearchType.vector:
            results = self.vector_search(query, limit, filters)
        elif self.search_type == SearchType.keyword:
            results = self.keyword_search(query, limit, filters)
        elif self.search_type == SearchType.hybrid:
            results = self.hybrid_search(query, limit, filters)
        if self.search_type == SearchType.vector:
            results = await self.async_vector_search(query, limit, filters)
        elif self.search_type == SearchType.keyword:
            results = await self.async_keyword_search(query, limit, filters)
        elif self.search_type == SearchType.hybrid:
            results = await self.async_hybrid_search(query, limit, filters)

from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.yfinance import YFinanceTools
from langtrace_python_sdk import langtrace # Must precede other imports
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has to be before the agno imports, right?

@kausmeows
Copy link
Contributor

@Anush008 also just noticed something- Technically it is a breaking change because it requires new columns, so existing tables would have to be recreated. because my existing qdrant didn't store embeddings as dense: <vector> but rather just the vector.
It works if I recreate, but perhaps users won't want to do that for no reason. So if the breaking change could only happen IF hybrid search is turned on, then it is ok.

wdyt, can we do that?

@Anush008
Copy link
Author

because my existing qdrant didn't store embeddings as dense: but rather just the vector.

Yeah. It defaults to an unnamed dense vector. But now, we're using 2 named vectors for dense and sparse.

It works if I recreate, but perhaps users won't want to do that for no reason. So if the breaking change could only happen IF hybrid search is turned on, then it is ok.

This can be made to work, but the code becomes riddled with conditionals and will be hard to maintain.

@kausmeows
Copy link
Contributor
kausmeows commented May 14, 2025

because my existing qdrant didn't store embeddings as dense: but rather just the vector.

Yeah. It defaults to an unnamed dense vector. But now, we're using 2 named vectors for dense and sparse.

It works if I recreate, but perhaps users won't want to do that for no reason. So if the breaking change could only happen IF hybrid search is turned on, then it is ok.

This can be made to work, but the code becomes riddled with conditionals and will be hard to maintain.

Hmh i see, but how big a refactor that be, i think a few conditionals won't hurt as long as they dont force users to do something that they surely will be very uncomfortable doing..?

@dirkbrnd @manuhortet how do you feel about this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0