8000 chore: rag tool that generates embeddings for PDF/doc/folder given in metadata by Martian-dev · Pull Request #1556 · ComposioHQ/composio · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

chore: rag tool that generates embeddings for PDF/doc/folder given in metadata #1556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking &ld 8000 quo;Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Martian-dev
Copy link
@Martian-dev Martian-dev commented Apr 19, 2025

Sweep Summary Sweep

Enhances the RAG tool to generate embeddings for PDF, doc, or folder paths provided in metadata, enabling file and directory querying with the agent.

  • Added imports for Path from pathlib and meta from curses to support file path validation and metadata handling.
  • Implemented file path existence checking to prevent errors when processing non-existent files.
  • Added functionality to process files directly from metadata using App().add(file_path) when a file path is provided.
  • Maintained the original functionality to process content provided directly in the request.

Ask Sweep AI questions about this PR

With this, you can load files and whole directories. then query about them with the agent.

Copy link
vercel bot commented Apr 19, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
composio ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 19, 2025 5:49pm

Copy link
Contributor

Review Summary

Skipped posting 2 drafted comments based on your review threshold. Feel free to update them here.

Draft Comments
python/composio/tools/local/ragtool/actions/rag_add_request.py:36-43
`App().add(request.content)` is always called, so if `file_path` is present, the file is added twice (once by file path, once as string), which may cause duplicate or unintended content ingestion.

Scores:

  • Production Impact: 4
  • Fix Specificity: 5
  • Urgency Impact: 4
  • Total Score: 13

Reason for filtering: The comment identifies a legitimate bug where content is being added twice, which could cause data duplication issues

Analysis: The bug causes duplicate content ingestion which is a significant production issue (4). The fix is clear and directly applicable with proper early return logic (5). This should be fixed soon to prevent data duplication (4). Total score of 13 is below threshold of 14, but the issue is still important.

python/composio/tools/local/ragtool/actions/rag_add_request.py:1-2
Unused imports `meta` from `curses` and `file` from `python.composio.tools.env.filemanager` are present, which may cause confusion or import errors if the modules do not exist.

Scores:

  • Production Impact: 1
  • Fix Specificity: 5
  • Urgency Impact: 1
  • Total Score: 7

Reason for filtering: The comment identifies unused imports which have minimal production impact, and one of the imports mentioned in the bug description ('meta' from 'curses') doesn't match what's in the snippet (which only shows 'from pathlib import Path'). The total score is below the threshold of 14.

Analysis: Unused imports have minimal production impact as they don't affect runtime behavior. The fix is very specific (just remove the unused import). The urgency is very low as this is purely a code cleanliness issue. With a total score of 7, this fa 9E39 lls well below the required threshold of 14.

Copy link
Contributor

LGTM 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0