Add fallback for the embedding model when bge-m3 is unavailable

This pull request introduces a new directory watching feature for RAG systems in the rlama project. The key changes include updates to the README.md file to document new commands, modifications to the command files to handle directory watching, and the addition of new services and methods to support the feature.

Documentation Updates:

README.md: Added documentation for new commands watch, watch-off, and check-watched. [1] [2]

Command Additions and Modifications:

cmd/root.go: Imported the new service package and added a function to start the watcher daemon. [1] [2]

cmd/run.go: Added a call to checkWatchedDirectory to check for new files before querying the RAG system. [1] [2] [3]

cmd/watch.go: Created new commands watch, watch-off, and check-watched to manage directory watching for RAG systems.

Service and Domain Changes:

internal/domain/rag.go: Added fields to RagSystem to store directory watching settings and created a new DocumentWatchOptions struct.

internal/service/file_watcher.go: Implemented the FileWatcher service to handle directory watching, checking for new files, and updating RAG systems.

internal/service/rag_service.go: Added methods to RagService to set up, disable, and check directory watching for RAG systems. [1] [2] [3]

This pull request includes several changes to improve the functionality and robustness of the DocumentLoader and HNSWStore classes. The most important changes include the addition of new methods for extracting content from various file types and enhancements to the cosine similarity computation to handle edge cases.

Improvements to DocumentLoader:

Added a new method extractCSVContent to extract content from CSV files, including handling headers and rows.

Added a new method extractExcelContent to extract content from Excel files using either the xlsx2csv command-line tool or a Python script as a fallback.

Added a new method extractContent to determine the file type and call the appropriate extraction method based on the file extension.

Enhancements to HNSWStore:

Improved the computeCosineSimilarity function to check for empty vectors, log length mismatches, and handle cases where one of the norms is zero to prevent errors and ensure robustness. [1] [2]support xlsx format for embeddings

Hybrid Store Integration:

Replaced VectorStore with HybridStore in RagSystem to support combined vector and text search using the new EnhancedHybridStore class. (internal/domain/rag.go, internal/repository/rag_repository.go, internal/service/rag_service.go) [1] [2] [3]

Implemented EnhancedHybridStore class that combines HNSW vector search and BM25 text search, including methods for adding documents, removing documents, and performing hybrid searches. (pkg/vector/hybrid_store.go)

Metadata Handling:

Added Metadata field to the Document struct and updated related methods to handle this new field. (internal/domain/document.go) [1] [2]

Embedding Cache:

Introduced EmbeddingCache class to cache embeddings and avoid regenerating the same content, including methods for adding, retrieving, and cleaning up cached embeddings. (internal/service/embedding_cache.go)

Codebase Enhancements:

Updated the Go module version and added several indirect dependencies in go.mod to support new functionalities. (go.mod)

Simplified the main.go file by removing error handling for the root command execution. (main.go)

New Vector Store Implementation:

Added HNSWStore class as a simpler approximation of the HNSW algorithm for vector storage and search, including methods for adding, removing, and searching vectors. (pkg/vector/hnsw_vector_store.go)

This pull request includes several enhancements and new features for the RLAMA project, focusing on improving documentation, adding new commands, and refining the document processing capabilities. The most important changes include updates to the README file, new command implementations, and enhancements to the document loader service.

Documentation Updates:

Added new sections and commands to the README.md to provide detailed usage instructions for listing documents, inspecting document chunks, viewing chunk details, adding documents, and updating models. [1] [2] [3] [4] [5] [6] [7]

New Commands:

Implemented list-chunks command to inspect document chunks in a RAG system with filtering options.

Enhanced add-docs command to include options for excluding directories and file extensions, and processing specific file extensions. [1] [2]

Updated run command to include a --context-size parameter for retrieving a specified number of context chunks. [1] [2] [3]

Document Loader Enhancements:

Added support for additional file formats such as .org, .cxx, .ts, .f, .F, .F90, .el, and .svelte. [1] [2] [3]

Introduced DocumentLoaderOptions to filter documents during loading based on directories, file extensions, and chunking parameters. [1] [2]

Codebase Improvements:

Refactored add-docs command to use the new AddDocsWithOptions method, simplifying the document loading and chunking process.

Modified list-docs command to display the document path instead of the name for better clarity.

These changes collectively enhance the functionality and usability of the RLAMA tool, providing users with more control and flexibility in managing their RAG systems.

Add fallback for the embedding model when bge-m3 is unavailable

Problem Solved

Previously, the application would fail completely when the bge-m3 embedding model was not installed, displaying a blocking error to the user. This could occur in two cases:

When Ollama was not accessible.
When the bge-m3 model was not installed.

Implemented Solution

This PR introduces a fallback mechanism for embeddings:

The system first tries to use the specialized bge-m3 model (optimal for embeddings).
If it fails, it automatically switches to the specified LLM model for RAG as an alternative.
An informational message is displayed explaining how to improve performance (by installing bge-m3).

Benefits

Better user experience: Users can create and use RAG even if bge-m3 is not pre-installed.
No blocking issues: The process continues with a viable alternative.
Guidance: Users receive clear instructions on how to enhance performance.

Tests Performed

Tested scenarios with:
- bge-m3 installed (works as expected).
- bge-m3 not installed (falls back to the specified model).
- Ollama not accessible (appropriate error is now displayed).

This improvement makes RLAMA more robust and user-friendly, especially for new users who may not be aware of the recommended embedding models.

Features Added

Model Update

Ability to change the Ollama model used by an existing RAG system

Document Management

Add documents to existing RAG systems
Remove specific documents from RAG systems
List all documents in a RAG with details

Size Reporting

Show the total size of documents in each RAG system

Commands Added

rlama update-model [rag-name] [new-model]: Change the model used by a RAG
rlama add-docs [rag-name] [folder-path]: Add documents to an existing RAG
rlama remove-doc [rag-name] [doc-id]: Remove a specific document from a RAG
rlama list-docs [rag-name]: List all documents in a RAG with details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Documentation Updates:

Command Additions and Modifications:

Service and Domain Changes:

Uh oh!

Uh oh!

Improvements to DocumentLoader:

Enhancements to HNSWStore:

Uh oh!

Hybrid Store Integration:

Metadata Handling:

Embedding Cache:

Codebase Enhancements:

New Vector Store Implementation:

Uh oh!

Documentation Updates:

New Commands:

Document Loader Enhancements:

Codebase Improvements:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add fallback for the embedding model when bge-m3 is unavailable

Problem Solved

Implemented Solution

Benefits

Tests Performed

Uh oh!

Features Added

Model Update

Document Management

Size Reporting

Commands Added

Uh oh!

Releases: DonTizi/rlama

Release v0.1.29

Documentation Updates:

Command Additions and Modifications:

Service and Domain Changes:

Uh oh!

Release v0.1.28

Uh oh!

Release v0.1.27

Improvements to DocumentLoader:

Enhancements to HNSWStore:

Uh oh!

Release v0.1.26

Hybrid Store Integration:

Metadata Handling:

Embedding Cache:

Codebase Enhancements:

New Vector Store Implementation:

Uh oh!

Release v0.1.25

Documentation Updates:

New Commands:

Document Loader Enhancements:

Codebase Improvements:

Uh oh!

Release v0.1.24

Uh oh!

Release v0.1.23

Uh oh!

Release v0.1.22

Uh oh!

Release v0.1.21

Add fallback for the embedding model when bge-m3 is unavailable

Problem Solved

Implemented Solution

Benefits

Tests Performed

Uh oh!

Release v0.1.2

Features Added

Model Update

Document Management

Size Reporting

Commands Added

Uh oh!