🧠 xRAG: Token-Compressed, Quantized, Deployable RAG Pipeline for Local & Edge Systems

“Compact brains, sharp answers.”
A unified, efficient RAG pipeline optimized for edge & local deployments.

🔍 What is xRAG?

xRAG (eXtreme Retrieval Augmented Generation) is a token-compressed, quantized, and fully local RAG system. It’s engineered for low-latency, on-device question answering and document retrieval—perfect for applications that demand speed, privacy, and portability.

📚 Inspired By

This project draws from cutting-edge RAG and efficient model deployment research:

xRAG (2022): 1-token RAG with 17x compression using MLP bridges and I-token projection.

⚡️ Problem Statement

Most RAG pipelines:

Are too bulky for local use
Depend on cloud services
Struggle with low-latency, high-efficiency deployment

xRAG aims to solve this by going fully local, compressing input, and minimizing memory & compute overhead.

🏗️ Architecture Overview

🔥 Features

✅ Fully Local: Works offline, on-device, no cloud required.
🔻 Token Compression: Fewer tokens, faster inference.
⚡ Quantized Models: Lightweight, edge-ready transformers.
🧩 Modular Components: Swap retrievers, bridges, and generators.
📚 Document-Aware: Pulls relevant context before answering.
🧠 Context Memory: Maintains logical conversation threads.

🔍 Use Cases

🧑‍🏫 AI Tutors (offline, classroom-ready)
🔐 Private assistants (no cloud dependencies)
🧭 Search across personal documents
🧠 Lightweight research copilots

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
rag_pipeline		rag_pipeline
.gitattributes		.gitattributes
README.md		README.md
main.py		main.py
processed_data.json		processed_data.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 xRAG: Token-Compressed, Quantized, Deployable RAG Pipeline for Local & Edge Systems

🔍 What is xRAG?

📚 Inspired By

⚡️ Problem Statement

🏗️ Architecture Overview

🔥 Features

🔍 Use Cases

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

pyschofives/xRAG-LLM

Folders and files

Latest commit

History

Repository files navigation

🧠 xRAG: Token-Compressed, Quantized, Deployable RAG Pipeline for Local & Edge Systems

🔍 What is xRAG?

📚 Inspired By

⚡️ Problem Statement

🏗️ Architecture Overview

🔥 Features

🔍 Use Cases

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages