Spring AI RAG Tutorial - Lecture 5: Retrieval-Augmented Generation

🎓 Part of Spring AI Mastery Course: Building Intelligent Applications with OpenAI

📚 About This Project

This is an educational project accompanying Lecture 5 of the Spring AI Mastery course. It demonstrates how to build a production-ready Retrieval-Augmented Generation (RAG) system using Spring Boot and Spring AI.

🎯 Learning Objectives

By working through this project, you will:

Understand why RAG is essential for overcoming LLM limitations
Build a complete document Q&A system from scratch
Master vector stores, embeddings, and semantic search
Compare different RAG implementation approaches
Apply best practices for production RAG systems

🌳 Branch-Based Learning Structure

This project uses a progressive branch structure where each branch builds upon the previous one. Start from the beginning and work your way up!

Branch Progression

Branch	Topic	What You'll Learn
`01-rag-introduction`	~~RAG Foundations~~	• Project setup with Spring AI • Understanding RAG concepts • Basic configuration
`02-vector-store-setup`	Vector Store Setup	• PostgreSQL with PGVector • Vector store configuration • Embedding preparation
`03-document-ingestion-and-splitting`	ETL Pipeline	• Document loading strategies • Text chunking with TokenTextSplitter • Duplicate prevention
`04-add-controller-and-service`	Core Architecture	• Service layer pattern • REST endpoints • Prompt templates
`05-query-vector-store`	RAG Implementation	• Similarity search • Context injection • Comparing LLM vs RAG responses
`06-embedding-and-prompt-clarification`	Production Features	• Admin controls • Vector store management • Prompt optimization
`07-qa-service-with-questionansweradvisor`	Spring AI Advisors	• QuestionAnswerAdvisor pattern • Simplified RAG flow • ChatClient configuration
`08-retrieval-augmentation-advance` 🚧	Advanced RAG	• Query transformation • Document post-processing • Multi-stage retrieval
`09-vectorstore-metadata-and-filtering` 🚧	Metadata & Filtering	• Dynamic filtering • Metadata strategies • Advanced search patterns

🚧 = Branches under development for Spring AI 1.0.0

🚀 Getting Started

Prerequisites

Java 21 or higher
PostgreSQL 14+ with PGVector extension
OpenAI API Key
Maven 3.8+

Quick Start

Clone the repository
```
git clone <repository-url>
cd L5RAG
```
Start from the beginning
```
git checkout 01-rag-introduction
```

Set up PostgreSQL with PGVector

# Using Docker
docker run -it --rm --name postgres \
  -p 5432:5432 \
  -e POSTGRES_USER=postgres \
  -e POSTGRES_PASSWORD=postgres \
  pgvector/pgvector

Configure environment variables

export SPRING_AI_OPENAI_API_KEY=your-api-key-here

Run the application
```
./mvnw spring-boot:run
```

📖 How to Use This Tutorial

1. Follow the Branch Order

Each branch builds on the previous one. Don't skip ahead!

# See all branches
git branch -a

# Move to next branch
git checkout 02-vector-store-setup

2. Read the Documentation

Each major feature has detailed documentation:

Document	Description
RAG Concepts	Understanding RAG and its importance
Vector Stores Explained	How vector databases enable semantic search
Document Processing	ETL pipeline and chunking strategies
Implementation Patterns	Comparing different RAG approaches
Spring AI Advisors	Using Spring AI's built-in RAG patterns
Production Best Practices	Performance, security, and monitoring

3. Try the Exercises

Each branch includes hands-on exercises in the exercises/ directory:

exercises/
├── 01-basic-rag/          # Simple RAG implementation
├── 02-custom-splitter/    # Build your own text splitter
├── 03-metadata-filtering/ # Advanced search with filters
└── 04-performance-tuning/ # Optimize your RAG system

🏗️ Architecture Overview

System Components

┌─────────────┐     ┌──────────────┐     ┌──────────────┐
│   Client    │────▶│  Controller  │────▶│   Service    │
└─────────────┘     └──────────────┘     └──────────────┘
                                                  │
                                                  ▼
                    ┌──────────────────────────────────┐
                    │          Spring AI               │
                    ├──────────────┬───────────────────┤
                    │ Embedding    │  Chat Model       │
                    │   Model      │  (OpenAI)         │
                    └──────────────┴───────────────────┘
                            │              ▲
                            ▼              │
                    ┌─────────────┐        │
                    │Vector Store │────────┘
                    │ (PGVector)  │
                    └─────────────┘

Data Flow

Document Ingestion: Load → Chunk → Embed → Store
Query Processing: Question → Embed → Search → Retrieve
Answer Generation: Context + Question → LLM → Response

🔧 Key Technologies

Spring Boot 3.4.4 - Application framework
Spring AI 1.0.0 - AI integration (migration in progress)
PostgreSQL + PGVector - Vector database
OpenAI API - Embeddings and completions
StringTemplate - Prompt templating

📊 API Endpoints

Query Endpoints

Endpoint	Method	Description	Purpose
`/ask-llm`	POST	Direct LLM query	Baseline without RAG
`/ask-rag`	POST	Manual RAG implementation	Full control approach
`/ask-advisor`	POST	Spring AI Advisor	Production-ready pattern
`/ask-combined`	POST	Compare all approaches	Educational comparison

Admin Endpoints

Endpoint	Method	Description
`/admin/reset-vector-store`	POST	Clear and reload vector store
`/admin/stats`	GET	Vector store statistics

🧪 Testing the Application

Example Queries

Test without context (LLM only)

curl -X POST http://localhost:8080/ask-llm \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the plot of Inception?"}'

Test with RAG

curl -X POST http://localhost:8080/ask-rag \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the plot of Inception?"}'

Compare approaches

curl -X POST http://localhost:8080/ask-combined \
  -H "Content-Type: application/json" \
  -d '{"question": "Who directed The Dark Knight?"}'

📈 Performance Considerations

Vector Store Optimization

HNSW Index: Better query performance, slower builds
Chunk Size: 800 tokens (balance context vs precision)
Overlap: 200 tokens (maintain context continuity)
Batch Size: 10,000 documents (configurable)

Cost Management

Monitor embedding API calls
Cache frequently accessed embeddings
Optimize chunk sizes to reduce tokens
Use similarity thresholds wisely

🤝 Contributing

This is an educational project. To add new topics:

Create a new branch following the naming pattern
Build upon the previous branch
Add comprehensive documentation
Inc 7355 lude practical exercises
Update this README

📚 Additional Resources

Spring AI Documentation

Course Materials

❓ Troubleshooting

Common Issues

Vector store not initializing

spring.ai.vectorstore.pgvector.initialize-schema: true

Embedding dimension mismatch
- Ensure consistency between model and vector store
- Default: 1536 for OpenAI text-embedding-3-small
Out of memory with large documents
- Adjust chunk size and batch size
- Increase JVM heap size

📝 License

This educational project is part of the Spring AI Mastery course.

🎯 Next Steps

Complete all branches in order
Try the exercises to reinforce learning
Build your own RAG system using these patterns
Share your learnings with the community

Happy Learning! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.claude		.claude
.mvn/wrapper		.mvn/wrapper
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
MIGRATION-GUIDE-1.0.0.md		MIGRATION-GUIDE-1.0.0.md
README.md		README.md
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom-1.0.0.xml		pom-1.0.0.xml
pom.xml		pom.xml
update-to-spring-ai-1.0.0.sh		update-to-spring-ai-1.0.0.sh

iWaraxe/L5RAG

Folders and files

Latest commit

History

Repository files navigation