A modern educational database engine implementation in Rust, inspired by SQLite and DuckDB. Built to teach database internals while leveraging Rust's safety and performance features.
CadiaDB is a simple but powerful database engine that implements core database concepts:
- Page-based storage engine
- Buffer pool management
- Table and record management
- B-Tree indexing
- Basic query processing
- Transaction management
- Column-oriented storage support
- Initial project structure
- Basic error handling
- Design document for storage engine
- Test infrastructure setup
The foundation of our database. We'll implement:
- Project setup
- Page structure
- Fixed-size pages (4KB)
- Page header with metadata
- Data region
- Slotted page layout for records
- Disk manager
- Page allocation
- Read/write operations
- Free page management
- Page cache interface
- File format specification
Memory management layer that sits between storage and table operations:
- Buffer pool manager
- Page table (mapping page_id -> frame_id)
- Frame allocation strategy
- Page replacement (LRU)
- Clock algorithm implementation
- Victim selection
- Dirty page handling
- Write-back policy
- Background flusher
- Pin counting
- Reference counting
- Deadlock prevention
Data organization and access methods:
- Table metadata
- Schema definition
- Column types
- Constraints
- Record formats
- Fixed-length records
- Variable-length records
- NULL handling
- CRUD operations
- Record insertion
- Record deletion
- Record updates
- Sequential scan
- Page directory
- Free space management
- Record location tracking
B-Tree implementation for fast data access:
- B-Tree implementation
- Node structure
- Tree traversal
- Rebalancing
- Key operations
- Binary search
- Range scans
- Bulk loading
- Page splits and merges
- Split algorithms
- Merge strategies
- Redistribution
- Iterator implementation
- Forward scanning
- Backward scanning
- Range iteration
SQL parsing and execution:
- Query parser
- Lexer implementation
- Parser implementation
- AST definition
- Execution engine
- Plan nodes
- Expression evaluation
- Join algorithms
- Basic SQL support
- SELECT implementation
- WHERE clause handling
- JOIN support
- Aggregations
ACID guarantees and concurrency:
- Transaction manager
- Transaction states
- Savepoints
- Rollback support
- Lock-based concurrency
- Lock manager
- Deadlock detection
- Isolation levels
- Write-ahead logging
- Log records
- Checkpointing
- Recovery algorithms
- Recovery
- ARIES implementation
- Crash recovery
- Media recovery
- Query optimization
- Cost-based optimization
- Statistics collection
- Plan caching
- Column store support
- Column compression
- Vectorized execution
- Late materialization
- Parallel execution
- Parallel scan
- Parallel join
- Task scheduling
- Build the project:
# Clone the repository
git clone https://github.com/yourusername/cadiadb.git
cd cadiadb
# Build in debug mode
cargo build
# Build in release mode
cargo build --release
- Run the tests:
# Run all tests
cargo test
# Run tests with output
cargo test -- --nocapture
# Run specific test module
cargo test storage::page::tests
- Development Setup:
# Install helpful tools
cargo install cargo-watch # For automatic rebuilds
cargo install cargo-edit # For dependency management
cargo install cargo-audit # For security audits
# Set up git hooks (optional)
cargo install cargo-husky
src/
├── storage/ # Storage engine implementation
│ ├── page.rs # Page structure
│ └── disk_manager.rs # Disk I/O
├── buffer/ # Buffer pool management
├── table/ # Table and record management
├── index/ # B-Tree implementation
├── query/ # Query processing
├── transaction/ # Transaction handling
└── error.rs # Error types
- Pick a task from the Learning Roadmap
- Create a new branch for the feature
- Write tests first (TDD approach)
- Implement the feature
- Run tests and benchmarks
- Submit a PR for review
This is an educational project. Feel free to:
- Submit bug fixes
- Add new features
- Improve documentation
- Share learning resources
MIT License - See LICENSE file for details