- Import PDFs/Images: Drag & drop manga PDFs/images to add to your library
- Dialogue OCR: Dialogue text recognition in the background (using manga-ocr)
- Grammar Breakdown: Break down the dialogue into tokens with furigana and part-of-speech highlighting (using ichiran)
- AI-driven Q&A: Select words and ask Komi (our kanban girl) to explain them in the context of the dialogue
- Responsive UI: Works on both desktop and mobile
- 3 Reading Modes: RTL/LTR page navigation and vertical scrolling
- PWA Support: Can be added to home screen as an app on both iOS and Android
- Frontend: React 18 + Vite + shadcn/ui + framer-motion
- Backend: Elysia + Bun + SQLite + Prisma
- Japanese Tokenization: ichiran
- OCR Inference: FastAPI + manga-ocr + comic-text-detector
Linux / WSL
- Install Docker following the instructions on the official website.
Start Docker:
systemctl start docker
# or
sudo /etc/init.d/docker start
Install Node.js 18+ following the instructions on the official website.
And finally, install PyTorch following the instructions on the official website.
- Run the following commands to install Git LFS, Bun, and ImageMagick.
# Install Git LFS
sudo apt install git-lfs
# Install Bun
curl -fsSL https://bun.sh/install | bash
source ~/.bashrc
# Install ImageMagick
sudo apt install imagemagick graphicsmagick
- Clone the repository and install dependencies.
# Setup project (Important: --recurse-submodules to clone submodules in the packages folder!)
git clone --recurse-submodules https://github.com/TigerHix/komu.git
cd komu
# Install frontend & backend dependencies
cd frontend && bun install && cd ..
cd backend
cp .env.example .env
bun install && bunx prisma generate && bunx prisma db push
sudo apt-get install -y poppler-utils
cd ..
# Create Python virtual environment for inference service
# Need to create at the project root directory!
python -m venv .venv
source .venv/bin/activate
cd inference
cp .env.example .env
pip install -r requirements.txt
cd ..
- To use the LLM-powered features, request an API key from OpenRouter and add it to the
backend/.env
file. You may also customize the LLM model (CHAT_MODEL
,SEARCH_MODEL
).
Windows & macOS
TODO. Should be highly similar to the above.
Linux / WSL
# Terminal 1: Frontend
cd frontend && bun run dev
# Terminal 2: Backend
cd backend && bun run dev
# Terminal 3: Inference Service
cd inference && bash ./start_service.sh
In your browser, go to http://localhost:5847
to access Komu.
We recommend using a VPN such as Tailscale to access the application from your other devices.
This project was pretty much vibe-coded in a few days so beware of the code quality! We invite you to vibe together using Claude Code. Just don't spend too much money on it...
Everything except Komi's visual assets are licensed under the GPL-3.0.
Character Designer & Illustrator: @凛星Rin