Serge - LLaMA made easy 🦙

Serge is a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!

🌐 SvelteKit frontend
💾 Redis for storing chat history & parameters
⚙️ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

🎥 Demo:

demo.webm

⚡️ Quick start

🐳 Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

🐙 Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008/, You can find the API documentation at http://localhost:8008/api/docs

🖥️ Windows Setup

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

☁️ Kubernetes & Docker Compose Setup

Instructions for setting up Serge on Kubernetes can be found in the wiki.

🧠 Supported Models

We currently support the following models:

Airoboros 🎈
- Airoboros-7B
- Airoboros-13B
- Airoboros-30B
- Airoboros-65B
Alpaca 🦙
- Alpaca-LoRA-65B
- GPT4-Alpaca-LoRA-30B
Chronos 🌑
- Chronos-13B
- Chronos-33B
- Chronos-Hermes-13B
GPT4All 🌍
- GPT4All-13B
Guanaco 🦙
- Guanaco-7B
- Guanaco-13B
- Guanaco-33B
- Guanaco-65B
Koala 🐨
- Koala-7B
- Koala-13B
Llama 🦙
- FinLlama-33B
- Llama-Supercot-30B
Lazarus 💀
- Lazarus-30B
Nous 🧠
- Nous-Hermes-13B
OpenAssistant 🎙️
- OpenAssistant-30B
Samantha 👩
- Samantha-7B
- Samantha-13B
- Samantha-33B
Tulu 🎚
- Tulu-7B
- Tulu-13B
- Tulu-30B
Vicuna 🦙
- Stable-Vicuna-13B
- Vicuna-CoT-7B
- Vicuna-CoT-13B
- Vicuna-v1.1-7B
- Vicuna-v1.1-13B
- VicUnlocked-30B
- VicUnlocked-65B
Wizard 🧙
- Wizard-Mega-13B
- Wizard-Vicuna-Uncensored-7B
- Wizard-Vicuna-Uncensored-13B
- Wizard-Vicuna-Uncensored-30B
- WizardLM-30B
- WizardLM-Uncensored-7B
- WizardLM-Uncensored-13B
- WizardLM-Uncensored-30B

Additional weights can be added to the serge_weights volume using docker cp:

docker cp ./my_weight.bin serge:/usr/src/app/weights/

⚠️ Memory Usage

LLaMA will crash if you don't have enough available memory for the model:

Model	RAM Required
7B	4.5GB
7B-q6_K	8.03GB
13B	12GB
13B-q6_K	13.18GB
30B	20GB
30B-q6_K	29.19GB

💬 Support

Need help? Join our Discord

🤝 Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
DOCKER_BUILDKIT=1 docker compose -f docker-compose.dev.yml up -d --build

Name		Name	Last commit message	Last commit date
Latest commit History 329 Commits
.github		.github
api		api
config		config
scripts		scripts
web		web
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Serge - LLaMA made easy 🦙

⚡️ Quick start

🖥️ Windows Setup

☁️ Kubernetes & Docker Compose Setup

🧠 Supported Models

⚠️ Memory Usage

💬 Support

🤝 Contributing

About

Uh oh!

Releases

Packages

Languages

License

fajarlabs/serge

Folders and files

Latest commit

History

Repository files navigation

Serge - LLaMA made easy 🦙

⚡️ Quick start

🖥️ Windows Setup

☁️ Kubernetes & Docker Compose Setup

🧠 Supported Models

⚠️ Memory Usage

💬 Support

🤝 Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages