8000 GitHub - fajarlabs/serge: A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
forked from serge-chat/serge

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

License

Notifications You must be signed in to change notification settings

fajarlabs/serge

 
 

Repository files navigation

Serge - LLaMA made easy 🦙

License Discord

Serge is a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!

  • 🌐 SvelteKit frontend
  • 💾 Redis for storing chat history & parameters
  • ⚙️ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

🎥 Demo:

demo.webm

⚡️ Quick start

🐳 Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

🐙 Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008/, You can find the API documentation at http://localhost:8008/api/docs

🖥️ Windows Setup

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

☁️ Kubernetes & Docker Compose Setup

Instructions for setting up Serge on Kubernetes can be found in the wiki.

🧠 Supported Models

We currently support the following models:

  • Airoboros 🎈
    • Airoboros-7B
    • Airoboros-13B
    • Airoboros-30B
    • Airoboros-65B
  • Alpaca 🦙
    • Alpaca-LoRA-65B
    • GPT4-Alpaca-LoRA-30B
  • Chronos 🌑
    • Chronos-13B
    • Chronos-33B
    • Chronos-Hermes-13B
  • GPT4All 🌍
    • GPT4All-13B
  • Guanaco 🦙
    • Guanaco-7B
    • Guanaco-13B
    • Guanaco-33B
    • Guanaco-65B
  • Koala 🐨
    • Koala-7B
    • Koala-13B
  • Llama 🦙
    • FinLlama-33B
    • Llama-Supercot-30B
  • Lazarus 💀
    • Lazarus-30B
  • Nous 🧠
    • Nous-Hermes-13B
  • OpenAssistant 🎙️
    • OpenAssistant-30B
  • Samantha 👩
    • Samantha-7B
    • Samantha-13B
    • Samantha-33B
  • Tulu 🎚
    • Tulu-7B
    • Tulu-13B
    • Tulu-30B
  • Vicuna 🦙
    • Stable-Vicuna-13B
    • Vicuna-CoT-7B
    • Vicuna-CoT-13B
    • Vicuna-v1.1-7B
    • Vicuna-v1.1-13B
    • VicUnlocked-30B
    • VicUnlocked-65B
  • Wizard 🧙
    • Wizard-Mega-13B
    • Wizard-Vicuna-Uncensored-7B
    • Wizard-Vicuna-Uncensored-13B
    • Wizard-Vicuna-Uncensored-30B
    • WizardLM-30B
    • WizardLM-Uncensored-7B
    • WizardLM-Uncensored-13B
    • WizardLM-Uncensored-30B

Additional weights can be added to the serge_weights volume using docker cp:

docker cp ./my_weight.bin serge:/usr/src/app/weights/

⚠️ Memory Usage

LLaMA will crash if you don't have enough available memory for the model:

Model RAM Required
7B 4.5GB
7B-q6_K 8.03GB
13B 12GB
13B-q6_K 13.18GB
30B 20GB
30B-q6_K 29.19GB

💬 Support

Need help? Join our Discord

🤝 Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
DOCKER_BUILDKIT=1 docker compose -f docker-compose.dev.yml up -d --build

About

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 50.6%
  • Svelte 40.4%
  • TypeScript 2.5%
  • Dockerfile 2.1%
  • JavaScript 1.9%
  • Shell 1.0%
  • Other 1.5%
0