embedding-server

A wrapper for Hugging Face sentence transformer models with an OpenAI-compatible API.

This is a wrapper for Hugging Face sentence transformer models. FastAPI is used to implement an HTTP API. The API is compatible with the OpenAI API for text embeddings. A Dockerfile is included to build an image based on Uvicorn with the CPU-only version of PyTorch.

Usage

The models are loaded based on the given MODELS environment variable. Multiple models can be given in a comma-separated string.

Docker

The following command will run the latest prebuilt image:

docker run -it --rm -p 8080:80 ghcr.io/bergos/embedding-server:latest

Open http://localhost:8080/docs in your browser to open the UI to browser the API.

The Dockerfile loads the models during build time. No persistence is required.

To build the image run:

docker build -t embedding-server .

And to spin up a local instance on port 8080:

docker run -it --rm -p 8080:80 embedding-server

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
app		app
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

embedding-server

Usage

Docker

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

bergos/embedding-server

Folders and files

Latest commit

History

Repository files navigation

embedding-server

Usage

Docker

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

Packages