8000 GitHub - usrbinbrain/kokoro-tts-container: A Docker container for running Kokoro Text-to-Speech engine v.1, providing high-quality speech synthesis
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

usrbinbrain/kokoro-tts-container

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
< 8000 /div>
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kokoro TTS Container

Docker Python 3.12 ONNX Build and Push kokoro-tts-container

A Docker container for running Kokoro Text-to-Speech engine v.1, providing high-quality speech synthesis with 54 voices and 9 languages options.

Container Features

  • High-quality text-to-speech synthesis
  • Multiple voice and languages options
  • Voice blending capabilities
  • Adjustable speech speed
  • Support for .mp3 and .wav output files

Quick Start Using Docker Hub Image

You can directly pull and run the pre-built container from Docker Hub without building locally:

# Pull the latest image
docker pull usrbinbrain/kokoro-tts-container:latest

# Run a basic example
docker run --rm -v $(pwd):/app/shared usrbinbrain/kokoro-tts-container \
    "Hello world!" \
    output.mp3 \
    --voice "af_sarah" \
    --speed 1.0 \
    --lang "en-us"

This way you can use Kokoro-TTS instantly without worrying about setup or build steps.

Local Setup && Build

Building your kokoro-tts Docker image:

# Install requirements for setup
pip3 install -r requirements.txt

# Run setup to donwload model and gerenate voices bin file
python3 setup.py

# Build your kokoro-tts image
docker build -t kokoro-tts-container .

Usage

Basic Usage Examples

Run the container with a single voice.

The command below generates an output.mp3 file, where af_sarah voice says "Hello my friend!" in English (US) with speed 1.2

docker run --rm -v $(pwd):/app/shared kokoro-tts-container \
    "Hello my friend!" \
    output.mp3 \
    --voice "af_sarah" \
    --speed 1.2 \
    --lang "en-us"

Voice Blending

Kokoro-TTS supports voice blending, allowing you to mix multiple voices with different weights.

The command below generates an output.wav file with combined voices, where af_sarah contributes 40% and am_adam contributes 60% to the final voice saying "Hasta la vista!" in Spanish with speed 0.8

docker run --rm -v $(pwd):/app/shared kokoro-tts-container \
    "Hasta la vista!" \
    output.wav \
    --voice "af_sarah:40,am_adam:60" \
    --speed 0.8 \
    --lang "es"

Container Parameters

Parameter Description Default
input_text The text to synthesize Required
output_file Output audio filename (.wav or .mp3) Required
--voice Voice ID or blend (format: voice1:weight,voice2:weight) af_sarah
--speed Speech rate multiplier, allows 0.5 to 2.0 1.0
--lang Language code en-us

Supported Languages and Codes

  • en-us: English (US)
  • en-gb: English (British)
  • fr-fr: French
  • ja: Japanese
  • hi: Hindi
  • cmn: Mandarin Chinese
  • es: Spanish
  • pt-br: Brazilian Portuguese
  • it: Italian

Available Voices

The container includes multiple voices for different languages, for a complete list of voices or another help, run:

docker run --rm kokoro-tts-container --help

Thanks

Built with ❤️ on top of Kokoro ONNX - A special thanks to thewh1teagle and hexgrad for providing this amazing fast TTS engine that made this container project possible.

About

A Docker container for running Kokoro Text-to-Speech engine v.1, providing high-quality speech synthesis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0