8000 GitHub - marcusrprojects/audiobook-generator: Generate audiobooks from PDFs, EPUBs, and text files using local Piper TTS models.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

marcusrprojects/audiobook-generator

Repository files navigation

📖 Audiobook Generator

Latest Release Python Version

Turn any PDF, EPUB, or TXT file into a natural-sounding audiobook using Piper TTS. Everything runs locally — no internet connection or server required after setup.


🚀 TL;DR Quick Start

For those familiar with Python development:

# 1. Clone the repository
git clone https://github.com/marcusrprojects/audiobook-generator.git
cd audiobook-generator

# 2. Create & activate virtual environment (macOS/Linux example)
python3 -m venv .venv && source .venv/bin/activate

# 3. Install Piper TTS packages manually first
pip install piper-tts --no-deps
pip install piper-phonemize-cross

# 4. Install the package and its other dependencies
pip install . # Installs audiobook-generator and requirements.txt deps

# 5. Download default English voice models (requires bash/curl)
bash download_voices.sh

# 6. Generate an audiobook!
audiobook-gen path/to/your_book.epub path/to/output/audio.mp3

Windows Users: Use PowerShell/CMD for venv activation (see detailed steps below). For download_voices.sh, use WSL2, Git Bash, or download models manually.


🚀 Features

  • 📚 Supports PDF, EPUB, and plain text formats
  • 🗣️ Multiple Piper TTS voices (American and British English)
  • ⏸️ Smart pauses after sentences, commas, and paragraphs
  • 🛠️ Text normalization (expand numbers, abbreviations)
  • 🎵 Outputs standard WAV or MP3 files
  • 🏷️ Add metadata (title, artist) to MP3 files
  • 💻 100% offline — no server, no data collection
  • ⚙️ Simple command-line interface

🛠 Installation

Follow these steps to set up the project and its dependencies.

  1. Clone the repository:

    git clone https://github.com/marcusrprojects/audiobook-generator.git
    cd audiobook-generator
  2. Create and activate a virtual environment (recommended):

    # macOS / Linux
    python3 -m venv .venv
    source .venv/bin/activate
    
    # Windows (Command Prompt)
    python -m venv .venv
    .venv\Scripts\activate.bat
    
    # Windows (PowerShell)
    python -m venv .venv
    .venv\Scripts\Activate.ps1
  3. Install Piper TTS Packages Manually: These specific packages often need manual installation first due to their dependencies or naming conventions.

    pip install piper-tts --no-deps
    pip install piper-phonemize-cross
  4. Install the remaining dependencies:

    pip install -r requirements.txt
  5. Install the audiobook-generator package: This step reads setup.py and makes the audiobook-gen command available in your active environment.

    pip install .
  6. Download voice models: See the "Downloading Voice Models" section below for details. The easiest way is using the provided script (requires bash and curl):

    bash download_voices.sh

(If bash is unavailable, see manual download instructions below).

Note: requirements.txt includes all non-Piper dependencies. The two Piper packages must be installed manually due to naming differences.


📥 Downloading Voice Models

This tool uses Piper TTS models (ONNX format) for speech synthesis.

You can download free voice models from HuggingFace Piper Voices.

Easy method (recommended)

Run the included script:

bash download_voices.sh

Manual method (example for Joe)

mkdir -p voices/en_US/joe-medium
cd voices/en_US/joe-medium
curl -L -o en_US-joe-medium.onnx \
  https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/joe/medium/en_US-joe-medium.onnx
curl -L -o en_US-joe-medium.onnx.json \
  https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/joe/medium/en_US-joe-medium.onnx.json
cd ../../../..

Recommended Voices:

Voice Path Notes
LibriTTS R (US, medium) voices/en_US/libritts_r-medium/ Neutral American narrator
Joe (US, medium) voices/en_US/joe-medium/ Clear American male
Cori (UK, high quality) voices/en_GB/cori-high/ High-quality British female
Jenny Dioco (UK, medium) voices/en_GB/jenny_dioco-medium/ Warm British voice

🌟 Usage (After Installation via pip install .)

Ensure your virtual environment is active (source .venv/bin/activate or .venv\Scripts\activate). You can then use the audiobook-gen command:

audiobook-gen input_file.pdf output_file.mp3

By default, the LibriTTS R (US, medium) voice is used. To specify a different model: (Use the path to the desired .onnx model file)

audiobook-gen input_file.epub output_file.wav \
  --model voices/en_GB/cori-high/en_GB-cori-high.onnx

Specify input format (EPUB) and output format (WAV):

audiobook-gen path/to/novel.epub final_audio.wav

Add metadata (MP3 only):

audiobook-gen book.txt audiobook.mp3 \
  --title "My Audiobook" --artist "Author Name"

List available bundled voice models: (Lists voices defined in the script's AVAILABLE_VOICES list)

audiobook-gen --list-models

Show help message: (Displays all available command-line options)

audiobook-gen --help

📚 Supported Formats

  • .pdf (scanned PDFs with selectable text)
  • .epub (standard e-book format)
  • .txt (plain text files, UTF-8 encoding preferred)

🧐 How It Works

  • Extracts and normalizes text from your document
  • Splits text intelligently by sentences and commas
  • Synthesizes natural-sounding speech with pauses
  • Outputs ready-to-listen WAV or MP3 files

📝 Requirements

  • Python 3.8 or higher
  • ONNX Runtime
  • Piper TTS models
  • Listed in requirements.txt (non-Piper dependencies)
  • Manual install for piper-tts and piper-phonemize-cross

📦 Creating a Standalone Executable (Optional Alternative)

If you want to distribute this tool to users who might not have Python installed, you can bundle it into a single executable file using PyInstaller. This is typically done by the developer for distribution.

Steps to Create the Executable:

  1. Make sure you have followed the Installation steps 1-4 (you need the dependencies installed).
  2. Install PyInstaller in your virtual environment:
pip install pyinstaller
  1. Run PyInstaller from the project's root directory (audiobook-generator/):
pyinstaller --onefile audiobook_generator.py

This process can take a significant amount of time and requires substantial disk space. It analyzes dependencies and bundles everything. The final executable will be placed inside a new dist/ folder.

Running the Standalone Executable:

Once built, the executable in the dist folder can be run directly without needing Python or the virtual environment.

# Example on macOS/Linux:
cd dist
./audiobook_generator ../path/to/book.pdf output.mp3 --model ../voices/en_US/joe-medium/en_US-joe-medium.onnx

# Example on Windows:
cd dist
.\audiobook_generator.exe ..\path\to\book.pdf output.mp3 --model ..\voices\en_US\joe-medium\en_US-joe-medium.onnx

Note: Paths to input files and models are relative to where you run the command.


✨ Future Plans

  • Batch conversion
  • Desktop GUI version (Tauri or Electron)
  • Additional language support

📩 Contact

Open an issue on GitHub.


🏁 Let's turn your books into audiobooks!

About

Generate audiobooks from PDFs, EPUBs, and text files using local Piper TTS models.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published
0