Turn any PDF, EPUB, or TXT file into a natural-sounding audiobook using Piper TTS. Everything runs locally — no internet connection or server required after setup.
For those familiar with Python development:
# 1. Clone the repository
git clone https://github.com/marcusrprojects/audiobook-generator.git
cd audiobook-generator
# 2. Create & activate virtual environment (macOS/Linux example)
python3 -m venv .venv && source .venv/bin/activate
# 3. Install Piper TTS packages manually first
pip install piper-tts --no-deps
pip install piper-phonemize-cross
# 4. Install the package and its other dependencies
pip install . # Installs audiobook-generator and requirements.txt deps
# 5. Download default English voice models (requires bash/curl)
bash download_voices.sh
# 6. Generate an audiobook!
audiobook-gen path/to/your_book.epub path/to/output/audio.mp3
Windows Users: Use PowerShell/CMD for venv activation (see detailed steps below). For download_voices.sh
, use WSL2, Git Bash, or download models manually.
- 📚 Supports PDF, EPUB, and plain text formats
- 🗣️ Multiple Piper TTS voices (American and British English)
- ⏸️ Smart pauses after sentences, commas, and paragraphs
- 🛠️ Text normalization (expand numbers, abbreviations)
- 🎵 Outputs standard WAV or MP3 files
- 🏷️ Add metadata (title, artist) to MP3 files
- 💻 100% offline — no server, no data collection
- ⚙️ Simple command-line interface
Follow these steps to set up the project and its dependencies.
-
Clone the repository: 8000 p>
git clone https://github.com/marcusrprojects/audiobook-generator.git cd audiobook-generator
-
Create and activate a virtual environment (recommended):
# macOS / Linux python3 -m venv .venv source .venv/bin/activate # Windows (Command Prompt) python -m venv .venv .venv\Scripts\activate.bat # Windows (PowerShell) python -m venv .venv .venv\Scripts\Activate.ps1
-
Install Piper TTS Packages Manually: These specific packages often need manual installation first due to their dependencies or naming conventions.
pip install piper-tts --no-deps pip install piper-phonemize-cross
-
Install the remaining dependencies:
pip install -r requirements.txt
-
Install the
audiobook-generator
package: This step readssetup.py
and makes theaudiobook-gen
command available in your active environment.pip install .
-
Download voice models: See the "Downloading Voice Models" section below for details. The easiest way is using the provided script (requires
bash
andcurl
):bash download_voices.sh
(If bash
is unavailable, see manual download instructions below).
Note:
requirements.txt
includes all non-Piper dependencies. The two Piper packages must be installed manually due to naming differences.
This tool uses Piper TTS models (ONNX format) for speech synthesis.
You can download free voice models from HuggingFace Piper Voices.
Run the included script:
bash download_voices.sh
mkdir -p voices/en_US/joe-medium
cd voices/en_US/joe-medium
curl -L -o en_US-joe-medium.onnx \
https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/joe/medium/en_US-joe-medium.onnx
curl -L -o en_US-joe-medium.onnx.json \
https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/joe/medium/en_US-joe-medium.onnx.json
cd ../../../..
Recommended Voices:
Voice | Path | Notes |
---|---|---|
LibriTTS R (US, medium) | voices/en_US/libritts_r-medium/ |
Neutral American narrator |
Joe (US, medium) | voices/en_US/joe-medium/ |
Clear American male |
Cori (UK, high quality) | voices/en_GB/cori-high/ |
High-quality British female |
Jenny Dioco (UK, medium) | voices/en_GB/jenny_dioco-medium/ |
Warm British voice |
Ensure your virtual environment is active (source .venv/bin/activate
or .venv\Scripts\activate
). You can then use the audiobook-gen command:
audiobook-gen input_file.pdf output_file.mp3
By default, the LibriTTS R (US, medium) voice is used. To specify a different model:
(Use the path to the desired .onnx
model file)
audiobook-gen input_file.epub output_file.wav \
--model voices/en_GB/cori-high/en_GB-cori-high.onnx
Specify input format (EPUB) and output format (WAV):
audiobook-gen path/to/novel.epub final_audio.wav
Add metadata (MP3 only):
audiobook-gen book.txt audiobook.mp3 \
--title "My Audiobook" --artist "Author Name"
List available bundled voice models:
(Lists voices defined in the script's AVAILABLE_VOICES
list)
audiobook-gen --list-models
Show help message: (Displays all available command-line options)
audiobook-gen --help
.pdf
(scanned PDFs with selectable text).epub
(standard e-book format).txt
(plain text files, UTF-8 encoding preferred)
- Extracts and normalizes text from your document
- Splits text intelligently by sentences and commas
- Synthesizes natural-sounding speech with pauses
- Outputs ready-to-listen WAV or MP3 files
- Python 3.8 or higher
- ONNX Runtime
- Piper TTS models
- Listed in
requirements.txt
(non-Piper dependencies) - Manual install for
piper-tts
andpiper-phonemize-cross
If you want to distribute this tool to users who might not have Python installed, you can bundle it into a single executable file using PyInstaller. This is typically done by the developer for distribution.
Steps to Create the Executable:
- Make sure you have followed the Installation steps 1-4 (you need the dependencies installed).
- Install PyInstaller in your virtual environment:
pip install pyinstaller
- Run PyInstaller from the project's root directory (
audiobook-generator/
):
pyinstaller --onefile audiobook_generator.py
This process can take a significant amount of time and requires substantial disk space. It analyzes dependencies and bundles everything. The final executable will be placed inside a new dist/
folder.
Running the Standalone Executable:
Once built, the executable in the dist
folder can be run directly without needing Python or the virtual environment.
# Example on macOS/Linux:
cd dist
./audiobook_generator ../path/to/book.pdf output.mp3 --model ../voices/en_US/joe-medium/en_US-joe-medium.onnx
# Example on Windows:
cd dist
.\audiobook_generator.exe ..\path\to\book.pdf output.mp3 --model ..\voices\en_US\joe-medium\en_US-joe-medium.onnx
Note: Paths to input files and models are relative to where you run the command.
- Batch conversion
- Desktop GUI version (Tauri or Electron)
- Additional language support
Open an issue on GitHub.