A self-contained, portable version of whisper.cpp optimized for Apple Silicon through Metal GPU acceleration. This project provides a simple, user-friendly way to run Whisper speech recognition on macOS without complex installation or configuration.
While whisper.cpp provides excellent performance as a C++ port of OpenAI's Whisper model, setting it up with proper GPU acceleration on macOS can be challenging. This project addresses several key issues:
- Metal GPU Acceleration: Pre-configured to use Apple's Metal framework, providing up to 5x faster inference on Apple Silicon Macs
- Self-Contained: All libraries and dependencies are packaged together, with no external installations required
- Easy Model Management: Simple model selection interface with automatic downloading
- Library Path Handling: Automatically fixes dynamic library paths for portability
- User-Friendly Interface: Simple menu-driven interface requiring no command line expertise
- Metal-Accelerated: Optimized for Apple Silicon through Metal GPU acceleration
- Interactive Model Selection: Choose from various Whisper models with different size/quality tradeoffs
- Smart Model Management: Models are stored in
~/.models/whisper
by default (configurable in the script) - Automatic Library Management: Dynamic libraries are properly linked regardless of where you place the folder
- Microphone Support: Process live audio from your microphone in real-time
- File Processing: Process audio files with the same interface
This project is based on whisper.cpp, with these key differences:
Feature | whisper.cpp | whisper-portable |
---|---|---|
Installation | Requires manual compilation | Pre-compiled binaries included |
GPU Acceleration | Requires manual Metal setup | Pre-configured Metal acceleration |
Library Paths | Fixed at build time | Automatically adjusted for portability |
Model Management | Manual download | Integrated download manager |
User Interface | Command-line only | Interactive menu system |
Target Platform | Cross-platform | macOS-optimized |
Simply download and extract the archive:
# Clone the repository
git clone https://github.com/innersanctumtech/whisper-portable.git
# Enter the directory
cd whisper-portable
# Make the script executable
chmod +x portable-whisper.sh
# Run the script
./portable-whisper.sh
No compilation or additional installation is required.
# Basic usage - will prompt for model selection
./portable-whisper.sh
# Specify parameters (threads, step size, length)
./portable-whisper.sh -t 4 --step 1000 --length 10000
# Pass additional arguments to whisper-stream
./portable-whisper.sh --language en
When running the script, you'll be presented with a selection menu:
Select a Whisper model:
Smaller models are faster but less accurate. English-only models (.en) are faster for English speech.
1) tiny.en (75MB) - Fastest, English-only, lower accuracy
2) tiny (75MB) - Fast, multilingual, lower accuracy
3) base.en (142MB) - Good balance, English-only, decent accuracy
4) base (142MB) - Good balance, multilingual, decent accuracy
5) small.en (466MB) - Better accuracy, English-only
6) small (466MB) - Better accuracy, multilingual
7) medium.en (1.5GB) - High accuracy, English-only
8) medium (1.5GB) - High accuracy, multilingual
9) large-v3 (3GB) - Best accuracy, multilingual
By default, all models are stored in ~/.models/whisper/
to avoid duplicating large model files across multiple installations.
You can modify the MODELS_DIR
variable in the portable-whisper.sh
script to store models within the project directory instead if you want truly portable operation without writing to the home directory.
You can set environment variables to customize behavior:
# Skip the interactive model selection and use a specific model
MODEL=medium.en ./portable-whisper.sh
# Set voice activation mode (step=0)
./portable-whisper.sh --step 0 --vad-threshold 0.6
# Use a specific microphone (if you have multiple)
./portable-whisper.sh -c 1
If you encounter library loading issues, the script will automatically run fix-libraries.sh
to correct the problem. If you still have issues, try running:
./fix-libraries.sh
On Apple Silicon Macs, you should see significant performance improvements over CPU-only inference:
- M1: ~2-5x faster inference
- M2: ~2-6x faster inference
- M3: ~3-7x faster inference
This project includes:
- Pre-compiled binaries: The whisper.cpp executables with Metal support
- Library management scripts: To ensure portable operation
- User interface scripts: For easy model selection and usage
This project is distributed under the same MIT license as whisper.cpp. See the LICENSE file for details.
- whisper.cpp for the excellent C++ implementation
- OpenAI for the original Whisper model.