8000 GitHub - innersanctumtech/whisper-portable: A self-contained, portable version of whisper.cpp focusing on using Apple Silicon's GPU
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

innersanctumtech/whisper-portable

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisper-Portable for macOS

A self-contained, portable version of whisper.cpp optimized for Apple Silicon through Metal GPU acceleration. This project provides a simple, user-friendly way to run Whisper speech recognition on macOS without complex installation or configuration.

Why This Project Exists

While whisper.cpp provides excellent performance as a C++ port of OpenAI's Whisper model, setting it up with proper GPU acceleration on macOS can be challenging. This project addresses several key issues:

  1. Metal GPU Acceleration: Pre-configured to use Apple's Metal framework, providing up to 5x faster inference on Apple Silicon Macs
  2. Self-Contained: All libraries and dependencies are packaged together, with no external installations required
  3. Easy Model Management: Simple model selection interface with automatic downloading
  4. Library Path Handling: Automatically fixes dynamic library paths for portability
  5. User-Friendly Interface: Simple menu-driven interface requiring no command line expertise

Features

  • Metal-Accelerated: Optimized for Apple Silicon through Metal GPU acceleration
  • Interactive Model Selection: Choose from various Whisper models with different size/quality tradeoffs
  • Smart Model Management: Models are stored in ~/.models/whisper by default (configurable in the script)
  • Automatic Library Management: Dynamic libraries are properly linked regardless of where you place the folder
  • Microphone Support: Process live audio from your microphone in real-time
  • File Processing: Process audio files with the same interface

Comparison to whisper.cpp

This project is based on whisper.cpp, with these key differences:

Feature whisper.cpp whisper-portable
Installation Requires manual compilation Pre-compiled binaries included
GPU Acceleration Requires manual Metal setup Pre-configured Metal acceleration
Library Paths Fixed at build time Automatically adjusted for portability
Model Management Manual download Integrated download manager
User Interface Command-line only Interactive menu system
Target Platform Cross-platform macOS-optimized

Installation

Simply download and extract the archive:

# Clone the repository
git clone https://github.com/innersanctumtech/whisper-portable.git

# Enter the directory  
cd whisper-portable

# Make the script executable
chmod +x portable-whisper.sh

# Run the script
./portable-whisper.sh

No compilation or additional installation is required.

Usage

# Basic usage - will prompt for model selection
./portable-whisper.sh

# Specify parameters (threads, step size, length)
./portable-whisper.sh -t 4 --step 1000 --length 10000

# Pass additional arguments to whisper-stream
./portable-whisper.sh --language en

Available Models

When running the script, you'll be presented with a selection menu:

Select a Whisper model:
Smaller models are faster but less accurate. English-only models (.en) are faster for English speech.

1) tiny.en (75MB) - Fastest, English-only, lower accuracy
2) tiny (75MB) - Fast, multilingual, lower accuracy
3) base.en (142MB) - Good balance, English-only, decent accuracy
4) base (142MB) - Good balance, multilingual, decent accuracy
5) small.en (466MB) - Better accuracy, English-only
6) small (466MB) - Better accuracy, multilingual
7) medium.en (1.5GB) - High accuracy, English-only
8) medium (1.5GB) - High accuracy, multilingual
9) large-v3 (3GB) - Best accuracy, multilingual

Model Storage

By default, all models are stored in ~/.models/whisper/ to avoid duplicating large model files across multiple installations.

You can modify the MODELS_DIR variable in the portable-whisper.sh script to store models within the project directory instead if you want truly portable operation without writing to the home directory.

Advanced Usage

You can set environment variables to customize behavior:

# Skip the interactive model selection and use a specific model
MODEL=medium.en ./portable-whisper.sh

# Set voice activation mode (step=0)
./portable-whisper.sh --step 0 --vad-threshold 0.6

# Use a specific microphone (if you have multiple)
./portable-whisper.sh -c 1

Troubleshooting

If you encounter library loading issues, the script will automatically run fix-libraries.sh to correct the problem. If you still have issues, try running:

./fix-libraries.sh

Performance

On Apple Silicon Macs, you should see significant performance improvements over CPU-only inference:

  • M1: ~2-5x faster inference
  • M2: ~2-6x faster inference
  • M3: ~3-7x faster inference

How It Works

This project includes:

  1. Pre-compiled binaries: The whisper.cpp executables with Metal support
  2. Library management scripts: To ensure portable operation
  3. User interface scripts: For easy model selection and usage

License

This project is distributed under the same MIT license as whisper.cpp. See the LICENSE file for details.

Acknowledgments

  • whisper.cpp for the excellent C++ implementation
  • OpenAI for the original Whisper model.

About

A self-contained, portable version of whisper.cpp focusing on using Apple Silicon's GPU

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0