A lightweight macOS menu‑bar utility that turns your voice into text and smart edits using the OpenAI API.
- 🔑 Bring Your Own Key: You can use your own keys for OpenAI or Azure OpenAI and configure them directly in the app.
- 🎤 Push‑to‑Talk Transcription: Start/stop recording with Option+S, auto‑paste the transcript.
- ✂️ Smart Text Transformations: Copy selection, speak an instruction with Option+Shift+S, and replace text via GPT‑4o.
- 📋 Clipboard Integration: Seamlessly saves and restores your clipboard.
- 🖼️ Visual Context: Optionally include a screenshot for richer prompts (macOS 13+). Configure via Transcription → Include screenshots in GPT requests.
- 📝 Proofread Transcripts: Automatically proofread and clean up your transcriptions with GPT-4o with image context. Yes, you can now dictate code!
- 🔐 Flexible Deployment: Supports App Store (sandboxed) or Developer ID (hardened runtime) builds.
- 🚀 Minimal Footprint: Runs in the menu bar, no Dock icon.
- 💬 Modal Chat: Press Option+A to record an audio instruction (optionally with selected text & screenshot), then view the AI’s response in a modal dialog with Copy/Close buttons.
- 🍎 Script Automation: Press Option+D to copy the current selection (if any) and include a screenshot, record an audio command, then have GPT‑4o generate and execute AppleScript to automate your Mac, with a preview of the script and its execution result.
- macOS 12.0 (Monterey) or later
- Xcode 14 or later (Swift 5.5+)
- An OpenAI or Azure OpenAI subscription
If you just want to try SpeechCraft, download the latest DMG from the Releases page on GitHub and install it directly. No build tools are required:
https://github.com/esawtooth/SpeechCraft/releases
Developers who wish to build from source can follow the instructions below.
- Clone the repo:
git clone https://github.com/yourorg/SpeechCraft.git cd SpeechCraft
- Open the Xcode project:
open speechcraft/speechcraft/SpeechCraft.xcodeproj
- Select the SpeechCraft scheme, configure your Team under Signing & Capabilities, then Build & Run.
- Entitlements
- App Store: Enable App Sandbox (allow network, microphone).
- Outside Store: Disable sandbox, enable Hardened Runtime.
- Info.plist
NSMicrophoneUsageDescription
: “Recording audio for transcription”NSCameraUsageDescription
: “Screen recording for rich context”LSUIElement
:YES
(hides Dock icon)
- Permissions (System Settings → Privacy & Security)
- Grant Accessibility & Microphone access to SpeechCraft.
- Environment Variables (Xcode Scheme → Run → Arguments → Env Vars)
AZURE_OPENAI_ENDPOINT = https://YOUR_RESOURCE.openai.azure.com/openai/deployments/YOUR_TRANSCRIBE_DEPLOYMENT/audio/transcriptions?api-version=2025-03-01-preview AZURE_OPENAI_CHAT_ENDPOINT = https://YOUR_RESOURCE.openai.azure.com/openai/deployments/YOUR_CHAT_DEPLOYMENT/chat/completions?api-version=2025-03-01-preview AZURE_OPENAI_KEY = <your_api_key>
- Option+S: Start/stop voice recording → automatic transcription & paste.
- Option+Shift+S: Copy selection → record instruction → GPT‑4o applies changes → replaces text.
- Option+A: Start/stop audio instruction recording (captures optional selected text & screenshot) → sends to AI chat → displays the response in a modal dialog with Copy/Close options.
- Option+D: Copy selection (if any) & screenshot → record an audio command → GPT‑4o generates and executes AppleScript to automate your Mac tasks → shows the generated script and execution result.
🟢 Ready | 🔴 Recording | 🔵 Processing
- Fork the repo and create a feature branch.
- Open in Xcode, implement your changes.
- If you have a custom icon (e.g.
icon.png
), you can embed it into your built.app
by running:python3 apply_icon.py /path/to/SpeechCraft.app icon.png
- Run & test locally.
- Submit a pull request with clear commit messages.
- Ensure SwiftLint and pre‑commit hooks pass.
This project is released under the MIT License. See LICENSE for details.