SpeechCraft

A lightweight macOS menu‑bar utility that turns your voice into text and smart edits using the OpenAI API.

Features

🔑 Bring Your Own Key: You can use your own keys for OpenAI or Azure OpenAI and configure them directly in the app.
🎤 Push‑to‑Talk Transcription: Start/stop recording with Option+S, auto‑paste the transcript.
✂️ Smart Text Transformations: Copy selection, speak an instruction with Option+Shift+S, and replace text via GPT‑4o.
📋 Clipboard Integration: Seamlessly saves and restores your clipboard.
🖼️ Visual Context: Optionally include a screenshot for richer prompts (macOS 13+). Configure via Transcription → Include screenshots in GPT requests.
📝 Proofread Transcripts: Automatically proofread and clean up your transcriptions with GPT-4o with image context. Yes, you can now dictate code!
🔐 Flexible Deployment: Supports App Store (sandboxed) or Developer ID (hardened runtime) builds.
🚀 Minimal Footprint: Runs in the menu bar, no Dock icon.
💬 Modal Chat: Press Option+A to record an audio instruction (optionally with selected text & screenshot), then view the AI’s response in a modal dialog with Copy/Close buttons.
🍎 Script Automation: Press Option+D to copy the current selection (if any) and include a screenshot, record an audio command, then have GPT‑4o generate and execute AppleScript to automate your Mac, with a preview of the script and its execution result.

Requirements

macOS 12.0 (Monterey) or later
Xcode 14 or later (Swift 5.5+)
An OpenAI or Azure OpenAI subscription

Getting the App

If you just want to try SpeechCraft, download the latest DMG from the Releases page on GitHub and install it directly. No build tools are required:

https://github.com/esawtooth/SpeechCraft/releases

Developers who wish to build from source can follow the instructions below.

Installation

Clone the repo:

git clone https://github.com/yourorg/SpeechCraft.git
cd SpeechCraft

Open the Xcode project:

open speechcraft/speechcraft/SpeechCraft.xcodeproj

Select the SpeechCraft scheme, configure your Team under Signing & Capabilities, then Build & Run.

Configuration

Entitlements
- App Store: Enable App Sandbox (allow network, microphone).
- Outside Store: Disable sandbox, enable Hardened Runtime.
Info.plist
- NSMicrophoneUsageDescription: “Recording audio for transcription”
- NSCameraUsageDescription: “Screen recording for rich context”
- LSUIElement: YES (hides Dock icon)
Permissions (System Settings → Privacy & Security)
- Grant Accessibility & Microphone access to SpeechCraft.

Environment Variables (Xcode Scheme → Run → Arguments → Env Vars)

AZURE_OPENAI_ENDPOINT            = https://YOUR_RESOURCE.openai.azure.com/openai/deployments/YOUR_TRANSCRIBE_DEPLOYMENT/audio/transcriptions?api-version=2025-03-01-preview
AZURE_OPENAI_CHAT_ENDPOINT       = https://YOUR_RESOURCE.openai.azure.com/openai/deployments/YOUR_CHAT_DEPLOYMENT/chat/completions?api-version=2025-03-01-preview
AZURE_OPENAI_KEY                 = <your_api_key>

Usage

Option+S: Start/stop voice recording → automatic transcription & paste.
Option+Shift+S: Copy selection → record instruction → GPT‑4o applies changes → replaces text.
Option+A: Start/stop audio instruction recording (captures optional selected text & screenshot) → sends to AI chat → displays the response in a modal dialog with Copy/Close options.
Option+D: Copy selection (if any) & screenshot → record an audio command → GPT‑4o generates and executes AppleScript to automate your Mac tasks → shows the generated script and execution result.

🟢 Ready | 🔴 Recording | 🔵 Processing

Development

Fork the repo and create a feature branch.
Open in Xcode, implement your changes.
If you have a custom icon (e.g. icon.png), you can embed it into your built .app by running:
```
python3 apply_icon.py /path/to/SpeechCraft.app icon.png
```
Run & test locally.
Submit a pull request with clear commit messages.
Ensure SwiftLint and pre‑commit hooks pass.

License

This project is released under the MIT License. See LICENSE for details.

Name	Name	Last commit date
Latest commit History 25 Commits
.github/workflows	.github/workflows
docs	docs
speechcraft/speechcraft	speechcraft/speechcraft	8000
.gitignore	.gitignore
CHANGELOG.md	CHANGELOG.md
LICENSE	LICENSE
README.md	README.md
icon.png	icon.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpeechCraft

Table of Contents

Features

Requirements

Getting the App

Installation

Configuration

Usage

Development

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

esawtooth/speechcraft

Folders and files

Latest commit

History

Repository files navigation

SpeechCraft

Table of Contents

Features

Requirements

Getting the App

Installation

Configuration

Usage

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages