SpeechDock
Speak and listen, from anywhere on your Mac.
What is SpeechDock?
Hear any text on your screen — Selected text, typed text, pasted content, or text captured via OCR from any screen region. If you can see it, SpeechDock can read it aloud.
Transcribe any audio on your Mac — Your voice through the microphone, system-wide audio, or sound from a specific app. If your Mac can hear it, SpeechDock can turn it into text in real time.
A menu bar app that makes STT and TTS accessible from anywhere on your Mac with global hotkeys. Works immediately after installation — no API keys or additional downloads required.
Architecture
Key Features
Speech-to-Text (STT)
- Any audio source — Microphone, System Audio, or specific App Audio
- Real-time transcription — See text as you speak
- Subtitle mode — Floating overlay for presentations and meetings
- Quick transcription — Floating mic button for instant dictation
Text-to-Speech (TTS)
- Any text source — Type, paste, select in other apps, or OCR from screen
- Natural voices — Use macOS built-in or cloud provider voices
- Speed control — Adjust playback speed in real-time (0.5x to 2.0x)
- Save audio — Export speech to audio files
Translation
- On-device translation — No API keys required (macOS 26+)
- 18+ languages — Translate between major languages
- TTS integration — Automatically read translated text
Cloud Providers (Optional)
- OpenAI — GPT-4o Transcribe, GPT-4o Mini TTS
- Google Gemini — Gemini 2.5 Flash (STT/TTS)
- ElevenLabs — Scribe v2 (STT), Eleven v3 (TTS)
- Grok (xAI) — Grok 2 (STT/TTS)
Requirements
- macOS 14.0 (Sonoma) or later
- Apple Silicon Mac (M1/M2/M3/M4)
Documentation
| Page | Description |
|---|---|
| Basic Features | Installation, STT, TTS, OCR, Subtitles, Shortcuts |
| Advanced Features | Cloud providers, API keys, File transcription |
| AppleScript | Automation and scripting |
Screenshots
License
SpeechDock is released under the Apache License 2.0.