Studio.
The full desk — an editable transcript, a live waveform, and a searchable history of your last 100 dictations.
Private, on-device dictation for Apple Vision Pro.
01 / Listen
A small glowing orb floats beside your work. It does one thing.
02 / Record · 00:12
Talk the way you think — pause mid-thought, switch languages mid-sentence. The model detects all 25 on its own.
Pause / Resume · Max 05:00
03 / Transcribe
NVIDIA’s Parakeet v3 runs entirely on the headset. No server, no round-trip — your words never queue behind someone else’s.
04 / Copy
The transcript lands on your clipboard the moment it’s ready. Tap into any app and paste.
Privacy · By construction
Audio is processed in memory and never written to disk. After the one-time model download, VisionSpeech works with the network off — and this page holds itself to the same standard.
The full desk — an editable transcript, a live waveform, and a searchable history of your last 100 dictations.
A floating sphere for dictating into everything else. Tap, talk, tap — paste.
| Model | NVIDIA Parakeet TDT 0.6B v3 |
|---|---|
| Runtime | FluidAudio · CoreML |
| Languages | 25, auto-detected |
| Model download | ~500 MB, one time |
| Network after setup | None |
| Recording | Up to 5:00, pause / resume |
| History | Last 100, stored locally |
| Accounts · Analytics · Tracking | None |
| Requires | Apple Vision Pro · visionOS 26 |