Speak.
It’s written

Private, on-device dictation for Apple Vision Pro.

Coming soon to visionOS

01 / Listen

One tap to talk.

A small glowing orb floats beside your work. It does one thing.

02 / Record · 00:12

Speak naturally. In any of 25 languages.

Talk the way you think — pause mid-thought, switch languages mid-sentence. The model detects all 25 on its own.

MalteseEstonianLatvianLithuanianSlovenianSlovakCroatianBulgarianDanishFinnishGreekHungarianCzechSwedishRomanianDutchUkrainianPortuguesePolishRussianItalianGermanFrenchSpanishEnglish

Pause / Resume · Max 05:00

03 / Transcribe

Transcribed by the headset itself.

NVIDIA’s Parakeet v3 runs entirely on the headset. No server, no round-trip — your words never queue behind someone else’s.

04 / Copy

Copied. Paste it anywhere.

The transcript lands on your clipboard the moment it’s ready. Tap into any app and paste.

Privacy · By construction

Nothing you say leaves the room.

Audio is processed in memory and never written to disk. After the one-time model download, VisionSpeech works with the network off — and this page holds itself to the same standard.

AudioMemory only

Disk writesNone

AccountsNone

AnalyticsNone

History100 · Local

This pageNo cookies · No trackers

Two ways to interact.

Studio.

The full desk — an editable transcript, a live waveform, and a searchable history of your last 100 dictations.

VisionSpeech Orb style on Apple Vision Pro — a small glowing blue orb with a microphone, labeled Tap to talk, floating above a desk.

Orb.

A floating sphere for dictating into everything else. Tap, talk, tap — paste.

Specifications

VisionSpeech specifications
Model	NVIDIA Parakeet TDT 0.6B v3
Runtime	FluidAudio · CoreML
Languages	25, auto-detected
Model download	~500 MB, one time
Network after setup	None
Recording	Up to 5:00, pause / resume
History	Last 100, stored locally
Accounts · Analytics · Tracking	None
Requires	Apple Vision Pro · visionOS 26

Speak. It’s written