You explained it perfectly on the call. The payment settlement flow, the edge cases, the spec. Your thinking was clear. Then you sat down to write it into a design doc and spent 45 minutes rewriting the same 800 words because talking is faster than typing when you're trying to articulate nuance. This is the moment most developers abandon dictation.
But something shifted in the last two years. Writing code stopped being the bottleneck. Explaining intent to an LLM became the bottleneck. The time you spend drafting a prompt, a design doc, a spec, a PR description is now longer than the time you spend actually coding. And you're running dictation tools built for a different workflow: people who talk to capture casual ideas, not people who talk to articulate complex systems thinking.
The moment you realize the tool was designed for someone else
Marcus is a backend engineer at a Series B fintech in Stockholm. He's spent the last eighteen months building payment settlement logic with Claude and Cursor. His workflow looks like this: design doc in Notion (voice), PR description (voice), code review comments (voice), incident postmortem (voice). He's not dictating transcripts. He's dictating technical explanation.
For the first few months, he used Wispr Flow on free tier. It was fine until it wasn't. Halfway through a design doc explaining the reconciliation flow, he'd hit the 500-word cap. The thinking broke. The next morning, he'd have fragmented prose spread across three disconnected sections. He'd spend an hour rewriting what should have been thirty minutes of coherent thought. So he stopped dictating design docs. He kept it for short Slack threads and code comments only.
The word cap broke the use case.
Why is every free tool capped?
The answer is in the economics of cloud transcription. Services like Wispr Flow, Willow, and Superwhisper run transcription in the cloud. API calls cost money. The more words you dictate, the higher their compute bill. So they meter the free tier: 500 words per day, 1000 words per day, whatever keeps their costs down. They're not selling you transcription. They're selling you a freemium funnel to paid plans. The math works if your users dictate casual voice memos. It breaks if they dictate technical documents.
Whisper, OpenAI's open-source speech-to-text model, runs locally on your device. Whisper-large-v3 achieves 96.3% word accuracy on LibriSpeech. It's the same model cloud services use. But when it runs on your machine, there's no API call, no streaming, no variable cost per word. The only cost is the CPU cycles to process audio on your device.
This is why Recitey is structured differently. The free tier uses local Whisper. No metering, no word limit, no cap. You dictate as much as you need. The paid tier adds cloud-based rewrite and polish features (which do cost money), but the transcription itself is free and uncapped. The economics align with the use case: developers who need to capture long-form technical thinking without interruption.
What changes when the cap goes away
Marcus went back to dictating design docs. The first time he did, he dictated a 1800-word specification about the payment lifecycle without hitting any limit. The thinking stayed coherent. He didn't wake up the next morning needing to stitch fragmented sections back together. The prose was raw (all dictated text is rough), but it was connected. The rewrite took 20 minutes, not 90.
The thing that changed wasn't speed. It was mental flow. When you know the cap is coming, you think differently. You edit as you speak. You break complex thoughts into smaller chunks. You hedge more. The prose becomes more cautious because you're managing the constraint in real time. When there's no cap, you think the way you think. The output is messier, but it's honest. It's closer to how you'd explain it on a call.
This is the specific moment dictation makes sense for technical work: when the tool gets out of the way of the thinking.
The trade-off you're actually making
The free tier is local transcription only. No cloud rewrite. No grammar correction. No punctuation cleanup. You get raw speech-to-text. If you want the text automatically polished into something closer to written prose, that's a paid feature (which does require cloud processing, and costs money fairly).
This trade-off is correct for Marcus's use case. He cares about capturing the thinking without interruption. Cleanup happens the next morning, or during the code review where he's already reading it again. The value isn't in automated rewrite. The value is in uninterrupted capture.
The other trade-off: you need to be near your device. Local transcription means the audio processes on your machine. If you dictate while walking home, with spotty connection, there's latency. Cloud transcription handles that. Wispr Flow can handle audio recorded anywhere, sent over bad internet, and still transcribe it reliably. For Marcus, this hasn't mattered because his dictation happens at his desk, in Notion or Cursor or Slack.
The shape of this problem
The reason this matters right now is structural, not accidental. LLM-assisted development rewired how developers write. You're no longer writing code most of the time. You're writing intent: prompts, specs, design docs, code review comments that explain what you want the model to do. The bottleneck is now captured thinking. And the tool that speeds up captured thinking is one that doesn't interrupt the capture.
Every general-purpose dictation tool is capped because they were designed before this workflow existed. Recitey is built for it. Local Whisper, no metering, works across Slack, email, browsers, every app that touches a text field on Windows. The technical constraint (local processing = zero variable cost) aligns with the product constraint (no word cap). This is what correct product design looks like: the technology matches the problem.
The moment Marcus stopped thinking about dictation as a dictation tool and started thinking about it as a technical-thinking capture tool, everything else followed.