Transcription for journalists and reporters
Interviews, press conferences, panel discussions, foreign-language sources, field recordings — speaker-labeled transcripts in minutes for $2 per hour. Verbatim word-for-word, not summary. Built for the writing job that comes after the recording.
Speaker labels for multi-source
A 3-person panel, a roundtable, a tense back-and-forth — diarisation separates voices automatically. Critical for accurate attribution when you're writing under deadline.
90+ languages, no English bias
Foreign-language interviews transcribe in the source language. No paraphrase, no auto-translation that flattens the meaning. Translate (and attest) separately if needed.
$2/hour, freelance-friendly
A 1-hour interview costs $2. A 4-hour day of recordings costs $8. No subscription that charges you between assignments. Expense back to the publication if you can.
Verbatim with timestamps
Every word, every timestamp. Useful for fact-checking against the audio, marking quote moments, and citing exact phrasing in print or broadcast.
From recorded interview to publishable copy in 3 steps
Upload the recording
Drop the file from your recorder, phone, or Zoom call. MP3, WAV, M4A, MP4 — all supported. Up to 500 MB / 10 hours per file. Most 1-hour interviews come in well under 100 MB.
We transcribe with speaker labels
Auto-language detection, speaker diarisation, verbatim transcript with timestamps. A typical 1-hour interview is ready in 4–6 minutes.
Quote, fact-check, file
Search the transcript with Cmd-F to find the exact moment a quote happened. Click the timestamp to jump the audio there for verification. Copy clean quotes for your draft.
Multi-source interviews — and the on-the-record / off-the-record problem
Most journalist interviews are 1-on-1 — those are the easy case. The harder case is multi-source: a panel discussion, a press scrum, a roundtable, or a meeting where three sources are talking and you need to know who said what.
Speaker diarisation handles the basic separation, but two practical issues come up:
- Voice similarity: when two speakers have similar voices (same gender, similar pitch, similar accent), diarisation may merge them. The fix is to spot-check at the start of the interview — listen to who introduces themselves, label them in the editor, and the system propagates that labeling forward.
- Cross-talk and interruptions: aggressive overlap (interview subjects cutting each other off, mid-sentence interjections) is the hardest case for any transcription system. Verbatim accuracy drops a few points. Use the audio playback in the editor to verify any line that the transcript flags as low-confidence.
On-the-record / off-the-record markingis a journalist-specific workflow that we don't formalise in the editor today, but here's the pattern that works:
- Verbal markers at the moment of switch ("This is off the record", "Back on record") — these show up in the transcript.
- After transcription, search for the markers and bracket those sections in the editor with
[OFF]...[ON]. - When exporting, you have a clearly labeled record of which sections were attributable.
Don't skip recording the verbal markers. Memory-based marking after the fact is unreliable, and the audio is the source of truth that protects you and the source.
Field recording, foreign-language interviews, and accuracy
Field journalism produces some of the worst recording conditions in audio: wind noise, traffic, crowds, sources who don't know how to hold a mic. Modern transcription handles these significantly better than tools from even 3 years ago, but expectations need to match reality.
- Wind noise: a foam windscreen on the mic helps more than any transcription tool can. Software noise reduction post-record helps further. Realistic accuracy on outdoor recordings: 80–88%.
- Crowd / café / restaurant: 80–90% on the foreground speaker. Background voices may bleed into the transcript briefly.
- Foreign-language source with English-language journalist: the transcript follows the audio. Code-switching is handled — if the journalist asks in English and the source replies in Spanish, both languages appear in the transcript correctly tagged.
- Accented English: Whisper-class models are robust to most English accents (Indian, Nigerian, Scottish, Australian, etc.). Some very thick regional accents may produce occasional misreads — verify against audio.
- Translation: we don't auto-translate. For cross-language quoting in print, get a verbatim transcript here, then translate as a separate step (and attest the translation, per your publication's standard).
What journalist transcription costs
$2 per hour, regardless of language or source count:
$2
1-hour 1-on-1 interview
$4
90-min panel discussion
$8
4-hour day of recordings
Per-minute services charge $0.25–$1.50/min. A 1-hour interview at $0.50/min = $30. At our $2/hr, you cover an entire feature\'s worth of source recordings for the price of one minute of human transcription.
Frequently asked questions
Is this verbatim or summarised?+
Verbatim. Word-for-word with timestamps. We don't paraphrase, summarise, or "clean up" the transcript. Filler words ("um", "you know") are preserved as recorded — useful for accurate attribution and for analysing how a source is hedging.
How accurate are transcripts of foreign-language interviews?+
Comparable to English on the same audio quality. We support 90+ languages natively, including all major European languages (Spanish, French, German, Portuguese, Italian, Norwegian, Swedish, Danish, Dutch, Polish, etc.), Arabic, Mandarin, Japanese, Hindi, and many smaller languages. Transcript stays in the source language; translate separately and attest.
Can I rename "Speaker 1" to the source's actual name?+
Yes. Open the transcript in the editor, click any speaker label, type the name. The change applies to all instances of that speaker in the transcript and in exports.
How do I jump from text to the moment in the audio?+
Each segment in the editor has a timestamp link. Click it to jump the audio playback to that exact moment. Useful for spot-checking a quote against the recording before publication.
How private is this for sensitive sources?+
Audio is processed in EU data centres and deleted from our servers 90 days after your last sign-in. We don't train models on your audio. OpenAI (transcription subprocessor) operates under a zero-retention agreement. Full disclosure at /trust. If you're working with at-risk sources, you may also want to scrub identifying metadata from the file before upload.
Can I export the transcript as a formatted Word document?+
Yes. Word export includes speaker labels and timestamps. Plain text and SRT (for subtitles, useful for video features) are also available.
Does this work for podcast journalism?+
Yes. Many of our customers are podcast producers using transcripts for show-note generation, blog repurposing, and editorial review. Speaker labels make panel-format podcasts much easier to repurpose.
Related journalism and interview resources
Transcribe your next interview
Speaker-labeled, verbatim, $2 per hour. The reading-and-quoting tool, not a summary tool.
Start Transcribing ($2/hr)Free to sign up · Pay only when you transcribe