Article · Research

Focus Group Transcription Guide: Recording to Coded Output

Q: How many speakers can be reliably separated?

With clean recording (lavalier per speaker or close-mic), eight participants is reliable. Tabletop omnis at room scale: four is the practical cap before crosstalk overwhelms speaker ID.

Practical playbook for moderators and analysts — recording setup, speaker key, anonymization, and how to deliver a transcript that drops straight into NVivo, Atlas.ti, or Excel.

Lessrec editorial · May 1, 2026 · 8 min read

Most focus group transcripts arrive at the analyst’s desk in worse shape than they need to be. Speakers labelled “Participant 1 / Participant 2 / Participant 3” with no demographics. Crosstalk merged into one speaker. Anonymization done sloppily so a regex can re-identify everyone. This guide is the playbook a research lead can hand to the moderator before the session, the recordist before the cameras roll, and the transcription vendor before they listen to the first minute.

Before the session — three things to write down

Before participants enter the room, the moderator should record three things on paper that will save the transcriber hours:

Speaker key. A table mapping each participant to a code (P1, P2, P3...) plus their demographic profile (age, role, segment) and seat position. Don’t share names with the transcriber unless the protocol allows it.
Glossary. Brand names, product names, technical terms, internal jargon. Pre-send this to the transcriber so spellings are consistent.
Discussion guide. The questions in the order you plan to ask them. Helps the transcriber follow context when the audio is unclear.

Recording setup that pays for itself

The single biggest predictor of transcription accuracy is the recording chain. Two recommendations beat everything else:

Lavalier per speaker when budget allows. Six people, six lavs, six channels — the transcriber separates speakers trivially.
Two backup recorders (Zoom H6, Tascam DR-40, or even iPhone Voice Memos in “Lossless” mode) at different points in the room. Catches speakers the lav misses and gives the transcriber audio variety.

If you can’t do per-speaker mics, place one tabletop omni in the center plus two boundary mics at the ends of the table. Avoid one mic at the moderator end — voices furthest from the mic become unintelligible. For the deeper format dive see best audio format for accurate transcription.

Moderator script for the recorder

At the very start of the recording, before the first question, the moderator says (and pauses for) each participant in turn:

“Going around the room, please introduce yourself: your code letter, your role, and one sentence about your experience with [topic].”

This 60 seconds gives the transcriber a voice sample for each speaker keyed to a code. From that moment on, speaker ID is a matching problem the transcriber can solve, not guess at.

Equally important — when the moderator calls on a participant, they should use the code: “P3, can you say more about that?” The audio carries the speaker ID through the whole session.

Transcription scope — what to ask for

Specify in the order:

Verbatim or smoothed? For coding, verbatim. For executive summaries, smoothed. Most research projects need both — verbatim master + smoothed extracts.
Crosstalk handling. Mark overlapping speech as [overlapping] or transcribe each voice separately if recording quality allows.
Inaudible markers. Require [inaudible HH:MM:SS] at every gap.
Timestamps. Every 30 seconds and at every speaker turn.
Speaker labels. Keep codes (P1, P2) in the deliverable; you map back to demographics in your analysis tool, not in the transcript itself.

Lessrec’s focus group transcription service includes all of the above by default at the multi-speaker tier.

Anonymization scheme that survives audit

If the protocol promises participants anonymity, the transcript must support it. Common failure: substituting names with “[NAME]” but leaving employer, neighborhood, or job title intact — easy to re-identify in a small segment study.

A robust scheme strips:

Names of participants and people they mention
Employer and previous employers
Neighborhoods, street names, specific addresses
Specific dates that could pin a personal event
Distinctive medical conditions, religious affiliations, named family members

Replace each consistently across the transcript: “[employer]”, “[city]”, “[partner]”. Don’t use unique tags per instance — analysts need to track patterns across mentions.

Output formats your analysis tool actually wants

Tool	Best format	Notes
NVivo	DOCX with speaker headers	Each `P3:` turn becomes auto-coded by speaker.
Atlas.ti	RTF or DOCX	Same as NVivo. Avoid PDF — locks the text.
Dedoose	DOCX or TXT	Quotation framework needs paragraph breaks at speaker turns.
Excel/Sheets	CSV one row per speaker turn	Columns: timestamp, speaker code, text, segment metadata.
LLM analysis	JSON	`{ timestamp, speaker, text, demographic_block }`

Order all the formats you might need at submission — re-exporting later costs less than re-doing the order, but most vendors charge per format. Lessrec ships up to three formats per order at the multi-speaker tier; additional formats $5 each.

FAQ

How many speakers can be reliably separated?

With clean recording (lav per speaker or close-mic), eight is reliable. Tabletop omnis at room scale: four is the practical cap.

Should I record video or just audio?

Video helps the transcriber resolve who’s speaking when voices are similar.

How long should the transcript take?

Standard turnaround for a 90-minute focus group is 24-36 hours.

Have a recording? Send it.

Upload audio or video. We’ll send a transparent estimate within an hour and confirm the deadline before you pay.

Upload audio