Telehealth transcription HIPAA-compliant 2026: Zoom for Healthcare, Doxy.me, VSee, and the Whisper path
Telehealth visits are 23% of all outpatient encounters in 2026 (KFF data) and that share is still growing in behavioral health, primary care, dermatology, and post-discharge follow-ups. Every one of those visits generates audio that someone has to convert into a clinical note. Most clinicians do it the slow way — review the recording (or rely on memory) and type the note manually after the visit. Some use the platform’s built-in transcription if it has one. A growing minority pipe the audio into Whisper or a brand-name AI scribe — and that is where HIPAA compliance gets interesting.
This post covers what HIPAA actually requires for telehealth transcripts (it’s less than most vendors imply), what each major telehealth platform offers, and the cheapest compliant DIY path using Whisper.
What HIPAA actually requires for telehealth transcripts
The Privacy and Security Rules (45 CFR §164) cover protected health information (PHI) including the audio of a clinical encounter, the transcript of that audio, and any structured note derived from it. To process PHI through any third-party vendor you need:
- A Business Associate Agreement (BAA) with that vendor (45 CFR §164.502(e)). The BAA legally binds them to safeguard the PHI per the Security Rule.
- Encryption in transit and at rest for the audio and transcript (Security Rule §164.312).
- Access control — only authorized clinicians can read the transcript (§164.312(a)).
- Audit logging — record who accessed what and when (§164.312(b)).
- Retention policy consistent with state medical record laws (federal HIPAA defers to state; CA = 7 years adults, 25 years minors, etc.).
What HIPAA does not require: that the AI vendor be US-based, that the transcript be reviewed by a human before signing, or that you use a vendor with HITRUST certification (HITRUST is a marketing tier, not a legal requirement). Many vendors imply otherwise to upsell premium plans — useful but not legally mandated.
Telehealth platform built-in transcription matrix
| Platform | Built-in transcript | BAA | Pricing | Notes |
|---|---|---|---|---|
| Zoom for Healthcare | Yes (audio + transcript via Zoom AI Companion) | Yes | $199/mo per host (Healthcare tier) | Transcription via Zoom AI Companion as add-on; transcript stored in Zoom cloud |
| Doxy.me | No native transcript (record to local) | Yes (Pro+ tier) | $35-50/mo per provider | You handle the recording and transcript path yourself |
| VSee | Optional via integration | Yes | $49-149/mo per provider | Integrates with select EHRs; transcript via partner stack |
| Spruce Health | No transcript (messaging-first) | Yes | $50-200/mo per seat | Async messaging primary, video secondary |
| SimplePractice Telehealth | No native transcript | Yes (with Telehealth Plus) | $59-99/mo per clinician | Behavioral health focus; connect external scribe |
| Mend / Updox / eVisit | Varies; mostly no native transcript | Yes | $29-99/mo per provider | EHR-tied; transcription via integration |
| FaceTime / Google Meet / Teams | Recording only, no clinical transcript | Teams: yes; others: no | $0-15/mo | Teams Healthcare has BAA; others do not without enterprise BAA |
The honest summary: only Zoom for Healthcare offers a usable built-in transcription out of the box at the consumer-friendly tier. Everyone else expects you to bring your own transcription stack and connect it via recording export. That is the gap a $0.05/min Whisper pipeline fills.
The DIY telehealth transcription pipeline
For solo and small-practice telehealth where the platform doesn’t bundle transcription, a workable HIPAA-compliant stack:
- Capture — record the telehealth session locally on your laptop via the platform’s record button (Doxy.me Pro, Zoom Healthcare, VSee all support local recording). Do not upload to consumer cloud (Dropbox, Google Drive personal, iCloud).
- Encrypt at rest — FileVault on Mac or BitLocker on Windows is sufficient for §164.312. Don’t leave the .m4a sitting in unencrypted Downloads.
- Transit to transcription — upload to your transcription vendor over HTTPS. LessRec uses TLS 1.3 in transit, S3-compatible encrypted storage at rest, and signs BAAs on the HIPAA tier.
- Transcript → structured note — the transcript comes back as .txt or .docx. For SOAP/HPI structuring, run it through Claude API (HIPAA addendum available on Anthropic API), GPT-4o (Enterprise BAA), or Gemini (Healthcare API BAA).
- Store + retain — the final signed note goes into your EHR (which already has BAA + retention covered). Source audio + raw transcript: keep encrypted for 90-365 days for revisions, then delete per your retention policy.
- Audit log — your EHR logs sign + edit events. The transcription vendor logs access. Most BAAs require you to be able to produce both on request.
Cost comparison: telehealth transcription per visit
| Path | Per 20-min visit | Per provider/year (1,500 telehealth visits) | Notes |
|---|---|---|---|
| Zoom AI Companion (Healthcare) | ~$0 incremental | $2,388 (Zoom seat itself) | Bundled if you’re on Healthcare tier |
| Brand-name AI scribe (Suki/Heidi) | ~$1.50 | $1,320-3,588 | Per-provider seat regardless of volume |
| Human virtual scribe | $8-14 | $12,000-21,000 | Real human reviewing your recordings |
| DIY Whisper + Claude (LessRec) | $1.00-1.50 | $1,500-2,250 | $0.05/min × 20 min audio + $0.001 LLM call |
For solo and 2-3 provider practices, DIY Whisper is the cheapest compliant path. For 5+ providers on Zoom Healthcare, the bundled Zoom AI Companion is hard to beat — you’re already paying the seat cost. For high-volume hybrid practices (telehealth + in-person), brand-name AI scribes that work across both modalities make sense.
State law gotchas worth knowing
HIPAA is a federal floor; state laws often go further. The ones that matter most for telehealth transcripts in 2026:
- California two-party consent (CIPA §632). You must obtain consent from the patient before recording the audio of a telehealth visit. Most platforms display this disclosure when the host hits Record. If you process audio for transcription you should explicitly consent the patient (verbal “may I record this for documentation?” + Y/N captured).
- Florida, Maryland, Massachusetts, New Hampshire, Pennsylvania, Washington — also two-party consent. Same recording disclosure rule.
- California AB 3030 (2024). Requires disclosure to patients when generative AI is used in clinical communication. A note generated by AI from your transcript falls under this. The disclosure language is brief but mandatory in CA.
- 42 CFR Part 2 (federal) for substance-use disorder (SUD) treatment programs. Stricter than HIPAA — consent required to share even with other providers. SUD telehealth transcripts need extra-careful BAA chain auditing.
- Behavioral health state laws. Some states (NY, CA) have additional protections for psychotherapy notes that exceed HIPAA. Worth a 30-minute call with your malpractice carrier before piping behavioral telehealth into AI transcription.
Specialty fit for AI telehealth transcription in 2026
| Specialty | AI fit | Why |
|---|---|---|
| Dermatology (telederm) | Excellent | Visit largely visual; voice content is low-volume + structured |
| Primary care follow-up | Excellent | Routine visits, structured patterns, AI handles 90%+ |
| Post-discharge follow-up | Excellent | Clear structure (medication, symptom check, escalation) |
| Behavioral health (50-min) | Mixed | Long narrative, high stakes; AI captures literal but flattens texture |
| Pediatrics | Good | Often involves multiple speakers (parent + child); diarization helps |
| Endocrinology | Good | Lab-driven visits; AI handles; numeric precision matters — verify dosing |
| Pain management | Caution | Documentation drives controlled-substance audit risk; manual review essential |
| Urgent care telehealth | Excellent | Fast, structured, high-volume; AI is the only economic path |
Common mistakes that break HIPAA compliance
- Recording into consumer apps. QuickTime to Dropbox, Voice Memos to iCloud, screen recording to Google Drive personal. None of these have BAAs at consumer tier. The recording is a HIPAA breach the moment it lands in the cloud.
- Pasting transcripts into ChatGPT. ChatGPT consumer is not BAA-covered. Even if you redact name and DOB, the surrounding context often re-identifies the patient. Use Anthropic API with HIPAA addendum, or OpenAI Enterprise with BAA.
- Forgetting the BAA chain. If your transcript flows through Vendor A (transcription) into Vendor B (LLM structuring) and you only have a BAA with A, that’s not enough. Both vendors need BAAs.
- Storing audio forever. Source audio after the note is signed is legal exposure with limited clinical value. Set a retention policy and stick to it.
- Not consenting recording. Especially in two-party-consent states, recording without explicit patient consent is a state law violation independent of HIPAA.
What we offer at LessRec for telehealth
For solo and small-practice telehealth providers who want to get to a working pipeline this week without the legal heartburn:
- Standard tier ($0.05/min) — great for non-PHI audio (research interviews, podcasts, lectures, public-facing content). Not for telehealth PHI.
- HIPAA-compliant tier — signed BAA, encrypted storage, audit logs, US-only data residency, 90-day retention default with custom options. Pricing on request — typically $0.08-0.12/min depending on volume.
The HIPAA tier uses the same Whisper Large v3 + Claude pipeline as the standard tier; the additional cost covers the BAA, the audit infrastructure, and the customer-managed retention controls.