FHIR / Population

FHIR Bulk Data Access 2026: population-level data for AI scribe analytics, quality reporting, and risk adjustment

May 8, 2026 · 6 min read

FHIR Bulk Data Access (also called Flat FHIR or "$export") lets a clinical organization or authorized analytics vendor pull large slices of patient data efficiently — not one patient at a time, but cohorts, panels, or entire patient populations in a single asynchronous job.

For AI scribe organizations and value-based care teams, this is the data-engineering layer that makes population-level intelligence work. Per-encounter scribe output is one half of the value; population analytics over those structured outputs is the other half.

What FHIR Bulk Data Access provides

The FHIR Bulk Data Access specification (HL7 FHIR R4 / R5) defines three operation patterns:

$export at the System level — entire data set for the authorized scope
$export at the Group level — specific patient cohort (HCC at-risk panel, value-based contract members, etc.)
$export at the Patient level — single patient deep export, including resources usually not available via single-patient FHIR queries

The output is typically NDJSON (newline-delimited JSON) over HTTPS, asynchronously generated and downloaded once ready. Standard FHIR resources are available: Patient, Encounter, Condition, Observation, MedicationStatement, AllergyIntolerance, Procedure, DocumentReference, etc.

The AI scribe analytics layer

If your organization is running AI scribes at scale, here's what FHIR Bulk Data unlocks:

Use case	FHIR Bulk pull	Analytics output
HCC suspect list generation	Condition + Observation (last 24 mo)	For each patient: HCCs billed last year not yet billed this year, plus conditions implied by labs
Quality measure performance	Procedure + Observation + DocumentReference	Cancer screening, diabetes A1c, BP control rates by panel
Documentation gap analysis	Condition + DocumentReference	Which dx are billed but lack supporting note text? Which conditions are mentioned in notes but not coded?
SDoH coverage	Condition (Z-codes) + DocumentReference	Which patients have SDoH content in notes but no Z-code? Which Z-codes are stale?
AI scribe quality assurance	DocumentReference (AI-tagged) + Encounter	Note completion rate, edit-distance from AI draft to signed note, reviewer flag rate
Panel risk stratification	Condition + Observation + Procedure	RAF distribution, HCC capture rate, projected revenue lift from documentation improvement

Why this matters more in 2026

Three reasons:

HCC v28 stricter specificity. RADV audits sample 30 charts per contract and extrapolate. Population-level documentation gap analysis means you find weak charts before the audit does.
Health Equity Index. The 2027 payment year HEI bonus is calculated on 2025-2026 data. Population SDoH coverage is a documentation analytics question, not a per-encounter question.
Value-based contracts. Most MA, ACO, and risk-bearing arrangements in 2026 require quarterly performance reporting. Without bulk FHIR, this is a manual extract job each cycle.

Authorization and security

FHIR Bulk Data Access uses the SMART on FHIR backend services authorization pattern:

Client app registers with the EHR (one-time)
JWT-based assertion for each request (stateless)
Signed assertions, no user-mediated OAuth dance
EHR validates the public key against pre-registered key set
Access scopes: System scopes for bulk operations (System/$export), patient-level scopes for narrower access

BAA chain: practice / organization + EHR vendor + analytics vendor (3-party) at minimum.

The DIY analytics stack on FHIR Bulk

For an organization wanting to run their own analytics on AI-scribed encounters:

FHIR Bulk client (Python / Node). Standard libraries exist for SMART backend services + bulk pull (e.g., python-fhirclient).
NDJSON download + parse. Straightforward streaming parse; resources by type into a relational store or columnar store.
Storage. Postgres or DuckDB for moderate scale; BigQuery or Snowflake for larger.
Analytics. SQL queries for HCC gap, quality measures, SDoH coverage. Dashboards in Metabase / Superset / Looker.
LLM-driven analysis. Periodic LLM passes over note text + structured data to generate clinician-facing reports.

For a 5-clinician practice, this stack is buildable in 4-6 weeks of one developer + analyst time. The output is panel-level intelligence the practice's payer contracts already require.

Vendor matrix — FHIR Bulk Data + AI scribe analytics 2026

Vendor	What they do	Cost model
Innovaccer	Population health data platform with FHIR Bulk ingestion	Enterprise
Health Catalyst	Healthcare analytics platform	Enterprise
Datavant	Cross-system data linking and quality	Per-record / enterprise
Navina	HCC-focused analytics with AI scribe overlay	Per-PMPM (panel size dependent)
Reveleer	Risk adjustment + quality gap closure	Enterprise / per-record
DIY (Python + Postgres + Metabase)	Build it yourself	Developer + analyst time + infrastructure

Limitations and gotchas

Not all EHRs support Bulk yet. Epic and Cerner Oracle have it; Athena and eClinicalWorks have varying support; smaller EHRs may not yet implement the spec fully.
Async means waiting. Bulk export is async; large pulls can take hours. Plan for batch refresh windows, not real-time queries.
Resource coverage varies. Some EHRs export only US Core resources; others include extensions and custom resources. Test against the actual endpoint.
Network agreement. Bulk pulls of national data via TEFCA / Carequality have separate governance from your local EHR's bulk endpoint.

Strategic angle for AI scribe organizations

If you're building or running AI scribe at any meaningful scale (more than 50 encounters/day per provider), the per-encounter scribe output is just the input layer. The population analytics layer over those outputs is where ROI is demonstrated to leadership and to payer contracts.

The FHIR Bulk path is the data plumbing for that analytics layer. Without it, you're stuck running per-patient queries that don't scale.

What's coming in 2026 H2

EHR vendor support for FHIR Bulk Data continues to expand through 2026 H2 as part of the ONC-mandated USCDI v3+ adoption. Expect broader resource coverage, more reliable Group-level exports, and tighter SMART backend services standardization. By 2027 most major EHRs should have production-grade Bulk endpoints — making this layer a baseline expectation rather than a differentiator.

Population analytics + AI scribe stack on LessRec

$0.05/min Whisper transcription as the encounter layer. Build your population analytics over FHIR Bulk Data exports for value-based contracts. First 10 minutes free.

Try LessRec free →