FHIR Bulk Data Access 2026: population-level data for AI scribe analytics, quality reporting, and risk adjustment
FHIR Bulk Data Access (also called Flat FHIR or "$export") lets a clinical organization or authorized analytics vendor pull large slices of patient data efficiently — not one patient at a time, but cohorts, panels, or entire patient populations in a single asynchronous job.
For AI scribe organizations and value-based care teams, this is the data-engineering layer that makes population-level intelligence work. Per-encounter scribe output is one half of the value; population analytics over those structured outputs is the other half.
What FHIR Bulk Data Access provides
The FHIR Bulk Data Access specification (HL7 FHIR R4 / R5) defines three operation patterns:
- $export at the System level — entire data set for the authorized scope
- $export at the Group level — specific patient cohort (HCC at-risk panel, value-based contract members, etc.)
- $export at the Patient level — single patient deep export, including resources usually not available via single-patient FHIR queries
The output is typically NDJSON (newline-delimited JSON) over HTTPS, asynchronously generated and downloaded once ready. Standard FHIR resources are available: Patient, Encounter, Condition, Observation, MedicationStatement, AllergyIntolerance, Procedure, DocumentReference, etc.
The AI scribe analytics layer
If your organization is running AI scribes at scale, here's what FHIR Bulk Data unlocks:
| Use case | FHIR Bulk pull | Analytics output |
|---|---|---|
| HCC suspect list generation | Condition + Observation (last 24 mo) | For each patient: HCCs billed last year not yet billed this year, plus conditions implied by labs |
| Quality measure performance | Procedure + Observation + DocumentReference | Cancer screening, diabetes A1c, BP control rates by panel |
| Documentation gap analysis | Condition + DocumentReference | Which dx are billed but lack supporting note text? Which conditions are mentioned in notes but not coded? |
| SDoH coverage | Condition (Z-codes) + DocumentReference | Which patients have SDoH content in notes but no Z-code? Which Z-codes are stale? |
| AI scribe quality assurance | DocumentReference (AI-tagged) + Encounter | Note completion rate, edit-distance from AI draft to signed note, reviewer flag rate |
| Panel risk stratification | Condition + Observation + Procedure | RAF distribution, HCC capture rate, projected revenue lift from documentation improvement |
Why this matters more in 2026
Three reasons:
- HCC v28 stricter specificity. RADV audits sample 30 charts per contract and extrapolate. Population-level documentation gap analysis means you find weak charts before the audit does.
- Health Equity Index. The 2027 payment year HEI bonus is calculated on 2025-2026 data. Population SDoH coverage is a documentation analytics question, not a per-encounter question.
- Value-based contracts. Most MA, ACO, and risk-bearing arrangements in 2026 require quarterly performance reporting. Without bulk FHIR, this is a manual extract job each cycle.
Authorization and security
FHIR Bulk Data Access uses the SMART on FHIR backend services authorization pattern:
- Client app registers with the EHR (one-time)
- JWT-based assertion for each request (stateless)
- Signed assertions, no user-mediated OAuth dance
- EHR validates the public key against pre-registered key set
- Access scopes: System scopes for bulk operations (System/$export), patient-level scopes for narrower access
BAA chain: practice / organization + EHR vendor + analytics vendor (3-party) at minimum.
The DIY analytics stack on FHIR Bulk
For an organization wanting to run their own analytics on AI-scribed encounters:
- FHIR Bulk client (Python / Node). Standard libraries exist for SMART backend services + bulk pull (e.g., python-fhirclient).
- NDJSON download + parse. Straightforward streaming parse; resources by type into a relational store or columnar store.
- Storage. Postgres or DuckDB for moderate scale; BigQuery or Snowflake for larger.
- Analytics. SQL queries for HCC gap, quality measures, SDoH coverage. Dashboards in Metabase / Superset / Looker.
- LLM-driven analysis. Periodic LLM passes over note text + structured data to generate clinician-facing reports.
For a 5-clinician practice, this stack is buildable in 4-6 weeks of one developer + analyst time. The output is panel-level intelligence the practice's payer contracts already require.
Vendor matrix — FHIR Bulk Data + AI scribe analytics 2026
| Vendor | What they do | Cost model |
|---|---|---|
| Innovaccer | Population health data platform with FHIR Bulk ingestion | Enterprise |
| Health Catalyst | Healthcare analytics platform | Enterprise |
| Datavant | Cross-system data linking and quality | Per-record / enterprise |
| Navina | HCC-focused analytics with AI scribe overlay | Per-PMPM (panel size dependent) |
| Reveleer | Risk adjustment + quality gap closure | Enterprise / per-record |
| DIY (Python + Postgres + Metabase) | Build it yourself | Developer + analyst time + infrastructure |
Limitations and gotchas
- Not all EHRs support Bulk yet. Epic and Cerner Oracle have it; Athena and eClinicalWorks have varying support; smaller EHRs may not yet implement the spec fully.
- Async means waiting. Bulk export is async; large pulls can take hours. Plan for batch refresh windows, not real-time queries.
- Resource coverage varies. Some EHRs export only US Core resources; others include extensions and custom resources. Test against the actual endpoint.
- Network agreement. Bulk pulls of national data via TEFCA / Carequality have separate governance from your local EHR's bulk endpoint.
Strategic angle for AI scribe organizations
If you're building or running AI scribe at any meaningful scale (more than 50 encounters/day per provider), the per-encounter scribe output is just the input layer. The population analytics layer over those outputs is where ROI is demonstrated to leadership and to payer contracts.
The FHIR Bulk path is the data plumbing for that analytics layer. Without it, you're stuck running per-patient queries that don't scale.
What's coming in 2026 H2
EHR vendor support for FHIR Bulk Data continues to expand through 2026 H2 as part of the ONC-mandated USCDI v3+ adoption. Expect broader resource coverage, more reliable Group-level exports, and tighter SMART backend services standardization. By 2027 most major EHRs should have production-grade Bulk endpoints — making this layer a baseline expectation rather than a differentiator.
Population analytics + AI scribe stack on LessRec
$0.05/min Whisper transcription as the encounter layer. Build your population analytics over FHIR Bulk Data exports for value-based contracts. First 10 minutes free.
Try LessRec free →