Latest
H23publishedNPD release 2026-05-08

High-risk provider cohort

A composite, transparent, audit-friendly score combining six independent NDH/NPPES quality signals into a per-NPI revalidation prioritization list. Aligned with 42 CFR § 455.436 federal database checks and § 455.450 risk-tier screening.

Headline

64,156 of 7,441,211 (0.86%) NDH practitioner NPIs score at or above the 1.0 composite threshold, including 8,002 at the critical 1.5 threshold (LEIE- or SAM-excluded). Anchored in 42 CFR § 455.436 federal database checks. Reason codes: oig_excluded (7,887), sam_excluded (4,517), not_in_nppes (56,156), nppes_deactivated (260,534), luhn_fail (2).

64.2K / 7.4M = 0.86%

critical8.0K
high56.2K
medium259.6K
clean7.1M

unit: count

What this means

Regulators

42 CFR § 455.436 requires monthly NPPES + LEIE + SAM checks on all enrolled providers. AINPI today covers the NPPES leg with audit-trail-ready output (commit SHA, methodology version, generated_at). LEIE and SAM ingestion are roadmap items — the high-risk cohort will become a 4-database composite once ingested.

Payer data teams

Composite scores are NOT fraud determinations. Each NPI in the cohort carries reason codes (e.g. `not_in_nppes`, `nppes_deactivated`, `luhn_fail`, `specialty_mismatch`). Use the reason codes to triage; do not treat the score as a substitute for investigation.

Researchers

The composite weights (1.0 / 0.8 / 1.0 / 0.4 / 0.3 / 0.2) are pre-registered and visible in the analysis script. Sensitivity analyses welcome — file an issue with a reproducible alternative weighting and we will publish the comparison.

Everyone using NDH

The cohort is exported as CSV/JSON keyed by NPI with reason codes. State Medicaid PI teams can join this directly to their internal roster and produce an actionable revalidation queue inside one workday.

Null hypothesis

Less than 1% of NDH practitioner NPIs accumulate a composite high-risk score above the 1.0 threshold, indicating that the federal directory population is broadly clean and revalidation can proceed on the standard 5-year cadence under 42 CFR § 455.414.

Denominator

All `Practitioner` resources in the NDH bulk export with a populated NPI.

Data source

NDH bulk export joined to NPPES `npi_raw` for match and deactivation status; the AINPI H9 Luhn check; the AINPI H13 NPPES↔NDH specialty agreement check; and the `ainpi-probe` endpoint liveness L4+ score for any endpoints declared by the practitioner’s organization. Roadmap: OIG LEIE and SAM.gov exclusion lists per 42 CFR § 455.436.

Notes

v0.4 composite combines five signals: OIG LEIE active exclusion match (1.5), SAM.gov active exclusion match (1.5), NPPES match (1.0), NPPES deactivation (0.8), and Luhn validity (1.0). H13 specialty mismatch (weight 0.4) wires in via the cohort_specialty_mismatch derived table from analysis/h10_h13_with_crosswalk.py. LEIE and SAM are scored independently — the HHS slice of SAM overlaps LEIE by design, but they are distinct legal sources under 42 CFR § 455.436, and a doubly-flagged NPI is genuinely higher triage confidence than a singly-flagged one. SSA-DMF (weight 2.0) is the last roadmap leg — until then, state Medicaid agencies must run independent monthly SSA-DMF checks. Composite score is a data-quality flag, NOT a fraud determination — each NPI carries reason codes for transparent triage.