H9H10H11H12H13publishedNPD release 2026-05-08

NPI and taxonomy correctness

Do NDH NPIs pass the Luhn check, exist in NPPES, and agree with NPPES on name and primary specialty? NUCC taxonomy validity + currency.

Headline

95.71% of 10.9M NDH NPIs clear NPPES (0.79% ghost, 3.49% deactivated). Practitioner name agreement: 93.8% exact → 94.1% normalized → 96.7% Jaro-Winkler ≥0.85. Organization name: 59.2% exact → 92.4% normalized → 99.6% Jaro-Winkler ≥0.85 (closes the 44-point exact-match gap to 0pp). NDH carries NUCC on Practitioner.qualification (99.83% valid) AND Medicare Specialty codes on PractitionerRole.specialty (13.10% valid against the CMS-published crosswalk). Internal cross-system consistency: 9.1% of 2.3M Practitioner↔Role pairs agree via the crosswalk. External NUCC agreement NDH↔NPPES: 91.8% match NPPES's switch='Y' TRUE primary, 99.0% match any of the 15 slots, 7.3% match only a secondary. Slot_1 is NOT always the true primary (14.93% of rows).

206.6K / 2.3M = 9.10%

H10 NPPES match OK95.7%

H10 not in NPPES0.790%

H10 deactivated in NPPES3.50%

H11 Prac exact93.8%

H11 Prac normalized94.1%

H11 Prac JW ≥0.8596.7%

H11 Org exact59.2%

H11 Org normalized92.4%

H11 Org JW ≥0.8599.6%

H12 NUCC valid99.9%

H12 CMS code valid13.1%

H13 internal crosswalk9.10%

H13 NDH↔NPPES slot 193.9%

H13 NDH↔NPPES true primary91.8%

H13 NDH↔NPPES any of 1599.0%

unit: percent

What this means

Payer data teams

When comparing NDH specialty to NPPES, match against all 15 NPPES taxonomy slots — NOT just slot 1. 15% of NPPES records have their TRUE primary (switch=Y) in a non-slot-1 position, and 6% of NDH Practitioners legitimately match only an NPPES secondary board (dual-specialists).

FHIR implementers

NDH uses TWO specialty code systems on two resources — NUCC on Practitioner.qualification, CMS Medicare Types on PractitionerRole.specialty. A consumer filtering on one won’t interoperate with one using the other. Apply the CMS-published Medicare/NUCC crosswalk (updated quarterly) to bridge.

Regulators

0.79% of NDH NPIs (86K) don’t exist in NPPES at all. 3.49% (379K) are deactivated in NPPES but still live in NDH. NDH’s update cadence lags NPPES by the gap window between releases.

Researchers

99.98% CMS structural validity + 99.83% NUCC validity = the underlying code quality is excellent. The interesting signal is inconsistency BETWEEN code systems for the same practitioner (14% fail the crosswalk check), not invalid codes themselves.

Null hypothesis

NPI structural validity is ≥99.9% and NDH-to-NPPES agreement on name and primary specialty is within documented drift thresholds.

Denominator

All `Practitioner` and `Organization` resources with an NPI identifier.

Data source

CMS NPD bulk export joined against the NPPES monthly full dissemination file (V.2) and the current NUCC quarterly code set.

Notes

Source: bigquery-public-data.nppes.npi_raw (updated 2026-02-09, 9.37M NPIs) + .healthcare_provider_taxonomy_code_set_170 + CMS Medicare Provider and Supplier Taxonomy Crosswalk (2025-10, 565 rows, 1-to-many). H11 v2 methodology — three tiers: (1) exact match on UPPER(TRIM), (2) normalized match that strips business suffixes (LLC/INC/CORP/PC/PA/PLLC/LLP/LTD/CO/COMPANY/THE for Orgs; JR/SR/II–V/MD/DO/PHD/RN/NP/PA-C/FNP-BC/DMD/DDS/DVM/PHARMD for persons), drops non-alphanumeric, collapses whitespace, (3) Jaro-Winkler ≥0.85 via a BQ JS UDF. Practitioner name: 6,907,334/7,139,252 family exact, 6,718,876 normalized full match, 6,904,517 at JW≥0.85, 6,748,894 at JW≥0.95. Organization name: 1,935,660/3,267,281 exact, 3,020,262 normalized, 3,253,066 at JW≥0.85, 3,210,187 at JW≥0.95. H12: NUCC codes on Practitioner.qualification (7,114,222/7,123,912 valid in NUCC v17.0); Medicare codes on PractitionerRole.specialty (298,166/2,276,748 valid in the crosswalk). NDH PractitionerRole._specialty_code carries a leading 'NN-' prefix (e.g. '14-50'); stripping recovers the canonical Medicare code. H13 internal: 2,270,482 Practitioner↔Role pairs, 206,603 agree via crosswalk. H13 confusion matrix — top 10 inconsistent (Medicare → qualification-NUCC) pairs: 207R00000X (INTERNAL MEDICINE) ↔ 207R00000X (Internal Medicine /): 208,760; 207Q00000X (FAMILY MEDICINE) ↔ 207Q00000X (Family Medicine /): 138,926; 207P00000X (EMERGENCY MEDICINE) ↔ 207P00000X (Emergency Medicine /): 133,684; 225100000X (PHYSICAL THERAPIST) ↔ 225100000X (Physical Therapist /): 111,838; 2085R0202X (DIAGNOSTIC RADIOLOGY) ↔ 2085R0202X (Radiology / Diagnostic Radiology): 86,970; 207L00000X (ANESTHESIOLOGY) ↔ 207L00000X (Anesthesiology /): 85,949; 152W00000X (OPTOMETRIST) ↔ 152W00000X (Optometrist /): 75,278; 367500000X (NURSE ANESTHETIST, CERTIFIED REGISTERED) ↔ 367500000X (Nurse Anesthetist, Certified Registered /): 72,140; 193400000X (SINGLE SPECIALTY) ↔ 193400000X (Single Specialty /): 57,371; 363A00000X (PHYSICIAN ASSISTANT) ↔ 363A00000X (Physician Assistant /): 56,545. H13 external (v3 — switch-aware): NPPES stores 15 (taxonomy_code, primary_switch) pairs per NPI; exactly one should have switch='Y' (the TRUE primary). Four buckets: • Match NPPES true primary (switch='Y' slot): 6,538,617 (91.78%) • Match any slot: 7,055,325 (99.04%) • Match slot_1 specifically:6,688,650 (93.89%) • Match only a secondary (switch='N'): 516,708 (7.25%) • Disagree entirely (not in any slot): 68,587 (0.96%) Slot-ordering observation: 1,063,860 rows (14.93%) have the NPPES TRUE primary in a slot other than slot_1 — so the prior 'slot_1' proxy for 'primary' was slightly wrong. 0 rows (0.00%) have no switch='Y' at all (NPPES data-quality edge). Known caveats: NPPES vintage 2026-02-09 vs NDH 2026-05-08 — 8-week gap means taxonomy changes in that window show as disagreement; Jaro-Winkler ≥0.85 is a permissive threshold that recovers common variations (whitespace, DBA suffixes, casing) but also accepts some false positives (e.g. 'Smith Medical' vs 'Smith Medicare'); the 0.95 column is the strict signal. v2 upgrade candidates: pinned quarterly NUCC; NPPES secondary-taxonomy match; phonetic fallback (Soundex / Metaphone) for names where JW misses transpositions.

Get the next finding in your inbox. One email per release, no filler.