H41pre-registered

NPPES taxonomy vs Medicare-billed-specialty divergence (behavioral specialty drift)

H37 tests whether PECOS PROVIDER_TYPE matches the NPPES NUCC taxonomy — a record-vs-record test. H41 is the billing-behavior counterpart: for each NPI active in Medicare Part B, does the actual procedure mix match what the NPPES-registered taxonomy would predict? An NPPES "Family Medicine" (207Q) provider whose Part B billing is 80% pain-management J-codes and trigger-point injections is *behaviorally* practicing a different specialty than the directory says. State Medicaid systems trust NPPES taxonomy for prior-authorization rules, network-adequacy counts, and credentialing — drift on the billing side means those downstream rules are operating on stale signal. Pairs with H37 (registration drift) and H38 (behavioral-health subset) as the three-leg PECOS-NPPES-billing audit triangle.

What this means

Payer ops teams

Prior-authorization rules and network-adequacy counts trust NPPES taxonomy. H41 surfaces providers whose actual practice profile diverges sharply — the same population your specialty-of-record-based PA decisions are running against. Treat as a credentialing-team flag, not a denial trigger.

State Medicaid PI offices

Specialty drift is the behavioral analog of H37 registration drift. A provider with PECOS-NPPES taxonomy alignment but billing-pattern drift is operating outside the lane the directory advertises. Per-state CSV gives your team the cohort; combine with H37 / H38 results for the same NPI to triage by signal strength.

Researchers

The HCPCS↔NUCC affinity table is data-derived, not normative — it reflects what providers in each taxonomy *actually* bill, not what they should bill. Useful as an empirical anchor, not as a coding-correctness ground truth. Sensitivity windows (60% / 90%) published as sidecar so readers can pick their own falsification threshold.

Null hypothesis

Every NPI with both a NPPES NUCC taxonomy and Medicare Part B billing activity bills a procedure mix consistent with that taxonomy. No NPIs show ≥80% of billed services attributable to HCPCS codes whose modal NUCC (across the full population) differs from the NPI's registered NUCC.

Denominator

NPIs that are (a) present in the CMS Medicare Physician & Other Practitioners by Provider AND Service file with ≥1 service row AND (b) carry ≥1 NUCC code on NPPES (~1.86M expected, same denominator as H37). Drift threshold operationalized as ≥80% of billed services attributable to HCPCS codes whose modal NUCC across the full population differs from the NPI's NPPES taxonomy. The 80% threshold is a publishable falsification line — sensitivity analysis at 60% / 90% as sidecar.

Data source

NPPES (`bigquery-public-data.nppes.npi_optimized`) × CMS Medicare Physician & Other Practitioners by Provider AND Service (same file as H40). HCPCS↔NUCC affinity table built empirically from the full dataset: for each HCPCS code, the modal NPPES NUCC across all NPIs billing it. Then per-NPI billing-share against that affinity table. Streaming-once partition at `analysis/h41_specialty_drift.py` (to be added). See methodology doc §10d for the affinity-table construction and the falsification window.

Pre-registered — results not yet published.

This finding is listed here before results drop. That is the project's trust contract: the null hypothesis and the computation are public first, and numbers follow. Methodology: /methodology.

Get the next finding in your inbox. One email per release, no filler.