Updated for 2026

HIPAA Safe Harbor De-Identification: Detecting Hospital-Specific MRN Formats Without Engineering

HIPAA Safe Harbor requires removing medical record numbers. This is one of 18 required ID types. It sounds simple. The problem is that MRN formats are not standardized.

Epic uses one format. Cerner uses a different one. Meditech uses another. Each hospital adds its own codes. Regional health groups create even more formats. A standard PII tool cannot know your format. It will miss your MRNs.

This is not a minor risk. Healthcare IT teams often find MRNs still in datasets that were meant to be de-identified. The tool was only set up for common PII types.

The MRN Format Problem

The US has no national standard for medical record numbers. Each hospital or EHR vendor defines its own format.

Common patterns observed:

Epic-style: 8–12 digit numeric (e.g., 123456789)
Cerner-style: Hospital code prefix + numeric (e.g., MGH-987654)
Regional networks: Facility code + year + sequence (e.g., HOSP-2023-456789)
Veterans Affairs: 9-digit with a check digit
Pediatric systems: Patient-type prefix + numeric (e.g., PED-12345678)

No single rule matches all of these. There is no universal MRN pattern.

What standard PII tools catch: Most HIPAA tools focus on fixed-format IDs. SSNs follow XXX-XX-XXXX. Phone numbers follow XXX-XXX-XXXX. Email addresses have a clear shape. These are easy to find.

MRNs, account numbers, and license numbers are HIPAA types 8, 10, and 11. These vary by hospital. They need custom setup. A generic tool will not catch them.

The Compliance Gap

A regional hospital wants to share patient data with a university research partner. Their EHR uses this MRN format: HOSP-YYYY-XXXXXX.

They run the data through their HIPAA tool. The tool strips names, dates, phone numbers, and SSNs. It does not remove MRNs. HOSP-2023-456789 matches no built-in rule.

The researcher gets the dataset. They join it against their own records. Those records include MRNs from past referrals at the same hospital. Many patients can now be re-identified. The hospital has a HIPAA breach.

This is a real failure mode. See also HIPAA Safe Harbor de-identification for healthcare research for more on where Safe Harbor breaks down.

The Fix: Custom Entity Creation

The fix is to define your MRN format as a custom entity. A compliance officer can do this. No engineer is needed.

Steps:

Write out the format: "Starts with HOSP, then a dash, a 4-digit year, a dash, and a 6-digit number"
Use an AI tool to build the regex: HOSP-\d{4}-\d{6}
Test it on 20 discharge summaries. Confirm it catches all MRNs.
Save it as a custom entity called "Hospital MRN"
Add it to your HIPAA preset alongside the standard 17 ID types

This process takes a compliance officer about 3 days. Building custom code can take 3 months.

Example: 15-Facility Hospital Network

Organization: 15-facility regional hospital network

MRN format: HOSP-YYYY-XXXXXX (in thousands of discharge summary PDFs)

Goal: Share a research dataset with a university partner under a HIPAA data use agreement

Old approach: External de-identification vendor at $120,000 per year

Gap found: The vendor tool did not detect the institution-specific MRN format

New workflow:

Compliance officer defines the MRN pattern — 20 minutes
AI validates the regex — 5 minutes
Test on 50 sample summaries — 30 minutes
Confirm no MRNs remain, no false positives — 10 minutes
Add the custom entity to the HIPAA preset
Run the full 50,000-record dataset in batch

Total time to close the gap: one afternoon.

Multi-Facility Networks: Multiple MRN Formats

Hospital networks built through mergers often run several EHR systems. Each legacy system may use a different MRN format.

How to handle this:

Create a separate custom entity for each format:

"MRN Format A (Epic)" — 8-digit numeric
"MRN Format B (legacy Cerner)" — prefix + 7-digit numeric
"MRN Format C (acquired affiliate)" — state code + year + sequence

One preset holds all three custom entities plus the standard HIPAA ID types. Every document from every facility will have its MRNs removed.

See custom MRN detection in HIPAA pipelines without code for a step-by-step guide to this multi-format setup.

Beyond MRNs: Other Non-Standard Identifiers

The same approach works for other HIPAA Safe Harbor ID types.

Health plan member numbers (Category 9): Each insurer uses its own format. Aetna, Blue Cross, and United Healthcare all look different. A billing team needs a custom pattern for each payer.

Account numbers (Category 10): Hospital billing account numbers differ by hospital.

License numbers (Category 11): DEA numbers have a standard federal format. State medical license numbers do not. Each state board uses its own format.

Device identifiers (Category 14): Medical device serial numbers are set by each maker.

For each of these, a custom entity closes the gap. No engineers needed.

See custom PII identifiers for organizational anonymization for more on non-standard ID types.

Validation: Proving Safe Harbor Compliance

HIPAA Safe Harbor says the covered entity must have no "actual knowledge" the data could identify someone. (45 CFR §164.514(b)(1))

Custom entity validation proves all 18 ID types are covered.

Validation steps:

Process 50–100 sample documents from the research dataset
Review the output — does anything look like an ID?
Run a second detection pass to catch any missed items
Document what you did

Your custom entity setup, sample review, and processing logs form your Safe Harbor record.

Conclusion

Standard PII tools on default settings do not finish HIPAA Safe Harbor de-identification. Medical record numbers are hospital-specific. They need custom detection.

Custom entity creation closes this gap in hours. Compliance officers can define the pattern, test it, and process data. No engineering work is needed.

The gap between "we ran a HIPAA tool" and "we removed all 18 Safe Harbor identifiers" is often just one missing custom entity.

When This Approach Has Limits

Defining your hospital's MRN format as a custom entity is the right way to close the gap a generic tool leaves open — but three limits apply.

Detection accuracy bounds the result. A custom pattern only removes what it matches exactly. If a clinician typed an MRN with an unusual prefix, an extra hyphen, or inside running prose rather than a header field, the regex can miss it. Free-text mentions and OCR artifacts in scanned discharge summaries are the usual sources of residual false negatives. The test on 20 or 50 sample documents tells you the pattern works on those documents, not that it generalizes to every record across years and departments. Hold back a separate validation set and review output before declaring a dataset de-identified.

Removing MRNs does not, by itself, achieve Safe Harbor. Safe Harbor requires all 18 identifier categories absent and no actual knowledge that re-identification is possible. Even with MRNs gone, the combination of admission date, sub-state location, age, and a rare diagnosis can re-identify a patient. That makes the output pseudonymized rather than truly anonymous, and the residual-risk judgment is a legal determination. The tool supports that determination and produces the processing record for it, but a qualified person still has to make and document the call.

Batch scale magnifies a systematic miss. Running 50,000 records through one preset is efficient, but if a facility used a second MRN format you did not configure, the same gap repeats across every file in the batch. A single mismatched legacy format becomes thousands of leaked identifiers, not one. After mergers especially, inventory every EHR system's format before the full run, and sample output from each facility rather than trusting that one validated pattern covers all of them.

Sources

Ready to protect your data?

Start anonymizing PII with 267+ entity types across 48 languages.

Start Free Trial View Features

HIPAA: Hospital-Specific MRN Detection

HIPAA Safe Harbor De-Identification: Detecting Hospital-Specific MRN Formats Without Engineering

The MRN Format Problem

The Compliance Gap

The Fix: Custom Entity Creation

Example: 15-Facility Hospital Network

Multi-Facility Networks: Multiple MRN Formats

Beyond MRNs: Other Non-Standard Identifiers

Validation: Proving Safe Harbor Compliance

Conclusion

When This Approach Has Limits

Sources

Related Articles

HIPAA MRN Detection Without a Regex PhD

HIPAA Safe Harbor De-ID at Scale

ISO 27001 & HIPAA BAAs for Healthcare

Ready to protect your data?

HIPAA: Hospital-Specific MRN Detection

HIPAA Safe Harbor De-Identification: Detecting Hospital-Specific MRN Formats Without Engineering

The MRN Format Problem

The Compliance Gap

The Fix: Custom Entity Creation

Example: 15-Facility Hospital Network

Multi-Facility Networks: Multiple MRN Formats

Beyond MRNs: Other Non-Standard Identifiers

Validation: Proving Safe Harbor Compliance

Conclusion

When This Approach Has Limits

Sources

Related Articles

HIPAA MRN Detection Without a Regex PhD

HIPAA Safe Harbor De-ID at Scale

ISO 27001 & HIPAA BAAs for Healthcare

Ready to protect your data?

About this page

Related reading

We follow these rules

Our promise

Where we run

Need help?

How we test

What we never do

Plans in plain words

Who built this

Where to start

How the parts fit

Words from our team

Common questions we hear

A short tour of the workflow