Presidio Misses 220+ GDPR Entities: The EU Coverage Gap
Updated for 2026
Microsoft Presidio ships with about 40 default entity recognizers. For US deployments, this works. It covers SSNs, US passports, driver's licenses, credit cards, and emails.
For EU deployments, the gap is large. GDPR covers all EU personal data. This applies no matter the nationality of the data subject. EU teams need recognizers that Presidio does not ship with.
What Presidio Includes
Presidio's defaults fall into four groups.
US-centric IDs:
- US Social Security Number (SSN)
- US Passport Number
- US Driver's License Number
- US Bank Account Number
- US ITIN
- US Medical License Number
Universal IDs:
- Email Address
- Phone Number
- IP Address
- Credit Card Number
- Crypto Wallet Address
- URL
Text entities (NER-based):
- PERSON
- LOCATION
- ORGANIZATION
- DATE_TIME
Limited international:
- UK NHS Number
- UK National Insurance Number (NINO)
- Some financial identifiers
Total: about 40 recognizers.
What EU Teams Need
Financial IDs
IBAN shows up in most EU business files. It appears in payments, invoices, and payroll. IBAN follows ISO 13616. Presidio has no IBAN recognizer.
Take a German fintech. Every payment file has an IBAN. Without IBAN detection, the tool only looks for credit card numbers. The main EU payment ID is missed. That means a key piece of GDPR-covered data is never found.
National Tax IDs
None of these are in Presidio's defaults:
- German Steueridentifikationsnummer: 11 digits
- French NIR: 15 digits with a check key
- Italian Codice Fiscale: 16 characters with checksum
- Spanish NIF/NIE: 9 characters with a letter
- Dutch BSN: 9 digits with elfproef validation
An EU payroll team handles files from many member states. Without these, it misses the most sensitive IDs in those records.
National Health IDs
The UK NHS Number is covered. These are not:
- French NIR (also a health ID)
- German Krankenkassennummer
- Italian Codice Fiscale (also a health ID)
- Dutch BSN (used for health insurance)
EU health teams need these for GDPR-level data protection.
EU Driving Licenses
EU driving licenses fall under Directive 2006/126/EC. Each member state has its own format. The alphanumeric structure differs by country. Presidio has US-only driver's license recognizers. It has no EU driving license support. This means EU driving license data passes through undetected.
VAT Numbers
EU VAT numbers appear in every B2B deal. Format: 2-letter country code plus 8–12 digits. Presidio has no VAT recognizer. VAT numbers link to companies and their owners. They are personal data under GDPR.
For more on GDPR obligations, see GDPR compliance resources.
The Cost of Custom Recognizers
When EU teams find the gap, they build custom recognizers. This takes real time.
Time per recognizer (rough estimate):
- Research the format: 1–2 hours
- Write the Python class: 2–4 hours
- Build the regex and validation: 2–4 hours
- Add context words: 1–2 hours
- Write tests: 2–3 hours
- Deploy and check: 1–2 hours
That is 9–17 hours per recognizer. These are rough estimates only.
Example: a German fintech needs four recognizers.
IBAN, Steuer-ID, EU driving license, German VAT.
- 4 recognizers at 13 hours each = 52 hours of work
- At €100 per hour: about €5,200
That covers only the first build. Formats change over time. New edge cases come up. Presidio API updates can break things. Each change needs a developer to review and fix it. Ongoing work adds to the cost year after year.
The Managed Library
anonym.legal extends Presidio with 285+ entity types. The team keeps the library up to date. EU identifiers are included from day one.
What goes beyond Presidio defaults:
- IBAN in all EU member state formats
- Member state tax IDs: Steuer-ID, NIR, Codice Fiscale, NIF/NIE, BSN, PESEL, and more
- EU national health IDs
- VAT numbers (EU format)
- EU driving license formats
- European passport formats
- Entity variants across 48 supported languages
When Germany updates a tax ID format, the update ships with the service. No pull request needed from your team.
For IDs not in the library, the custom entity builder lets you add patterns. No Python code is needed.
See security and compliance details for how updates and audit trails work.
German Fintech Example
A German fintech needs to detect IBANs, BICs, Steuer-IDs, and Handelsregisternummern in customer files.
Presidio default detection rate for these four types: 0%.
None are in the default library. This is not poor precision. It is zero detections. The tool does not miss them partly. It does not see them at all.
Cost comparison:
| Approach | Year-one cost |
|---|---|
| Custom recognizers (4 × 13 hrs at €100/hr) | ~€5,200 plus ongoing upkeep |
| Managed entity library (Pro plan) | €180/year, all four covered |
The gap is about 29x in year one. Each year after that, custom upkeep adds more cost. The managed service price stays flat.
Conclusion
Presidio's defaults serve US use cases well. For EU deployments under GDPR, they fall short. The gap requires either custom recognizer work or a managed service.
For EU teams where compliance is required and engineering time is tight, a pre-built EU entity library removes a 50+ hour build project. Files can be processed from day one. No custom code is needed first.