Beyond SSNs: Anonymizing Your Organization's Internal IDs

Your GDPR tool removes email addresses. It removes phone numbers. It removes names. You run support exports through it. Then you share the output with your analytics team.

Your customer account numbers are still in every ticket. Your order IDs are still there. Your internal user IDs are still there too.

These IDs look harmless on their own. Without a lookup table, they do not name a person. But your analytics team has that table. Your CRM has it. Your support database has it. Anyone with access can find the person in seconds.

This is a GDPR failure. The tool did not break. It was never told to look for your IDs.

What Standard PII Tools Detect

Standard PII tools cover universal formats. They catch what every organization uses.

Standard tools detect:

Social security numbers (US SSNs, UK NINOs, EU national ID formats)
Email addresses
Phone numbers
Credit card numbers
Names
Passport and driver's license numbers

Standard tools do not detect:

Employee IDs in your EMP-XXXXX format
Customer account numbers in your ACC-XXXXXXXX-XX format
Order IDs in your ORD-XXXXXXX format
Internal user IDs in UUID or custom formats
Partner-specific reference codes

Standard tools find universal patterns. Your internal IDs are not universal. They need custom setup to be found.

The Re-Identification Risk

A firm exports support tickets for quality review. Standard PII removal strips names, emails, and phone numbers. Account numbers in ACC-XXXXXXXX-XX format are not touched.

The export goes to the analytics team. An analyst joins the ticket table with the customer database on account number. The person is found at once. No special trick is needed. It is a routine SQL join.

GDPR Article 4(5) defines pseudonymization as processing where data "can no longer be attributed to a specific data subject without the use of additional information." Account numbers fail that test. The additional information — your customer database — is right there in your organization.

The "anonymized" export was not anonymous.

Building Custom Entity Patterns

Custom entity setup is fast. Compliance teams can do it with no engineering help.

Step 1: List your ID formats.

Write down each one. For example: account ACC-XXXXXXXX-XX, order ID ORD-XXXXXXX, employee ID EMP-XXXXX.

Step 2: Describe the format in plain language.

"Account numbers start with ACC, then a dash, then 8 digits, then a dash, then 2 uppercase letters."

AI-assisted pattern generation returns: ACC-\d{8}-[A-Z]{2}

Step 3: Test on sample data.

Upload 20 to 30 documents. Confirm all instances are found. Confirm no false hits appear.

Step 4: Choose a method.

For IDs used as join keys, where analysis needs to link records:

Pseudonymize. Replace ACC-00123456-AB with ACC-99876543-XY each time. The same input always gives the same output. Joins still work. The original value cannot be found without the key.

For IDs not needed in analysis:

Redact. Replace with [REDACTED]. Simple. Permanent.

Step 5: Save as a shared preset.

Save the custom entity — or a set of them — to a shared preset. The setup applies to all use: batch uploads, API calls, browser interface. New team members get the full config at once.

Case Study: 180,000 Support Tickets

A firm found 180,000 support tickets in their analytics warehouse. Names and emails had been removed. Account numbers had not. Each ticket still held a live ACC-XXXXXXXX-XX value.

Resolution timeline:

Compliance officer defines the ACC pattern — 15 minutes
Tests it on 30 sample tickets — 20 minutes
Confirms accuracy — 10 minutes
Processes 180,000 tickets in an overnight batch
Replaces warehouse tables with the clean versions

Total time for the compliance officer: 45 minutes. Without custom entity support, the fix would need an engineering ticket, code review, and a deploy. That takes weeks, not hours.

For a closer look at how custom IDs create risk in AI support tools, see the GDPR and support AI guide.

Where Custom IDs Spread

Internal IDs appear in more places than most teams expect.

Internal documents:

Meeting notes with account or order ID references
Email threads about customer cases
Presentations with case study data

Shared with third parties:

Reports to regulators with case reference numbers
Audit files with customer references
Vendor files that carry customer IDs

Research and analytics:

Customer journey datasets
Support quality review exports
Training data for internal ML models

Each context needs the same custom entity setup to produce truly anonymous output.

Pseudonymization vs. Anonymization

GDPR draws a clear line.

Pseudonymization replaces IDs with stand-ins. The original person can be found again if someone has the lookup table. This data is still personal data. It reduces risk. It does not remove your GDPR duties.

Anonymization removes the ability to re-identify. Anonymous data is not personal data. GDPR does not apply to it.

Account numbers and order IDs are pseudonymous when lookup tables exist. Replacing them with fixed stand-ins lowers risk, but GDPR still applies. Replacing them with random tokens — and deleting the key — removes the GDPR duty, but breaks join-based analysis.

For sharing with third parties who lack your lookup tables: pseudonymization may be enough. For internal analytics, full anonymization or strict access controls are needed. The legal compliance guide covers how to document each approach for your ROPA.

Conclusion

The gap is not a tool failure. It is a setup gap. No tool can know your account number format unless you tell it.

Custom entity setup closes the gap in hours. Compliance teams define the formats, test them on sample data, and apply them across all use modes. No engineering help is needed.

The 180,000 unredacted account numbers were not there because the tool failed. They were there because the tool was never told to look for them.

Sources

Ready to protect your data?

Start anonymizing PII with 267+ entity types across 48 languages.

Start Free Trial View Features

Beyond SSNs: Internal ID Anonymization

Beyond SSNs: Anonymizing Your Organization's Internal IDs

What Standard PII Tools Detect

The Re-Identification Risk

Building Custom Entity Patterns

Case Study: 180,000 Support Tickets

Where Custom IDs Spread

Pseudonymization vs. Anonymization

Conclusion

Sources

Related Articles

Self-Hosted PII Fails Compliance Audits

Presidio Misses 220+ GDPR Entities

Configuration Drift: A Hidden GDPR Risk

Ready to protect your data?

Beyond SSNs: Internal ID Anonymization

Beyond SSNs: Anonymizing Your Organization's Internal IDs

What Standard PII Tools Detect

The Re-Identification Risk

Building Custom Entity Patterns

Case Study: 180,000 Support Tickets

Where Custom IDs Spread

Pseudonymization vs. Anonymization

Conclusion

Sources

Related Articles

Self-Hosted PII Fails Compliance Audits

Presidio Misses 220+ GDPR Entities

Configuration Drift: A Hidden GDPR Risk

Ready to protect your data?

About this page

Related reading

We follow these rules

Our promise

Where we run

Need help?

How we test

What we never do

Plans in plain words

Who built this

Where to start

How the parts fit

Words from our team

Common questions we hear

A short tour of the workflow