By · Last updated 2026-05-29

Back to BlogTechnical

GDPR in App Logs: JSON PII Compliance

Application logs contain customer email addresses, IPs, and account numbers that GDPR Article 5(1)(e) requires be managed.

May 29, 20266 minute read
API logsGDPR complianceJSON anonymizationobservabilitystorage limitation

The Silent GDPR Risk in Your Log Stack

Updated for 2026

Most teams check their database for personal info. Fewer do the same for their log system.

GDPR Article 5(1)(e) limits how long you can store personal info. For databases, teams set policies and run deletion jobs. For log files, the rule is simpler: keep everything for 90 days for debugging.

The problem? Those records hold personal info. Request entries hold user emails. Error captures hold raw input values. Access entries hold IP addresses. Each of these counts as personal info under GDPR. Your team needs a lawful basis and a retention plan for each one.

What Ends Up in Your Log Files

Standard web app logging pulls in a wide range of PII.

Access records (nginx/Apache):

  • IP addresses — personal info per EDPB guidance
  • User-agent strings — may enable device fingerprinting
  • Session tokens — if written to output

App records (structured JSON):

  • User IDs and email addresses
  • Input errors — often include the raw invalid value, which may be real user info
  • Business events — order IDs linked to customer accounts
  • Search queries — may contain names or addresses

API gateway records:

  • Auth headers — partly captured in some setups
  • Query params — may carry user IDs, names, or emails
  • Request and response bodies — present in debug-level setups

Database audit entries:

  • SQL queries with WHERE clauses like email = 'user@example.com'
  • Literal personal values in query params

This is not done on purpose. It is a side effect of logging built for debugging, not GDPR.

EDPB Guidance on IP Addresses

The European Data Protection Board says IP addresses are personal info. ISPs can link them to subscribers. Within an org, they can identify specific users.

The impact is direct. Access records with IP addresses are personal records. Keeping nginx output for 12 months means keeping personal info for 12 months. That needs a lawful basis under Article 6. It also needs the retention period to match your stated purpose.

Most teams skip this step. "We keep entries for 90 days because security says so" is a rule of thumb. It is not a GDPR Article 5(1)(e) review. See our Legal Compliance overview for how this fits a broader program.

How to Reach Compliance

The practical route for most teams is not to cut retention windows. Operational and security reasons for longer windows are real. The better path is to mask records before long-term storage.

A tiered model works well.

0–7 days: Full raw records for active debugging. Seven days is short enough for most teams.

7–90 days: Masked records for trend analysis and security review. IP addresses are swapped out. User emails become stable tokens. Account numbers are masked. Key fields — timestamps, error codes, latency, endpoints — are kept as-is.

90+ days (if needed): Aggregated output only. Event counts, error rates, latency ranges. No user-level records remain.

Personal info stops at seven days. Aggregated output can carry forward without exposing anyone. See Security & Compliance for more detail.

Keep Structure Intact for Monitoring

Good masking keeps the JSON structure intact. It only swaps out content. This keeps output useful for debugging and alerts.

Kept as-is:

  • JSON keys and nesting
  • Timestamps and time order
  • Error types and HTTP status codes
  • HTTP methods, paths, and latency values
  • Business event types

Swapped out:

  • Email addresses → stable token per original (e.g. user1@example.com)
  • IP addresses → RFC 5737 ranges (192.0.2.x)
  • Account numbers → ACCT_XXXXX
  • Phone numbers → +XX XXX XXX XXXX
  • Names in error text → [PERSON]

Stable tokens keep traces useful. A trace for user1@example.com across 40 entries works the same as the original. Aggregated metrics — error rates, latency, throughput — need no personal info at all. See the Glossary for the terms pseudonymization and anonymization.

Three Ways to Integrate This

Three patterns cover most engineering teams.

Option 1 — Pipeline masking: Fluentd or Logstash intercepts each line before sending it on. A masking step runs inline. Elastic or Datadog gets only cleaned records. No app code changes are needed.

Option 2 — Nightly batch: Raw records land in local storage. A nightly job masks the prior day's output and deletes the raw version. Masked records go to long-term storage. Raw output is kept for seven days only.

Option 3 — Pre-share masking: Raw records stay internal with strict access controls. Before sharing with pen testers or outside contractors, run a masking pass. External parties always get clean versions.

For GDPR docs, masking is a "technical measure" under Article 32. Record the tool, its setup, and your retention policy in your Records of Processing Activities (RoPA) under Article 30. See our FAQ for common RoPA questions.

Want a real-world example? Check the case studies for concrete implementation details. You can also review our pricing to see which plan includes built-in masking pipelines.

Sources

Ready to protect your data?

Start anonymizing PII with 285+ entity types across 48 languages.

About this page

We update this page when our platform or the law changes.

Read our founder note for how we work.

Each change shows up in the timestamp at the top.

Related reading

We follow these rules

  • GDPR (EU 2016/679).
  • ISO/IEC 27001:2022.
  • NIS2 (EU 2022/2555).
  • HIPAA safe harbor under 45 CFR § 164.514(b)(2).

Our promise

We do not sell your data.

We do not train models on your text.

We store your files in Germany.

You can delete your account at any time.

You own your work.

Where we run

Our servers live in Falkenstein, Germany.

We use Hetzner. They hold ISO 27001 certification.

All data stays in the EU.

Backups run every day.

Need help?

Email support@anonym.legal.

We reply within one business day.

How we test

We run a full check suite on every release.

Each surface gets its own sweep script and report.

Human reviewers spot-check the output each week.

We track recall and precision on a labelled set.

Bad runs block the deploy.

What we never do

  • We never sell your information to third parties.
  • We never train models on what you upload.
  • We never keep your work after you delete it.
  • We never share keys with any outside firm.
  • We never run ads inside the product.

Plans in plain words

We sell credits, not seats.

One credit covers one short job.

Long jobs use a few credits each.

You can top up at any time.

Unused credits roll over each month.

Read the plans page for current rates.

Who built this

A small team of engineers and lawyers built this.

We ship from Europe and work in the open.

Our founder note spells out why we started.

Where to start

How the parts fit

A browser add-on cleans text inside Chrome.

A Word plug-in handles drafts in Office.

A small desktop tool works on whole folders.

An agent protocol link feeds large models safely.

All four share one core engine and one rule set.

Words from our team

We started this work after a lunch about cookies.

One friend kept getting odd ads on her phone.

We asked why a court file leaked through a draft.

We sketched the first build on a napkin that week.

By month three we had a tiny demo for a friend.

She used it on her first case the next day.

Common questions we hear

Can the tool read scanned PDFs? Yes, with OCR.

Does it work on long files? Yes, in small chunks.

Can I roll my own rule set? Yes, save it as a preset.

Does it run offline? The desktop build runs offline.

Do you keep my files? No, the cloud build wipes after each run.

Will it learn from my work? No, we never train on inputs.

A short tour of the workflow

Upload a file or paste a snippet of prose.

Pick the entities you want gone from the draft.

Choose a method: replace, mask, hash, encrypt, or redact.

Press run and watch the side panel show each hit.

Skim the result and tweak any rule that misfired.

Save the cleaned file or send it to a teammate.