By · Last updated 2026-03-03

Back to BlogTechnical

Air-Gapped PII: Offline-First for Defense

41% of enterprise security policies prohibit cloud processing of classified documents.

March 3, 20268 minute read
offlineair-gapdesktopITARGDPRgovernmentdefenselocal processing

When the Network Has No Exit

A data scientist works at a defense firm. She has 3,000 personnel records. She needs to remove names, Social Security Numbers, and clearance levels. Then she can share the data with a research partner under a CUI agreement.

Her network has no internet. By design.

She tests every web-based tool she can find. Each one sends data to an outside server. Every cloud platform needs an account and a live link. Even "on-premises" tools often call a remote license server.

This is the air-gapped deployment problem. It hits far more teams than most people expect.

Who Needs Offline PII Removal

Defense firms and government agencies face this most often. DISA's FedRAMP program requires data to stay within approved network boundaries. ITAR limits technical data to US-controlled systems. Networks like JWICS and SIPRNet are physically cut off by design.

But the offline need goes well beyond classified sites:

Hospitals with segmented networks. PACS imaging systems, EHR platforms, and research databases often sit on networks with no internet by policy.

Trading floors and clearing houses. Proprietary trading systems and SWIFT-connected systems use strict network cutoffs.

Industrial control systems. SCADA networks and critical infrastructure run with air gaps as a core security measure. Post-Stuxnet hardening made this the norm.

European data rules. Germany's Landesdatenschutzgesetze and similar EU laws require local data processing for sensitive government and health records. The TikTok €530M GDPR fine arrived in May 2025. It covered data transfers to China. That fine pushed more teams toward local tools. See our compliance overview for the GDPR transfer rules that apply.

Why Cloud Tools Fail in Air-Gapped Networks

Most data removal tools follow a SaaS model:

User Device → HTTPS → Vendor API → NLP Models → Response → User Device

This design needs internet access at the processing device. It needs trust in the vendor's servers. It means data crosses outside networks.

On an air-gapped network, step one is a physical impossibility. For regulated environments, steps two through four may each break compliance rules.

Self-hosted Presidio is the common fallback. But it needs Docker skills and Python setup. It also needs spaCy model downloads, which require internet access. And it needs ongoing IT support. Most teams lack all of this.

The gap between cloud ease and self-hosted complexity is exactly what local desktop tools fill.

How Local PII Removal Works

A good offline tool ships with everything it needs:

Bundled NLP models. spaCy models (40–80 MB each) and transformer models for named entity detection are part of the installer. No download is needed at run time.

Local detection pipeline. Regex, NLP, and ML all run on the local CPU — or GPU if available. The Presidio-based engine inside anonym.legal makes no network calls during a run.

Encrypted local vault. Configs, presets, and keys are stored locally. The vault uses AES-256-GCM encryption and Argon2id key derivation. No cloud sync. No remote backup. The vault stays on the device.

Local file I/O. Input files come from local storage. Output files go back to local storage. No data crosses any network interface.

Small attack surface. The Desktop App uses Tauri 2.0 (Rust-based). Tauri has a much smaller attack surface than Electron (Chromium-based) tools. Its binary is roughly one-tenth the size. It also calls fewer OS APIs by default.

Three Real Compliance Scenarios

ITAR Documents — 500 Files

A defense firm needs to share technical docs with a foreign partner under a license exception. The files contain US person names and personnel data. Both must be removed first.

Key needs: processing on cleared workstations only. No data sent outside the cleared network. An audit trail showing the work was done. Batch support for 500+ files.

The Desktop App handles all 500+ DOCX files locally in batch mode. No network call is made during the run. The audit log stays in the local vault. The output meets the ITAR license exception needs.

German Federal Agency — Complaint Records

A German federal agency must remove personal data from citizen complaint records. It then sends the records to a research institute. BfDI guidance bars processing on non-government systems.

The Desktop App runs on agency Windows 11 workstations. All processing is local. The IT security team confirms this with traffic monitoring — zero outside connections during the run.

Hospital Research — EHR De-ID

A hospital research team needs to strip patient records for a clinical trial. HIPAA Safe Harbor requires removing 18 identifier types. The clinical network has no internet access.

The Desktop App handles batch processing of EHR exports in CSV and JSON format. The Privacy Officer reviews the output against Safe Harbor rules before the dataset goes to research partners.

What to Look for in an Offline Tool

CapabilityWhy It Matters
Fully offline after installNo internet dependency during processing
Bundled NLP modelsNo download step needed
Batch processingHandle large volumes without manual work
Local encrypted vaultSecure storage of configs and keys
Audit logRecords needed for compliance reviews
Windows, macOS, Linux supportCovers classified workstation types
No telemetry optionStop data leaving via telemetry
File format supportDOCX, PDF, TXT, CSV, JSON, Excel

Data Rules Push Teams Toward Local Tools

The TikTok €530M fine triggered a wider wave of fines. EU teams that used cloud tools now ask a new question. Does processing on a vendor's servers satisfy GDPR Chapter V and national data laws?

The cleanest answer to "where does your data go?" is this: nowhere — it never leaves the device. Local processing removes the GDPR transfer question fully.

For German teams, the DSGVO's strict reading of Articles 44–46 makes local processing a smart choice. This applies even without strict network restrictions. Our security overview explains how local processing cuts out the third-party data chain.

Practical Deployment Notes

Install on air-gapped systems. The installer — Windows .exe or .msi, macOS .dmg, Linux .AppImage or .deb — transfers to the air-gapped network via USB or secure file transfer. Internet is not needed after install.

Language support. All 48 supported languages ship with the app (25 spaCy + 7 Stanza + 16 Transformer models). The full set is available offline with no extra download.

Hardware needs. The NLP pipeline runs on modern workstations without a GPU. Batch processing of 1,000 documents typically takes 5–15 minutes. Speed depends on document size and CPU speed.

Offline license setup. For networks where a license server is out of reach, offline license setup is available.

When Air-Gapping Is Not the Right Fit

Air-gapped systems solve specific problems. They also add real burden.

Update friction. Keeping models and software current takes manual steps. Teams that fall behind may miss new PII patterns.

Linking overhead. Air-gapped systems cannot connect to cloud SIEM tools or remote audit dashboards. Custom data-diode solutions are needed. This raises costs.

Accuracy trade-offs. Cloud tools update training data on an ongoing basis. Offline models are a snapshot. They can fall behind against new language patterns over time.

Not needed for every threat model. Teams without government, health, or legal mandates may find cloud tools more practical. Strong encryption, SOC 2 Type II audits, and data processing agreements cover most cases. Air-gapping only pays off when the threat model truly includes network-based data theft by a skilled adversary.

For most SMBs and standard enterprise teams, strong encryption in transit and at rest gives adequate protection. Add solid contractual controls and you cover most use cases — without the overhead of full air-gapping. See our FAQ for more on choosing the right deployment model.


anonym.legal's Desktop App (Windows, macOS, Linux) processes PII entirely locally with bundled NLP models. No internet connection is required after installation. Batch processing supports 1–5,000 files per run depending on plan tier.

Sources

Ready to protect your data?

Start anonymizing PII with 285+ entity types across 48 languages.

About this page

We update this page when our platform or the law changes.

Read our founder note for how we work.

Each change shows up in the timestamp at the top.

Related reading

We follow these rules

  • GDPR (EU 2016/679).
  • ISO/IEC 27001:2022.
  • NIS2 (EU 2022/2555).
  • HIPAA safe harbor under 45 CFR § 164.514(b)(2).

Our promise

We do not sell your data.

We do not train models on your text.

We store your files in Germany.

You can delete your account at any time.

You own your work.

Where we run

Our servers live in Falkenstein, Germany.

We use Hetzner. They hold ISO 27001 certification.

All data stays in the EU.

Backups run every day.

Need help?

Email support@anonym.legal.

We reply within one business day.

How we test

We run a full check suite on every release.

Each surface gets its own sweep script and report.

Human reviewers spot-check the output each week.

We track recall and precision on a labelled set.

Bad runs block the deploy.

What we never do

  • We never sell your information to third parties.
  • We never train models on what you upload.
  • We never keep your work after you delete it.
  • We never share keys with any outside firm.
  • We never run ads inside the product.

Plans in plain words

We sell credits, not seats.

One credit covers one short job.

Long jobs use a few credits each.

You can top up at any time.

Unused credits roll over each month.

Read the plans page for current rates.

Who built this

A small team of engineers and lawyers built this.

We ship from Europe and work in the open.

Our founder note spells out why we started.

Where to start

How the parts fit

A browser add-on cleans text inside Chrome.

A Word plug-in handles drafts in Office.

A small desktop tool works on whole folders.

An agent protocol link feeds large models safely.

All four share one core engine and one rule set.

Words from our team

We started this work after a lunch about cookies.

One friend kept getting odd ads on her phone.

We asked why a court file leaked through a draft.

We sketched the first build on a napkin that week.

By month three we had a tiny demo for a friend.

She used it on her first case the next day.

Common questions we hear

Can the tool read scanned PDFs? Yes, with OCR.

Does it work on long files? Yes, in small chunks.

Can I roll my own rule set? Yes, save it as a preset.

Does it run offline? The desktop build runs offline.

Do you keep my files? No, the cloud build wipes after each run.

Will it learn from my work? No, we never train on inputs.

A short tour of the workflow

Upload a file or paste a snippet of prose.

Pick the entities you want gone from the draft.

Choose a method: replace, mask, hash, encrypt, or redact.

Press run and watch the side panel show each hit.

Skim the result and tweak any rule that misfired.

Save the cleaned file or send it to a teammate.