NAIH Hungary: TAJ-Szám and GDPR Technical Requirements
Updated for 2026
Hungary's data authority is NAIH. Its 2024 report found that NER accuracy for Hungarian is only 67%. The EU average is 82%. That gap creates real risk. Tools built for English or German miss Hungarian identifiers at high rates.
Why Hungarian NER Scores Low
Three features of Hungarian break standard NLP models.
Agglutination: Hungarian adds suffixes to root words. The same name takes many forms in a sentence. "Kovács Péter" in subject position becomes "Kovács Péternek" in another role. NER models must link all those forms to one person.
Name order: Hungarian puts the family name first. Most NLP models expect given name first. That reversal causes missed detections.
Special characters: Hungarian uses ő and ű. These are not the same as German umlauts. Mixed encoding — Windows-1250 vs UTF-8 — also causes failures.
These three factors explain most of the accuracy gap in NAIH's 2024 report.
TAJ-Szám: Hungary's Social Security Number
The TAJ-szám (Társadalombiztosítási Azonosító Jel) is a 9-digit number. It appears in healthcare, payroll, social benefits, and pension records.
Checksum: Multiply digits 1 to 8 by weights 3, 7, 3, 7, 3, 7, 3, 7. Add the results. Take modulo 10. That gives the check digit.
This algorithm is unique to Hungary. It is not the same as the Luhn algorithm used in other countries.
Generic tools detect TAJ-szám at only 61% accuracy, per the NAIH 2024 report. The 9-digit format looks like many other numbers in Hungarian documents. Without the checksum step, tools flag false positives and miss real ones.
Adóazonosító Jel: Hungary's Tax ID
The adóazonosító jel is a 10-digit personal tax number. The first digit is always 8. It appears in employment records, tax filings, and financial documents.
Checksum: Take digits 2 to 9. Multiply by weights 9, 7, 3, 1, 9, 7, 3, 1. Add the results. Take modulo 10. That is the check digit. A result of 0 means the check digit is 0.
NAIH enforcement cases show this number is often missed in HR documents when tools are set up for other languages.
See our EU national tax ID guide for how these numbers compare across member states.
NAIH's DPIA Requirement for AI Systems
NAIH's 2024 guidance requires a completed DPIA before any AI system processes personal data. This is more strict than the general GDPR test. The DPIA must cover:
- Data flows — training data, inputs, and outputs
- Legal basis — documented for each activity
- Language accuracy — required for languages below the EU average
- Human review — a way to check automated decisions
The DPIA must be updated each year when the system is retrained.
For teams deploying AI tools on Hungarian data, the order is fixed: DPIA first, then deployment.
Minimum Technical Controls
Three controls form the baseline for NAIH compliance:
- TAJ-szám detection with modulo-10 checksum — pattern matching alone is not enough
- Adóazonosító jel detection with checksum validation — critical for HR and finance
- Hungarian NER with agglutination support — must handle ő, ű, and encoding variants
See our BFDI Germany guide to compare how Central European DPAs set technical requirements. For a similar language gap in Central Europe, see our Czech ÚOOÚ guide.