The distinction matters legally. Under GDPR (Recital 26), anonymized data is no longer personal data and falls outside the regulation's scope entirely. No consent, no data subject rights, no processing restrictions. Pseudonymized data remains personal data because re-identification is possible, meaning GDPR and FADP obligations continue to apply in full. Choosing the wrong technique has direct compliance consequences.
Anonymization techniques include generalization (replacing specific values with ranges, e.g., exact age becomes age bracket), suppression (removing data entirely), perturbation (modifying values while preserving statistical properties), and k-anonymity approaches (ensuring each record is indistinguishable from at least k-1 others). The goal is to ensure that no combination of remaining data points can identify an individual, even when cross-referenced with external datasets. For legal documents, this means removing not just names but also contextual identifiers: case-specific facts, unique transaction details, or combinations of dates and locations that could identify parties.
Pseudonymization replaces identifiers with codes (e.g., "John Smith" becomes "Patient-001") while maintaining a separate key linking codes to real identities. This supports use cases where re-identification may be necessary (clinical trials, follow-up studies, internal audit) but adds re-identification risk, particularly if the mapping key is compromised or if the pseudonymized data contains enough contextual detail to enable inference attacks.
DocIQ Shield performs anonymization, not pseudonymization. Shield replaces personal data with standardized designations (A.____, B.____) following the conventions used by the Swiss Federal Supreme Court for published BGE decisions. No re-identification key is created or stored. The process is irreversible by design. Once a document is anonymized, the original personal data cannot be recovered from the output. Combined with zero-persistence processing, this means Shield never retains either the original or the mapping between original and anonymized identifiers.