generated from coulomb/repo-seed
Backfill all 23 research source notes with terminology extracts, modeling assumptions, conflicts, canonical mappings, and references. Refresh terminology artifacts, refine the conceptual model with explicit scenario paths, reconcile canon surfaces and open questions, and mark the workplan finished.
5.3 KiB
5.3 KiB
GDPR Pseudonymization and Privacy
Source Type
Regulatory guidance. EU GDPR (Regulation 2016/679) Article 4(5) and Recital 26; EDPB guidance on identifiability, anonymization, and data subject rights.
Domain
Privacy regulation, pseudonymization, identifiability, data minimization, and lawful basis for identity processing.
Why This Source Matters
GDPR pseudonymization and identifiability concepts affect how canonical models should represent privacy-limited links, scoped identifiers, and correlation risk.
Key Concepts
- Personal data: information relating to identified or identifiable natural person.
- Identifiable person: can be identified directly or indirectly by reasonable means.
- Pseudonymization (Art. 4(5)): processing personal data so it cannot be attributed to a subject without additional information kept separately.
- Anonymization: irreversible de-identification; data no longer personal.
- Data subject: identified or identifiable natural person.
- Controller / Processor: roles responsible for processing personal data.
- Purpose limitation: data used for specified, explicit, legitimate purposes.
- Data minimization: adequate, relevant, limited to necessary.
- Right of access / erasure: data subject rights affecting linked records.
- Additional information: key held separately to re-identify pseudonymous data.
Relevant Terminology
| Term | Source meaning |
|---|---|
| Personal data | Data about identifiable natural person. |
| Pseudonymization | Reversible de-identification with separate key. |
| Anonymization | Irreversible; no longer personal data (if effective). |
| Data subject | Natural person the data relates to. |
| Identifiable | Reasonably linkable to person. |
| Additional information | Re-identification key stored separately. |
| Controller | Determines purposes and means of processing. |
| Processing | Any operation on personal data. |
| Erasure | Delete personal data (right to be forgotten). |
| Profiling | Automated evaluation of personal aspects. |
Modeling Assumptions
- Pseudonymization is not anonymization; data may remain personal.
- Separate storage of additional information is required for pseudonymization.
- Scope and access control on keys determine correlation risk.
- Linking pseudonymous records across purposes may increase identifiability.
- Legal basis and purpose govern whether linking is permissible.
- Erasure requests may require breaking links or deleting assertions.
- Regulatory role (controller) is organizational, not purely technical.
Identity-Canon Implications
- Pseudonymous Identifier and Scoped Identifier map to pseudonymization techniques (pairwise sub, hashed email, internal IDs).
- Privacy-limited Synonymity Assertion must record privacy classification and scope (S14).
- Additional information (re-identification key) maps to separately secured Evidence Source or Credential with strict Scope access.
- Data subject maps to Natural Person with privacy rights overlay (downstream policy, not canon legal advice).
- Erasure maps to Lifecycle State transitions: revoke assertions, sever bindings, archive with legal exceptions noted downstream.
- Pairwise OIDC, tenant-local subjects, and restricted persona links are technical pseudonymization patterns aligned with GDPR concepts.
- Reinforces visibility of privacy constraints on relationships (P8, S14 checks).
Terminology Conflicts
- Pseudonym vs. Pseudonymization: pseudonym is identifier; pseudonymization is processing technique.
- Anonymous vs. Pseudonymous: often conflated in product marketing.
- Identity vs. Personal data: not all identifiers are personal data in all contexts.
- Deletion vs. Revocation: erasure may require more than assertion revocation.
- Subject: GDPR data subject vs. OIDC/SAML subject.
Candidate Canonical Mappings
| GDPR concept | Candidate canonical concept |
|---|---|
| Data subject | Natural Person (privacy overlay) |
| Pseudonymization | Processing pattern on Identifier / Profile |
| Pseudonymous identifier | Scoped Identifier / Pseudonymous Identifier |
| Additional information | Separately secured Evidence Source / key |
| Purpose limitation | Scope + policy metadata on processing |
| Cross-system link | Synonymity Assertion (privacy classification required) |
| Erasure request | Lifecycle State + assertion revocation |
| Identifiability risk | Privacy classification on links |
| Controller | Organization actor (downstream legal role) |
| Anonymized dataset | Out of scope for personal identity linking |
Open Questions
- Should canon include a standard
privacy_classificationenum for assertions? - How should erasure of one account affect Synonymity Assertions touching other accounts (S02)?
- Does pseudonymization key storage warrant a canonical secured Scope type?
- Should identifiability review be documented as operator workflow in downstream recommendations only?
References
- GDPR Article 4(5) pseudonymization — https://gdpr-info.eu/art-4-gdpr/
- GDPR Recital 26 on identifiability — https://gdpr-info.eu/recitals-novo/26/
- EDPB Guidelines on identifiability (various) — https://edpb.europa.eu/
- ISO/IEC 20889 privacy enhancing data de-identification terminology