generated from coulomb/repo-seed
Backfill all 23 research source notes with terminology extracts, modeling assumptions, conflicts, canonical mappings, and references. Refresh terminology artifacts, refine the conceptual model with explicit scenario paths, reconcile canon surfaces and open questions, and mark the workplan finished.
114 lines
5.3 KiB
Markdown
114 lines
5.3 KiB
Markdown
# GDPR Pseudonymization and Privacy
|
|
|
|
## Source Type
|
|
|
|
Regulatory guidance. EU GDPR (Regulation 2016/679) Article 4(5) and Recital 26;
|
|
EDPB guidance on identifiability, anonymization, and data subject rights.
|
|
|
|
## Domain
|
|
|
|
Privacy regulation, pseudonymization, identifiability, data minimization, and
|
|
lawful basis for identity processing.
|
|
|
|
## Why This Source Matters
|
|
|
|
GDPR pseudonymization and identifiability concepts affect how canonical models
|
|
should represent privacy-limited links, scoped identifiers, and correlation risk.
|
|
|
|
## Key Concepts
|
|
|
|
- **Personal data**: information relating to identified or identifiable natural
|
|
person.
|
|
- **Identifiable person**: can be identified directly or indirectly by reasonable
|
|
means.
|
|
- **Pseudonymization (Art. 4(5))**: processing personal data so it cannot be
|
|
attributed to a subject without additional information kept separately.
|
|
- **Anonymization**: irreversible de-identification; data no longer personal.
|
|
- **Data subject**: identified or identifiable natural person.
|
|
- **Controller / Processor**: roles responsible for processing personal data.
|
|
- **Purpose limitation**: data used for specified, explicit, legitimate purposes.
|
|
- **Data minimization**: adequate, relevant, limited to necessary.
|
|
- **Right of access / erasure**: data subject rights affecting linked records.
|
|
- **Additional information**: key held separately to re-identify pseudonymous data.
|
|
|
|
## Relevant Terminology
|
|
|
|
| Term | Source meaning |
|
|
| --- | --- |
|
|
| Personal data | Data about identifiable natural person. |
|
|
| Pseudonymization | Reversible de-identification with separate key. |
|
|
| Anonymization | Irreversible; no longer personal data (if effective). |
|
|
| Data subject | Natural person the data relates to. |
|
|
| Identifiable | Reasonably linkable to person. |
|
|
| Additional information | Re-identification key stored separately. |
|
|
| Controller | Determines purposes and means of processing. |
|
|
| Processing | Any operation on personal data. |
|
|
| Erasure | Delete personal data (right to be forgotten). |
|
|
| Profiling | Automated evaluation of personal aspects. |
|
|
|
|
## Modeling Assumptions
|
|
|
|
- **Pseudonymization is not anonymization**; data may remain personal.
|
|
- **Separate storage of additional information** is required for pseudonymization.
|
|
- **Scope and access control on keys** determine correlation risk.
|
|
- **Linking pseudonymous records across purposes** may increase identifiability.
|
|
- **Legal basis and purpose** govern whether linking is permissible.
|
|
- **Erasure requests** may require breaking links or deleting assertions.
|
|
- **Regulatory role (controller)** is organizational, not purely technical.
|
|
|
|
## Identity-Canon Implications
|
|
|
|
- **Pseudonymous Identifier** and **Scoped Identifier** map to pseudonymization
|
|
techniques (pairwise sub, hashed email, internal IDs).
|
|
- **Privacy-limited Synonymity Assertion** must record privacy classification
|
|
and scope (S14).
|
|
- **Additional information** (re-identification key) maps to separately secured
|
|
**Evidence Source** or **Credential** with strict Scope access.
|
|
- **Data subject** maps to **Natural Person** with privacy rights overlay
|
|
(downstream policy, not canon legal advice).
|
|
- **Erasure** maps to Lifecycle State transitions: revoke assertions, sever
|
|
bindings, archive with legal exceptions noted downstream.
|
|
- Pairwise OIDC, tenant-local subjects, and restricted persona links are
|
|
technical pseudonymization patterns aligned with GDPR concepts.
|
|
- Reinforces visibility of privacy constraints on relationships (**P8**, S14 checks).
|
|
|
|
## Terminology Conflicts
|
|
|
|
- **Pseudonym vs. Pseudonymization**: pseudonym is identifier; pseudonymization
|
|
is processing technique.
|
|
- **Anonymous vs. Pseudonymous**: often conflated in product marketing.
|
|
- **Identity vs. Personal data**: not all identifiers are personal data in all
|
|
contexts.
|
|
- **Deletion vs. Revocation**: erasure may require more than assertion revocation.
|
|
- **Subject**: GDPR data subject vs. OIDC/SAML subject.
|
|
|
|
## Candidate Canonical Mappings
|
|
|
|
| GDPR concept | Candidate canonical concept |
|
|
| --- | --- |
|
|
| Data subject | Natural Person (privacy overlay) |
|
|
| Pseudonymization | Processing pattern on Identifier / Profile |
|
|
| Pseudonymous identifier | Scoped Identifier / Pseudonymous Identifier |
|
|
| Additional information | Separately secured Evidence Source / key |
|
|
| Purpose limitation | Scope + policy metadata on processing |
|
|
| Cross-system link | Synonymity Assertion (privacy classification required) |
|
|
| Erasure request | Lifecycle State + assertion revocation |
|
|
| Identifiability risk | Privacy classification on links |
|
|
| Controller | Organization actor (downstream legal role) |
|
|
| Anonymized dataset | Out of scope for personal identity linking |
|
|
|
|
## Open Questions
|
|
|
|
- Should canon include a standard `privacy_classification` enum for assertions?
|
|
- How should erasure of one account affect Synonymity Assertions touching other
|
|
accounts (S02)?
|
|
- Does pseudonymization key storage warrant a canonical secured Scope type?
|
|
- Should identifiability review be documented as operator workflow in downstream
|
|
recommendations only?
|
|
|
|
## References
|
|
|
|
- GDPR Article 4(5) pseudonymization — https://gdpr-info.eu/art-4-gdpr/
|
|
- GDPR Recital 26 on identifiability — https://gdpr-info.eu/recitals-novo/26/
|
|
- EDPB Guidelines on identifiability (various) — https://edpb.europa.eu/
|
|
- ISO/IEC 20889 privacy enhancing data de-identification terminology |