feat(tpsc): Third-Party Services Catalog (CUST-WP-0023)

Introduces TPSC for tracking external service dependencies with GDPR
compliance maturity (CNIL/IAPP CMMI scale), pricing model, ToS, and
data retention information across all repos.

Primary data:
- canon/tpsc/{openai,anthropic,gemini,openrouter}-api.yaml — service definitions
- tpsc.yaml in each repo (llm-connect seeded with 4 services)

State-hub additions:
- Migration j7e8f9a0b1c2: tpsc_catalog + tpsc_snapshots + tpsc_entries
- api/models/tpsc.py, api/schemas/tpsc.py, api/routers/tpsc.py
- /tpsc/catalog/, /tpsc/ingest/, /tpsc/snapshots/, /tpsc/report/gdpr endpoints
- 4 MCP tools: register_service, list_services, ingest_tpsc_tool, get_gdpr_report
- scripts/ingest_tpsc.py + make ingest-tpsc[/-all] targets
- Dashboard: tpsc.md page + docs/tpsc.md

GDPR maturity scale: unknown | non_compliant | initial | developing | defined | managed | certified
Warnings triggered at: unknown, non_compliant, initial

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-20 00:15:26 +01:00
parent 4e28cab297
commit 60beb1ff35
14 changed files with 1126 additions and 1 deletions

136
dashboard/src/docs/tpsc.md Normal file
View File

@@ -0,0 +1,136 @@
---
title: Third-Party Services Catalog (TPSC)
---
# Third-Party Services Catalog (TPSC)
The TPSC tracks external service dependencies (APIs, SaaS, CLIs) across all
registered repos — complementing the SBOM for package dependencies.
---
## Why TPSC?
Package lockfiles capture Python/JS/Rust dependencies but miss the external
HTTP services your code calls. These carry compliance, cost, and privacy
implications that are invisible to standard SBOM tooling.
TPSC provides:
- A registry of which repos use which external services
- GDPR compliance maturity ratings per service
- Pricing model tracking (paid/usage-based costs)
- Data processing region and retention information
- GDPR warnings for services not suitable in regulated environments
---
## Primary Data Locations
Following ADR-001 (workplans as repo artefacts), TPSC data lives in two places:
| Location | Purpose |
|---|---|
| `<repo>/tpsc.yaml` | Declares which services the repo uses |
| `the-custodian/canon/tpsc/<slug>.yaml` | Canonical service metadata (ToS, GDPR, pricing) |
The state-hub is a collector — it can be rebuilt from scratch by re-ingesting
all `tpsc.yaml` files and re-seeding the catalog from canon files.
---
## tpsc.yaml Format
```yaml
# tpsc.yaml — Third-Party Services Catalog declarations
# Ingest: cd state-hub && make ingest-tpsc REPO=<slug>
services:
- slug: openai-api # Must match a slug in canon/tpsc/
purpose: LLM inference via OpenAI-compatible API
auth: api_key # api_key | oauth | cli | none | unknown
- slug: stripe
purpose: Payment processing
auth: api_key
endpoint: https://api.stripe.com # Optional override if non-standard
notes: Only used in production tier
```
---
## Canon Service File Format
```yaml
# canon/tpsc/openai-api.yaml
slug: openai-api
name: OpenAI API
provider: OpenAI, Inc.
category: llm_inference # llm_inference | storage | payments | search | etc.
website_url: https://openai.com
pricing_model: usage_based # free | paid | freemium | usage_based | unknown
gdpr_maturity: developing # See scale below
gdpr_notes: >
DPA available. SCCs for EU→US transfer. 30-day retention for safety.
dpa_available: true
tos_url: https://openai.com/policies/terms-of-use
privacy_policy_url: https://openai.com/policies/privacy-policy
data_processing_regions:
- us
data_retention_notes: >
30 days default; zero-retention available on eligible endpoints.
status: active
```
---
## GDPR Maturity Scale
Based on the **CNIL / IAPP CMMI Privacy Maturity Model**, adapted for
third-party service assessment:
| Level | Name | Description | Dashboard |
|---|---|---|---|
| 0 | `unknown` | No information about GDPR stance | 🔴 Warning |
| 1 | `non_compliant` | Known GDPR issues, no remediation | 🔴 Warning |
| 2 | `initial` | Basic privacy policy only, ad hoc approach | 🟠 Warning |
| 3 | `developing` | DPA available, some controls, SCCs provided | 🟡 |
| 4 | `defined` | Formal DPA, SCCs documented, clear retention policy | 🟢 |
| 5 | `managed` | Independently audited, metrics tracked | 🟢 |
| 6 | `certified` | ISO 27701 / SOC2 privacy certified | 🟢 |
Services at levels 02 (**Warning**) may limit use in GDPR-regulated or
corporate environments. At minimum, `developing` is needed for routine
processing of personal data with an API provider.
Reference: [CNIL GDPR maturity model](https://iapp.org/news/b/cnil-publishes-data-protection-management-maturity-model), [IAPP Privacy Maturity Model](https://iapp.org/news/a/achieving-privacy-excellence-understanding-the-privacy-maturity-model)
---
## Adding a New Service
1. Create `the-custodian/canon/tpsc/<slug>.yaml` following the format above
2. Seed it into the state-hub: `cd state-hub && make api` then POST to `/tpsc/catalog/`
(or use the MCP tool: `register_service(slug=..., ...)`)
3. Add it to your repo's `tpsc.yaml`
4. Ingest: `make ingest-tpsc REPO=<slug>`
---
## MCP Tools
| Tool | Purpose |
|---|---|
| `register_service(slug, ...)` | Add/update a service in the catalog |
| `list_services(gdpr_maturity?, category?, pricing_model?)` | Browse catalog |
| `ingest_tpsc_tool(repo_slug)` | Parse tpsc.yaml and ingest snapshot |
| `get_gdpr_report()` | GDPR warning summary across all repos |
---
## Makefile Targets
```bash
make ingest-tpsc REPO=llm-connect # Ingest single repo
make ingest-tpsc-all # Ingest all repos
make ingest-tpsc REPO=llm-connect DRY_RUN=1 # Preview only
```