feat(sbom): scan mode, domain grouping dashboard, SBOM convention doc
- ingest_sbom.py: add --scan flag (recursive lockfile discovery) + --lockfile repeatable for explicit multi-file ingestion; skip .venv/node_modules/.git/dist/etc; Makefile gains SCAN= and REPO_PATH= vars - sbom.md: add /domains/ fetch; domain-level summary table; per-repo accordion with details/summary; domain filter on package table; dual- licence false-positive note; +1 KPI card (Domains Covered) - canon/standards/sbom-convention_v0.1.md: authoritative lockfile table, ingest workflow (single/scan/explicit), snapshot semantics, direct-vs- transitive caveats, licence governance + copyleft escalation, update cadence, multi-repo domain pattern, planned enhancements First ingest: the-custodian — 420 pkgs (88 python + 332 node), 13 licence groups, 1 copyleft flag (jszip dual-licensed MIT OR GPL-3.0-or-later) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
253
canon/standards/sbom-convention_v0.1.md
Normal file
253
canon/standards/sbom-convention_v0.1.md
Normal file
@@ -0,0 +1,253 @@
|
||||
---
|
||||
id: SBOM-CONV-001
|
||||
type: standard
|
||||
title: "SBOM Convention v0.1 — Dependency Tracking & Licence Governance"
|
||||
domain: custodian
|
||||
status: active
|
||||
version: "0.1"
|
||||
created: "2026-03-01"
|
||||
updated: "2026-03-01"
|
||||
---
|
||||
|
||||
# SBOM Convention v0.1 — Dependency Tracking & Licence Governance
|
||||
|
||||
## Purpose
|
||||
|
||||
This convention defines how every Custodian-registered project captures,
|
||||
stores, and reports its software supply-chain inventory to the State Hub SBOM
|
||||
store. It establishes:
|
||||
|
||||
- Which lockfiles are authoritative per ecosystem
|
||||
- How to run SBOM ingestion (single-ecosystem and multi-ecosystem repos)
|
||||
- How to keep the data current
|
||||
- Licence governance rules and escalation thresholds
|
||||
|
||||
The State Hub SBOM store aggregates across all registered repos. The
|
||||
dashboard (`/sbom`) provides domain-level and repo-level drill-down.
|
||||
|
||||
---
|
||||
|
||||
## 1. Authoritative Lockfiles per Ecosystem
|
||||
|
||||
| Ecosystem | Authoritative file | Notes |
|
||||
|-----------|-------------------|-------|
|
||||
| Python | `uv.lock` | Preferred. `requirements.txt` accepted as fallback |
|
||||
| Node / npm | `package-lock.json` | Preferred. `yarn.lock` accepted |
|
||||
| Rust | `Cargo.lock` | Auto-detected |
|
||||
| Go | `go.sum` | *Not yet parsed — planned* |
|
||||
| Java / JVM | `gradle.lockfile` / `pom.xml` | *Not yet parsed — planned* |
|
||||
|
||||
**Principle:** commit lockfiles to the repo. Lockfiles are the SBOM source
|
||||
of truth; do not generate them at ingest time.
|
||||
|
||||
---
|
||||
|
||||
## 2. Repo Registration Prerequisite
|
||||
|
||||
Before SBOM data can be reported, the repo must be registered in the State Hub:
|
||||
|
||||
```bash
|
||||
cd ~/the-custodian/state-hub
|
||||
make add-repo DOMAIN=<domain-slug> SLUG=<repo-slug> NAME="<Display Name>" PATH=/absolute/path/to/repo
|
||||
```
|
||||
|
||||
Check registered repos:
|
||||
```bash
|
||||
make list-repos
|
||||
# or
|
||||
curl -s http://127.0.0.1:8000/repos/ | python3 -m json.tool
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. SBOM Ingestion
|
||||
|
||||
### 3.1 Standard ingest (single lockfile at repo root)
|
||||
|
||||
```bash
|
||||
cd ~/the-custodian/state-hub
|
||||
make ingest-sbom REPO=<slug> REPO_PATH=/path/to/repo
|
||||
```
|
||||
|
||||
The script auto-detects the first recognised lockfile at `REPO_PATH`.
|
||||
|
||||
### 3.2 Multi-ecosystem repos (recommended for complex repos)
|
||||
|
||||
Use `SCAN=1` to walk the repo tree and combine **all** lockfiles into a single
|
||||
snapshot. Non-dep directories (`.venv`, `node_modules`, `.git`, `dist`, etc.)
|
||||
are automatically skipped.
|
||||
|
||||
```bash
|
||||
make ingest-sbom REPO=the-custodian SCAN=1 REPO_PATH=/home/worsch/the-custodian
|
||||
```
|
||||
|
||||
This is the correct approach for repos that contain both a backend and a
|
||||
frontend (e.g., a Python API + Node/Observable dashboard).
|
||||
|
||||
### 3.3 Explicit lockfile path
|
||||
|
||||
```bash
|
||||
make ingest-sbom REPO=<slug> LOCKFILE=/path/to/specific/uv.lock
|
||||
```
|
||||
|
||||
Multiple lockfiles can be passed by calling the script directly with repeated
|
||||
`--lockfile` flags:
|
||||
|
||||
```bash
|
||||
cd ~/the-custodian/state-hub
|
||||
.venv/bin/python scripts/ingest_sbom.py \
|
||||
--repo <slug> \
|
||||
--lockfile /path/to/uv.lock \
|
||||
--lockfile /path/to/package-lock.json
|
||||
```
|
||||
|
||||
### 3.4 Dry run (inspect without submitting)
|
||||
|
||||
```bash
|
||||
make ingest-sbom REPO=<slug> SCAN=1 REPO_PATH=/path/to/repo
|
||||
# append: add --dry-run to the command, or run the script directly:
|
||||
.venv/bin/python scripts/ingest_sbom.py --repo <slug> --scan --repo-path /path/to/repo --dry-run
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Snapshot Semantics
|
||||
|
||||
Each `POST /sbom/ingest/` call **replaces** the entire previous snapshot for
|
||||
that repo. This means:
|
||||
|
||||
- There is always exactly one snapshot per repo (the most recent ingest)
|
||||
- Re-running ingest after a dependency update is idempotent — it simply
|
||||
refreshes the data
|
||||
- Historical snapshots are **not** retained (v0.1 scope; versioned history is
|
||||
a planned extension)
|
||||
|
||||
The `last_sbom_at` timestamp on the managed_repo record indicates when the
|
||||
last ingest ran.
|
||||
|
||||
---
|
||||
|
||||
## 5. Direct vs Transitive Dependencies
|
||||
|
||||
| Source | `is_direct` | Notes |
|
||||
|--------|-------------|-------|
|
||||
| `package-lock.json` | Accurate — npm `indirect` flag used | Dev packages also detected via `dev` flag |
|
||||
| `yarn.lock` | `false` for all (yarn.lock doesn't distinguish) | Treat output as transitive |
|
||||
| `uv.lock` | `false` for all (uv.lock doesn't distinguish direct from transitive) | |
|
||||
| `requirements.txt` | `true` for all (every line is a direct dep) | |
|
||||
| `Cargo.lock` | `false` for all (workspace member packages not yet distinguished) | |
|
||||
|
||||
**Governance implication:** `is_direct=true` entries receive stricter licence
|
||||
scrutiny. Copyleft risk is reported specifically for `is_direct=true AND is_dev=false`.
|
||||
|
||||
---
|
||||
|
||||
## 6. Licence Governance
|
||||
|
||||
### 6.1 Copyleft detection
|
||||
|
||||
The following SPDX identifier substrings trigger a copyleft flag:
|
||||
`GPL`, `AGPL`, `LGPL`, `EUPL`, `CDDL`, `MPL`
|
||||
|
||||
A copyleft flag on a **direct prod dependency** (`is_direct=true`, `is_dev=false`)
|
||||
increments the `licence_risk_count` in the State Hub summary and triggers a
|
||||
warning on the SBOM dashboard.
|
||||
|
||||
### 6.2 Dual-licensed packages
|
||||
|
||||
Packages with SPDX expressions like `(MIT OR GPL-3.0-or-later)` are flagged
|
||||
**conservatively** — the presence of a copyleft identifier in the SPDX string
|
||||
is sufficient to trigger the flag, regardless of the OR clause.
|
||||
|
||||
**Action required:** review flagged packages. If the non-copyleft licence is
|
||||
used in practice, document this decision in a `contrib/` BR or FR artifact and
|
||||
note it in the repo's CLAUDE.md.
|
||||
|
||||
### 6.3 Unknown licences
|
||||
|
||||
Packages with `license_spdx = null` are those whose lockfile did not contain
|
||||
licence metadata (`uv.lock`, `yarn.lock`, `Cargo.lock` do not embed licence
|
||||
info). These are listed in the dashboard but do not trigger risk flags.
|
||||
|
||||
To resolve unknowns, consult the package's registry page (PyPI, npm, crates.io)
|
||||
and either accept the unknown status or enhance the ingest script.
|
||||
|
||||
### 6.4 Escalation
|
||||
|
||||
Per the Custodian Constitution, a copyleft direct prod dep **must be reviewed**
|
||||
before the next production deployment. Record the decision via:
|
||||
|
||||
```
|
||||
register_contribution(type="br", title="Licence review: <package>", ...)
|
||||
```
|
||||
|
||||
or directly in `contrib/bug-reports/` using the BR template.
|
||||
|
||||
---
|
||||
|
||||
## 7. Keeping Data Current
|
||||
|
||||
### 7.1 When to re-run ingest
|
||||
|
||||
Re-run `make ingest-sbom` after any of the following:
|
||||
- `uv add` / `uv remove` (Python)
|
||||
- `npm install` / `npm update` (Node)
|
||||
- `cargo add` / `cargo update` (Rust)
|
||||
- Any lockfile regeneration
|
||||
|
||||
### 7.2 Recommended workflow integration
|
||||
|
||||
Add to your repo's CLAUDE.md (or developer runbook):
|
||||
|
||||
> After updating dependencies, run:
|
||||
> ```bash
|
||||
> cd ~/the-custodian/state-hub
|
||||
> make ingest-sbom REPO=<your-slug> SCAN=1 REPO_PATH=<your-repo-path>
|
||||
> ```
|
||||
|
||||
### 7.3 Verification
|
||||
|
||||
After ingest:
|
||||
```bash
|
||||
curl -s http://127.0.0.1:8000/sbom/<your-slug>/ | python3 -m json.tool | head -30
|
||||
curl -s http://127.0.0.1:8000/sbom/report/licences/ | python3 -m json.tool
|
||||
```
|
||||
|
||||
Or visit the State Hub dashboard → SBOM → By Repo to see the updated snapshot.
|
||||
|
||||
---
|
||||
|
||||
## 8. Multi-Repo Domains
|
||||
|
||||
When a domain has multiple repos (e.g., `api` + `frontend` + `infra`), each
|
||||
repo should be registered separately and ingested separately:
|
||||
|
||||
```bash
|
||||
make ingest-sbom REPO=myapp-api SCAN=1 REPO_PATH=/home/worsch/myapp
|
||||
make ingest-sbom REPO=myapp-frontend SCAN=1 REPO_PATH=/home/worsch/myapp-frontend
|
||||
```
|
||||
|
||||
The SBOM dashboard aggregates across all repos within a domain in the
|
||||
**By Domain** table.
|
||||
|
||||
---
|
||||
|
||||
## 9. Current Registered Repos & Status
|
||||
|
||||
| Repo | Domain | Ecosystems | Last Ingest |
|
||||
|------|--------|------------|-------------|
|
||||
| `the-custodian` | custodian | python, node | 2026-03-01 |
|
||||
|
||||
*(This table is informational. The live view is at the SBOM dashboard.)*
|
||||
|
||||
---
|
||||
|
||||
## 10. Planned Enhancements
|
||||
|
||||
- **Go / Java parsers** — add to `ingest_sbom.py`
|
||||
- **Versioned snapshots** — retain history per repo for trend analysis
|
||||
- **Licence override file** — allow repos to document known-acceptable
|
||||
copyleft exceptions (`.sbom-overrides.yaml`)
|
||||
- **CI integration** — GitHub Actions step to run ingest on lockfile change
|
||||
- **Direct-dep detection for uv.lock** — parse `pyproject.toml` `[project.dependencies]`
|
||||
to mark direct deps accurately
|
||||
Reference in New Issue
Block a user