--- id: SBOM-CONV-001 type: standard title: "SBOM Convention v0.1 — Dependency Tracking & Licence Governance" domain: custodian status: active version: "0.1" created: "2026-03-01" updated: "2026-03-12" --- # SBOM Convention v0.1 — Dependency Tracking & Licence Governance ## Purpose This convention defines how every Custodian-registered project captures, stores, and reports its software supply-chain inventory to the State Hub SBOM store. It establishes: - Which lockfiles are authoritative per ecosystem - How to run SBOM ingestion (single-ecosystem and multi-ecosystem repos) - How to keep the data current - Licence governance rules and escalation thresholds The State Hub SBOM store aggregates across all registered repos. The dashboard (`/sbom`) provides domain-level and repo-level drill-down. --- ## 1. Capture Mechanisms `ingest_sbom.py` runs all four mechanisms in a single scan when given `--repo-path`. No flags needed — comprehensive detection is the default. | Mechanism | File(s) | Ecosystem | Detection scope | |-----------|---------|-----------|-----------------| | **Package manager lockfiles** | `uv.lock`, `requirements.txt`, `package-lock.json`, `yarn.lock`, `Cargo.lock` | `python`, `node`, `rust` | Anywhere in tree | | **Terraform provider lock** | `.terraform.lock.hcl` | `terraform` | Anywhere in tree | | **Ansible Galaxy manifest** | `ansible/requirements.yml` or `.yaml` | `ansible` | Under directories named `ansible/` | | **Tool manifest** | `sbom-tools.yaml` (repo root) | `tool`, `ansible`, `terraform`, etc. | Repo root only | **Go / Java parsers** (`go.sum`, `pom.xml`, `gradle.lockfile`) are *not yet implemented* — planned for a future workplan. **Principle:** commit lockfiles and `sbom-tools.yaml` to the repo. These are the SBOM source of truth; do not generate them at ingest time. --- ## 2. Repo Registration Prerequisite Before SBOM data can be reported, the repo must be registered in the State Hub: ```bash cd ~/the-custodian/state-hub make add-repo DOMAIN= SLUG= NAME="" PATH=/absolute/path/to/repo ``` Check registered repos: ```bash make list-repos # or curl -s http://127.0.0.1:8000/repos/ | python3 -m json.tool ``` --- ## 3. SBOM Ingestion ### 3.1 Standard ingest (all mechanisms, recommended) ```bash cd ~/the-custodian/state-hub make ingest-sbom REPO= REPO_PATH=/path/to/repo ``` `ingest_sbom.py` automatically runs all four mechanisms in one scan — lockfiles, Terraform provider locks, Ansible Galaxy manifests, and `sbom-tools.yaml`. All results are merged into a single snapshot. Non-dep directories (`.venv`, `node_modules`, `.git`, `dist`, etc.) are automatically skipped. ### 3.2 Repos with system-level tools: capture first, then ingest For repos that use system-level tools not tracked by any lockfile (Terraform binary, Helm, kubectl, k3s, goss, etc.): ```bash # Step 1: generate sbom-tools.yaml via agent make capture-tools REPO= REPO_PATH=/path/to/repo # Step 2: review sbom-tools.yaml — correct any confidence: low entries # Step 3: commit sbom-tools.yaml git -C /path/to/repo add sbom-tools.yaml && git -C /path/to/repo commit -m "chore(sbom): add tool manifest" # Step 4: ingest everything make ingest-sbom REPO= REPO_PATH=/path/to/repo ``` ### 3.3 Explicit lockfile path ```bash make ingest-sbom REPO= LOCKFILE=/path/to/specific/uv.lock ``` Multiple lockfiles can be passed by calling the script directly with repeated `--lockfile` flags: ```bash uv run python scripts/ingest_sbom.py \ --repo \ --lockfile /path/to/uv.lock \ --lockfile /path/to/package-lock.json ``` ### 3.4 Dry run (inspect without submitting) ```bash make ingest-sbom REPO= REPO_PATH=/path/to/repo DRY_RUN=1 ``` ### 3.5 sbom-tools.yaml: the tool manifest Create `sbom-tools.yaml` at the repo root for any system-level tools not covered by lockfiles. Schema: ```yaml # sbom-tools.yaml tools: - name: terraform version: "1.9.5" # confidence: medium ecosystem: terraform license_spdx: BSL-1.1 is_direct: true is_dev: false - name: helm version: null # confidence: low (no version pin found) ecosystem: tool license_spdx: Apache-2.0 is_direct: true is_dev: false ``` **Valid ecosystem values:** `python`, `node`, `rust`, `go`, `java`, `terraform`, `ansible`, `tool`, `other` Annotate each version with a `# confidence: high/medium/low` comment. Entries with `confidence: low` need human verification before committing. The `make capture-tools` command generates this file automatically using the SBOM capture agent prompt (`state-hub/prompts/sbom-capture-agent.md`). --- ## 4. Snapshot Semantics Each `POST /sbom/ingest/` call **replaces** the entire previous snapshot for that repo. This means: - There is always exactly one snapshot per repo (the most recent ingest) - Re-running ingest after a dependency update is idempotent — it simply refreshes the data - Historical snapshots are **not** retained (v0.1 scope; versioned history is a planned extension) The `last_sbom_at` timestamp on the managed_repo record indicates when the last ingest ran. --- ## 5. Direct vs Transitive Dependencies | Source | `is_direct` | Notes | |--------|-------------|-------| | `package-lock.json` | Accurate — npm `indirect` flag used | Dev packages also detected via `dev` flag | | `yarn.lock` | `false` for all (yarn.lock doesn't distinguish) | Treat output as transitive | | `uv.lock` | `false` for all (uv.lock doesn't distinguish direct from transitive) | | | `requirements.txt` | `true` for all (every line is a direct dep) | | | `Cargo.lock` | `false` for all (workspace member packages not yet distinguished) | | **Governance implication:** `is_direct=true` entries receive stricter licence scrutiny. Copyleft risk is reported specifically for `is_direct=true AND is_dev=false`. --- ## 6. Licence Governance ### 6.1 Copyleft detection The following SPDX identifier substrings trigger a copyleft flag: `GPL`, `AGPL`, `LGPL`, `EUPL`, `CDDL`, `MPL` A copyleft flag on a **direct prod dependency** (`is_direct=true`, `is_dev=false`) increments the `licence_risk_count` in the State Hub summary and triggers a warning on the SBOM dashboard. ### 6.2 Dual-licensed packages Packages with SPDX expressions like `(MIT OR GPL-3.0-or-later)` are flagged **conservatively** — the presence of a copyleft identifier in the SPDX string is sufficient to trigger the flag, regardless of the OR clause. **Action required:** review flagged packages. If the non-copyleft licence is used in practice, document this decision in a `contrib/` BR or FR artifact and note it in the repo's CLAUDE.md. ### 6.3 Unknown licences Packages with `license_spdx = null` are those whose lockfile did not contain licence metadata (`uv.lock`, `yarn.lock`, `Cargo.lock` do not embed licence info). These are listed in the dashboard but do not trigger risk flags. To resolve unknowns, consult the package's registry page (PyPI, npm, crates.io) and either accept the unknown status or enhance the ingest script. ### 6.4 Escalation Per the Custodian Constitution, a copyleft direct prod dep **must be reviewed** before the next production deployment. Record the decision via: ``` register_contribution(type="br", title="Licence review: ", ...) ``` or directly in `contrib/bug-reports/` using the BR template. --- ## 7. Keeping Data Current ### 7.1 When to re-run ingest Re-run `make ingest-sbom` after any of the following: - `uv add` / `uv remove` (Python) - `npm install` / `npm update` (Node) - `cargo add` / `cargo update` (Rust) - Any lockfile regeneration ### 7.2 Recommended workflow integration Add to your repo's CLAUDE.md (or developer runbook): > After updating dependencies, run: > ```bash > cd ~/the-custodian/state-hub > make ingest-sbom REPO= SCAN=1 REPO_PATH= > ``` ### 7.3 Verification After ingest: ```bash curl -s http://127.0.0.1:8000/sbom// | python3 -m json.tool | head -30 curl -s http://127.0.0.1:8000/sbom/report/licences/ | python3 -m json.tool ``` Or visit the State Hub dashboard → SBOM → By Repo to see the updated snapshot. --- ## 8. Multi-Repo Domains When a domain has multiple repos (e.g., `api` + `frontend` + `infra`), each repo should be registered separately and ingested separately: ```bash make ingest-sbom REPO=myapp-api SCAN=1 REPO_PATH=/home/worsch/myapp make ingest-sbom REPO=myapp-frontend SCAN=1 REPO_PATH=/home/worsch/myapp-frontend ``` The SBOM dashboard aggregates across all repos within a domain in the **By Domain** table. --- ## 9. Current Registered Repos & Status | Repo | Domain | Ecosystems | Last Ingest | |------|--------|------------|-------------| | `the-custodian` | custodian | python, node | 2026-03-01 | | `railiance-bootstrap` | railiance | — (Ansible + shell, no lockfile) | — | | `railiance-hosts` | railiance | terraform (2 providers) | 2026-03-01 | *(This table is informational. The live view is at the SBOM dashboard.)* --- ## 10. Planned Enhancements - **Go / Java parsers** — add `go.sum`, `pom.xml`, `gradle.lockfile` support to `ingest_sbom.py` - **Versioned snapshots** — retain history per repo for trend analysis - **Licence override file** — allow repos to document known-acceptable copyleft exceptions (`.sbom-overrides.yaml`) - **CI integration** — GitHub Actions step to run ingest on lockfile change - **Direct-dep detection for uv.lock** — parse `pyproject.toml` `[project.dependencies]` to mark direct deps accurately - **Galaxy API licence lookup** — resolve `license_spdx` for Ansible collections via the Galaxy API at ingest time - **Tool version pinning guidance** — tooling to detect `confidence: low` entries across all registered repos and flag them for resolution