- Migration d6e7f8a9b0c1: add terraform, ansible, tool to Ecosystem enum - ingest_sbom.py: new Ansible Galaxy requirements.yml parser (collections + roles) - ingest_sbom.py: new sbom-tools.yaml manifest parser (agent-generated tool deps) - ingest_sbom.py: promote .terraform.lock.hcl parser from ecosystem=other → terraform - ingest_sbom.py: detect_all() runs all four parsers in one comprehensive scan - capture_sbom_tools.py: agent-assisted tool manifest generator (claude -p) - prompts/sbom-capture-agent.md: parameterised prompt for repo tool discovery - Makefile: capture-tools target; ingest-sbom updated docs and DRY_RUN support - 29 unit tests covering all new parsers and detect_all() behaviour - canon/standards/sbom-convention_v0.1.md: updated with four-mechanism model and workflow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
9.6 KiB
id, type, title, domain, status, version, created, updated
| id | type | title | domain | status | version | created | updated |
|---|---|---|---|---|---|---|---|
| SBOM-CONV-001 | standard | SBOM Convention v0.1 — Dependency Tracking & Licence Governance | custodian | active | 0.1 | 2026-03-01 | 2026-03-12 |
SBOM Convention v0.1 — Dependency Tracking & Licence Governance
Purpose
This convention defines how every Custodian-registered project captures, stores, and reports its software supply-chain inventory to the State Hub SBOM store. It establishes:
- Which lockfiles are authoritative per ecosystem
- How to run SBOM ingestion (single-ecosystem and multi-ecosystem repos)
- How to keep the data current
- Licence governance rules and escalation thresholds
The State Hub SBOM store aggregates across all registered repos. The
dashboard (/sbom) provides domain-level and repo-level drill-down.
1. Capture Mechanisms
ingest_sbom.py runs all four mechanisms in a single scan when given --repo-path.
No flags needed — comprehensive detection is the default.
| Mechanism | File(s) | Ecosystem | Detection scope |
|---|---|---|---|
| Package manager lockfiles | uv.lock, requirements.txt, package-lock.json, yarn.lock, Cargo.lock |
python, node, rust |
Anywhere in tree |
| Terraform provider lock | .terraform.lock.hcl |
terraform |
Anywhere in tree |
| Ansible Galaxy manifest | ansible/requirements.yml or .yaml |
ansible |
Under directories named ansible/ |
| Tool manifest | sbom-tools.yaml (repo root) |
tool, ansible, terraform, etc. |
Repo root only |
Go / Java parsers (go.sum, pom.xml, gradle.lockfile) are not yet
implemented — planned for a future workplan.
Principle: commit lockfiles and sbom-tools.yaml to the repo. These are
the SBOM source of truth; do not generate them at ingest time.
2. Repo Registration Prerequisite
Before SBOM data can be reported, the repo must be registered in the State Hub:
cd ~/the-custodian/state-hub
make add-repo DOMAIN=<domain-slug> SLUG=<repo-slug> NAME="<Display Name>" PATH=/absolute/path/to/repo
Check registered repos:
make list-repos
# or
curl -s http://127.0.0.1:8000/repos/ | python3 -m json.tool
3. SBOM Ingestion
3.1 Standard ingest (all mechanisms, recommended)
cd ~/the-custodian/state-hub
make ingest-sbom REPO=<slug> REPO_PATH=/path/to/repo
ingest_sbom.py automatically runs all four mechanisms in one scan — lockfiles,
Terraform provider locks, Ansible Galaxy manifests, and sbom-tools.yaml. All
results are merged into a single snapshot. Non-dep directories (.venv,
node_modules, .git, dist, etc.) are automatically skipped.
3.2 Repos with system-level tools: capture first, then ingest
For repos that use system-level tools not tracked by any lockfile (Terraform binary, Helm, kubectl, k3s, goss, etc.):
# Step 1: generate sbom-tools.yaml via agent
make capture-tools REPO=<slug> REPO_PATH=/path/to/repo
# Step 2: review sbom-tools.yaml — correct any confidence: low entries
# Step 3: commit sbom-tools.yaml
git -C /path/to/repo add sbom-tools.yaml && git -C /path/to/repo commit -m "chore(sbom): add tool manifest"
# Step 4: ingest everything
make ingest-sbom REPO=<slug> REPO_PATH=/path/to/repo
3.3 Explicit lockfile path
make ingest-sbom REPO=<slug> LOCKFILE=/path/to/specific/uv.lock
Multiple lockfiles can be passed by calling the script directly with repeated
--lockfile flags:
uv run python scripts/ingest_sbom.py \
--repo <slug> \
--lockfile /path/to/uv.lock \
--lockfile /path/to/package-lock.json
3.4 Dry run (inspect without submitting)
make ingest-sbom REPO=<slug> REPO_PATH=/path/to/repo DRY_RUN=1
3.5 sbom-tools.yaml: the tool manifest
Create sbom-tools.yaml at the repo root for any system-level tools not
covered by lockfiles. Schema:
# sbom-tools.yaml
tools:
- name: terraform
version: "1.9.5" # confidence: medium
ecosystem: terraform
license_spdx: BSL-1.1
is_direct: true
is_dev: false
- name: helm
version: null # confidence: low (no version pin found)
ecosystem: tool
license_spdx: Apache-2.0
is_direct: true
is_dev: false
Valid ecosystem values: python, node, rust, go, java, terraform,
ansible, tool, other
Annotate each version with a # confidence: high/medium/low comment.
Entries with confidence: low need human verification before committing.
The make capture-tools command generates this file automatically using the
SBOM capture agent prompt (state-hub/prompts/sbom-capture-agent.md).
4. Snapshot Semantics
Each POST /sbom/ingest/ call replaces the entire previous snapshot for
that repo. This means:
- There is always exactly one snapshot per repo (the most recent ingest)
- Re-running ingest after a dependency update is idempotent — it simply refreshes the data
- Historical snapshots are not retained (v0.1 scope; versioned history is a planned extension)
The last_sbom_at timestamp on the managed_repo record indicates when the
last ingest ran.
5. Direct vs Transitive Dependencies
| Source | is_direct |
Notes |
|---|---|---|
package-lock.json |
Accurate — npm indirect flag used |
Dev packages also detected via dev flag |
yarn.lock |
false for all (yarn.lock doesn't distinguish) |
Treat output as transitive |
uv.lock |
false for all (uv.lock doesn't distinguish direct from transitive) |
|
requirements.txt |
true for all (every line is a direct dep) |
|
Cargo.lock |
false for all (workspace member packages not yet distinguished) |
Governance implication: is_direct=true entries receive stricter licence
scrutiny. Copyleft risk is reported specifically for is_direct=true AND is_dev=false.
6. Licence Governance
6.1 Copyleft detection
The following SPDX identifier substrings trigger a copyleft flag:
GPL, AGPL, LGPL, EUPL, CDDL, MPL
A copyleft flag on a direct prod dependency (is_direct=true, is_dev=false)
increments the licence_risk_count in the State Hub summary and triggers a
warning on the SBOM dashboard.
6.2 Dual-licensed packages
Packages with SPDX expressions like (MIT OR GPL-3.0-or-later) are flagged
conservatively — the presence of a copyleft identifier in the SPDX string
is sufficient to trigger the flag, regardless of the OR clause.
Action required: review flagged packages. If the non-copyleft licence is
used in practice, document this decision in a contrib/ BR or FR artifact and
note it in the repo's CLAUDE.md.
6.3 Unknown licences
Packages with license_spdx = null are those whose lockfile did not contain
licence metadata (uv.lock, yarn.lock, Cargo.lock do not embed licence
info). These are listed in the dashboard but do not trigger risk flags.
To resolve unknowns, consult the package's registry page (PyPI, npm, crates.io) and either accept the unknown status or enhance the ingest script.
6.4 Escalation
Per the Custodian Constitution, a copyleft direct prod dep must be reviewed before the next production deployment. Record the decision via:
register_contribution(type="br", title="Licence review: <package>", ...)
or directly in contrib/bug-reports/ using the BR template.
7. Keeping Data Current
7.1 When to re-run ingest
Re-run make ingest-sbom after any of the following:
uv add/uv remove(Python)npm install/npm update(Node)cargo add/cargo update(Rust)- Any lockfile regeneration
7.2 Recommended workflow integration
Add to your repo's CLAUDE.md (or developer runbook):
After updating dependencies, run:
cd ~/the-custodian/state-hub make ingest-sbom REPO=<your-slug> SCAN=1 REPO_PATH=<your-repo-path>
7.3 Verification
After ingest:
curl -s http://127.0.0.1:8000/sbom/<your-slug>/ | python3 -m json.tool | head -30
curl -s http://127.0.0.1:8000/sbom/report/licences/ | python3 -m json.tool
Or visit the State Hub dashboard → SBOM → By Repo to see the updated snapshot.
8. Multi-Repo Domains
When a domain has multiple repos (e.g., api + frontend + infra), each
repo should be registered separately and ingested separately:
make ingest-sbom REPO=myapp-api SCAN=1 REPO_PATH=/home/worsch/myapp
make ingest-sbom REPO=myapp-frontend SCAN=1 REPO_PATH=/home/worsch/myapp-frontend
The SBOM dashboard aggregates across all repos within a domain in the By Domain table.
9. Current Registered Repos & Status
| Repo | Domain | Ecosystems | Last Ingest |
|---|---|---|---|
the-custodian |
custodian | python, node | 2026-03-01 |
railiance-bootstrap |
railiance | — (Ansible + shell, no lockfile) | — |
railiance-hosts |
railiance | terraform (2 providers) | 2026-03-01 |
(This table is informational. The live view is at the SBOM dashboard.)
10. Planned Enhancements
- Go / Java parsers — add
go.sum,pom.xml,gradle.lockfilesupport toingest_sbom.py - Versioned snapshots — retain history per repo for trend analysis
- Licence override file — allow repos to document known-acceptable
copyleft exceptions (
.sbom-overrides.yaml) - CI integration — GitHub Actions step to run ingest on lockfile change
- Direct-dep detection for uv.lock — parse
pyproject.toml[project.dependencies]to mark direct deps accurately - Galaxy API licence lookup — resolve
license_spdxfor Ansible collections via the Galaxy API at ingest time - Tool version pinning guidance — tooling to detect
confidence: lowentries across all registered repos and flag them for resolution