- Migration d6e7f8a9b0c1: add terraform, ansible, tool to Ecosystem enum - ingest_sbom.py: new Ansible Galaxy requirements.yml parser (collections + roles) - ingest_sbom.py: new sbom-tools.yaml manifest parser (agent-generated tool deps) - ingest_sbom.py: promote .terraform.lock.hcl parser from ecosystem=other → terraform - ingest_sbom.py: detect_all() runs all four parsers in one comprehensive scan - capture_sbom_tools.py: agent-assisted tool manifest generator (claude -p) - prompts/sbom-capture-agent.md: parameterised prompt for repo tool discovery - Makefile: capture-tools target; ingest-sbom updated docs and DRY_RUN support - 29 unit tests covering all new parsers and detect_all() behaviour - canon/standards/sbom-convention_v0.1.md: updated with four-mechanism model and workflow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 KiB
id, type, title, domain, repo, status, owner, topic_slug, state_hub_workstream_id, created, updated
| id | type | title | domain | repo | status | owner | topic_slug | state_hub_workstream_id | created | updated |
|---|---|---|---|---|---|---|---|---|---|---|
| CUST-WP-0013 | workplan | SBOM Infrastructure Expansion | custodian | the-custodian | completed | custodian | custodian | f4ba84c8-4d47-492d-b65e-73b157271a2b | 2026-03-12 | 2026-03-12 |
CUST-WP-0013 — SBOM Infrastructure Expansion
Scope: Extend SBOM capture beyond Python packages to cover Terraform providers,
Ansible Galaxy collections, and system-level tools (Ansible, Terraform, Helm, k3s,
cloud-init, etc.). Introduces an agent-assisted tool manifest capture workflow,
new ecosystem enum values, comprehensive auto-detection in ingest_sbom.py, and
delivers full SBOM coverage for railiance-infra and railiance-cluster.
Drives: Licence risk visibility across the full dependency graph, not just language-level packages.
Design Decisions
Tool manifest: agent-generated, not hand-written
System tools (Ansible, Terraform, Helm, k3s, etc.) live outside any lockfile —
they are provisioned, not installed by a package manager. Rather than asking
operators to maintain a hand-written manifest, the SBOM capture agent inspects
the repo and generates/updates sbom-tools.yaml automatically.
The agent prompt (state-hub/prompts/sbom-capture-agent.md) is parameterised
per repo. It reads the repo's CLAUDE.md, Makefile, README, CI configs, version
pins, and provisioning files, then emits a structured sbom-tools.yaml with
tool name, version, ecosystem, SPDX licence, and directness flags.
A thin wrapper script (state-hub/scripts/capture_sbom_tools.py) invokes the
agent prompt via claude -p (or prints it for manual use) and writes the result
to <repo-root>/sbom-tools.yaml.
Comprehensive ingest: all mechanisms per repo
make ingest-sbom REPO=<slug> must run all applicable parsers, not just
whichever lockfile happens to be auto-detected first. The updated auto-detection
in ingest_sbom.py scans:
- Package manager lockfiles (
uv.lock,requirements.txt,package-lock.json,yarn.lock,Cargo.lock,go.sum) - Terraform provider locks (
.terraform.lock.hcl, anywhere in the tree) - Ansible Galaxy manifests (
requirements.yml/requirements.yaml, anywhere in the tree underansible/) - Agent-generated tool manifest (
sbom-tools.yamlat repo root)
All parsers run and their results are merged into a single snapshot.
Phase 1 — Schema: Ecosystem Enum Extension
Acceptance: terraform and ansible are valid ecosystem values; existing
other entries are unaffected; migration applies cleanly.
T01 — Alembic migration: add terraform and ansible enum values
id: CUST-WP-0013-T01
state_hub_task_id: c0b6edc4-86ab-4cee-88a8-6c66fb81adee
status: done
priority: high
Add terraform and ansible to the Ecosystem enum in the DB. Check whether
the column uses a native PostgreSQL ENUM type (requiring ALTER TYPE) or a
String column (requiring no migration). Write the migration accordingly.
Also add tool as a catch-all for tool-manifest entries that don't fit a
named ecosystem.
Phase 2 — Parser Improvements in ingest_sbom.py
Acceptance: --dry-run on railiance-infra shows terraform providers and
ansible collections correctly labelled; tool manifest entries appear with the
declared ecosystem.
T02 — Promote Terraform parser: other → terraform ecosystem
id: CUST-WP-0013-T02
state_hub_task_id: 7686bccd-022c-4e30-8081-c8487eb82253
status: done
priority: high
The .terraform.lock.hcl parser already exists in ingest_sbom.py but stores
entries as ecosystem="other". Change to ecosystem="terraform" after T01
migration lands. Re-ingest any repos that previously ingested terraform entries
as other to correct the label.
T03 — Implement Ansible Galaxy requirements.yml parser
id: CUST-WP-0013-T03
state_hub_task_id: 48658bdd-4d16-4be0-a87e-45df4f4901b0
status: done
priority: high
Parse requirements.yml / requirements.yaml files found in ansible/
subdirectories. Standard format:
collections:
- name: community.general
version: "9.5.0"
roles:
- name: geerlingguy.docker
version: "6.x"
Store as ecosystem="ansible", is_direct=True. Licence left null (Galaxy
API lookup is deferred). Handle both collections: and roles: blocks.
T04 — Implement sbom-tools.yaml manifest parser
id: CUST-WP-0013-T04
state_hub_task_id: 4522ea04-134b-40ee-a7a2-ea0e4c1c061d
status: done
priority: high
Parse sbom-tools.yaml at the repo root (written by the capture agent). Schema:
# Generated by sbom-capture-agent — review before committing
tools:
- name: ansible
version: "12.3.0"
ecosystem: ansible # or: terraform, other, python, etc.
license_spdx: GPL-3.0-only
is_direct: true
is_dev: false
- name: helm
version: "3.17.x"
ecosystem: other
license_spdx: Apache-2.0
is_direct: true
is_dev: false
Supports all existing ecosystem values plus tool. Pass entries through the
same normalisation as lockfile entries. Skip entries with version: unknown
with a warning (agent could not determine version).
T05 — Comprehensive auto-detection: all formats in one scan
id: CUST-WP-0013-T05
state_hub_task_id: cdda6bf2-2a44-4444-a04a-ac2fe2314923
status: done
priority: high
Refactor the --repo-path scan to discover and run all applicable parsers,
not just the first match. Scan order:
- Walk tree for all
uv.lock,requirements.txt,package-lock.json,yarn.lock,Cargo.lock - Walk tree for all
.terraform.lock.hcl - Walk tree for
ansible/requirements.ymlandansible/requirements.yaml - Check repo root for
sbom-tools.yaml
Merge all results into a single batch for the snapshot ingest call. Log a
summary line per parser: <parser>: N packages from <path>.
T06 — Unit tests for new parsers
id: CUST-WP-0013-T06
state_hub_task_id: fee37e66-8f41-4dba-995b-97fc66493caf
status: done
priority: medium
Add test fixtures and unit tests for:
- Ansible Galaxy requirements.yml (collections + roles, version pinned and unpinned)
- sbom-tools.yaml (valid, missing version, unknown ecosystem)
- Multi-parser scan: repo root with uv.lock + .terraform.lock.hcl + sbom-tools.yaml produces merged results
Phase 3 — SBOM Capture Agent
Acceptance: make capture-tools REPO=railiance-infra produces a reviewed
sbom-tools.yaml that correctly identifies Ansible, Terraform, Helm, and other
declared tools with versions and SPDX licences.
T07 — Write SBOM capture agent prompt
id: CUST-WP-0013-T07
state_hub_task_id: a3b919b5-63b0-44f7-a048-ebfae603ef7b
status: done
priority: high
Write state-hub/prompts/sbom-capture-agent.md — a Claude agent prompt
parameterised with {repo_slug} and {repo_path}. The prompt instructs the
agent to:
- Read
CLAUDE.md,Makefile,README.md,pyproject.toml,.tool-versions, CI configs, Dockerfiles, and provisioning files in{repo_path} - Identify all system-level tools: name, version (from version pins, Makefile vars, or documented prerequisites), ecosystem, SPDX licence
- Identify indirect/transitive tool deps (e.g. Ansible → Python; Terraform →
provider plugins already captured by
.terraform.lock.hcl) - Emit a well-formed
sbom-tools.yamlwith a comment header noting generation date and confidence level per entry (# confidence: high/medium/low) - Flag any tools where version could not be determined (
version: unknown) for human review
The prompt must not hallucinate versions — it must derive them from evidence in the repo or mark them unknown.
T08 — Implement capture_sbom_tools.py
id: CUST-WP-0013-T08
state_hub_task_id: 9593dca7-e713-4d7a-b4f2-c5333ae0b3d2
status: done
priority: high
Write state-hub/scripts/capture_sbom_tools.py:
- Accepts
--repo SLUGand--repo-path PATH - Resolves repo path from slug via the state-hub API if
--repo-pathis omitted - Loads the agent prompt from
prompts/sbom-capture-agent.md, substitutes{repo_slug}and{repo_path} - Invokes
claude -p "<prompt>"(non-interactive) and captures stdout - Parses the YAML block from the response
- Writes or updates
<repo-path>/sbom-tools.yaml - Prints a diff of changes if the file already exists
--dry-runflag: print the prompt and diff without writing
T09 — Add make capture-tools target
id: CUST-WP-0013-T09
state_hub_task_id: 6948e1d2-9c97-4709-bdb0-4b6ded700a22
status: done
priority: medium
Add to state-hub/Makefile:
capture-tools: ## Run SBOM capture agent for a repo (REPO=slug, REPO_PATH=path)
uv run python scripts/capture_sbom_tools.py --repo $(REPO) $(if $(REPO_PATH),--repo-path $(REPO_PATH),)
Also update make ingest-sbom to note that capture-tools should be run first
for repos that have system-level tool dependencies.
Phase 4 — Ingest railiance-infra
Acceptance: make ingest-sbom REPO=railiance-infra shows terraform providers,
ansible collections, and tool manifest entries in one snapshot.
T10 — Capture tools manifest for railiance-infra
id: CUST-WP-0013-T10
state_hub_task_id: 99b23998-5129-4777-9d42-7bee5981cdbb
status: done
priority: medium
Run make capture-tools REPO=railiance-infra. Review the generated
railiance-infra/sbom-tools.yaml — verify Ansible, Terraform, cloud-init, goss,
and any other tools with their versions and licences. Correct any unknown
versions by consulting the repo. Commit the file.
T11 — Ingest railiance-infra
id: CUST-WP-0013-T11
state_hub_task_id: bb516909-f903-48ce-b60b-a24245e7382e
status: done
priority: medium
Run make ingest-sbom REPO=railiance-infra REPO_PATH=~/railiance-infra. Verify
the snapshot contains:
- Terraform providers (from
.terraform.lock.hcl) - Ansible Galaxy collections (from
ansible/requirements.yaml) - System tools (from
sbom-tools.yaml)
Check the licence report for any copyleft or BSL flags.
Phase 5 — Ingest railiance-cluster
Acceptance: railiance-cluster SBOM covers both Python packages (uv.lock) and system tools in a single snapshot.
T12 — Capture tools manifest for railiance-cluster
id: CUST-WP-0013-T12
state_hub_task_id: 7a890f1a-da9f-4e6d-86a7-4fd1aefd5b3f
status: done
priority: medium
Run make capture-tools REPO=railiance-cluster. Review the generated
railiance-cluster/sbom-tools.yaml — verify Helm, kubectl, k3s, and any other
operational tools. Commit the file.
T13 — Re-ingest railiance-cluster
id: CUST-WP-0013-T13
state_hub_task_id: 789dbe93-011a-4470-9fec-ebf249cd7134
status: done
priority: medium
Run make ingest-sbom REPO=railiance-cluster REPO_PATH=~/railiance-cluster.
Verify the snapshot merges uv.lock (Python packages including ansible-core) and
sbom-tools.yaml entries into one coherent snapshot. Confirm ansible-core GPL-3.0
flag appears in the licence report.
Phase 6 — Convention Documentation
Acceptance: A developer reading the SBOM convention doc knows exactly how to add a new repo to SBOM coverage.
T14 — Document SBOM capture convention in canon/standards
id: CUST-WP-0013-T14
state_hub_task_id: dc3bb2a3-882e-4dd7-ab7c-8b1e88279a7d
status: done
priority: low
Write canon/standards/sbom-convention_v0.1.md documenting:
- The four capture mechanisms and when each applies
- The
sbom-tools.yamlschema (with confidence annotation convention) - The
make capture-tools→ review → commit →make ingest-sbomworkflow - Licence risk thresholds: copyleft = flag for review; BSL = flag for review; null licence = acceptable for infra tools if well-known open source
Licence Risk Preview
Based on known tool licences, expect these flags once ingested:
| Tool / Package | Licence | Risk level |
|---|---|---|
| ansible-core | GPL-3.0-only | Copyleft — flag (ops toolchain, not shipped) |
| terraform ≥ 1.5.6 | BSL-1.1 | Non-OSI — flag for review |
| hashicorp providers | BSL-1.1 | Same |
| community.general | GPL-3.0 | Copyleft — flag (ops toolchain) |
| Helm | Apache-2.0 | Clean |
| k3s | Apache-2.0 | Clean |
| cloud-init | Apache-2.0 / GPL-3.0 | Mixed — check version |
| goss | Apache-2.0 | Clean |
All copyleft/BSL entries here are operational toolchain dependencies, not shipped code — risk is low but worth tracking for compliance awareness.