Capture sbom_status resolver bug as ADHOC-2026-06-01

Surfaced while bringing up the dev worker for the CUST-WP-0045 T06 cutover.
weekly-sbom-staleness fires its state-hub resolver with query
repo_sbom_status, which hits GET /sbom/status?repo=. State Hub does not
expose that route, so _fetch_json returns {} and the rule
context.repos.sbom_age_days > 30 silently no-ops. The weekly SBOM check has
been a no-op for as long as the route mismatch has existed. Logged as a
low-priority adhoc rather than promoting to a workplan because the resolver
and definition both need a one-line decision (single-repo vs fan-out), not
multi-phase design.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-06-02 03:16:12 +02:00
parent ca6d80ec07
commit 5d3fb33c6b

View File

@@ -0,0 +1,82 @@
---
id: ADHOC-2026-06-01
type: workplan
title: "Ad hoc — activity-core opportunistic fixes 2026-06-01"
domain: custodian
repo: activity-core
status: ready
owner: custodian
topic_slug: custodian
created: "2026-06-01"
updated: "2026-06-01"
state_hub_workstream_id: "36162ff0-9b47-47c4-8602-56767f9b7a1c"
---
# ADHOC-2026-06-01 — activity-core opportunistic fixes
Captured during the CUST-WP-0045 T06 cutover prep session. The dev worker was
brought up and surfaced an unrelated, pre-existing bug in the state-hub
context resolver that is independent of the daily triage canary.
## Tasks
### T01 - Fix repo_sbom_status resolver route and params
```task
id: ADHOC-2026-06-01-T01
status: todo
priority: low
state_hub_task_id: "87b56da9-e692-4350-9aff-47080414ec06"
```
`src/activity_core/context_resolvers/state_hub.py` resolves
`query: repo_sbom_status` by calling `GET /sbom/status?repo={repo_slug}`, but
State Hub does not expose `/sbom/status` at all. Actual SBOM routes are
`/sbom/`, `/sbom/{repo_slug}`, `/sbom/snapshots/`, `/sbom/snapshots/{id}`,
`/sbom/ingest/`, `/sbom/report/licences/`.
Compounding bug: the only ActivityDefinition using this query is
`activity-definitions/weekly-sbom-staleness.md`, which passes
`params: { repos: all }`. The resolver reads `params.get("repo_slug", "")`,
so the lookup URL collapses to `/sbom/status?repo=` regardless of the
ActivityDefinition value.
Symptom: every Monday at 09:00 Europe/Berlin (and on worker startup after a
missed Monday tick), the `weekly-sbom-staleness` workflow runs and the
resolver logs `HTTP/1.1 404 Not Found` for `GET /sbom/status?repo=`. The
`_fetch_json` helper swallows the error and returns `{}`, so the workflow
continues but the downstream rule evaluates
`context.repos.sbom_age_days > 30` against an empty dict and never spawns
the intended SBOM rescan tasks. The weekly SBOM staleness check has been
silently no-op for as long as this route mismatch has existed.
Fix scope:
1. Decide the contract — single-repo lookup (current parameter shape suggests
this) versus multi-repo bulk lookup (`repos: all` suggests this).
2. Update the resolver to call the actual State Hub route(s):
- single repo: `GET /sbom/{repo_slug}` (or `/sbom/{repo_slug}/status` if a
status-shaped projection is preferred and exists).
- bulk: iterate the State Hub `/repos/` list and call `/sbom/{repo_slug}`
per repo, returning a list bound to `context.repos`.
3. Update `activity-definitions/weekly-sbom-staleness.md` to match: either pass
a real `repo_slug` per definition (multiple definitions, one per repo) or
keep `repos: all` and let the resolver fan out.
4. Update the rule expression to traverse the resulting shape — currently
`context.repos.sbom_age_days` assumes a single object; if the resolver
returns a list, the rule needs `any(repo.sbom_age_days > 30 for repo in
context.repos)` or an equivalent per-repo evaluation.
5. Add a resolver unit test that asserts the resolver hits a route State Hub
actually serves, and an integration test against a fixture State Hub
response so this regression cannot repeat.
Out of scope for this adhoc:
- Decoupling SBOM staleness rules from the state hub resolver.
- Rewriting the SBOM ingestion pipeline or `sbom_source` policy.
- Promoting this to a full workplan unless the multi-repo decision turns out
to need design discussion.
Done when `weekly-sbom-staleness` runs cleanly against a live State Hub on
Monday and either spawns SBOM rescan tasks for stale repos or leaves a clear
"all SBOMs fresh" audit row — not a 404 log line and a silent no-op.