The state-hub resolver was calling GET /sbom/status?repo={slug}, which State
Hub does not expose. Real SBOM routes are /sbom/, /sbom/{slug},
/sbom/snapshots/, /sbom/snapshots/{id}, /sbom/ingest/, /sbom/report/licences/.
The weekly-sbom-staleness ActivityDefinition was passing params {repos: all}
and the resolver was reading params.get("repo_slug", ""), so the URL
collapsed to /sbom/status?repo= and 404'd. _fetch_json swallowed the error,
the rule context.repos.sbom_age_days > 30 evaluated against {} and never
matched, and the weekly SBOM check has been a silent no-op for as long as
the route mismatch has existed.
Resolver now supports two modes selected by params:
- single-repo: {repo_slug: foo} → GET /sbom/{foo}, returns
{repo_slug, last_sbom_at, sbom_age_days, has_sbom}
- bulk: {repos: all} → GET /repos/, computes per-repo age, returns the
worst repo's fields hoisted to the top of the result alongside
stale_count, total_count, worst_* fields, and the full per-repo list
Never-scanned repos get a 99999 sentinel age so threshold rules treat
them as very stale without forcing the rule to special-case None.
Hoisting the worst entry to the top preserves the existing rule
expression context.repos.sbom_age_days > 30 (and target_repo:
context.repos.repo_slug, though that field is a separate interpolation
gap tracked as ADHOC-2026-06-01-T02). The integration tests'
aspirational per-repo iteration model is left intact.
Live validation against State Hub on 2026-06-01:
- single: activity-core → 36 days since 2026-04-26 ingest
- bulk: 48 repos total, 46 stale (>30d), worst is info-tech-canon (never
scanned), rule expression evaluates True
Tests: 120 passed, 1 skipped.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
7.4 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | state_hub_workstream_id |
|---|---|---|---|---|---|---|---|---|---|---|
| ADHOC-2026-06-01 | workplan | Ad hoc — activity-core opportunistic fixes 2026-06-01 | custodian | activity-core | ready | custodian | custodian | 2026-06-01 | 2026-06-01 | 36162ff0-9b47-47c4-8602-56767f9b7a1c |
ADHOC-2026-06-01 — activity-core opportunistic fixes
Captured during the CUST-WP-0045 T06 cutover prep session. The dev worker was brought up and surfaced an unrelated, pre-existing bug in the state-hub context resolver that is independent of the daily triage canary.
Tasks
T01 - Fix repo_sbom_status resolver route and params
id: ADHOC-2026-06-01-T01
status: done
priority: low
state_hub_task_id: "87b56da9-e692-4350-9aff-47080414ec06"
src/activity_core/context_resolvers/state_hub.py resolves
query: repo_sbom_status by calling GET /sbom/status?repo={repo_slug}, but
State Hub does not expose /sbom/status at all. Actual SBOM routes are
/sbom/, /sbom/{repo_slug}, /sbom/snapshots/, /sbom/snapshots/{id},
/sbom/ingest/, /sbom/report/licences/.
Compounding bug: the only ActivityDefinition using this query is
activity-definitions/weekly-sbom-staleness.md, which passes
params: { repos: all }. The resolver reads params.get("repo_slug", ""),
so the lookup URL collapses to /sbom/status?repo= regardless of the
ActivityDefinition value.
Symptom: every Monday at 09:00 Europe/Berlin (and on worker startup after a
missed Monday tick), the weekly-sbom-staleness workflow runs and the
resolver logs HTTP/1.1 404 Not Found for GET /sbom/status?repo=. The
_fetch_json helper swallows the error and returns {}, so the workflow
continues but the downstream rule evaluates
context.repos.sbom_age_days > 30 against an empty dict and never spawns
the intended SBOM rescan tasks. The weekly SBOM staleness check has been
silently no-op for as long as this route mismatch has existed.
Fix scope:
- Decide the contract — single-repo lookup (current parameter shape suggests
this) versus multi-repo bulk lookup (
repos: allsuggests this). - Update the resolver to call the actual State Hub route(s):
- single repo:
GET /sbom/{repo_slug}(or/sbom/{repo_slug}/statusif a status-shaped projection is preferred and exists). - bulk: iterate the State Hub
/repos/list and call/sbom/{repo_slug}per repo, returning a list bound tocontext.repos.
- single repo:
- Update
activity-definitions/weekly-sbom-staleness.mdto match: either pass a realrepo_slugper definition (multiple definitions, one per repo) or keeprepos: alland let the resolver fan out. - Update the rule expression to traverse the resulting shape — currently
context.repos.sbom_age_daysassumes a single object; if the resolver returns a list, the rule needsany(repo.sbom_age_days > 30 for repo in context.repos)or an equivalent per-repo evaluation. - Add a resolver unit test that asserts the resolver hits a route State Hub actually serves, and an integration test against a fixture State Hub response so this regression cannot repeat.
Out of scope for this adhoc:
- Decoupling SBOM staleness rules from the state hub resolver.
- Rewriting the SBOM ingestion pipeline or
sbom_sourcepolicy. - Promoting this to a full workplan unless the multi-repo decision turns out to need design discussion.
Done when weekly-sbom-staleness runs cleanly against a live State Hub on
Monday and either spawns SBOM rescan tasks for stale repos or leaves a clear
"all SBOMs fresh" audit row — not a 404 log line and a silent no-op.
Completion — 2026-06-01:
Resolver now supports two modes selected by params:
- single-repo:
params: {repo_slug: foo}→GET /sbom/{foo} - bulk:
params: {repos: all}→GET /repos/, computes per-repo age, returns the worst-repo fields hoisted to the top of the result alongsidestale_count,total_count,worst_*fields, and the full per-repo list
Never-scanned repos use a 99999 sentinel age so threshold rules treat them
as very stale without forcing the rule expression to special-case None.
activity-definitions/weekly-sbom-staleness.md kept its existing rule
expression context.repos.sbom_age_days > 30 (the resolver hoists the worst
repo's age to that path). The definition now documents that the rule fires
at most once per workflow run, not once per stale repo, and that the
aspirational per-stale-repo fan-out exercised by the integration tests is
not delivered by the current workflow.
Live validation against the running State Hub on 2026-06-01:
- single:
activity-core→ 36 days since SBOM ingest at 2026-04-26 - bulk: 48 repos total, 46 stale (>30d); worst is
info-tech-canon(last_sbom_at: null→ 99999d sentinel); rule expression evaluates True
Tests: uv run pytest -q → 120 passed, 1 skipped (previously 116 passed +
4 broken integration tests; broken-on-my-change reverted by hoisting the
worst-repo fields to the top of context.repos).
T02 - Rule action context interpolation and per-iteration binding
id: ADHOC-2026-06-01-T02
status: todo
priority: low
state_hub_task_id: "6b3a185e-cbea-454c-82fb-8b4c16cefef0"
Discovered while completing T01: RunActivityWorkflow builds each
TaskSpec by lifting raw YAML fields out of the rule action without ever
interpolating context.* references:
# src/activity_core/workflows.py
task_spec_dicts.append({
"title": action.get("task_template", rule.get("id", "")),
"target_repo": action.get("target_repo"),
...
})
So target_repo: context.repos.repo_slug in an ActivityDefinition rule is
emitted to the spawn log as the literal string "context.repos.repo_slug",
not the actual stale repo slug. The aspirational per-stale-repo fan-out
exercised by test_pipeline_emits_one_task_for_stale_repo_only and friends
in tests/test_integration_event_bridge.py is not delivered by the
workflow — those tests simulate a per-repo iteration the real workflow
does not perform.
Two pieces of work, likely related:
-
Action field interpolation. Define and implement a safe template grammar for
action.target_repo,action.task_template,action.priority,action.labels, etc. Reuse the rule-condition AST walker (noexec, no comprehensions) or a constrained string{context.foo.bar}substitution. Decide on grammar — instruction prompt rendering uses{...}placeholders today (rules/executor.py::_render_prompt); consistent with that is probably right. -
Per-iteration context binding. Decide whether the workflow should evaluate a rule once per element of a list-valued context field (the integration-test contract), or whether the spawn-once semantics is actually desired and the tests should be relaxed. If iteration is the answer, the resolver shape from T01 already gives a clean
reposlist to iterate over; the workflow would need an explicitfor_each:directive on the rule, or implicit iteration whenconditionreferences a list element.
This is borderline workplan-grade work (design decision + security review of the interpolation grammar + workflow change + test updates). Promote to a full workplan if anyone decides to actually do it; the adhoc T02 is just to make sure the gap doesn't get forgotten.
Done when either: (a) rule action fields interpolate context.*
expressions and a stale-repo workflow run emits a TaskSpec with the actual
repo slug, or (b) a recorded decision explicitly defers/declines the change
with reasoning.