Add ops inventory probe runner workplan

This commit is contained in:
2026-06-05 22:46:11 +02:00
parent 42e373aba1
commit 3b8bac26da

View File

@@ -0,0 +1,218 @@
---
id: ACTIVITY-WP-0007
type: workplan
title: "Ops Inventory Probe Runner"
domain: custodian
repo: activity-core
status: ready
owner: codex
topic_slug: custodian
created: "2026-06-05"
updated: "2026-06-05"
---
# ACTIVITY-WP-0007 - Ops Inventory Probe Runner
## Context
Custodian `CUST-WP-0047` introduced an inventory-first ops-hub slice: a
non-secret service inventory file, a generated service catalog view, and a
disabled draft ActivityDefinition for repeatable service inventory probes.
State Hub message `d543c39c-1f04-4f1e-a1e4-0d5b40503525` handed the
activity-core portion to this repo.
The request fits activity-core only if it stays narrow. activity-core should
provide scheduled policy, bounded context resolution, deterministic lightweight
HTTP/HTTPS checks, and non-secret evidence emission. It must not become an ops
executor, secret handler, Inter-Hub operator, k8s/ssh/tunnel runner, or service
inventory authority.
Existing activity-core capabilities that make this feasible:
- cron/manual ActivityDefinitions and Temporal orchestration
- external definition scanning via `ACTIVITY_DEFINITION_DIRS`
- static context sources and pluggable context resolvers
- State Hub context resolver and State Hub progress report sink pattern
- working-memory sink, NATS/EventEnvelope routing, and event type registry
Known gaps this workplan closes:
- no `ops-inventory` resolver for `service-inventory.yml`
- no deterministic endpoint/access-path probe result model
- no non-LLM evidence sink for ops probe summaries
- no ops evidence event definitions owned by activity-core
- no Railiance projection for the Custodian probe definition or inventory input
## Add Ops Inventory Context Resolver
```task
id: ACTIVITY-WP-0007-T01
status: todo
priority: high
```
Add a registered context resolver:
- source type: `ops-inventory`
- query: `probe_services`
- params: `inventory_path`, `timeout_seconds`, `include_kinds`,
`allow_network`, `required`
The resolver reads and validates a non-secret service inventory YAML file,
initially `/home/worsch/the-custodian/ops/service-inventory.yml` when present.
It produces compact structured output:
```json
{
"services": [],
"endpoints": [],
"summary": {"ok": 0, "degraded": 0, "down": 0, "skipped": 0},
"generated_at": "..."
}
```
First implementation scope:
- HTTP/HTTPS endpoint probes only
- expected status and expected signal checks only
- non-HTTP, k8s, ssh, tunnel, and authenticated access paths return
`skipped` / `unsupported`, not failed
- missing optional inventory returns `{}` or a skipped summary unless the
context source is required
- no response bodies, cookies, authorization headers, tokens, or command output
are stored
Done when fixture-based resolver tests cover `ok`, expected-status mismatch,
expected-signal mismatch, network/down, unsupported, and optional/required
inventory failure behavior.
## Add Ops Evidence Sink
```task
id: ACTIVITY-WP-0007-T02
status: todo
priority: high
```
Add a deterministic non-LLM evidence sink for compact probe results.
Initial sink behavior:
- sink type: `state-hub-progress`
- State Hub event type: `ops_inventory_probe`
- idempotency key: `activity_core_run_id + service_id + endpoint_id/access_path_id + event_type`
- detail contains compact non-secret results only
- one summary progress event per run is acceptable for the first version
Prepare the contract for later Inter-Hub submission without making it mandatory:
- event names: `ops-service-observed`, `ops-endpoint-verified`,
`ops-access-path-checked`, `ops-backup-verified`, `ops-inventory-drift`
- Inter-Hub mode requires `INTER_HUB_URL`, `OPS_HUB_KEY` from Secret, and
widget/capability mapping config
- missing Inter-Hub config skips cleanly with an explicit sink result
Done when sink idempotency, State Hub fallback posting, missing Inter-Hub
config, and no-secret-leak behavior are covered by tests.
## Register Ops Evidence Event Definitions
```task
id: ACTIVITY-WP-0007-T03
status: todo
priority: medium
```
Add activity-core-owned event type definitions for ops evidence so producers
and future widgets have a stable contract:
- `ops-service-observed`
- `ops-endpoint-verified`
- `ops-access-path-checked`
- `ops-backup-verified`
- `ops-inventory-drift`
Each definition must document:
- publisher intent
- non-secret attribute schema
- idempotency fields
- examples for success, degraded, down, skipped, and drift where applicable
- explicit forbidden payload material: secrets, auth headers, cookies, raw
response bodies, command output, and token-like values
Done when event registry tests or parser coverage prove the definitions are
valid and reviewable.
## Wire Custodian Definition Safely
```task
id: ACTIVITY-WP-0007-T04
status: todo
priority: medium
```
Accept the Custodian-owned disabled draft definition:
`/home/worsch/the-custodian/activity-definitions/ops-service-inventory-probes.md`
Requirements:
- support sync/parse with
`ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian`
- keep the definition disabled until resolver, sink, and deployment wiring pass
- add parser/sync tests for external definition directories
- ensure manual trigger of a disabled definition in test/dev can produce fixture
evidence without enabling the production schedule
Done when activity-core can scan the Custodian definition path without enabling
it prematurely.
## Wire Railiance Runtime Inputs
```task
id: ACTIVITY-WP-0007-T05
status: todo
priority: medium
```
Wire the production deployment only after the local resolver/sink tests pass.
Scope:
- project the new disabled Custodian definition into
`actcore-external-activity-definitions`
- decide how the worker sees the inventory input:
- generated ConfigMap from `service-inventory.yml`
- mounted repo snapshot
- State Hub endpoint if Custodian later exposes the inventory
- add config placeholders for `OPS_INVENTORY_PATH`, `INTER_HUB_URL`, and widget
mapping
- keep `OPS_HUB_KEY` in Secret only
Done when the Railiance worker can see the disabled definition and inventory
input without leaking secrets or activating the schedule early.
## Close Safety And Handoff Gates
```task
id: ACTIVITY-WP-0007-T06
status: wait
priority: medium
```
Complete the operational handoff only after the safety gates are satisfied.
Acceptance criteria:
- a disabled manual trigger runs the ops inventory probe against fixture or
non-production inventory data and produces compact non-secret evidence
- State Hub progress receives one `ops_inventory_probe` summary per run
- Inter-Hub submission is either implemented behind config/secret gates or
explicitly deferred with a clean sink result
- the activity-core worker can sync the Custodian definition without enabling it
prematurely
- no k8s, ssh, tunnel, or authenticated command execution is required for the
first version
- `CUST-WP-0047-T07` has enough evidence to move from `progress` toward done
This task waits on the implementation tasks above and, for final Inter-Hub
activation, the operator-gated ops-hub widget/API-key path in `CUST-WP-0047`.
## Review Verdict
activity-core should provide this as a bounded probe-and-evidence capability.
It should not provide a general operational execution engine. The first useful
slice is safe and valuable if it remains HTTP/HTTPS-only, non-secret, disabled
until explicitly wired, and idempotent in its evidence output.