Files
activity-core/workplans/ACTIVITY-WP-0007-ops-inventory-probe-runner.md

10 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id
id type title domain repo status owner topic_slug created updated state_hub_workstream_id
ACTIVITY-WP-0007 workplan Ops Inventory Probe Runner custodian activity-core active codex custodian 2026-06-05 2026-06-07 c91a0946-92f9-4b41-8a92-005b29952916

ACTIVITY-WP-0007 - Ops Inventory Probe Runner

Context

Custodian CUST-WP-0047 introduced an inventory-first ops-hub slice: a non-secret service inventory file, a generated service catalog view, and a disabled draft ActivityDefinition for repeatable service inventory probes. State Hub message d543c39c-1f04-4f1e-a1e4-0d5b40503525 handed the activity-core portion to this repo.

The request fits activity-core only if it stays narrow. activity-core should provide scheduled policy, bounded context resolution, deterministic lightweight HTTP/HTTPS checks, and non-secret evidence emission. It must not become an ops executor, secret handler, Inter-Hub operator, k8s/ssh/tunnel runner, or service inventory authority.

Existing activity-core capabilities that make this feasible:

  • cron/manual ActivityDefinitions and Temporal orchestration
  • external definition scanning via ACTIVITY_DEFINITION_DIRS
  • static context sources and pluggable context resolvers
  • State Hub context resolver and State Hub progress report sink pattern
  • working-memory sink, NATS/EventEnvelope routing, and event type registry

Known gaps this workplan closes:

  • no ops-inventory resolver for service-inventory.yml
  • no deterministic endpoint/access-path probe result model
  • no non-LLM evidence sink for ops probe summaries
  • no ops evidence event definitions owned by activity-core
  • no Railiance projection for the Custodian probe definition or inventory input

Add Ops Inventory Context Resolver

id: ACTIVITY-WP-0007-T01
status: done
priority: high
state_hub_task_id: "dbe49dfb-f073-4245-8e86-d0355a6bb8bb"

Add a registered context resolver:

  • source type: ops-inventory
  • query: probe_services
  • params: inventory_path, timeout_seconds, include_kinds, allow_network, required

The resolver reads and validates a non-secret service inventory YAML file, initially /home/worsch/the-custodian/ops/service-inventory.yml when present. It produces compact structured output:

{
  "services": [],
  "endpoints": [],
  "summary": {"ok": 0, "degraded": 0, "down": 0, "skipped": 0},
  "generated_at": "..."
}

First implementation scope:

  • HTTP/HTTPS endpoint probes only
  • expected status and expected signal checks only
  • non-HTTP, k8s, ssh, tunnel, and authenticated access paths return skipped / unsupported, not failed
  • missing optional inventory returns {} or a skipped summary unless the context source is required
  • no response bodies, cookies, authorization headers, tokens, or command output are stored

Done when fixture-based resolver tests cover ok, expected-status mismatch, expected-signal mismatch, network/down, unsupported, and optional/required inventory failure behavior.

2026-06-05: Completed the first resolver slice. Added src/activity_core/context_resolvers/ops_inventory.py, registered source type ops-inventory, and covered ok/degraded/down/skipped results plus required vs optional inventory failure and no-secret output behavior.

Add Ops Evidence Sink

id: ACTIVITY-WP-0007-T02
status: done
priority: high
state_hub_task_id: "c6b5f49d-6f05-4be9-a968-de42195170cb"

Add a deterministic non-LLM evidence sink for compact probe results.

Initial sink behavior:

  • sink type: state-hub-progress
  • State Hub event type: ops_inventory_probe
  • idempotency key: activity_core_run_id + service_id + endpoint_id/access_path_id + event_type
  • detail contains compact non-secret results only
  • one summary progress event per run is acceptable for the first version

Prepare the contract for later Inter-Hub submission without making it mandatory:

  • event names: ops-service-observed, ops-endpoint-verified, ops-access-path-checked, ops-backup-verified, ops-inventory-drift
  • Inter-Hub mode requires INTER_HUB_URL, OPS_HUB_KEY from Secret, and widget/capability mapping config
  • missing Inter-Hub config skips cleanly with an explicit sink result

Done when sink idempotency, State Hub fallback posting, missing Inter-Hub config, and no-secret-leak behavior are covered by tests.

2026-06-05: Completed the State Hub fallback sink slice. Added src/activity_core/ops_evidence_sinks.py, a persist_ops_evidence Temporal activity, workflow/worker wiring, idempotent ops_inventory_probe progress posting, missing-Inter-Hub-config skip behavior, and no-secret compaction tests.

Register Ops Evidence Event Definitions

id: ACTIVITY-WP-0007-T03
status: done
priority: medium
state_hub_task_id: "70eb470e-9b0a-448f-ae3a-f5b1bed49e04"

Add activity-core-owned event type definitions for ops evidence so producers and future widgets have a stable contract:

  • ops-service-observed
  • ops-endpoint-verified
  • ops-access-path-checked
  • ops-backup-verified
  • ops-inventory-drift

Each definition must document:

  • publisher intent
  • non-secret attribute schema
  • idempotency fields
  • examples for success, degraded, down, skipped, and drift where applicable
  • explicit forbidden payload material: secrets, auth headers, cookies, raw response bodies, command output, and token-like values

Done when event registry tests or parser coverage prove the definitions are valid and reviewable.

2026-06-05: Completed. Added the five ops evidence event definitions under event-types/ and parser tests covering required fields and safety language.

Wire Custodian Definition Safely

id: ACTIVITY-WP-0007-T04
status: done
priority: medium
state_hub_task_id: "45132f9f-da3c-44f1-a488-195aa0e46428"

Accept the Custodian-owned disabled draft definition:

/home/worsch/the-custodian/activity-definitions/ops-service-inventory-probes.md

Requirements:

  • support sync/parse with ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian
  • keep the definition disabled until resolver, sink, and deployment wiring pass
  • add parser/sync tests for external definition directories
  • ensure manual trigger of a disabled definition in test/dev can produce fixture evidence without enabling the production schedule

Done when activity-core can scan the Custodian definition path without enabling it prematurely.

2026-06-05: Started. Added test coverage that ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian style external roots can scan a disabled ops-service-inventory-probes.md definition carrying an ops-inventory context source and explicit state-hub-progress evidence sink.

2026-06-05: Completed. The Railiance-projected disabled definition now uses the ops-inventory resolver and explicit state-hub-progress evidence sink. Tests prove the disabled definition can resolve fixture inventory data and emit one compact ops_inventory_probe State Hub progress event without enabling the production schedule.

Wire Railiance Runtime Inputs

id: ACTIVITY-WP-0007-T05
status: done
priority: medium
state_hub_task_id: "474564be-a447-4bdf-b995-168f7a93e515"

Wire the production deployment only after the local resolver/sink tests pass.

Scope:

  • project the new disabled Custodian definition into actcore-external-activity-definitions
  • decide how the worker sees the inventory input:
    • generated ConfigMap from service-inventory.yml
    • mounted repo snapshot
    • State Hub endpoint if Custodian later exposes the inventory
  • add config placeholders for OPS_INVENTORY_PATH, INTER_HUB_URL, and widget mapping
  • keep OPS_HUB_KEY in Secret only

Done when the Railiance worker can see the disabled definition and inventory input without leaking secrets or activating the schedule early.

2026-06-05: Completed the first production wiring slice. 20-runtime.yaml projects the disabled ops probe definition, runtime config placeholders (OPS_INVENTORY_PATH, INTER_HUB_URL, OPS_HUB_WIDGET_MAPPING), and a non-secret actcore-ops-service-inventory ConfigMap snapshot. The worker mounts the inventory at /etc/activity-core/ops, and bootstrap-secrets.sh keeps OPS_HUB_KEY as an empty Secret-only placeholder until operator provisioning.

Close Safety And Handoff Gates

id: ACTIVITY-WP-0007-T06
status: wait
priority: medium
state_hub_task_id: "d15fc947-3fbe-4269-93c6-d98577352149"

Complete the operational handoff only after the safety gates are satisfied.

Acceptance criteria:

  • a disabled manual trigger runs the ops inventory probe against fixture or non-production inventory data and produces compact non-secret evidence
  • State Hub progress receives one ops_inventory_probe summary per run
  • Inter-Hub submission is either implemented behind config/secret gates or explicitly deferred with a clean sink result
  • the activity-core worker can sync the Custodian definition without enabling it prematurely
  • no k8s, ssh, tunnel, or authenticated command execution is required for the first version
  • CUST-WP-0047-T07 has enough evidence to move from progress toward done

This task waits on the implementation tasks above and, for final Inter-Hub activation, the operator-gated ops-hub widget/API-key path in CUST-WP-0047.

2026-06-05: The local implementation gates are now satisfied and tested. Live closure remains waiting on applying the updated Railiance manifests and on the operator-gated Inter-Hub ops-hub widget/API-key path.

2026-06-07: Added the remaining deployment handoff for this gate while investigating the missing daily WSJF run. The Railiance runtime projection now includes the daily WSJF definition alongside the disabled ops probe definition, schema/config support needed by the shared worker, and a working-memory PVC. No live ops_inventory_probe event exists yet, and the cluster currently lacks an activity-core namespace. Cross-repo closure tasks were posted via State Hub to railiance-cluster (53e78702), inter-hub (f3ec4a36), the-custodian (7a5d4e62), state-hub (dc10704f), and activity-core (28d11021). This task remains waiting on live manifest application, actcore-sync, a disabled manual probe trigger, State Hub ops_inventory_probe evidence, and an Inter-Hub activation or explicit defer decision.

Review Verdict

activity-core should provide this as a bounded probe-and-evidence capability. It should not provide a general operational execution engine. The first useful slice is safe and valuable if it remains HTTP/HTTPS-only, non-secret, disabled until explicitly wired, and idempotent in its evidence output.