generated from coulomb/repo-seed
new workplan rescan
This commit is contained in:
313
workplans/RAIL-FAB-WP-0011-operational-rescan-loops.md
Normal file
313
workplans/RAIL-FAB-WP-0011-operational-rescan-loops.md
Normal file
@@ -0,0 +1,313 @@
|
||||
---
|
||||
id: RAIL-FAB-WP-0011
|
||||
type: workplan
|
||||
title: "Operational Rescan Loops"
|
||||
domain: railiance
|
||||
repo: railiance-fabric
|
||||
status: ready
|
||||
owner: codex
|
||||
topic_slug: railiance
|
||||
planning_priority: high
|
||||
planning_order: 11
|
||||
created: "2026-05-19"
|
||||
updated: "2026-05-19"
|
||||
state_hub_workstream_id: "b6eb92ee-1aba-49b4-8580-ab15782cb970"
|
||||
---
|
||||
|
||||
# RAIL-FAB-WP-0011 - Operational Rescan Loops
|
||||
|
||||
## Goal
|
||||
|
||||
Turn the repo reality scanner into a regular operational loop that can rescan
|
||||
the local Fabric estate, compare each repo against its latest known discovery
|
||||
state, store useful baselines, surface changes for review, and update the
|
||||
registry without requiring manual JSON handoffs between runs.
|
||||
|
||||
The desired outcome is a boring, repeatable command path that can be run by a
|
||||
human, cron, Codex automation, or a later State Hub operator. A run should answer
|
||||
three practical questions:
|
||||
|
||||
- what changed in the observed repo reality?
|
||||
- what needs review before acceptance?
|
||||
- which repos failed, were skipped, or are becoming stale?
|
||||
|
||||
## Background
|
||||
|
||||
`RAIL-FAB-WP-0010` established the repo reality scanner, deterministic and
|
||||
LLM-assisted extraction, reconciliation, registry discovery snapshot storage,
|
||||
multi-repo `registry scan-manifest`, and the first small rollout dry-run.
|
||||
|
||||
The scanner can already:
|
||||
|
||||
- scan one repo or a manifest of repos
|
||||
- write discovery snapshots to files
|
||||
- reconcile against a previous snapshot directory
|
||||
- ingest discovery snapshots into the Fabric registry
|
||||
- accept candidates that are already review-safe
|
||||
- produce concise per-repo summaries
|
||||
|
||||
The remaining operational gap is that repeated rescans still require too much
|
||||
manual setup: choosing a snapshot directory, exporting previous snapshots,
|
||||
remembering when to ingest, and turning run summaries into a persistent review
|
||||
backlog.
|
||||
|
||||
This workplan closes that gap by making the registry and CLI cooperate around
|
||||
baselines, previous-from-registry diffs, run reports, stale/failure health, and
|
||||
automation-safe modes.
|
||||
|
||||
## Design Principles
|
||||
|
||||
- Default to safe dry-runs and explicit ingest/accept actions.
|
||||
- Prefer the registry as the durable source of prior discovery state.
|
||||
- Keep local snapshot caches useful but optional.
|
||||
- Make unchanged runs cheap and quiet.
|
||||
- Treat conflicts, tombstones, LLM failures, and missing repos as review
|
||||
signals, not as silent noise.
|
||||
- Preserve per-repo failure isolation in every operational mode.
|
||||
- Keep the loop automation-friendly: stable exit codes, machine-readable
|
||||
reports, lock/overlap behavior, and concise human summaries.
|
||||
- Avoid accepting or projecting discovery data unless review state and policy
|
||||
allow it.
|
||||
|
||||
## Proposed Operational Loop
|
||||
|
||||
1. Read `registry/local-repos.yaml` or another onboarding manifest.
|
||||
2. For each selected repo, determine the previous discovery snapshot from:
|
||||
- the latest registry snapshot for the same repo/profile, or
|
||||
- a configured local snapshot cache, or
|
||||
- no previous snapshot on first baseline.
|
||||
3. Run the scanner with deterministic rules and explicitly enabled connectors
|
||||
or LLM profile.
|
||||
4. Reconcile current evidence against previous evidence.
|
||||
5. Write an operational run report with per-repo results, diffs, failures,
|
||||
skipped LLM state, review artifact counts, and accepted/ingested ids.
|
||||
6. Optionally ingest changed or baseline snapshots into the registry.
|
||||
7. Optionally project candidates only when policy says they are acceptable.
|
||||
8. Expose the run result through registry/status endpoints and State Hub
|
||||
progress notes.
|
||||
|
||||
## Scope
|
||||
|
||||
In scope:
|
||||
|
||||
- CLI and registry support for previous-from-registry rescans.
|
||||
- Standard local snapshot/report directory conventions.
|
||||
- Run reports that can be consumed by humans and automation.
|
||||
- Idempotent ingest behavior for unchanged runs.
|
||||
- Review-oriented summary output and health status.
|
||||
- Documentation and tests for recurring use.
|
||||
|
||||
Out of scope for this workplan:
|
||||
|
||||
- A full review UI for discovery conflicts and tombstones.
|
||||
- Live server/deployment inventory connectors beyond existing local connector
|
||||
mechanics.
|
||||
- Auto-generating repo-owned Fabric declaration patches from accepted
|
||||
discoveries.
|
||||
- Enabling external LLM providers by default.
|
||||
|
||||
Those are likely follow-up workplans once the operational loop produces steady
|
||||
baseline data.
|
||||
|
||||
## Tasks
|
||||
|
||||
### T01 - Snapshot Cache And Baseline Conventions
|
||||
|
||||
```task
|
||||
id: RAIL-FAB-WP-0011-T01
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "cb6f05b6-ae8c-47b1-aead-4505276b089f"
|
||||
```
|
||||
|
||||
Define and implement the local baseline conventions for repeated discovery
|
||||
scans.
|
||||
|
||||
Acceptance notes:
|
||||
|
||||
- Define a standard local directory, likely `.fabric-discovery/`, for snapshot
|
||||
caches and run reports.
|
||||
- Decide whether the directory is ignored, partially checked in, or fully local
|
||||
operational state; document the reason.
|
||||
- Add CLI defaults or manifest configuration so `scan-manifest` can write and
|
||||
read this directory without repeated flags.
|
||||
- Preserve explicit `--output-dir` and `--previous-dir` overrides.
|
||||
- Ensure output filenames remain stable by repo slug and scan profile.
|
||||
- Add tests that prove first-baseline runs create predictable snapshot/report
|
||||
paths without affecting registry state in dry-run mode.
|
||||
|
||||
### T02 - Previous-From-Registry Reconciliation
|
||||
|
||||
```task
|
||||
id: RAIL-FAB-WP-0011-T02
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "ee8e5437-3c87-473c-99a0-84d947d09249"
|
||||
```
|
||||
|
||||
Allow manifest rescans to diff against the latest stored discovery snapshot in
|
||||
the registry, so operators do not need to export JSON before every run.
|
||||
|
||||
Acceptance notes:
|
||||
|
||||
- Add a `scan-manifest` option such as `--previous-source registry|dir|none`
|
||||
or `--previous-from-registry`.
|
||||
- Fetch the latest discovery snapshot for each repo/profile through existing
|
||||
registry discovery APIs.
|
||||
- Fall back cleanly when a repo has no previous registry snapshot and mark the
|
||||
run as a first baseline for that repo.
|
||||
- Keep local `--previous-dir` support for offline or file-based workflows.
|
||||
- Include previous snapshot id/source in per-repo results and run reports.
|
||||
- Add tests for registry previous found, registry previous missing, registry
|
||||
unavailable, and file-directory fallback.
|
||||
|
||||
### T03 - Operational Run Reports
|
||||
|
||||
```task
|
||||
id: RAIL-FAB-WP-0011-T03
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "d11621f2-7610-4060-863d-dbf86858a3e6"
|
||||
```
|
||||
|
||||
Persist each rescan loop as a report that can drive review, State Hub notes,
|
||||
and future automation.
|
||||
|
||||
Acceptance notes:
|
||||
|
||||
- Add a report schema or documented JSON shape for manifest rescan runs.
|
||||
- Record command profile, manifest path, selected repos, generated timestamp,
|
||||
scanner version, registry URL, dry-run/ingest/accept flags, and LLM budget
|
||||
policy.
|
||||
- For each repo, record commit, previous source/id, current output path,
|
||||
discovery snapshot id, accepted graph snapshot id, candidate counts, diff
|
||||
counts, review artifact counts, connector run summaries, and errors.
|
||||
- Add `--report-output` and a default report path under the standard
|
||||
operational directory.
|
||||
- Keep console output concise while making the JSON report complete.
|
||||
- Add tests for report content in success, partial failure, and no-change runs.
|
||||
|
||||
### T04 - Idempotent Ingest And Acceptance Policies
|
||||
|
||||
```task
|
||||
id: RAIL-FAB-WP-0011-T04
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "c64daf3b-a5ec-4ea9-82b6-f8f352eb9283"
|
||||
```
|
||||
|
||||
Make registry writes safe for recurring execution by avoiding unnecessary
|
||||
snapshot churn and by separating ingest from acceptance policy.
|
||||
|
||||
Acceptance notes:
|
||||
|
||||
- Add a mode to skip ingesting unchanged discovery snapshots unless explicitly
|
||||
requested.
|
||||
- Detect unchanged snapshots by reconciliation diff and/or normalized snapshot
|
||||
fingerprint.
|
||||
- Keep an explicit first-baseline ingest mode for repos with no prior discovery
|
||||
snapshot.
|
||||
- Add acceptance policy controls such as accepted-only, no-conflicts,
|
||||
no-tombstones, selected keys, or selected review states.
|
||||
- Prevent `--accept` from projecting conflicted, needs-review, or low-confidence
|
||||
candidates unless an explicit override is supplied.
|
||||
- Report why a repo was ingested, skipped unchanged, blocked for review, or
|
||||
accepted.
|
||||
- Add tests covering unchanged skip, baseline ingest, changed ingest, blocked
|
||||
acceptance, and explicit acceptance override.
|
||||
|
||||
### T05 - Rescan Health And Registry Surfaces
|
||||
|
||||
```task
|
||||
id: RAIL-FAB-WP-0011-T05
|
||||
status: todo
|
||||
priority: medium
|
||||
state_hub_task_id: "b3440439-b9c4-4753-98bc-618d1934ed4e"
|
||||
```
|
||||
|
||||
Expose operational rescan state through the registry so humans and tools can
|
||||
see freshness, failures, and review load.
|
||||
|
||||
Acceptance notes:
|
||||
|
||||
- Store or derive latest rescan metadata per repo/profile.
|
||||
- Add registry inventory/status fields for latest discovery run, latest diff
|
||||
counts, latest failure, stale age, and review artifact counts.
|
||||
- Add an endpoint or CLI view for repos needing review.
|
||||
- Add an endpoint or CLI view for repos stale beyond a configurable age.
|
||||
- Keep existing graph and discovery snapshot APIs backward compatible.
|
||||
- Add tests for inventory/status output after baseline, changed, failed, and
|
||||
stale runs.
|
||||
|
||||
### T06 - Automation-Safe Command Mode
|
||||
|
||||
```task
|
||||
id: RAIL-FAB-WP-0011-T06
|
||||
status: todo
|
||||
priority: medium
|
||||
state_hub_task_id: "7461e6f1-7ef0-4947-9cfa-67f463e9aa00"
|
||||
```
|
||||
|
||||
Make the rescan loop safe to run from cron, Codex automations, or a State Hub
|
||||
operator without bespoke shell glue.
|
||||
|
||||
Acceptance notes:
|
||||
|
||||
- Add a documented command recipe, script, or subcommand for the standard local
|
||||
rescan loop.
|
||||
- Define stable exit codes for success, changes found, review required,
|
||||
partial repo failures, and infrastructure failure.
|
||||
- Add lock/overlap protection so two local rescan loops do not run against the
|
||||
same manifest concurrently.
|
||||
- Keep dry-run as the safe default unless ingest/accept flags are explicit.
|
||||
- Emit concise human output and machine-readable JSON consistently.
|
||||
- Add tests for exit-code policy and lock behavior where practical.
|
||||
|
||||
### T07 - Documentation, Rollout, And First Baseline
|
||||
|
||||
```task
|
||||
id: RAIL-FAB-WP-0011-T07
|
||||
status: todo
|
||||
priority: medium
|
||||
state_hub_task_id: "9c6f8e33-dd13-48ca-815d-73fd09b25423"
|
||||
```
|
||||
|
||||
Document the operational loop and run a first controlled baseline against a
|
||||
small local repo set before broad adoption.
|
||||
|
||||
Acceptance notes:
|
||||
|
||||
- Document the standard local rescan workflow, registry-backed workflow,
|
||||
report format, exit codes, and failure handling.
|
||||
- Document how to use deterministic-only mode, connector mode, and LLM-capped
|
||||
mode safely.
|
||||
- Document the manual review steps before acceptance.
|
||||
- Run a first baseline loop against a small allowlist such as
|
||||
`repo-scoping`, `llm-connect`, and `railiance-fabric`.
|
||||
- Record the resulting report summary and follow-up backlog in docs and State
|
||||
Hub progress.
|
||||
- Mark this workplan ready for broader all-local-repo rollout only after the
|
||||
small baseline loop is repeatable.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Should local discovery caches be committed, ignored, or treated as operator
|
||||
runtime state only?
|
||||
- Should the registry store every run report or only latest run metadata?
|
||||
- What is the right default stale age for local repos: daily, weekly, or based
|
||||
on commit changes?
|
||||
- Should exit code `0` mean "no infrastructure failure" or "no changes found"?
|
||||
- Which acceptance policies are safe enough for unattended operation?
|
||||
- Should State Hub receive one progress note per run or only when changes,
|
||||
failures, or review-required conditions appear?
|
||||
|
||||
## Close Criteria
|
||||
|
||||
- A single documented command can perform a safe repeated rescan loop across a
|
||||
manifest.
|
||||
- The command can diff against registry-stored previous discovery snapshots.
|
||||
- First-baseline, unchanged, changed, failed, and review-required repos are
|
||||
distinguishable in console output, JSON reports, and registry status.
|
||||
- Repeated runs do not create noisy duplicate registry snapshots by default.
|
||||
- Acceptance remains explicit and policy-gated.
|
||||
- Tests cover the recurring loop behavior well enough to trust automation.
|
||||
Reference in New Issue
Block a user