new workplan rescan

This commit is contained in:
2026-05-20 00:57:26 +02:00
parent 6746943c0b
commit 50810ffd54

View File

@@ -0,0 +1,313 @@
---
id: RAIL-FAB-WP-0011
type: workplan
title: "Operational Rescan Loops"
domain: railiance
repo: railiance-fabric
status: ready
owner: codex
topic_slug: railiance
planning_priority: high
planning_order: 11
created: "2026-05-19"
updated: "2026-05-19"
state_hub_workstream_id: "b6eb92ee-1aba-49b4-8580-ab15782cb970"
---
# RAIL-FAB-WP-0011 - Operational Rescan Loops
## Goal
Turn the repo reality scanner into a regular operational loop that can rescan
the local Fabric estate, compare each repo against its latest known discovery
state, store useful baselines, surface changes for review, and update the
registry without requiring manual JSON handoffs between runs.
The desired outcome is a boring, repeatable command path that can be run by a
human, cron, Codex automation, or a later State Hub operator. A run should answer
three practical questions:
- what changed in the observed repo reality?
- what needs review before acceptance?
- which repos failed, were skipped, or are becoming stale?
## Background
`RAIL-FAB-WP-0010` established the repo reality scanner, deterministic and
LLM-assisted extraction, reconciliation, registry discovery snapshot storage,
multi-repo `registry scan-manifest`, and the first small rollout dry-run.
The scanner can already:
- scan one repo or a manifest of repos
- write discovery snapshots to files
- reconcile against a previous snapshot directory
- ingest discovery snapshots into the Fabric registry
- accept candidates that are already review-safe
- produce concise per-repo summaries
The remaining operational gap is that repeated rescans still require too much
manual setup: choosing a snapshot directory, exporting previous snapshots,
remembering when to ingest, and turning run summaries into a persistent review
backlog.
This workplan closes that gap by making the registry and CLI cooperate around
baselines, previous-from-registry diffs, run reports, stale/failure health, and
automation-safe modes.
## Design Principles
- Default to safe dry-runs and explicit ingest/accept actions.
- Prefer the registry as the durable source of prior discovery state.
- Keep local snapshot caches useful but optional.
- Make unchanged runs cheap and quiet.
- Treat conflicts, tombstones, LLM failures, and missing repos as review
signals, not as silent noise.
- Preserve per-repo failure isolation in every operational mode.
- Keep the loop automation-friendly: stable exit codes, machine-readable
reports, lock/overlap behavior, and concise human summaries.
- Avoid accepting or projecting discovery data unless review state and policy
allow it.
## Proposed Operational Loop
1. Read `registry/local-repos.yaml` or another onboarding manifest.
2. For each selected repo, determine the previous discovery snapshot from:
- the latest registry snapshot for the same repo/profile, or
- a configured local snapshot cache, or
- no previous snapshot on first baseline.
3. Run the scanner with deterministic rules and explicitly enabled connectors
or LLM profile.
4. Reconcile current evidence against previous evidence.
5. Write an operational run report with per-repo results, diffs, failures,
skipped LLM state, review artifact counts, and accepted/ingested ids.
6. Optionally ingest changed or baseline snapshots into the registry.
7. Optionally project candidates only when policy says they are acceptable.
8. Expose the run result through registry/status endpoints and State Hub
progress notes.
## Scope
In scope:
- CLI and registry support for previous-from-registry rescans.
- Standard local snapshot/report directory conventions.
- Run reports that can be consumed by humans and automation.
- Idempotent ingest behavior for unchanged runs.
- Review-oriented summary output and health status.
- Documentation and tests for recurring use.
Out of scope for this workplan:
- A full review UI for discovery conflicts and tombstones.
- Live server/deployment inventory connectors beyond existing local connector
mechanics.
- Auto-generating repo-owned Fabric declaration patches from accepted
discoveries.
- Enabling external LLM providers by default.
Those are likely follow-up workplans once the operational loop produces steady
baseline data.
## Tasks
### T01 - Snapshot Cache And Baseline Conventions
```task
id: RAIL-FAB-WP-0011-T01
status: todo
priority: high
state_hub_task_id: "cb6f05b6-ae8c-47b1-aead-4505276b089f"
```
Define and implement the local baseline conventions for repeated discovery
scans.
Acceptance notes:
- Define a standard local directory, likely `.fabric-discovery/`, for snapshot
caches and run reports.
- Decide whether the directory is ignored, partially checked in, or fully local
operational state; document the reason.
- Add CLI defaults or manifest configuration so `scan-manifest` can write and
read this directory without repeated flags.
- Preserve explicit `--output-dir` and `--previous-dir` overrides.
- Ensure output filenames remain stable by repo slug and scan profile.
- Add tests that prove first-baseline runs create predictable snapshot/report
paths without affecting registry state in dry-run mode.
### T02 - Previous-From-Registry Reconciliation
```task
id: RAIL-FAB-WP-0011-T02
status: todo
priority: high
state_hub_task_id: "ee8e5437-3c87-473c-99a0-84d947d09249"
```
Allow manifest rescans to diff against the latest stored discovery snapshot in
the registry, so operators do not need to export JSON before every run.
Acceptance notes:
- Add a `scan-manifest` option such as `--previous-source registry|dir|none`
or `--previous-from-registry`.
- Fetch the latest discovery snapshot for each repo/profile through existing
registry discovery APIs.
- Fall back cleanly when a repo has no previous registry snapshot and mark the
run as a first baseline for that repo.
- Keep local `--previous-dir` support for offline or file-based workflows.
- Include previous snapshot id/source in per-repo results and run reports.
- Add tests for registry previous found, registry previous missing, registry
unavailable, and file-directory fallback.
### T03 - Operational Run Reports
```task
id: RAIL-FAB-WP-0011-T03
status: todo
priority: high
state_hub_task_id: "d11621f2-7610-4060-863d-dbf86858a3e6"
```
Persist each rescan loop as a report that can drive review, State Hub notes,
and future automation.
Acceptance notes:
- Add a report schema or documented JSON shape for manifest rescan runs.
- Record command profile, manifest path, selected repos, generated timestamp,
scanner version, registry URL, dry-run/ingest/accept flags, and LLM budget
policy.
- For each repo, record commit, previous source/id, current output path,
discovery snapshot id, accepted graph snapshot id, candidate counts, diff
counts, review artifact counts, connector run summaries, and errors.
- Add `--report-output` and a default report path under the standard
operational directory.
- Keep console output concise while making the JSON report complete.
- Add tests for report content in success, partial failure, and no-change runs.
### T04 - Idempotent Ingest And Acceptance Policies
```task
id: RAIL-FAB-WP-0011-T04
status: todo
priority: high
state_hub_task_id: "c64daf3b-a5ec-4ea9-82b6-f8f352eb9283"
```
Make registry writes safe for recurring execution by avoiding unnecessary
snapshot churn and by separating ingest from acceptance policy.
Acceptance notes:
- Add a mode to skip ingesting unchanged discovery snapshots unless explicitly
requested.
- Detect unchanged snapshots by reconciliation diff and/or normalized snapshot
fingerprint.
- Keep an explicit first-baseline ingest mode for repos with no prior discovery
snapshot.
- Add acceptance policy controls such as accepted-only, no-conflicts,
no-tombstones, selected keys, or selected review states.
- Prevent `--accept` from projecting conflicted, needs-review, or low-confidence
candidates unless an explicit override is supplied.
- Report why a repo was ingested, skipped unchanged, blocked for review, or
accepted.
- Add tests covering unchanged skip, baseline ingest, changed ingest, blocked
acceptance, and explicit acceptance override.
### T05 - Rescan Health And Registry Surfaces
```task
id: RAIL-FAB-WP-0011-T05
status: todo
priority: medium
state_hub_task_id: "b3440439-b9c4-4753-98bc-618d1934ed4e"
```
Expose operational rescan state through the registry so humans and tools can
see freshness, failures, and review load.
Acceptance notes:
- Store or derive latest rescan metadata per repo/profile.
- Add registry inventory/status fields for latest discovery run, latest diff
counts, latest failure, stale age, and review artifact counts.
- Add an endpoint or CLI view for repos needing review.
- Add an endpoint or CLI view for repos stale beyond a configurable age.
- Keep existing graph and discovery snapshot APIs backward compatible.
- Add tests for inventory/status output after baseline, changed, failed, and
stale runs.
### T06 - Automation-Safe Command Mode
```task
id: RAIL-FAB-WP-0011-T06
status: todo
priority: medium
state_hub_task_id: "7461e6f1-7ef0-4947-9cfa-67f463e9aa00"
```
Make the rescan loop safe to run from cron, Codex automations, or a State Hub
operator without bespoke shell glue.
Acceptance notes:
- Add a documented command recipe, script, or subcommand for the standard local
rescan loop.
- Define stable exit codes for success, changes found, review required,
partial repo failures, and infrastructure failure.
- Add lock/overlap protection so two local rescan loops do not run against the
same manifest concurrently.
- Keep dry-run as the safe default unless ingest/accept flags are explicit.
- Emit concise human output and machine-readable JSON consistently.
- Add tests for exit-code policy and lock behavior where practical.
### T07 - Documentation, Rollout, And First Baseline
```task
id: RAIL-FAB-WP-0011-T07
status: todo
priority: medium
state_hub_task_id: "9c6f8e33-dd13-48ca-815d-73fd09b25423"
```
Document the operational loop and run a first controlled baseline against a
small local repo set before broad adoption.
Acceptance notes:
- Document the standard local rescan workflow, registry-backed workflow,
report format, exit codes, and failure handling.
- Document how to use deterministic-only mode, connector mode, and LLM-capped
mode safely.
- Document the manual review steps before acceptance.
- Run a first baseline loop against a small allowlist such as
`repo-scoping`, `llm-connect`, and `railiance-fabric`.
- Record the resulting report summary and follow-up backlog in docs and State
Hub progress.
- Mark this workplan ready for broader all-local-repo rollout only after the
small baseline loop is repeatable.
## Open Questions
- Should local discovery caches be committed, ignored, or treated as operator
runtime state only?
- Should the registry store every run report or only latest run metadata?
- What is the right default stale age for local repos: daily, weekly, or based
on commit changes?
- Should exit code `0` mean "no infrastructure failure" or "no changes found"?
- Which acceptance policies are safe enough for unattended operation?
- Should State Hub receive one progress note per run or only when changes,
failures, or review-required conditions appear?
## Close Criteria
- A single documented command can perform a safe repeated rescan loop across a
manifest.
- The command can diff against registry-stored previous discovery snapshots.
- First-baseline, unchanged, changed, failed, and review-required repos are
distinguishable in console output, JSON reports, and registry status.
- Repeated runs do not create noisy duplicate registry snapshots by default.
- Acceptance remains explicit and policy-gated.
- Tests cover the recurring loop behavior well enough to trust automation.