Files

tegwick 4fdf552f73 Implement operational discovery rescan loops

2026-05-20 22:52:26 +02:00

18 KiB

Raw Blame History

Repo Reality Scanner

The repo reality scanner discovers Fabric entities from repository evidence and turns them into candidate graph facts. It is a discovery layer, not a new authoring surface. Repo-owned declarations remain the highest-trust source for accepted Fabric graph data.

Contract

A scanner run emits a FabricDiscoverySnapshot. The snapshot is scoped to one repository, one commit, and one scan profile. It contains:

replacement scopes, which define the evidence sets that may be replaced on a rescan
candidate nodes, edges, and attributes
source anchors for every candidate
extractor provenance for every candidate
tombstones for candidates that vanished inside a replacement scope
reconciliation policy metadata

The JSON schema lives at schemas/discovery-snapshot.schema.yaml.

Deterministic Scanner CLI

The first implementation slice adds an offline deterministic scan command:

railiance-fabric scan . \
  --repo-slug railiance-fabric \
  --commit "$(git rev-parse HEAD)" \
  --dry-run \
  --output discovery-snapshot.json

Use --json to print the full FabricDiscoverySnapshot to stdout. Without --json, the command prints a concise summary of node, edge, attribute, and replacement-scope counts. The scanner does not call registries, catalogs, or LLMs in this mode; --output is the only write side effect.

The deterministic extractor framework currently covers:

repository metadata from local git/path evidence
README, INTENT, and SCOPE document presence and headings
repo-owned Fabric declarations under fabric/
Python pyproject.toml package metadata and dependencies
Node package.json package metadata and dependencies
common lockfiles such as package-lock.json, poetry.lock, and uv.lock
Dockerfiles and Docker Compose services
OpenAPI and AsyncAPI contract files
Score workload files
Kubernetes-style deployment manifests
common service config files such as application.yaml and appsettings.json

Each extractor emits candidates through the same accumulator so stable-key duplicates merge inside a scan before the snapshot is returned.

LLM-Assisted Extraction

LLM extraction is optional and explicit:

railiance-fabric scan . \
  --repo-slug railiance-fabric \
  --llm \
  --llm-provider openai \
  --llm-model gpt-4.1-mini \
  --dry-run \
  --output discovery-with-llm.json

The implementation integrates through llm-connect with create_adapter and RunConfig. Tests use a MockLLMAdapter-compatible boundary so CI stays offline. If llm-connect is unavailable, the provider call fails, or the model returns malformed JSON, the scanner records a review_artifacts entry and keeps the discovery snapshot schema-valid.

The LLM never receives the whole repository. The scanner first builds a compact evidence bundle from deterministic candidates, prioritizing repo-owned Fabric declarations, services, capabilities, interfaces, libraries, deployments, and small README/INTENT/SCOPE signals. The prompt asks for strict JSON:

{"nodes": [], "edges": [], "attributes": []}

Projected LLM candidates are always origin: llm and review_state: needs_review. Candidates below the configured confidence threshold become llm_low_confidence review artifacts instead of graph candidates. Unresolved edge endpoints or attribute targets also become review artifacts. Accepted graph data still requires deterministic evidence, repo-owned declarations, or a later human review/acceptance path.

Reconciliation And Dry-Run Diffs

Scans can be reconciled against a previous discovery snapshot:

railiance-fabric scan . \
  --repo-slug railiance-fabric \
  --previous-snapshot previous-discovery.json \
  --dry-run \
  --output current-discovery.json

The reconciler writes reconciliation.diff with explicit stable-key sets:

added
changed
retired
conflicted

It deduplicates candidates by stable key, merges source anchors and provenance, and applies source-aware precedence when duplicate candidates disagree. The current precedence is:

repo_declaration
deterministic
catalog
registry
llm
manual

Possible duplicates found through matching aliases, normalized labels, relationship endpoints, or attribute targets are not silently merged. They are marked status: conflicted, moved to review_state: needs_review, and listed under reconciliation.conflicts.

Missing previous candidates become tombstones only when their replacement scope is present in the current scan and has mode: replacement. Missing candidates from additive scopes, such as broad LLM evidence bundles, are left alone. Existing tombstones are preserved so repeated scans can explain graph drift.

Registry Review And Acceptance

Discovery snapshots can be stored in the Fabric registry for review:

railiance-fabric scan . \
  --repo-slug railiance-fabric \
  --previous-snapshot previous-discovery.json \
  --output discovery.json

railiance-fabric registry ingest-discovery discovery.json \
  --repo-slug railiance-fabric

The registry keeps discovery snapshots separately from accepted graph snapshots by repo, commit, and scan profile. It exposes latest/list/diff API routes so a dry run can be reviewed without changing the accepted graph.

Accepted discovery can be projected into a normal graph snapshot:

railiance-fabric registry accept-discovery railiance-fabric 12 \
  --accepted-key discovery:railiance-fabric:service-declaration:example

By default, the accept path only projects candidates already marked review_state: accepted. Passing --accepted-key explicitly includes selected candidate stable keys. Existing accepted graph nodes win over discovery nodes with the same graph id, so repo-owned declarations are preserved. Projected nodes carry discovery stable key, origin, review state, confidence, provenance, and source anchors in graph attributes; the graph explorer payload exposes those fields for review.

Connector Follow-Up

Connector follow-up is explicit and separated from repo-local extraction:

railiance-fabric scan . \
  --repo-slug railiance-fabric \
  --connector local-fabric-registry \
  --connector-manifest registry/local-repos.yaml \
  --dry-run

The connector interface has slots for:

package registries
container registries
API catalogs
service catalogs
deployment inventories
existing Fabric registry data

The first implementation is local-fabric-registry, an offline-safe connector that reads a local onboarding manifest such as registry/local-repos.yaml. It adds a FabricRegistryEntry candidate, a cataloged_as edge from the repository node, and registry-sourced attributes such as domain, remote URL, default branch, State Hub repo id, and declaration paths.

Connector evidence uses its own replacement scope with source kind fabric_registry, so rescans can replace catalog facts without retiring repo-local evidence. Connector run metadata is recorded under connector_runs with status, source, message, and candidate counts.

Connector-derived facts should be treated this way:

accepted: only when the connector reads explicit repo-owned declarations or a catalog already governed as authoritative for that field
candidate: stable local registry facts such as onboarding manifest entries, declared remote URLs, State Hub ids, and declaration paths
review-only: missing catalogs, rate limits, connector failures, ambiguous matches, or facts from catalogs with unclear ownership

Failures do not corrupt the scan. Missing catalogs become connector_unavailable review artifacts, malformed catalogs become connector_failed artifacts, and future remote connectors should use connector_rate_limited when backoff is required.

Multi-Repo Orchestration

Known local repos can be scanned from the same onboarding manifest used by registry sync-manifest:

railiance-fabric registry scan-manifest registry/local-repos.yaml \
  --dry-run \
  --output-dir .fabric-discovery

The command isolates each repo. A missing path, invalid previous snapshot, or registry write failure is reported for that repo without aborting the rest of the run. The summary includes repo counts for scanned, changed, retired, conflicted, LLM skipped, LLM failed, ingested, accepted, and errors so it can be copied into State Hub progress notes or future automation output.

Useful controls:

--repo-slug <slug> can be repeated to scan an allowlist.
--profile <name> tags the scan profile and output filename.
--previous-dir <dir> reconciles each repo against <slug>-<profile>.discovery.json from an earlier run.
--llm enables LLM-assisted extraction; --deterministic-only forces the offline rule path.
--llm-max-runs <n> caps how many repos may attempt LLM extraction in one orchestration run, while --llm-max-tokens remains the per-repo request cap.
--connector local-fabric-registry attaches manifest-derived registry facts to every repo scan.
--ingest stores discovery snapshots in the registry; --accept then projects accepted candidates into graph snapshots. --dry-run suppresses registry writes even when those flags are present.

Example review cycle:

railiance-fabric registry scan-manifest registry/local-repos.yaml \
  --repo-slug railiance-fabric \
  --previous-dir .fabric-discovery \
  --output-dir .fabric-discovery \
  --connector local-fabric-registry \
  --dry-run

After review, rerun with --ingest to store the snapshots. Add --accept only when candidates marked review_state: accepted should be projected into the registry graph.

For repeated operational loops, including default cache paths, registry-backed previous snapshots, run reports, exit codes, and rescan health views, see docs/operational-rescan-loops.md.

Scan Profiles And Review Workflow

The initial profile is deterministic, which means repo-local extraction plus any explicitly enabled offline connectors. Additional profiles should be named for the evidence policy they represent, for example deterministic-llm-draft or catalog-followup. Keep profile names stable because per-repo previous snapshots use <slug>-<profile>.discovery.json.

Recommended workflow:

Run scan or registry scan-manifest with --dry-run.
Reconcile with --previous-snapshot or --previous-dir when a prior snapshot exists.
Review candidates with review_state: needs_review, status: conflicted, tombstones, and review artifacts before accepting anything.
Store reviewed output with registry ingest-discovery.
Use registry accept-discovery or registry scan-manifest --ingest --accept only for candidates whose review state is acceptable for projection.

Failure Modes

Failures are captured close to the evidence source:

Missing repo paths, invalid manifest entries, unreadable previous snapshots, and registry request failures mark that repo as status: error in scan-manifest without stopping other repos.
Connector failures become review artifacts such as connector_unavailable or connector_failed.
LLM provider failures and malformed model output become llm_execution_error or llm_output_invalid review artifacts.
Low-confidence LLM candidates become llm_low_confidence artifacts instead of graph candidates.
Possible duplicates are marked as conflicts and left for review instead of being silently merged.

Rollout Dry Run

The first small local rollout ran on 2026-05-19:

railiance-fabric registry scan-manifest registry/local-repos.yaml \
  --repo-slug repo-scoping \
  --repo-slug llm-connect \
  --repo-slug railiance-fabric \
  --dry-run \
  --connector local-fabric-registry

Result:

repo-scoping: 18 nodes, 17 edges, 13 attributes
llm-connect: 5 nodes, 4 edges, 13 attributes
railiance-fabric: 55 nodes, 63 edges, 13 attributes
summary: 3 scanned, 0 changed, 0 retired, 0 conflicted, 3 LLM skipped, 0 LLM failed, 0 accepted, 0 errors

Follow-up backlog from this first pass:

Add a standard discovery snapshot directory, likely .fabric-discovery/, so repeated dry-runs can reconcile by default.
Add a previous-from-registry option so manifest scans can diff against the latest stored discovery snapshot without exporting JSON first.
Expand runtime/deployment extraction beyond local manifests to cover live server and deployment inventory connectors.
Add review UI affordances for conflicts, tombstones, and bulk acceptance once enough repos have baseline snapshots.
Define privacy and budget defaults before enabling non-mock LLM providers in multi-repo scans.

Identity

Identity is the main safety boundary. The scanner must not append guesses on every run. It needs to produce stable keys that are repeatable for the same observed entity.

Candidate node keys use this shape:

discovery:{repo_slug}:{entity_kind}:{normalized_name}[:source_fingerprint]

Use the optional source fingerprint when a name is too generic or when multiple entities of the same kind can share a display name. Examples include HTTP routes, generated clients, deployment manifests, and catalog records.

Candidate edge keys use a relationship fingerprint over:

source stable key
edge type
target stable key
optional evidence scope

Candidate attribute keys use the entity stable key plus the normalized attribute name and, where needed, a source fingerprint.

Stable-key parts are lowercased and normalized to ASCII-like identity segments. The helper functions in railiance_fabric.discovery define the initial rules.

Source Anchors

Every candidate must carry one or more source anchors. A source anchor identifies why the scanner believes the fact exists. Anchors can point to files, package manifests, lockfiles, API contracts, deployment manifests, service catalogs, registries, LLM evidence bundles, or manual review notes.

Source anchors include a fingerprint. The fingerprint should cover stable location fields such as path, URL, ref, line range, or JSON pointer. Snippets are useful for review but should not be the only identity anchor because formatting noise can churn snippets.

Replacement Scopes

A replacement scope says which extractor owns which set of candidates. Rescans may retire missing candidates only inside the same scope.

Examples:

scope:repo-scoping:python-package:package_manifest:<hash>
scope:state-hub:fabric-declarations:declaration
scope:llm-connect:readme-summary:file:<hash>
scope:railiance-fabric:local-registry:fabric_registry

Scopes have a mode:

replacement: candidates missing from the next run in the same scope become tombstones.
additive: candidates are added or updated, but absence does not retire old candidates.

LLM extractors should usually use replacement mode only for tightly bounded evidence bundles. Broad repo summaries are safer as additive or review-only until the extraction prompts are proven stable.

Merge Precedence

When multiple sources describe the same entity, reconciliation uses this precedence:

repo_declaration
deterministic
catalog
registry
llm
manual

Manual review can override local candidate state, but it should not silently rewrite repo-owned declarations. If accepted discoveries should become authoritative, the safer next step is to generate a repo-owned declaration patch for human review.

Duplicate Handling

The reconciler should merge candidates with the same stable key automatically. It should also look for possible duplicates using:

alias overlap
identical source anchors
identical evidence fingerprints
normalized label similarity within the same entity kind
relationship fingerprints with the same endpoints and edge type
declaration ids that match discovery aliases

Exact stable-key matches can be merged automatically. Alias-only or similarity-only matches should become needs_review conflicts unless an extractor has a source-specific rule that makes the match deterministic.

Rescan And Tombstones

On a rescan, the scanner compares the previous accepted discovery snapshot with the newly produced snapshot for the same repo/profile.

Same stable key: update in place.
Same source anchor but changed attributes: update with changed evidence.
Missing from same replacement scope: create a tombstone.
Missing from a different scope: leave untouched.
Reappears after tombstone: reactivate if the stable key and scope match.
Reappears with a new key but same alias/source anchor: flag as possible duplicate resurrection.

Tombstones explain graph drift and prevent immediate re-creation loops. They should be retained long enough to compare several scan cycles and can later be compacted by repo, extractor, or entity kind.

Mapping To Fabric Graphs

Discovery candidates can project into the existing graph model when accepted:

candidate service nodes map to ServiceDeclaration-like graph nodes
candidate capabilities and interfaces map to provider surface nodes
candidate dependencies map to dependency nodes and consumes edges
candidate deployment/runtime entities map to graph explorer infrastructure nodes until declarations gain first-class runtime support
candidate libraries map to library inventory records and graph explorer nodes

If a repo-owned declaration already exists for the same entity, discovery output should attach as supporting evidence instead of creating another node.

LLM Boundary

LLM extraction through llm-connect is optional and schema-gated. The scanner should use deterministic preselection to build small evidence bundles, ask for structured JSON, validate the JSON against the discovery schema, and record:

extractor id and version
prompt version
provider and model
usage metadata
confidence and uncertainty
rationale

Malformed, low-confidence, or conflicting LLM output becomes review material, not accepted graph data.

18 KiB Raw Blame History