Files

tegwick 3e906c1dd4 Fix rerun assessment and candidate extraction

2026-05-16 00:57:44 +02:00

12 KiB

Raw Blame History

id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id

id	type	title	domain	repo	status	owner	topic_slug	created	updated	state_hub_workstream_id
RREG-WP-0018	workplan	Agentic Hierarchy And Intent/Scope Review	capabilities	repo-scoping	active	codex	foerster-capabilities	2026-05-15	2026-05-15	83df7082-789f-440e-b7a8-d1f8ecd01cc6

Agentic Hierarchy And Intent/Scope Review

The Railiance and related repository datasets exposed a gap in the current generation pipeline. Deterministic scanning produces useful facts and content chunks, but most repositories without an INTENT.md stop at a single candidate ability and do not produce candidate capabilities, features, or evidence. The dependency graph then appears empty because it only renders edges from approved characteristics; it does not yet render fact-only, candidate, or partial hierarchies.

This workplan shifts the next improvement from deterministic acceptance toward a reviewable agentic support layer. Deterministic scanners should continue to produce transparent facts and formal rejection signals. Agentic generation should stand in for the human abstraction step: deriving features, capabilities, abilities, and draft scope from facts, source-linked text, and existing SCOPE.md content when present, while keeping every result reviewable.

Dataset Assessment

The initial var/repo-scoping.sqlite3 dataset contained eight repositories. The new non-repo-scoping repositories all completed analysis, but only ops-warden produced a candidate capability and feature. Railiance repos mostly produced one candidate ability, zero candidate capabilities, zero candidate features, and zero candidate evidence.

Observed patterns:

repo-scoping: complete approved hierarchy and dependency graph (92 nodes / 241 edges).
railiance-cluster, railiance-infra, railiance-apps, railiance-platform, railiance-enablement, and vergabe-teilnahme: facts and content chunks exist, but no lower candidate hierarchy is produced.
ops-warden: interface facts produce one generic capability and feature, but the ability name is polluted by template README text instead of the specific SCOPE.md one-liner.
Repositories with rich SCOPE.md files already contain useful one-liners, relevant/not-relevant boundaries, related repos, entry points, and Provided Capabilities blocks, but those are currently treated as derived_scope facts and not promoted into candidate capabilities.
Repositories without INTENT.md need a proposed intent draft, not an automatic source-file mutation. The draft should be an ambitious design-intent version of the abstracted current scope and should require review before it is written.

T01: Capture Sparse-Hierarchy Dataset Baseline

id: RREG-WP-0018-T01
status: done
priority: high
state_hub_task_id: "dd00a642-7c69-4ae2-b7ac-954c31a1c72a"

Create a repeatable local assessment command or artifact that summarizes each repository's latest run across facts, content chunks, candidate layers, approved layers, document presence, and dependency graph element counts.

Acceptance criteria:

The Railiance/ops/vergabe dataset can be compared before and after generation changes without relying on screenshots or manual DB inspection.
The report distinguishes approved, candidate, draft, and fact-only graph coverage.
The report flags suspicious abstraction quality issues such as template README contamination and empty lower layers despite rich SCOPE.md content.

T02: Generate Candidate Hierarchies From Facts And Scope Text

id: RREG-WP-0018-T02
status: done
priority: high
state_hub_task_id: "01eb03da-7a0e-4e22-ae2d-7596752d178e"

Extend generation so a repository can get candidate features, capabilities, abilities, and draft scope even when no INTENT.md exists. Existing SCOPE.md content may be used as review input for current-state candidates, but it should remain labeled as derived/current scope rather than design intent.

Acceptance criteria:

SCOPE.md Provided Capabilities blocks become source-linked candidate capabilities/features when present.
One-liner, Core Idea, Relevant When, Not Relevant When, related repo, and entry point sections contribute to candidate ability/scope drafts without being auto-approved.
Configuration, manifest, Makefile, Helm/Kubernetes, Terraform, Ansible, CLI, and test facts can produce concrete feature candidates rather than stopping at a generic ability.
Candidate naming prefers repo-specific scope/readme evidence over template boilerplate such as repo-seed.

T03: Add Agentic Draft Generation Layer

id: RREG-WP-0018-T03
status: done
priority: high
state_hub_task_id: "fd572f4d-d2f6-4c85-bbf5-f77829fd6e6a"

Introduce an agentic generation step after deterministic facts and before human approval. The agent should receive facts, source-role metadata, content chunks, and deterministic candidate seeds, then propose a grounded draft hierarchy.

Acceptance criteria:

Agentic generation can fill missing abstraction levels from facts and source text, including scope when no reviewed scope exists.
Every agentic draft carries source references and a rationale explaining how the abstraction was derived.
The agentic step does not write INTENT.md, does not auto-approve registry truth, and does not bypass quality gates.
Failures or unavailable agent configuration leave deterministic facts and candidates intact with an explicit review decision.

T04: Review And Edit INTENT.md / SCOPE.md Drafts

id: RREG-WP-0018-T04
status: done
priority: high
state_hub_task_id: "286d96e0-ec5a-4a55-bb50-62d20ab25830"

Add review surfaces for repository INTENT.md and SCOPE.md without mutating source files automatically. If INTENT.md is missing, produce a proposed intent draft as an ambitious design-intent version of the abstracted current scope.

Acceptance criteria:

Users can view existing INTENT.md and SCOPE.md content from the checkout.
Users can review, edit, diff, and explicitly apply generated drafts.
Missing INTENT.md produces a draft artifact with provenance, not an automatic file write.
Draft intent is clearly separated from current scope: intent describes desired utility; scope describes current understood behavior.
SCOPE updates remain reviewable and do not overwrite user files without an explicit write action.

T05: Make Dependency Graph Work For Partial Hierarchies

id: RREG-WP-0018-T05
status: done
priority: high
state_hub_task_id: "80bc671c-2361-47e5-8135-7c945de66437"

Dependency graph generation should degrade gracefully when approved abilities, capabilities, or features are absent. It should be able to show facts, candidate entries, draft scope/intent nodes, and partial hierarchy edges.

Acceptance criteria:

Repositories with facts never render as an empty dependency graph solely because approved characteristics are missing.
Graph nodes are visibly labeled as approved, candidate, draft, or fact-only.
Candidate and draft edges use distinct dependency types from approved truth.
The graph supports partial chains such as fact -> candidate feature, fact -> candidate capability, candidate capability -> candidate ability, and candidate ability -> draft scope.
Existing approved dependency graph behavior remains stable for repo-scoping.

T06: Transparent Quality Criteria For Generated Abstractions

id: RREG-WP-0018-T06
status: done
priority: medium
state_hub_task_id: "4b74a058-b759-42d2-a243-7134dd907093"

Add reviewable quality criteria that apply to generated features, capabilities, abilities, scope drafts, and intent drafts.

Acceptance criteria:

Criteria cover source grounding, native utility, abstraction coherence, sibling-repo boundary awareness, template contamination, and scope-vs-intent separation.
Criteria can invalidate or downgrade generated items before review, but do not deterministically accept them as truth.
Criteria outcomes are exposed in API/UI reports and assessment artifacts.
Railiance layer boundaries are treated as evidence for review, not as automatically accepted architecture claims.

T07: Re-Run And Compare The Dataset

id: RREG-WP-0018-T07
status: in_progress
priority: medium
state_hub_task_id: "cd1a3c14-076b-42da-8319-48310a964611"

After implementation, rerun the current repository dataset and compare the new results against the sparse baseline.

Acceptance criteria:

Each Railiance repo has source-linked candidate capabilities/features or a documented reason why generation withheld them.
ops-warden and vergabe-teilnahme no longer prefer template README text over repo-specific evidence.
Dependency graph element counts are non-zero for repositories with facts.
The comparison report makes it easy to judge whether the new result is better than the previous sparse output.

Implementation Update

Implemented the comparison and generation infrastructure needed to rerun the dataset:

Added repo-scoping assess-dataset to summarize latest runs by facts, chunks, candidate/approved hierarchy counts, graph coverage, document presence, and sparse-hierarchy quality issues.
Updated candidate generation so SCOPE.md one-liners and Provided Capabilities blocks seed reviewable current-state abilities/capabilities, while deterministic fact fallback now requires stronger configuration facts and does not promote dependency-only repositories.
Added review-only INTENT.md/SCOPE.md API and UI draft views. Missing INTENT.md now produces an ambitious draft derived from scope/candidates without writing the file.
Added dependency graph fallback nodes/edges for candidate and draft hierarchies so repos with facts no longer render empty just because approved characteristics are absent.
Added transparent quality criteria for template contamination and scope-vs-intent separation; deterministic gates can require review but do not accept registry truth.

The latest local assessment command initially saw nine repositories because vantage-point had been added. It still reported old sparse Railiance candidate counts because those stored analysis runs predated this implementation. T07 stays open until the affected repositories are rerun and compared against the sparse baseline.

Rerun Review 2026-05-16

The local dataset now contains ten repositories and several post-implementation reruns. A review found that assess-dataset and the dependency graph fallback were incorrectly selecting the oldest completed analysis run because list_analysis_runs is sorted newest-first. That has been corrected.

Corrected assessment results:

Dataset total: 10 repos, 430 facts, candidate hierarchy 10/26/36/44, graph 210/387.
Improved: railiance-cluster now has 3 capabilities / 3 features; railiance-platform has 3 / 3; railiance-enablement has 2 / 2; ops-warden has repo-specific scope naming and 1 / 2; vergabe-teilnahme has 1 / 4.
Still sparse because they were not rerun after the implementation: railiance-infra and railiance-apps. Read-only generator preview shows they would now produce 3 and 1 scope-derived capabilities respectively.
New sparse repo: helix-forge. Its INTENT.md uses numbered/escaped Primary outcomes sections rather than bullet-based intended capabilities; generator support was added for this shape and preview now yields five outcome-derived capabilities.
Naming polish added for reviewability: preserve non-ASCII letters, normalize nominalized capability names such as Analysis of... to Analyze..., and trim explanatory dash clauses from scope one-liners.

T07 remains in progress until railiance-infra, railiance-apps, and helix-forge are rerun and the corrected assessment report is captured as the comparison artifact.

12 KiB Raw Blame History

Agentic Hierarchy And Intent/Scope Review

Dataset Assessment

T01: Capture Sparse-Hierarchy Dataset Baseline

T02: Generate Candidate Hierarchies From Facts And Scope Text

T03: Add Agentic Draft Generation Layer

T04: Review And Edit INTENT.md / SCOPE.md Drafts

T05: Make Dependency Graph Work For Partial Hierarchies

T06: Transparent Quality Criteria For Generated Abstractions

T07: Re-Run And Compare The Dataset

Implementation Update

Rerun Review 2026-05-16

12 KiB

Raw Blame History