Add stats, establish (scaffold, publish-check, discover), and update CLI commands with optional llm-connect bridge, validate --root for sibling repos, pytest coverage, and documentation for sibling registry onboarding.
9.0 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | state_hub_workstream_id |
|---|---|---|---|---|---|---|---|---|---|---|
| REUSE-WP-0013 | workplan | Registry establish, update, and stats with optional llm-connect assist | helix_forge | reuse-surface | finished | codex | helix-forge | 2026-06-16 | 2026-06-17 | 239a0077-8593-4dc7-918d-4c23895275f6 |
Registry establish, update, and stats with optional llm-connect assist
Follow-up to operator feedback and REUSE-WP-0012-T01 blocks
(history/2026-06-16-hub-registration-blocks.md). Sibling repos cannot federate
until each publishes registry/indexes/capabilities.yaml; today that requires
manual layout copy and authoring. This workplan adds deterministic bootstrap
and analytics CLI plus optional llm-connect-backed discovery and refresh
so domain repos can establish and maintain registries with low ceremony.
Baseline vector: D5 / A4 / C5 / R3
Target vector after completion: D5 / A4 / C5 / R3 (tooling depth; no
product-vector promotion unless dogfood evidence warrants)
Problem statement
| Pain | Today | Target |
|---|---|---|
| Bootstrap registry in sibling repo | Manual copy from reuse-surface | reuse-surface establish |
| Keep entries aligned with repo reality | Manual edits + validate | reuse-surface update |
| Portfolio / federation readiness view | report cohorts, manual curls |
reuse-surface stats |
| Draft capability metadata from repo context | Agent improvisation | llm-connect structured extract |
Design principles
- Deterministic first —
stats,establish --scaffold, andestablish --publish-checkwork without llm-connect or API keys. - LLM optional —
establish --discoverandupdate --suggestcall llm-connect HTTP (POST /execute) or library adapter when configured. - Dry-run default — LLM and write paths require explicit
--apply. - Validate gate — every
--applypath ends withreuse-surface validate(or fails with schema errors). - Boundary — reuse-surface does not embed provider keys; llm-connect owns routing and credentials.
Suggested execution order
T01 stats (deterministic, unblocks observability)
→ T02 establish --scaffold (unblocks sibling publish path)
→ T03 establish --publish-check (verifies raw URL / federation readiness)
→ T04 llm-connect bridge + draft JSON schema
→ T05 establish --discover
→ T06 update (deterministic signals + optional LLM suggest)
→ T07 docs, tests, gap-analysis note
Dependencies
| Dependency | Owner | Notes |
|---|---|---|
| llm-connect server or package | llm-connect | LLM_CONNECT_URL or editable install |
| Capability JSON schema | reuse-surface | schemas/capability.schema.yaml unchanged |
| Sibling repo apply | Domain owners | establish run in target repo checkout |
| Hub token | Operator | hub register remains separate post-publish |
Proposed CLI surface
# Deterministic
reuse-surface stats
reuse-surface stats --format json --federation-ready
reuse-surface establish --scaffold --domain helix_forge [--path .]
reuse-surface establish --publish-check --raw-url <gitea-raw-url>
# LLM-assisted (optional backend)
export LLM_CONNECT_URL=http://127.0.0.1:8088
reuse-surface establish --discover [--path .] --dry-run
reuse-surface establish --discover --apply
reuse-surface update --capability <id> --dry-run
reuse-surface update --all --suggest-maturity --dry-run
reuse-surface update --from-git-since HEAD~5 --apply
Add Registry Stats Command
id: REUSE-WP-0013-T01
status: done
priority: high
state_hub_task_id: "98e65330-bfc7-4282-b372-d35542b899ce"
Deliver reuse-surface stats with deterministic aggregates:
- Capability count; maturity histogram (D/A/C/R bands)
- Entries at R0–R2 vs R3+; consumption mode counts
- Index vs entry vector drift count
- Federation readiness: local
sources.yaml/ index presence; optional--federation-readyraw URL probe (HTTP status) - Hub summary when
REUSE_SURFACE_URLset (registration count)
Output: Markdown default, --format json. Pytest coverage. Document in
tools/README.md.
Implement establish --scaffold
id: REUSE-WP-0013-T02
status: done
priority: high
state_hub_task_id: "b8fedd87-d0d3-41b4-9af8-e36d52bfe1c5"
Bootstrap registry/ in target repo (--path, default cwd):
registry/README.md(pointer to reuse-surface schema and validate)registry/capabilities/.gitkeepregistry/indexes/capabilities.yamlwithversion,domain,capabilities: []- Refuse overwrite unless
--force - Print next steps: add entry, validate, merge to main, publish-check
No llm-connect dependency. Pytest with temp directory.
Implement establish --publish-check
id: REUSE-WP-0013-T03
status: done
priority: medium
state_hub_task_id: "2924d685-709f-4e28-886f-b363cd9c40b4"
Federation publish helper for sibling repo operators:
curl-equivalent probe of--raw-url(status, content-type hint)- Validate local index YAML if
--pathhasregistry/indexes/capabilities.yaml - Report pass/fail with remediation tied to
docs/RegistryFederation.md - Exit non-zero on hard failures (non-200, invalid YAML)
Add llm-connect Bridge And Draft Schema
id: REUSE-WP-0013-T04
status: done
priority: high
state_hub_task_id: "650ebee5-b34b-4ed8-891d-d93aacebadd7"
Thin client boundary:
reuse_surface/llm_bridge.py—POST {LLM_CONNECT_URL}/execute, parse JSON from response contentschemas/registry-draft.schema.json— structured draft shape (capability list with id, name, summary, vector, tags, consumption_modes, discovery intent)- Env:
LLM_CONNECT_URL(default none); graceful error when missing on LLM paths - Optional
pyproject.tomlextra:llm→llm-connectdependency - Pytest with mocked HTTP (no live LLM in CI)
Implement establish --discover
id: REUSE-WP-0013-T05
status: done
priority: medium
state_hub_task_id: "b9154889-f538-4266-9918-b277f9a297be"
LLM-assisted bootstrap after --scaffold or on empty registry:
- Collect context files:
INTENT.md,SCOPE.md,README*,pyproject.toml,AGENTS.md, top-level package dirs (configurable--context-max-files) - Prompt template + schema-constrained JSON via llm-connect
--dry-run: print proposed entries and index rows--apply: writeregistry/capabilities/*.mdfrom template merge + index update; run validate before success exit- Document prompt assumptions and review checklist in
registry/README.md
Implement update Command
id: REUSE-WP-0013-T06
status: done
priority: medium
state_hub_task_id: "b79558da-54b2-4712-91d2-b298c7cf2c40"
Refresh existing entries from repo signals:
Deterministic (no LLM):
- Index/entry vector mismatch detection
- New paths under
tests/, CLI modules → suggestevidence.tests/availability.current_artifactsadditions --applyfor safe deterministic patches only (explicit list in code)
Optional LLM (--suggest, --suggest-maturity):
- Git diff or file snapshot → proposed
promotion_history, evidence notes - Always
--dry-rununless--apply; validate after apply
Targets: single --capability, --all, --from-git-since <ref>.
Documentation, Tests, And Gap Note
id: REUSE-WP-0013-T07
status: done
priority: low
state_hub_task_id: "a55a2f26-004e-4c20-90cb-49bed64a1291"
tools/README.md— full command referencedocs/RegistryFederation.md— link establish/publish-check to sibling onboardingdocs/IntentScopeGapAnalysis.md— add priority 24 (registry bootstrap tooling); mark openSCOPE.md— "What Is Possible Now" when T01–T03 ship (incremental)- CI:
statsinformational step; no llm-connect in CI - Total pytest increase;
reuse-surface validateunchanged
Acceptance
reuse-surface statsreports maturity and federation-readiness aggregatesestablish --scaffoldcreates valid empty registry layout without overwrite accidentsestablish --publish-checkdetects 303 vs 200 raw URL outcomes- llm-connect bridge works with mocked HTTP; fails clearly when URL unset
establish --discover --dry-runproduces schema-valid draft JSON from fixture contextupdate --dry-runreports deterministic drift on sample repo- All new commands documented; gap priority 24 recorded
Completion notes (2026-06-17)
- Modules:
stats.py,establish.py,registry_update.py,llm_bridge.py - Schema:
schemas/registry-draft.schema.json validate --rootfor sibling repo validation after establish --apply- 43 pytest tests; optional
pip install -e ".[llm]"extra
Out of scope
- Auto
hub register(still operator step with token) - Auto-merge LLM output without human review path
- Embedding similarity or overlap ML (keep
overlapstoken heuristic) - llm-connect hosting or provider configuration inside reuse-surface
Dogfood target
Run establish --scaffold and establish --publish-check against state-hub
checkout when available; optional establish --discover to seed
capability.statehub.workstream-coordinate from existing docs.