Proposes deterministic bootstrap and analytics CLI plus optional llm-connect assist for sibling repo capability index publishing and registry maintenance.
8.2 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated
| id | type | title | domain | repo | status | owner | topic_slug | created | updated |
|---|---|---|---|---|---|---|---|---|---|
| REUSE-WP-0013 | workplan | Registry establish, update, and stats with optional llm-connect assist | helix_forge | reuse-surface | ready | codex | helix-forge | 2026-06-16 | 2026-06-16 |
Registry establish, update, and stats with optional llm-connect assist
Follow-up to operator feedback and REUSE-WP-0012-T01 blocks
(history/2026-06-16-hub-registration-blocks.md). Sibling repos cannot federate
until each publishes registry/indexes/capabilities.yaml; today that requires
manual layout copy and authoring. This workplan adds deterministic bootstrap
and analytics CLI plus optional llm-connect-backed discovery and refresh
so domain repos can establish and maintain registries with low ceremony.
Baseline vector: D5 / A4 / C5 / R3
Target vector after completion: D5 / A4 / C5 / R3 (tooling depth; no
product-vector promotion unless dogfood evidence warrants)
Problem statement
| Pain | Today | Target |
|---|---|---|
| Bootstrap registry in sibling repo | Manual copy from reuse-surface | reuse-surface establish |
| Keep entries aligned with repo reality | Manual edits + validate | reuse-surface update |
| Portfolio / federation readiness view | report cohorts, manual curls |
reuse-surface stats |
| Draft capability metadata from repo context | Agent improvisation | llm-connect structured extract |
Design principles
- Deterministic first —
stats,establish --scaffold, andestablish --publish-checkwork without llm-connect or API keys. - LLM optional —
establish --discoverandupdate --suggestcall llm-connect HTTP (POST /execute) or library adapter when configured. - Dry-run default — LLM and write paths require explicit
--apply. - Validate gate — every
--applypath ends withreuse-surface validate(or fails with schema errors). - Boundary — reuse-surface does not embed provider keys; llm-connect owns routing and credentials.
Suggested execution order
T01 stats (deterministic, unblocks observability)
→ T02 establish --scaffold (unblocks sibling publish path)
→ T03 establish --publish-check (verifies raw URL / federation readiness)
→ T04 llm-connect bridge + draft JSON schema
→ T05 establish --discover
→ T06 update (deterministic signals + optional LLM suggest)
→ T07 docs, tests, gap-analysis note
Dependencies
| Dependency | Owner | Notes |
|---|---|---|
| llm-connect server or package | llm-connect | LLM_CONNECT_URL or editable install |
| Capability JSON schema | reuse-surface | schemas/capability.schema.yaml unchanged |
| Sibling repo apply | Domain owners | establish run in target repo checkout |
| Hub token | Operator | hub register remains separate post-publish |
Proposed CLI surface
# Deterministic
reuse-surface stats
reuse-surface stats --format json --federation-ready
reuse-surface establish --scaffold --domain helix_forge [--path .]
reuse-surface establish --publish-check --raw-url <gitea-raw-url>
# LLM-assisted (optional backend)
export LLM_CONNECT_URL=http://127.0.0.1:8088
reuse-surface establish --discover [--path .] --dry-run
reuse-surface establish --discover --apply
reuse-surface update --capability <id> --dry-run
reuse-surface update --all --suggest-maturity --dry-run
reuse-surface update --from-git-since HEAD~5 --apply
Add Registry Stats Command
id: REUSE-WP-0013-T01
status: todo
priority: high
Deliver reuse-surface stats with deterministic aggregates:
- Capability count; maturity histogram (D/A/C/R bands)
- Entries at R0–R2 vs R3+; consumption mode counts
- Index vs entry vector drift count
- Federation readiness: local
sources.yaml/ index presence; optional--federation-readyraw URL probe (HTTP status) - Hub summary when
REUSE_SURFACE_URLset (registration count)
Output: Markdown default, --format json. Pytest coverage. Document in
tools/README.md.
Implement establish --scaffold
id: REUSE-WP-0013-T02
status: todo
priority: high
Bootstrap registry/ in target repo (--path, default cwd):
registry/README.md(pointer to reuse-surface schema and validate)registry/capabilities/.gitkeepregistry/indexes/capabilities.yamlwithversion,domain,capabilities: []- Refuse overwrite unless
--force - Print next steps: add entry, validate, merge to main, publish-check
No llm-connect dependency. Pytest with temp directory.
Implement establish --publish-check
id: REUSE-WP-0013-T03
status: todo
priority: medium
Federation publish helper for sibling repo operators:
curl-equivalent probe of--raw-url(status, content-type hint)- Validate local index YAML if
--pathhasregistry/indexes/capabilities.yaml - Report pass/fail with remediation tied to
docs/RegistryFederation.md - Exit non-zero on hard failures (non-200, invalid YAML)
Add llm-connect Bridge And Draft Schema
id: REUSE-WP-0013-T04
status: todo
priority: high
Thin client boundary:
reuse_surface/llm_bridge.py—POST {LLM_CONNECT_URL}/execute, parse JSON from response contentschemas/registry-draft.schema.json— structured draft shape (capability list with id, name, summary, vector, tags, consumption_modes, discovery intent)- Env:
LLM_CONNECT_URL(default none); graceful error when missing on LLM paths - Optional
pyproject.tomlextra:llm→llm-connectdependency - Pytest with mocked HTTP (no live LLM in CI)
Implement establish --discover
id: REUSE-WP-0013-T05
status: todo
priority: medium
LLM-assisted bootstrap after --scaffold or on empty registry:
- Collect context files:
INTENT.md,SCOPE.md,README*,pyproject.toml,AGENTS.md, top-level package dirs (configurable--context-max-files) - Prompt template + schema-constrained JSON via llm-connect
--dry-run: print proposed entries and index rows--apply: writeregistry/capabilities/*.mdfrom template merge + index update; run validate before success exit- Document prompt assumptions and review checklist in
registry/README.md
Implement update Command
id: REUSE-WP-0013-T06
status: todo
priority: medium
Refresh existing entries from repo signals:
Deterministic (no LLM):
- Index/entry vector mismatch detection
- New paths under
tests/, CLI modules → suggestevidence.tests/availability.current_artifactsadditions --applyfor safe deterministic patches only (explicit list in code)
Optional LLM (--suggest, --suggest-maturity):
- Git diff or file snapshot → proposed
promotion_history, evidence notes - Always
--dry-rununless--apply; validate after apply
Targets: single --capability, --all, --from-git-since <ref>.
Documentation, Tests, And Gap Note
id: REUSE-WP-0013-T07
status: todo
priority: low
tools/README.md— full command referencedocs/RegistryFederation.md— link establish/publish-check to sibling onboardingdocs/IntentScopeGapAnalysis.md— add priority 24 (registry bootstrap tooling); mark openSCOPE.md— "What Is Possible Now" when T01–T03 ship (incremental)- CI:
statsinformational step; no llm-connect in CI - Total pytest increase;
reuse-surface validateunchanged
Acceptance
reuse-surface statsreports maturity and federation-readiness aggregatesestablish --scaffoldcreates valid empty registry layout without overwrite accidentsestablish --publish-checkdetects 303 vs 200 raw URL outcomes- llm-connect bridge works with mocked HTTP; fails clearly when URL unset
establish --discover --dry-runproduces schema-valid draft JSON from fixture contextupdate --dry-runreports deterministic drift on sample repo- All new commands documented; gap priority 24 recorded
Out of scope
- Auto
hub register(still operator step with token) - Auto-merge LLM output without human review path
- Embedding similarity or overlap ML (keep
overlapstoken heuristic) - llm-connect hosting or provider configuration inside reuse-surface
Dogfood target
Run establish --scaffold and establish --publish-check against state-hub
checkout when available; optional establish --discover to seed
capability.statehub.workstream-coordinate from existing docs.