Files
reuse-surface/workplans/REUSE-WP-0013-registry-establish-and-llm-assist.md
tegwick c81c18c607
Some checks failed
ci / validate-registry (push) Has been cancelled
Add REUSE-WP-0013 workplan for registry establish, update, and stats
Proposes deterministic bootstrap and analytics CLI plus optional llm-connect
assist for sibling repo capability index publishing and registry maintenance.
2026-06-16 01:12:34 +02:00

8.2 KiB
Raw Blame History

id, type, title, domain, repo, status, owner, topic_slug, created, updated
id type title domain repo status owner topic_slug created updated
REUSE-WP-0013 workplan Registry establish, update, and stats with optional llm-connect assist helix_forge reuse-surface ready codex helix-forge 2026-06-16 2026-06-16

Registry establish, update, and stats with optional llm-connect assist

Follow-up to operator feedback and REUSE-WP-0012-T01 blocks (history/2026-06-16-hub-registration-blocks.md). Sibling repos cannot federate until each publishes registry/indexes/capabilities.yaml; today that requires manual layout copy and authoring. This workplan adds deterministic bootstrap and analytics CLI plus optional llm-connect-backed discovery and refresh so domain repos can establish and maintain registries with low ceremony.

Baseline vector: D5 / A4 / C5 / R3
Target vector after completion: D5 / A4 / C5 / R3 (tooling depth; no product-vector promotion unless dogfood evidence warrants)

Problem statement

Pain Today Target
Bootstrap registry in sibling repo Manual copy from reuse-surface reuse-surface establish
Keep entries aligned with repo reality Manual edits + validate reuse-surface update
Portfolio / federation readiness view report cohorts, manual curls reuse-surface stats
Draft capability metadata from repo context Agent improvisation llm-connect structured extract

Design principles

  1. Deterministic firststats, establish --scaffold, and establish --publish-check work without llm-connect or API keys.
  2. LLM optionalestablish --discover and update --suggest call llm-connect HTTP (POST /execute) or library adapter when configured.
  3. Dry-run default — LLM and write paths require explicit --apply.
  4. Validate gate — every --apply path ends with reuse-surface validate (or fails with schema errors).
  5. Boundary — reuse-surface does not embed provider keys; llm-connect owns routing and credentials.

Suggested execution order

T01 stats (deterministic, unblocks observability)
  → T02 establish --scaffold (unblocks sibling publish path)
  → T03 establish --publish-check (verifies raw URL / federation readiness)
  → T04 llm-connect bridge + draft JSON schema
  → T05 establish --discover
  → T06 update (deterministic signals + optional LLM suggest)
  → T07 docs, tests, gap-analysis note

Dependencies

Dependency Owner Notes
llm-connect server or package llm-connect LLM_CONNECT_URL or editable install
Capability JSON schema reuse-surface schemas/capability.schema.yaml unchanged
Sibling repo apply Domain owners establish run in target repo checkout
Hub token Operator hub register remains separate post-publish

Proposed CLI surface

# Deterministic
reuse-surface stats
reuse-surface stats --format json --federation-ready
reuse-surface establish --scaffold --domain helix_forge [--path .]
reuse-surface establish --publish-check --raw-url <gitea-raw-url>

# LLM-assisted (optional backend)
export LLM_CONNECT_URL=http://127.0.0.1:8088
reuse-surface establish --discover [--path .] --dry-run
reuse-surface establish --discover --apply
reuse-surface update --capability <id> --dry-run
reuse-surface update --all --suggest-maturity --dry-run
reuse-surface update --from-git-since HEAD~5 --apply

Add Registry Stats Command

id: REUSE-WP-0013-T01
status: todo
priority: high

Deliver reuse-surface stats with deterministic aggregates:

  • Capability count; maturity histogram (D/A/C/R bands)
  • Entries at R0R2 vs R3+; consumption mode counts
  • Index vs entry vector drift count
  • Federation readiness: local sources.yaml / index presence; optional --federation-ready raw URL probe (HTTP status)
  • Hub summary when REUSE_SURFACE_URL set (registration count)

Output: Markdown default, --format json. Pytest coverage. Document in tools/README.md.

Implement establish --scaffold

id: REUSE-WP-0013-T02
status: todo
priority: high

Bootstrap registry/ in target repo (--path, default cwd):

  • registry/README.md (pointer to reuse-surface schema and validate)
  • registry/capabilities/.gitkeep
  • registry/indexes/capabilities.yaml with version, domain, capabilities: []
  • Refuse overwrite unless --force
  • Print next steps: add entry, validate, merge to main, publish-check

No llm-connect dependency. Pytest with temp directory.

Implement establish --publish-check

id: REUSE-WP-0013-T03
status: todo
priority: medium

Federation publish helper for sibling repo operators:

  • curl-equivalent probe of --raw-url (status, content-type hint)
  • Validate local index YAML if --path has registry/indexes/capabilities.yaml
  • Report pass/fail with remediation tied to docs/RegistryFederation.md
  • Exit non-zero on hard failures (non-200, invalid YAML)

Add llm-connect Bridge And Draft Schema

id: REUSE-WP-0013-T04
status: todo
priority: high

Thin client boundary:

  • reuse_surface/llm_bridge.pyPOST {LLM_CONNECT_URL}/execute, parse JSON from response content
  • schemas/registry-draft.schema.json — structured draft shape (capability list with id, name, summary, vector, tags, consumption_modes, discovery intent)
  • Env: LLM_CONNECT_URL (default none); graceful error when missing on LLM paths
  • Optional pyproject.toml extra: llmllm-connect dependency
  • Pytest with mocked HTTP (no live LLM in CI)

Implement establish --discover

id: REUSE-WP-0013-T05
status: todo
priority: medium

LLM-assisted bootstrap after --scaffold or on empty registry:

  • Collect context files: INTENT.md, SCOPE.md, README*, pyproject.toml, AGENTS.md, top-level package dirs (configurable --context-max-files)
  • Prompt template + schema-constrained JSON via llm-connect
  • --dry-run: print proposed entries and index rows
  • --apply: write registry/capabilities/*.md from template merge + index update; run validate before success exit
  • Document prompt assumptions and review checklist in registry/README.md

Implement update Command

id: REUSE-WP-0013-T06
status: todo
priority: medium

Refresh existing entries from repo signals:

Deterministic (no LLM):

  • Index/entry vector mismatch detection
  • New paths under tests/, CLI modules → suggest evidence.tests / availability.current_artifacts additions
  • --apply for safe deterministic patches only (explicit list in code)

Optional LLM (--suggest, --suggest-maturity):

  • Git diff or file snapshot → proposed promotion_history, evidence notes
  • Always --dry-run unless --apply; validate after apply

Targets: single --capability, --all, --from-git-since <ref>.

Documentation, Tests, And Gap Note

id: REUSE-WP-0013-T07
status: todo
priority: low
  • tools/README.md — full command reference
  • docs/RegistryFederation.md — link establish/publish-check to sibling onboarding
  • docs/IntentScopeGapAnalysis.md — add priority 24 (registry bootstrap tooling); mark open
  • SCOPE.md — "What Is Possible Now" when T01T03 ship (incremental)
  • CI: stats informational step; no llm-connect in CI
  • Total pytest increase; reuse-surface validate unchanged

Acceptance

  • reuse-surface stats reports maturity and federation-readiness aggregates
  • establish --scaffold creates valid empty registry layout without overwrite accidents
  • establish --publish-check detects 303 vs 200 raw URL outcomes
  • llm-connect bridge works with mocked HTTP; fails clearly when URL unset
  • establish --discover --dry-run produces schema-valid draft JSON from fixture context
  • update --dry-run reports deterministic drift on sample repo
  • All new commands documented; gap priority 24 recorded

Out of scope

  • Auto hub register (still operator step with token)
  • Auto-merge LLM output without human review path
  • Embedding similarity or overlap ML (keep overlaps token heuristic)
  • llm-connect hosting or provider configuration inside reuse-surface

Dogfood target

Run establish --scaffold and establish --publish-check against state-hub checkout when available; optional establish --discover to seed capability.statehub.workstream-coordinate from existing docs.