infospace-s3-closeout: 8 tasks (C.1-C.8) covering 3 missing evals, viability sign-off, docs (advanced usage, composition, perf), deferred git history cleanup, and formal roadmap closure. testdrive-jsui-publication: 9 tasks (P.1-P.9) covering repo structure decision, Markitect integration gate, pack/dry-run, npm publish, CDN verify, fresh install test, GitHub release, and badges. Both registered as workstreams in Custodian State Hub. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5.7 KiB
Infospace Tooling — Stage 3 Close-out
Context
Stages 1 and 2 of the infospace tooling roadmap are complete. Stage 3 used the Wealth of Nations / VSM example to validate the tooling end-to-end. Most of S3 is done; this workstream finishes the remaining tasks, addresses deferred cleanup, and formally closes the roadmap.
Parent roadmap: roadmap/infospace-tooling/PLAN.md
Example location: examples/infospace-with-history/
State at workstream open (2026-02-26)
| Item | Status |
|---|---|
| S3.1 Migrate example to infospace config | ✅ Done |
| S3.3 Per-entity eval batch | ✅ 985/988 complete; metrics.yaml updated |
| S3.4 Tutorial rewrite | ✅ Done |
| S3.5 Supply-chain-vsm composition demo | ✅ Done |
| S3.2 Clean per-chapter git history | ⏳ Deferred — included here |
| 3 missing evaluations | ⏳ Outstanding |
4 follow-up items (commit b055c8d7) |
⏳ Outstanding |
Tasks
C.1 — Complete the 3 missing entity evaluations
985 of 988 entities have evaluation files. Identify and evaluate the remaining 3.
cd examples/infospace-with-history
# Identify missing slugs
comm -23 \
<(ls output/entities/*.md | xargs -I{} basename {} .md | sort) \
<(ls output/evaluations/*.md | xargs -I{} basename {} .md | sort)
# Evaluate each missing entity individually
markitect infospace evaluate --entity <slug> --provider openrouter
Acceptance: ls output/evaluations/*.md | wc -l returns 988.
C.2 — Run eval-summary and verify viability
Run the aggregation command to update per_entity_mean from all 988 evaluations, then check all 6 viability gates pass.
cd examples/infospace-with-history
unset OPENROUTER_API_KEY # stale env var guard
markitect infospace eval-summary --update-metrics
markitect infospace viability
Current sample reading (985 entities): per_entity_mean = 3.956 against threshold 3.5.
Expected: all 6 metrics pass.
Acceptance: markitect infospace viability exits 0 and shows 6/6 PASS.
C.3 — Refresh the metrics report
The metrics report was generated from chapters 1–4 only. Regenerate it from the full 988-entity set.
cd examples/infospace-with-history
markitect infospace check --provider openrouter # or reuse existing check outputs
markitect infospace history # confirm snapshot recorded
Acceptance: output/metrics/metrics.yaml reflects all 988 entities; a dated
snapshot exists in the metrics history.
C.4 — Document advanced usage patterns
Write examples/infospace-with-history/docs/advanced-usage.md covering:
- Incremental evaluation (adding entities after initial run, skip-if-exists behaviour)
- Re-evaluating after guideline changes (
--forceflag) - Interpreting per-entity score distributions and identifying outliers
- Using
markitect infospace entities --sort-by scoreto triage low scorers - Reading and acting on collection check outputs (redundancy pairs, coverage gaps)
Acceptance: File exists with ≥ 4 documented patterns, each with a worked command example.
C.5 — Add composition examples to documentation
Document how the supply-chain-vsm example (examples/supply-chain-vsm/) demonstrates
composition. Add a docs/composition-guide.md covering:
- What composition means (discipline binding)
- How supply-chain-vsm binds WoN as a discipline
- How to create a new infospace that uses an existing one as a discipline
- Viability requirement: the discipline must pass its own thresholds before binding
Reference examples/supply-chain-vsm/ throughout.
Acceptance: docs/composition-guide.md exists and links to supply-chain-vsm.
C.6 — Performance benchmarking note
Rather than a full benchmarking guide (out of scope for a 988-entity example),
record observed timings in a docs/performance-notes.md:
- Eval batch duration (~4 hrs for 988 entities via OpenRouter)
- Tokens per entity (rough estimate from usage logs)
- Embedding cache hit rate after first run
- Recommendation: provider choice (OpenRouter vs Gemini) for different dataset sizes
Acceptance: File exists with at least 4 concrete measurements or estimates.
C.7 — S3.2: Clean per-chapter git history (deferred cleanup)
Create a clean branch where each of the 35 processed chapters has its own commit.
Chapters 1–8 are already done on branch clean-example-history; 27 remain.
This is a cosmetic/archival task — it does not change output files.
git checkout clean-example-history
# For each remaining chapter (9–35):
# cherry-pick or re-commit the chapter output files with a per-chapter message
git log --oneline clean-example-history # verify 35 chapter commits
Acceptance: Branch clean-example-history has exactly 35 chapter commits
(one per chapter), rebased onto current main.
Note: This task can be done independently of C.1–C.6. Low urgency — do last.
C.8 — Formally close the S3 roadmap
Update roadmap/infospace-tooling/PLAN.md to mark all S3 tasks as complete.
Add a close-out summary at the top of the file with final metrics and date.
Commit with a docs(roadmap) message.
Acceptance: PLAN.md header shows all stages complete; committed to main.
Task order
C.1 → C.2 → C.3
↓
C.4, C.5, C.6 (parallel)
↓
C.8
C.7 (independent, do last)
Out of scope
- Adding new entities or chapters (the WoN example is complete at 988 entities)
- Re-running collection checks from scratch (existing results are valid)
- Publishing the example as a standalone dataset