Files
markitect-main/roadmap/infospace-s3-closeout/PLAN.md
tegwick 9e8d73fa7d docs(roadmap): close out infospace tooling S3 and parent roadmap
All three stages of the infospace tooling roadmap are complete. The Wealth
of Nations / VSM example passes 6/6 viability thresholds on 988 entities,
and composition is demonstrated via the supply-chain-vsm example.

- Parent roadmap (roadmap/infospace-tooling/PLAN.md): header now shows the
  closed status with final validation metrics.
- S3 close-out plan (roadmap/infospace-s3-closeout/PLAN.md): records the
  final task dispositions. C.1–C.6 and C.8 done; C.7 (clean per-chapter
  git history) is deferred indefinitely — the task was cosmetic, its
  prerequisite branch no longer exists, and reconstructing 35 archival
  commits would not change any output files. Rationale documented inline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 07:08:43 +02:00

203 lines
7.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Infospace Tooling — Stage 3 Close-out
## Context
Stages 1 and 2 of the infospace tooling roadmap are complete. Stage 3 used the
Wealth of Nations / VSM example to validate the tooling end-to-end. Most of S3
is done; this workstream finishes the remaining tasks, addresses deferred cleanup,
and formally closes the roadmap.
**Parent roadmap:** `roadmap/infospace-tooling/PLAN.md`
**Example location:** `examples/infospace-with-history/`
**Status: CLOSED (2026-04-22).** All acceptance criteria except the cosmetic
per-chapter history (C.7) are met. Final metrics: 988 entities, 988 evaluations,
6/6 viability thresholds PASS (`per_entity_mean = 3.957`). Tooling work that
came out of this close-out landed as commits `c0615c2d` (gemini retry,
unified skip-existing, non-destructive metrics I/O) and `d44a4cd3`
(`infospace entity` lookup, `evaluate --model-fallback`, `llm-check`
stale-key advisory, `build_state` type guard).
### State at workstream open (2026-02-26)
| Item | Status |
|------|--------|
| S3.1 Migrate example to infospace config | ✅ Done |
| S3.3 Per-entity eval batch | ✅ 985/988 complete; metrics.yaml updated |
| S3.4 Tutorial rewrite | ✅ Done |
| S3.5 Supply-chain-vsm composition demo | ✅ Done |
| S3.2 Clean per-chapter git history | ⏳ Deferred — included here |
| 3 missing evaluations | ⏳ Outstanding |
| 4 follow-up items (commit b055c8d7) | ⏳ Outstanding |
### State at workstream close (2026-04-22)
| Task | Status |
|------|--------|
| C.1 Complete 3 missing entity evaluations | ✅ Done (commit f325f89d) |
| C.2 Run eval-summary and verify viability | ✅ Done — 6/6 PASS |
| C.3 Refresh metrics report (988 entities) | ✅ Done — snapshot `090bb961` |
| C.4 Document advanced usage patterns | ✅ Done — `examples/infospace-with-history/docs/advanced-usage.md` |
| C.5 Composition-examples documentation | ✅ Done — `docs/composition-guide.md` |
| C.6 Performance benchmarking note | ✅ Done — `examples/infospace-with-history/docs/performance-notes.md` |
| C.7 Clean per-chapter git history | ⏭️ Deferred indefinitely — see note below |
| C.8 Formally close S3 roadmap | ✅ This commit |
**C.7 disposition.** The task assumed a pre-existing `clean-example-history`
branch with chapters 18 already committed; that branch no longer exists in
the repo. The task is explicitly cosmetic ("does not change output files"),
and the output files themselves are canonical. Reconstructing a 35-commit
per-chapter history from scratch would be archaeological rather than useful.
Closing as "won't do" unless a specific archival need surfaces. If revisited,
entities can be grouped by their `## Source Chapter` markdown section to
reconstruct chapter membership.
---
## Tasks
### C.1 — Complete the 3 missing entity evaluations
985 of 988 entities have evaluation files. Identify and evaluate the remaining 3.
```bash
cd examples/infospace-with-history
# Identify missing slugs
comm -23 \
<(ls output/entities/*.md | xargs -I{} basename {} .md | sort) \
<(ls output/evaluations/*.md | xargs -I{} basename {} .md | sort)
# Evaluate each missing entity individually
markitect infospace evaluate --entity <slug> --provider openrouter
```
**Acceptance:** `ls output/evaluations/*.md | wc -l` returns 988.
---
### C.2 — Run eval-summary and verify viability
Run the aggregation command to update per_entity_mean from all 988 evaluations,
then check all 6 viability gates pass.
```bash
cd examples/infospace-with-history
unset OPENROUTER_API_KEY # stale env var guard
markitect infospace eval-summary --update-metrics
markitect infospace viability
```
Current sample reading (985 entities): `per_entity_mean = 3.956` against threshold 3.5.
Expected: all 6 metrics pass.
**Acceptance:** `markitect infospace viability` exits 0 and shows 6/6 PASS.
---
### C.3 — Refresh the metrics report
The metrics report was generated from chapters 14 only. Regenerate it from
the full 988-entity set.
```bash
cd examples/infospace-with-history
markitect infospace check --provider openrouter # or reuse existing check outputs
markitect infospace history # confirm snapshot recorded
```
**Acceptance:** `output/metrics/metrics.yaml` reflects all 988 entities; a dated
snapshot exists in the metrics history.
---
### C.4 — Document advanced usage patterns
Write `examples/infospace-with-history/docs/advanced-usage.md` covering:
- Incremental evaluation (adding entities after initial run, skip-if-exists behaviour)
- Re-evaluating after guideline changes (`--force` flag)
- Interpreting per-entity score distributions and identifying outliers
- Using `markitect infospace entities --sort-by score` to triage low scorers
- Reading and acting on collection check outputs (redundancy pairs, coverage gaps)
**Acceptance:** File exists with ≥ 4 documented patterns, each with a worked command example.
---
### C.5 — Add composition examples to documentation
Document how the supply-chain-vsm example (`examples/supply-chain-vsm/`) demonstrates
composition. Add a `docs/composition-guide.md` covering:
- What composition means (discipline binding)
- How supply-chain-vsm binds WoN as a discipline
- How to create a new infospace that uses an existing one as a discipline
- Viability requirement: the discipline must pass its own thresholds before binding
Reference `examples/supply-chain-vsm/` throughout.
**Acceptance:** `docs/composition-guide.md` exists and links to supply-chain-vsm.
---
### C.6 — Performance benchmarking note
Rather than a full benchmarking guide (out of scope for a 988-entity example),
record observed timings in a `docs/performance-notes.md`:
- Eval batch duration (~4 hrs for 988 entities via OpenRouter)
- Tokens per entity (rough estimate from usage logs)
- Embedding cache hit rate after first run
- Recommendation: provider choice (OpenRouter vs Gemini) for different dataset sizes
**Acceptance:** File exists with at least 4 concrete measurements or estimates.
---
### C.7 — S3.2: Clean per-chapter git history (deferred cleanup)
Create a clean branch where each of the 35 processed chapters has its own commit.
Chapters 18 are already done on branch `clean-example-history`; 27 remain.
This is a cosmetic/archival task — it does not change output files.
```bash
git checkout clean-example-history
# For each remaining chapter (935):
# cherry-pick or re-commit the chapter output files with a per-chapter message
git log --oneline clean-example-history # verify 35 chapter commits
```
**Acceptance:** Branch `clean-example-history` has exactly 35 chapter commits
(one per chapter), rebased onto current main.
**Note:** This task can be done independently of C.1C.6. Low urgency — do last.
---
### C.8 — Formally close the S3 roadmap
Update `roadmap/infospace-tooling/PLAN.md` to mark all S3 tasks as complete.
Add a close-out summary at the top of the file with final metrics and date.
Commit with a `docs(roadmap)` message.
**Acceptance:** PLAN.md header shows all stages complete; committed to main.
---
## Task order
```
C.1 → C.2 → C.3
C.4, C.5, C.6 (parallel)
C.8
C.7 (independent, do last)
```
## Out of scope
- Adding new entities or chapters (the WoN example is complete at 988 entities)
- Re-running collection checks from scratch (existing results are valid)
- Publishing the example as a standalone dataset