Files
markitect-main/roadmap/infospace-s3-closeout/PLAN.md
tegwick 9e8d73fa7d docs(roadmap): close out infospace tooling S3 and parent roadmap
All three stages of the infospace tooling roadmap are complete. The Wealth
of Nations / VSM example passes 6/6 viability thresholds on 988 entities,
and composition is demonstrated via the supply-chain-vsm example.

- Parent roadmap (roadmap/infospace-tooling/PLAN.md): header now shows the
  closed status with final validation metrics.
- S3 close-out plan (roadmap/infospace-s3-closeout/PLAN.md): records the
  final task dispositions. C.1–C.6 and C.8 done; C.7 (clean per-chapter
  git history) is deferred indefinitely — the task was cosmetic, its
  prerequisite branch no longer exists, and reconstructing 35 archival
  commits would not change any output files. Rationale documented inline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 07:08:43 +02:00

7.5 KiB
Raw Blame History

Infospace Tooling — Stage 3 Close-out

Context

Stages 1 and 2 of the infospace tooling roadmap are complete. Stage 3 used the Wealth of Nations / VSM example to validate the tooling end-to-end. Most of S3 is done; this workstream finishes the remaining tasks, addresses deferred cleanup, and formally closes the roadmap.

Parent roadmap: roadmap/infospace-tooling/PLAN.md Example location: examples/infospace-with-history/

Status: CLOSED (2026-04-22). All acceptance criteria except the cosmetic per-chapter history (C.7) are met. Final metrics: 988 entities, 988 evaluations, 6/6 viability thresholds PASS (per_entity_mean = 3.957). Tooling work that came out of this close-out landed as commits c0615c2d (gemini retry, unified skip-existing, non-destructive metrics I/O) and d44a4cd3 (infospace entity lookup, evaluate --model-fallback, llm-check stale-key advisory, build_state type guard).

State at workstream open (2026-02-26)

Item Status
S3.1 Migrate example to infospace config Done
S3.3 Per-entity eval batch 985/988 complete; metrics.yaml updated
S3.4 Tutorial rewrite Done
S3.5 Supply-chain-vsm composition demo Done
S3.2 Clean per-chapter git history Deferred — included here
3 missing evaluations Outstanding
4 follow-up items (commit b055c8d7) Outstanding

State at workstream close (2026-04-22)

Task Status
C.1 Complete 3 missing entity evaluations Done (commit f325f89d)
C.2 Run eval-summary and verify viability Done — 6/6 PASS
C.3 Refresh metrics report (988 entities) Done — snapshot 090bb961
C.4 Document advanced usage patterns Done — examples/infospace-with-history/docs/advanced-usage.md
C.5 Composition-examples documentation Done — docs/composition-guide.md
C.6 Performance benchmarking note Done — examples/infospace-with-history/docs/performance-notes.md
C.7 Clean per-chapter git history ⏭️ Deferred indefinitely — see note below
C.8 Formally close S3 roadmap This commit

C.7 disposition. The task assumed a pre-existing clean-example-history branch with chapters 18 already committed; that branch no longer exists in the repo. The task is explicitly cosmetic ("does not change output files"), and the output files themselves are canonical. Reconstructing a 35-commit per-chapter history from scratch would be archaeological rather than useful. Closing as "won't do" unless a specific archival need surfaces. If revisited, entities can be grouped by their ## Source Chapter markdown section to reconstruct chapter membership.


Tasks

C.1 — Complete the 3 missing entity evaluations

985 of 988 entities have evaluation files. Identify and evaluate the remaining 3.

cd examples/infospace-with-history
# Identify missing slugs
comm -23 \
  <(ls output/entities/*.md | xargs -I{} basename {} .md | sort) \
  <(ls output/evaluations/*.md | xargs -I{} basename {} .md | sort)
# Evaluate each missing entity individually
markitect infospace evaluate --entity <slug> --provider openrouter

Acceptance: ls output/evaluations/*.md | wc -l returns 988.


C.2 — Run eval-summary and verify viability

Run the aggregation command to update per_entity_mean from all 988 evaluations, then check all 6 viability gates pass.

cd examples/infospace-with-history
unset OPENROUTER_API_KEY  # stale env var guard
markitect infospace eval-summary --update-metrics
markitect infospace viability

Current sample reading (985 entities): per_entity_mean = 3.956 against threshold 3.5. Expected: all 6 metrics pass.

Acceptance: markitect infospace viability exits 0 and shows 6/6 PASS.


C.3 — Refresh the metrics report

The metrics report was generated from chapters 14 only. Regenerate it from the full 988-entity set.

cd examples/infospace-with-history
markitect infospace check --provider openrouter   # or reuse existing check outputs
markitect infospace history                        # confirm snapshot recorded

Acceptance: output/metrics/metrics.yaml reflects all 988 entities; a dated snapshot exists in the metrics history.


C.4 — Document advanced usage patterns

Write examples/infospace-with-history/docs/advanced-usage.md covering:

  • Incremental evaluation (adding entities after initial run, skip-if-exists behaviour)
  • Re-evaluating after guideline changes (--force flag)
  • Interpreting per-entity score distributions and identifying outliers
  • Using markitect infospace entities --sort-by score to triage low scorers
  • Reading and acting on collection check outputs (redundancy pairs, coverage gaps)

Acceptance: File exists with ≥ 4 documented patterns, each with a worked command example.


C.5 — Add composition examples to documentation

Document how the supply-chain-vsm example (examples/supply-chain-vsm/) demonstrates composition. Add a docs/composition-guide.md covering:

  • What composition means (discipline binding)
  • How supply-chain-vsm binds WoN as a discipline
  • How to create a new infospace that uses an existing one as a discipline
  • Viability requirement: the discipline must pass its own thresholds before binding

Reference examples/supply-chain-vsm/ throughout.

Acceptance: docs/composition-guide.md exists and links to supply-chain-vsm.


C.6 — Performance benchmarking note

Rather than a full benchmarking guide (out of scope for a 988-entity example), record observed timings in a docs/performance-notes.md:

  • Eval batch duration (~4 hrs for 988 entities via OpenRouter)
  • Tokens per entity (rough estimate from usage logs)
  • Embedding cache hit rate after first run
  • Recommendation: provider choice (OpenRouter vs Gemini) for different dataset sizes

Acceptance: File exists with at least 4 concrete measurements or estimates.


C.7 — S3.2: Clean per-chapter git history (deferred cleanup)

Create a clean branch where each of the 35 processed chapters has its own commit. Chapters 18 are already done on branch clean-example-history; 27 remain.

This is a cosmetic/archival task — it does not change output files.

git checkout clean-example-history
# For each remaining chapter (935):
#   cherry-pick or re-commit the chapter output files with a per-chapter message
git log --oneline clean-example-history  # verify 35 chapter commits

Acceptance: Branch clean-example-history has exactly 35 chapter commits (one per chapter), rebased onto current main.

Note: This task can be done independently of C.1C.6. Low urgency — do last.


C.8 — Formally close the S3 roadmap

Update roadmap/infospace-tooling/PLAN.md to mark all S3 tasks as complete. Add a close-out summary at the top of the file with final metrics and date. Commit with a docs(roadmap) message.

Acceptance: PLAN.md header shows all stages complete; committed to main.


Task order

C.1 → C.2 → C.3
              ↓
         C.4, C.5, C.6 (parallel)
              ↓
             C.8
C.7 (independent, do last)

Out of scope

  • Adding new entities or chapters (the WoN example is complete at 988 entities)
  • Re-running collection checks from scratch (existing results are valid)
  • Publishing the example as a standalone dataset