IB-WP-0016: refresh validation baseline after T01/T02 smoke run

Run a fixture-backed end-to-end smoke against the real Lefevre EPUB (max-chunks 3) and capture the result in the validation note and the workplan. The pipeline produces a complete infospace with stable chapter-01-part-NNN source IDs, full chapter/book/anchor provenance on every source artifact, viable metrics, and exact-title entity dedupe. Refresh the workplan validation baseline to reflect the post-T01/T02 state, and add a remaining-gaps section that maps the open issues to the right follow-on tasks: cost/scope controls and plan preview to T03, the trading-literature profile to T04, chunk-level resume to T06, and a richer generation-summary report (entity titles, chapter coverage, anchor links) to T07. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 16:13:39 +02:00
parent 745edc8b81
commit 001b64d67b
2 changed files with 65 additions and 17 deletions
--- a/docs/lefevre-epub3-validation.md
+++ b/docs/lefevre-epub3-validation.md
@@ -112,3 +112,26 @@ now produces:
  `Page_1..Page_14` distributed across its three parts)
 - Optional `overlap_words` parameter supports evidence-window context
  between adjacent parts of the same chapter without duplicating headings
+
+## Fixture Smoke Run (2026-05-17)
+
+`generate from-source ... --fixture-responses ... --max-chunks 3 --apply`
+against the real EPUB produced a complete infospace:
+
+- 3 source chunks (`chapter-01-part-001..003`) and 3 entities/relations/
+  evaluations plus the generation-summary report
+- `artifacts/index.yaml` carries full T01/T02 provenance on every source
+  artifact (`chapter_label`, `chapter_number`, `page_anchors`, OPF
+  `book_metadata`)
+- Metrics viable: `coverage=1.0`, `redundancy=0.0`, `granularity_entropy
+  ≈ 1.79`; viability gates pass
+- Repeated same-title entities upserted to single artifact files — basic
+  exact-title dedupe works; near-duplicate dedupe is still open
+
+Gaps the smoke surfaced for follow-on tasks:
+
+- `generation-summary.md` is just counts + metrics; needs entity titles,
+  chapter coverage, page-anchor links for review (T07)
+- No `plan` cost preview, no chapter/cost cap selection — running the full
+  book at default `max_words` is ~335 provider calls (T03)
+- Generic profile shaped the output, not Lefevre's trading vocabulary (T04)
--- a/workplans/IB-WP-0016-lefevre-ebook-infospace-readiness.md
+++ b/workplans/IB-WP-0016-lefevre-ebook-infospace-readiness.md
@@ -45,27 +45,49 @@ provenance, reviewability, or cost control.

 ## Validation Baseline

-Validation note: `docs/lefevre-epub3-validation.md`.
+Validation note: `docs/lefevre-epub3-validation.md` (includes T01 and T02
+result sections).

-Current WP-0015 infrastructure can initialize the local EPUB and run
-source-only metrics in a disposable workspace:
+After T01 and T02, the local Lefevre EPUB is intake-ready:

- source chunks: 155
- entity count: 0
- relation count: 0
- evaluation count: 0
- source-only metrics history can be written without provider calls
+- 67 body chunks at default `max_words=800`, all 24 roman-numeral chapters
+  detected, stable IDs `chapter-01..chapter-24` with `-part-NNN` suffix
+- Cover, PG header/footer, Contents, Transcriber's Notes, and license
+  sections classified out of the body stream by default
+- Per-chunk provenance carries full OPF book metadata, chapter label and
+  number, page anchors, and spine index

-The run proves the basic intake path works, but also shows why a live all-book
-run should wait:
+### Smoke Run (2026-05-17)

- most generated chunk titles collapse to the same Gutenberg page title
- EPUB spine/chapter metadata is not yet honored deeply enough
- archive-order sorting risks confusing reading order
- non-body sections such as cover/header/footer/license need explicit policy
- plan output is too prompt-heavy for cost review on a 155-chunk book
- long-book resume needs chunk-level state, not only whole-run skip
- generated entities need cross-chunk dedupe/merge policy
+A fixture-backed end-to-end smoke run with `--max-chunks 3` against the
+real EPUB produced a complete infospace:
+
+- 3 source chunks (`chapter-01-part-001..003`), 3 entities, 3 relations,
+  3 evaluations, 1 generation-summary report
+- All chapter/book/anchor provenance fields land in `artifacts/index.yaml`
+  (verified: `chapter_label=I`, `chapter_number=1`,
+  `page_anchors=[Page_1, Page_2, Page_3]` on the first chunk)
+- Metrics viable: `coverage=1.0`, `redundancy=0.0`,
+  `granularity_entropy=1.79`, viability gates pass
+- Same-title entities returned by repeated stages were upserted to single
+  artifact files — basic dedupe works for exact-title matches
+
+### Remaining Gaps
+
+These are the gaps a serious full-book run still hits:
+
+- No compact `plan` output for cost/call preview on a 67-chunk run
+  (~5 stages per chunk = ~335 provider calls at default `max_words`) — T03
+- No `--chapter`, `--from-chapter`, `--to-chapter`, `--cost-cap`, or
+  `--max-calls` selection — T03
+- Generic profile produces sensible structure but does not push concepts
+  toward traders, markets, lessons, or strategies — T04
+- The generation-summary report only shows counts and metrics; it should
+  surface entity titles, chapter coverage, page-anchor links, and unmapped
+  chunks for human review — T07
+- Long-book resume is still whole-run-skip, not chunk-level — T06
+- Near-duplicate entities across chunks (e.g. "Larry Livingston" vs "the
+  narrator") need cross-chunk merge/dedupe policy before a 24-chapter run

 ## Non-Goals

@@ -197,6 +219,9 @@ state_hub_task_id: "5ff1f11e-49ad-4c2d-bd4c-b8cc261309bc"
 - Add a review checklist for duplicate entities, relation endpoints, weak
  evidence, and over-broad trading lessons
 - Add a final readiness report before generating the full book
+- Enrich `reports/generation-summary.md` beyond counts and metrics: list
+  entity titles, per-chapter coverage, page-anchor links, and any unmapped
+  source chunks (gap found in the 2026-05-17 smoke run)

 ## Acceptance