generated from coulomb/repo-seed
IB-WP-0016: refresh validation baseline after T01/T02 smoke run
Run a fixture-backed end-to-end smoke against the real Lefevre EPUB (max-chunks 3) and capture the result in the validation note and the workplan. The pipeline produces a complete infospace with stable chapter-01-part-NNN source IDs, full chapter/book/anchor provenance on every source artifact, viable metrics, and exact-title entity dedupe. Refresh the workplan validation baseline to reflect the post-T01/T02 state, and add a remaining-gaps section that maps the open issues to the right follow-on tasks: cost/scope controls and plan preview to T03, the trading-literature profile to T04, chunk-level resume to T06, and a richer generation-summary report (entity titles, chapter coverage, anchor links) to T07. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -112,3 +112,26 @@ now produces:
|
||||
`Page_1..Page_14` distributed across its three parts)
|
||||
- Optional `overlap_words` parameter supports evidence-window context
|
||||
between adjacent parts of the same chapter without duplicating headings
|
||||
|
||||
## Fixture Smoke Run (2026-05-17)
|
||||
|
||||
`generate from-source ... --fixture-responses ... --max-chunks 3 --apply`
|
||||
against the real EPUB produced a complete infospace:
|
||||
|
||||
- 3 source chunks (`chapter-01-part-001..003`) and 3 entities/relations/
|
||||
evaluations plus the generation-summary report
|
||||
- `artifacts/index.yaml` carries full T01/T02 provenance on every source
|
||||
artifact (`chapter_label`, `chapter_number`, `page_anchors`, OPF
|
||||
`book_metadata`)
|
||||
- Metrics viable: `coverage=1.0`, `redundancy=0.0`, `granularity_entropy
|
||||
≈ 1.79`; viability gates pass
|
||||
- Repeated same-title entities upserted to single artifact files — basic
|
||||
exact-title dedupe works; near-duplicate dedupe is still open
|
||||
|
||||
Gaps the smoke surfaced for follow-on tasks:
|
||||
|
||||
- `generation-summary.md` is just counts + metrics; needs entity titles,
|
||||
chapter coverage, page-anchor links for review (T07)
|
||||
- No `plan` cost preview, no chapter/cost cap selection — running the full
|
||||
book at default `max_words` is ~335 provider calls (T03)
|
||||
- Generic profile shaped the output, not Lefevre's trading vocabulary (T04)
|
||||
|
||||
@@ -45,27 +45,49 @@ provenance, reviewability, or cost control.
|
||||
|
||||
## Validation Baseline
|
||||
|
||||
Validation note: `docs/lefevre-epub3-validation.md`.
|
||||
Validation note: `docs/lefevre-epub3-validation.md` (includes T01 and T02
|
||||
result sections).
|
||||
|
||||
Current WP-0015 infrastructure can initialize the local EPUB and run
|
||||
source-only metrics in a disposable workspace:
|
||||
After T01 and T02, the local Lefevre EPUB is intake-ready:
|
||||
|
||||
- source chunks: 155
|
||||
- entity count: 0
|
||||
- relation count: 0
|
||||
- evaluation count: 0
|
||||
- source-only metrics history can be written without provider calls
|
||||
- 67 body chunks at default `max_words=800`, all 24 roman-numeral chapters
|
||||
detected, stable IDs `chapter-01..chapter-24` with `-part-NNN` suffix
|
||||
- Cover, PG header/footer, Contents, Transcriber's Notes, and license
|
||||
sections classified out of the body stream by default
|
||||
- Per-chunk provenance carries full OPF book metadata, chapter label and
|
||||
number, page anchors, and spine index
|
||||
|
||||
The run proves the basic intake path works, but also shows why a live all-book
|
||||
run should wait:
|
||||
### Smoke Run (2026-05-17)
|
||||
|
||||
- most generated chunk titles collapse to the same Gutenberg page title
|
||||
- EPUB spine/chapter metadata is not yet honored deeply enough
|
||||
- archive-order sorting risks confusing reading order
|
||||
- non-body sections such as cover/header/footer/license need explicit policy
|
||||
- plan output is too prompt-heavy for cost review on a 155-chunk book
|
||||
- long-book resume needs chunk-level state, not only whole-run skip
|
||||
- generated entities need cross-chunk dedupe/merge policy
|
||||
A fixture-backed end-to-end smoke run with `--max-chunks 3` against the
|
||||
real EPUB produced a complete infospace:
|
||||
|
||||
- 3 source chunks (`chapter-01-part-001..003`), 3 entities, 3 relations,
|
||||
3 evaluations, 1 generation-summary report
|
||||
- All chapter/book/anchor provenance fields land in `artifacts/index.yaml`
|
||||
(verified: `chapter_label=I`, `chapter_number=1`,
|
||||
`page_anchors=[Page_1, Page_2, Page_3]` on the first chunk)
|
||||
- Metrics viable: `coverage=1.0`, `redundancy=0.0`,
|
||||
`granularity_entropy=1.79`, viability gates pass
|
||||
- Same-title entities returned by repeated stages were upserted to single
|
||||
artifact files — basic dedupe works for exact-title matches
|
||||
|
||||
### Remaining Gaps
|
||||
|
||||
These are the gaps a serious full-book run still hits:
|
||||
|
||||
- No compact `plan` output for cost/call preview on a 67-chunk run
|
||||
(~5 stages per chunk = ~335 provider calls at default `max_words`) — T03
|
||||
- No `--chapter`, `--from-chapter`, `--to-chapter`, `--cost-cap`, or
|
||||
`--max-calls` selection — T03
|
||||
- Generic profile produces sensible structure but does not push concepts
|
||||
toward traders, markets, lessons, or strategies — T04
|
||||
- The generation-summary report only shows counts and metrics; it should
|
||||
surface entity titles, chapter coverage, page-anchor links, and unmapped
|
||||
chunks for human review — T07
|
||||
- Long-book resume is still whole-run-skip, not chunk-level — T06
|
||||
- Near-duplicate entities across chunks (e.g. "Larry Livingston" vs "the
|
||||
narrator") need cross-chunk merge/dedupe policy before a 24-chapter run
|
||||
|
||||
## Non-Goals
|
||||
|
||||
@@ -197,6 +219,9 @@ state_hub_task_id: "5ff1f11e-49ad-4c2d-bd4c-b8cc261309bc"
|
||||
- Add a review checklist for duplicate entities, relation endpoints, weak
|
||||
evidence, and over-broad trading lessons
|
||||
- Add a final readiness report before generating the full book
|
||||
- Enrich `reports/generation-summary.md` beyond counts and metrics: list
|
||||
entity titles, per-chapter coverage, page-anchor links, and any unmapped
|
||||
source chunks (gap found in the 2026-05-17 smoke run)
|
||||
|
||||
## Acceptance
|
||||
|
||||
|
||||
Reference in New Issue
Block a user