generated from coulomb/repo-seed
IB-WP-0016: refresh validation baseline after T01/T02 smoke run
Run a fixture-backed end-to-end smoke against the real Lefevre EPUB (max-chunks 3) and capture the result in the validation note and the workplan. The pipeline produces a complete infospace with stable chapter-01-part-NNN source IDs, full chapter/book/anchor provenance on every source artifact, viable metrics, and exact-title entity dedupe. Refresh the workplan validation baseline to reflect the post-T01/T02 state, and add a remaining-gaps section that maps the open issues to the right follow-on tasks: cost/scope controls and plan preview to T03, the trading-literature profile to T04, chunk-level resume to T06, and a richer generation-summary report (entity titles, chapter coverage, anchor links) to T07. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -112,3 +112,26 @@ now produces:
|
|||||||
`Page_1..Page_14` distributed across its three parts)
|
`Page_1..Page_14` distributed across its three parts)
|
||||||
- Optional `overlap_words` parameter supports evidence-window context
|
- Optional `overlap_words` parameter supports evidence-window context
|
||||||
between adjacent parts of the same chapter without duplicating headings
|
between adjacent parts of the same chapter without duplicating headings
|
||||||
|
|
||||||
|
## Fixture Smoke Run (2026-05-17)
|
||||||
|
|
||||||
|
`generate from-source ... --fixture-responses ... --max-chunks 3 --apply`
|
||||||
|
against the real EPUB produced a complete infospace:
|
||||||
|
|
||||||
|
- 3 source chunks (`chapter-01-part-001..003`) and 3 entities/relations/
|
||||||
|
evaluations plus the generation-summary report
|
||||||
|
- `artifacts/index.yaml` carries full T01/T02 provenance on every source
|
||||||
|
artifact (`chapter_label`, `chapter_number`, `page_anchors`, OPF
|
||||||
|
`book_metadata`)
|
||||||
|
- Metrics viable: `coverage=1.0`, `redundancy=0.0`, `granularity_entropy
|
||||||
|
≈ 1.79`; viability gates pass
|
||||||
|
- Repeated same-title entities upserted to single artifact files — basic
|
||||||
|
exact-title dedupe works; near-duplicate dedupe is still open
|
||||||
|
|
||||||
|
Gaps the smoke surfaced for follow-on tasks:
|
||||||
|
|
||||||
|
- `generation-summary.md` is just counts + metrics; needs entity titles,
|
||||||
|
chapter coverage, page-anchor links for review (T07)
|
||||||
|
- No `plan` cost preview, no chapter/cost cap selection — running the full
|
||||||
|
book at default `max_words` is ~335 provider calls (T03)
|
||||||
|
- Generic profile shaped the output, not Lefevre's trading vocabulary (T04)
|
||||||
|
|||||||
@@ -45,27 +45,49 @@ provenance, reviewability, or cost control.
|
|||||||
|
|
||||||
## Validation Baseline
|
## Validation Baseline
|
||||||
|
|
||||||
Validation note: `docs/lefevre-epub3-validation.md`.
|
Validation note: `docs/lefevre-epub3-validation.md` (includes T01 and T02
|
||||||
|
result sections).
|
||||||
|
|
||||||
Current WP-0015 infrastructure can initialize the local EPUB and run
|
After T01 and T02, the local Lefevre EPUB is intake-ready:
|
||||||
source-only metrics in a disposable workspace:
|
|
||||||
|
|
||||||
- source chunks: 155
|
- 67 body chunks at default `max_words=800`, all 24 roman-numeral chapters
|
||||||
- entity count: 0
|
detected, stable IDs `chapter-01..chapter-24` with `-part-NNN` suffix
|
||||||
- relation count: 0
|
- Cover, PG header/footer, Contents, Transcriber's Notes, and license
|
||||||
- evaluation count: 0
|
sections classified out of the body stream by default
|
||||||
- source-only metrics history can be written without provider calls
|
- Per-chunk provenance carries full OPF book metadata, chapter label and
|
||||||
|
number, page anchors, and spine index
|
||||||
|
|
||||||
The run proves the basic intake path works, but also shows why a live all-book
|
### Smoke Run (2026-05-17)
|
||||||
run should wait:
|
|
||||||
|
|
||||||
- most generated chunk titles collapse to the same Gutenberg page title
|
A fixture-backed end-to-end smoke run with `--max-chunks 3` against the
|
||||||
- EPUB spine/chapter metadata is not yet honored deeply enough
|
real EPUB produced a complete infospace:
|
||||||
- archive-order sorting risks confusing reading order
|
|
||||||
- non-body sections such as cover/header/footer/license need explicit policy
|
- 3 source chunks (`chapter-01-part-001..003`), 3 entities, 3 relations,
|
||||||
- plan output is too prompt-heavy for cost review on a 155-chunk book
|
3 evaluations, 1 generation-summary report
|
||||||
- long-book resume needs chunk-level state, not only whole-run skip
|
- All chapter/book/anchor provenance fields land in `artifacts/index.yaml`
|
||||||
- generated entities need cross-chunk dedupe/merge policy
|
(verified: `chapter_label=I`, `chapter_number=1`,
|
||||||
|
`page_anchors=[Page_1, Page_2, Page_3]` on the first chunk)
|
||||||
|
- Metrics viable: `coverage=1.0`, `redundancy=0.0`,
|
||||||
|
`granularity_entropy=1.79`, viability gates pass
|
||||||
|
- Same-title entities returned by repeated stages were upserted to single
|
||||||
|
artifact files — basic dedupe works for exact-title matches
|
||||||
|
|
||||||
|
### Remaining Gaps
|
||||||
|
|
||||||
|
These are the gaps a serious full-book run still hits:
|
||||||
|
|
||||||
|
- No compact `plan` output for cost/call preview on a 67-chunk run
|
||||||
|
(~5 stages per chunk = ~335 provider calls at default `max_words`) — T03
|
||||||
|
- No `--chapter`, `--from-chapter`, `--to-chapter`, `--cost-cap`, or
|
||||||
|
`--max-calls` selection — T03
|
||||||
|
- Generic profile produces sensible structure but does not push concepts
|
||||||
|
toward traders, markets, lessons, or strategies — T04
|
||||||
|
- The generation-summary report only shows counts and metrics; it should
|
||||||
|
surface entity titles, chapter coverage, page-anchor links, and unmapped
|
||||||
|
chunks for human review — T07
|
||||||
|
- Long-book resume is still whole-run-skip, not chunk-level — T06
|
||||||
|
- Near-duplicate entities across chunks (e.g. "Larry Livingston" vs "the
|
||||||
|
narrator") need cross-chunk merge/dedupe policy before a 24-chapter run
|
||||||
|
|
||||||
## Non-Goals
|
## Non-Goals
|
||||||
|
|
||||||
@@ -197,6 +219,9 @@ state_hub_task_id: "5ff1f11e-49ad-4c2d-bd4c-b8cc261309bc"
|
|||||||
- Add a review checklist for duplicate entities, relation endpoints, weak
|
- Add a review checklist for duplicate entities, relation endpoints, weak
|
||||||
evidence, and over-broad trading lessons
|
evidence, and over-broad trading lessons
|
||||||
- Add a final readiness report before generating the full book
|
- Add a final readiness report before generating the full book
|
||||||
|
- Enrich `reports/generation-summary.md` beyond counts and metrics: list
|
||||||
|
entity titles, per-chapter coverage, page-anchor links, and any unmapped
|
||||||
|
source chunks (gap found in the 2026-05-17 smoke run)
|
||||||
|
|
||||||
## Acceptance
|
## Acceptance
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user