generated from coulomb/repo-seed
IB-WP-0016: refresh validation baseline after T01/T02 smoke run
Run a fixture-backed end-to-end smoke against the real Lefevre EPUB (max-chunks 3) and capture the result in the validation note and the workplan. The pipeline produces a complete infospace with stable chapter-01-part-NNN source IDs, full chapter/book/anchor provenance on every source artifact, viable metrics, and exact-title entity dedupe. Refresh the workplan validation baseline to reflect the post-T01/T02 state, and add a remaining-gaps section that maps the open issues to the right follow-on tasks: cost/scope controls and plan preview to T03, the trading-literature profile to T04, chunk-level resume to T06, and a richer generation-summary report (entity titles, chapter coverage, anchor links) to T07. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -112,3 +112,26 @@ now produces:
|
||||
`Page_1..Page_14` distributed across its three parts)
|
||||
- Optional `overlap_words` parameter supports evidence-window context
|
||||
between adjacent parts of the same chapter without duplicating headings
|
||||
|
||||
## Fixture Smoke Run (2026-05-17)
|
||||
|
||||
`generate from-source ... --fixture-responses ... --max-chunks 3 --apply`
|
||||
against the real EPUB produced a complete infospace:
|
||||
|
||||
- 3 source chunks (`chapter-01-part-001..003`) and 3 entities/relations/
|
||||
evaluations plus the generation-summary report
|
||||
- `artifacts/index.yaml` carries full T01/T02 provenance on every source
|
||||
artifact (`chapter_label`, `chapter_number`, `page_anchors`, OPF
|
||||
`book_metadata`)
|
||||
- Metrics viable: `coverage=1.0`, `redundancy=0.0`, `granularity_entropy
|
||||
≈ 1.79`; viability gates pass
|
||||
- Repeated same-title entities upserted to single artifact files — basic
|
||||
exact-title dedupe works; near-duplicate dedupe is still open
|
||||
|
||||
Gaps the smoke surfaced for follow-on tasks:
|
||||
|
||||
- `generation-summary.md` is just counts + metrics; needs entity titles,
|
||||
chapter coverage, page-anchor links for review (T07)
|
||||
- No `plan` cost preview, no chapter/cost cap selection — running the full
|
||||
book at default `max_words` is ~335 provider calls (T03)
|
||||
- Generic profile shaped the output, not Lefevre's trading vocabulary (T04)
|
||||
|
||||
Reference in New Issue
Block a user