Files

tegwick 13f9c1895c IB-WP-0016-T03: scale-aware planning

Replace generate plan's full-prompt dump with a compact summary that
reports selected-chunk counts, selected chapter numbers, per-workflow
call counts, prompt-word and token estimates, and a rough USD cost when
--cost-per-1k is supplied. Selection filters --chapter (label or number,
repeatable), --from-chapter / --to-chapter (numeric range), and --chunk
(repeatable id) shape the estimate. Budget caps --max-calls and
--cost-cap are reported as exceeds_* booleans so callers can fail fast
before run.

The old full per-workflow plan with prompts remains available behind
--full so deep inspection is opt-in instead of the default.

Whole-Lefevre estimate at default max_words=800: 146 chunks, 730 calls,
~518k prompt tokens, ~$155 at $0.30/1k. Chapters 3-5 only: 19 chunks,
95 calls, ~64k tokens. 87 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-17 18:18:09 +02:00

3.7 KiB

Raw Blame History

Generic Source Generator

Date: 2026-05-14

Purpose

infospace-bench generate turns a local article, ebook-like file, or folder of knowledge sources into a manifest-backed infospace. It generalizes the Wealth/VSM pilot into an explicit workflow path with deterministic fixture support and an optional OpenRouter provider.

Deterministic Run

Use fixture responses for repeatable tests and demos:

infospace-bench generate from-source ./examples/article.md \
  --workspace . \
  --slug article-space \
  --name "Article Space" \
  --profile general-knowledge \
  --fixture-responses ./examples/responses.yaml \
  --apply

The command creates normalized source chunks, installs the selected profile, runs the declared workflows, writes entities, relations, evaluations, metrics, history, and a generation report, then registers artifacts in artifacts/index.yaml.

Stepwise Workflow

infospace-bench generate init ./book.epub \
  --workspace . \
  --slug book-space \
  --name "Book Space" \
  --profile general-knowledge \
  --max-chunks 3

infospace-bench generate plan ./infospaces/book-space --stage all
infospace-bench generate run ./infospaces/book-space \
  --fixture-responses ./responses.yaml
infospace-bench generate status ./infospaces/book-space

--max-chunks caps early experiments and provider cost. generate status shows chunk counts, generated artifact counts, evaluations, metrics, history, and stale source/profile inputs.

Scale-aware plan

generate plan returns a compact estimate by default — counts of selected chunks, calls per workflow, prompt-word and token estimates, and a rough USD cost when --cost-per-1k is supplied. Long corpora no longer dump hundreds of full prompts unless --full is set.

infospace-bench generate plan ./infospaces/book-space \
  --from-chapter 1 --to-chapter 3 \
  --cost-per-1k 0.30 \
  --max-calls 50 \
  --cost-cap 2.00

Selection filters:

--chapter LABEL (repeatable) — match a chapter by roman/arabic label or numeric value (e.g. --chapter I or --chapter 2)
--from-chapter N / --to-chapter N — numeric chapter range
--chunk ID (repeatable) — exact source chunk id (e.g. chapter-01-part-002)

Budget flags --max-calls and --cost-cap are reported as exceeds_max_calls / exceeds_cost_cap booleans in the summary, so a caller can fail fast before invoking run. Use --full to opt back into the full per-workflow plan with prompts for deep inspection.

OpenRouter

Live model calls are explicit:

export OPENROUTER_API_KEY=...

infospace-bench generate run ./infospaces/book-space \
  --provider openrouter \
  --model openai/gpt-4o-mini \
  --stage all

Choose the --model value from OpenRouter model IDs. The API key is read from OPENROUTER_API_KEY; it is not written to infospace.yaml. Default tests never make live provider calls.

Resume

Use resume for interrupted or reviewed runs:

infospace-bench generate resume ./infospaces/book-space \
  --provider openrouter \
  --model openai/gpt-4o-mini

Unchanged completed runs are skipped. Use --force when you intentionally want to rerun completed work. Stale status is reported when source artifact digests or installed profile/template files change.

Review Path

After generation:

inspect artifacts/sources/ for normalized input chunks
inspect artifacts/entities/ and artifacts/relations/ for generated claims
inspect output/evaluations/ for rubric output
run infospace-bench validate <root> and infospace-bench graph <root>
review reports/generation-summary.md

Move from the generic profile to a specialized profile when the source domain needs stricter terminology, narrower extraction granularity, or a discipline lens such as VSM.

3.7 KiB Raw Blame History