generated from coulomb/repo-seed
generic source-to-infospace generator
This commit is contained in:
94
docs/generic-source-generator.md
Normal file
94
docs/generic-source-generator.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# Generic Source Generator
|
||||
|
||||
Date: 2026-05-14
|
||||
|
||||
## Purpose
|
||||
|
||||
`infospace-bench generate` turns a local article, ebook-like file, or folder of
|
||||
knowledge sources into a manifest-backed infospace. It generalizes the
|
||||
Wealth/VSM pilot into an explicit workflow path with deterministic fixture
|
||||
support and an optional OpenRouter provider.
|
||||
|
||||
## Deterministic Run
|
||||
|
||||
Use fixture responses for repeatable tests and demos:
|
||||
|
||||
```bash
|
||||
infospace-bench generate from-source ./examples/article.md \
|
||||
--workspace . \
|
||||
--slug article-space \
|
||||
--name "Article Space" \
|
||||
--profile general-knowledge \
|
||||
--fixture-responses ./examples/responses.yaml \
|
||||
--apply
|
||||
```
|
||||
|
||||
The command creates normalized source chunks, installs the selected profile,
|
||||
runs the declared workflows, writes entities, relations, evaluations, metrics,
|
||||
history, and a generation report, then registers artifacts in
|
||||
`artifacts/index.yaml`.
|
||||
|
||||
## Stepwise Workflow
|
||||
|
||||
```bash
|
||||
infospace-bench generate init ./book.epub \
|
||||
--workspace . \
|
||||
--slug book-space \
|
||||
--name "Book Space" \
|
||||
--profile general-knowledge \
|
||||
--max-chunks 3
|
||||
|
||||
infospace-bench generate plan ./infospaces/book-space --stage all
|
||||
infospace-bench generate run ./infospaces/book-space \
|
||||
--fixture-responses ./responses.yaml
|
||||
infospace-bench generate status ./infospaces/book-space
|
||||
```
|
||||
|
||||
`--max-chunks` caps early experiments and provider cost. `generate status`
|
||||
shows chunk counts, generated artifact counts, evaluations, metrics, history,
|
||||
and stale source/profile inputs.
|
||||
|
||||
## OpenRouter
|
||||
|
||||
Live model calls are explicit:
|
||||
|
||||
```bash
|
||||
export OPENROUTER_API_KEY=...
|
||||
|
||||
infospace-bench generate run ./infospaces/book-space \
|
||||
--provider openrouter \
|
||||
--model openai/gpt-4o-mini \
|
||||
--stage all
|
||||
```
|
||||
|
||||
Choose the `--model` value from OpenRouter model IDs. The API key is read from
|
||||
`OPENROUTER_API_KEY`; it is not written to `infospace.yaml`. Default tests never
|
||||
make live provider calls.
|
||||
|
||||
## Resume
|
||||
|
||||
Use resume for interrupted or reviewed runs:
|
||||
|
||||
```bash
|
||||
infospace-bench generate resume ./infospaces/book-space \
|
||||
--provider openrouter \
|
||||
--model openai/gpt-4o-mini
|
||||
```
|
||||
|
||||
Unchanged completed runs are skipped. Use `--force` when you intentionally want
|
||||
to rerun completed work. Stale status is reported when source artifact digests
|
||||
or installed profile/template files change.
|
||||
|
||||
## Review Path
|
||||
|
||||
After generation:
|
||||
|
||||
- inspect `artifacts/sources/` for normalized input chunks
|
||||
- inspect `artifacts/entities/` and `artifacts/relations/` for generated claims
|
||||
- inspect `output/evaluations/` for rubric output
|
||||
- run `infospace-bench validate <root>` and `infospace-bench graph <root>`
|
||||
- review `reports/generation-summary.md`
|
||||
|
||||
Move from the generic profile to a specialized profile when the source domain
|
||||
needs stricter terminology, narrower extraction granularity, or a discipline
|
||||
lens such as VSM.
|
||||
Reference in New Issue
Block a user