generated from coulomb/repo-seed
Workplans to actually create infospaces
This commit is contained in:
158
workplans/IB-WP-0013-wealth-vsm-generation-pipeline-parity.md
Normal file
158
workplans/IB-WP-0013-wealth-vsm-generation-pipeline-parity.md
Normal file
@@ -0,0 +1,158 @@
|
||||
---
|
||||
id: IB-WP-0013
|
||||
type: workplan
|
||||
title: "Wealth VSM Generation Pipeline Parity"
|
||||
domain: markitect
|
||||
repo: infospace-bench
|
||||
status: planned
|
||||
owner: markitect
|
||||
topic_slug: markitect
|
||||
created: "2026-05-14"
|
||||
updated: "2026-05-14"
|
||||
state_hub_workstream_slug: "ib-wp-0013-wealth-vsm-generation-pipeline-parity"
|
||||
state_hub_workstream_id: "74dc579e-9b03-4a00-b739-84b1007cfb94"
|
||||
---
|
||||
|
||||
# IB-WP-0013 - Wealth VSM Generation Pipeline Parity
|
||||
|
||||
## Goal
|
||||
|
||||
Make `infospace-bench` capable of regenerating the Adam Smith
|
||||
`Wealth of Nations` / VSM infospace through explicit, auditable workflows.
|
||||
|
||||
This should replace the old `markitect-project` generation path without
|
||||
copying its hidden provider calls, implicit output conventions, or monolithic
|
||||
`process` command shape.
|
||||
|
||||
## Intent
|
||||
|
||||
The legacy implementation could run a chapter corpus through:
|
||||
|
||||
- entity extraction
|
||||
- VSM mapping
|
||||
- chapter-level analysis synthesis
|
||||
- entity evaluation
|
||||
- classification and relation enrichment
|
||||
- collection metrics
|
||||
|
||||
The successor should express those stages as declared infospace workflows with
|
||||
deterministic planning, fake-adapter tests, explicit assisted-generation
|
||||
requests, stable manifest registration, and clear provenance.
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Recreate the old `process_chapters.py` script as-is.
|
||||
- Hide provider-specific LLM calls behind a generic command.
|
||||
- Require a live provider or network access for default tests.
|
||||
- Commit the full regenerated Wealth/VSM output before a one-chapter pilot is
|
||||
proven.
|
||||
- Move durable runtime, retrieval, or audit responsibilities into
|
||||
`infospace-bench`; those remain `kontextual-engine` concerns.
|
||||
|
||||
## Tasks
|
||||
|
||||
### T01 - Legacy pipeline decomposition and corpus map
|
||||
|
||||
```task
|
||||
id: IB-WP-0013-T01
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "2c558d1e-290f-4e0e-abe6-37302cc31ac4"
|
||||
```
|
||||
|
||||
- Map legacy `examples/infospace-with-history/process_chapters.py`
|
||||
- Inventory old templates: `extract-entities`, `map-to-vsm`,
|
||||
`synthesize-analysis`, `evaluate-entity`, and `assess-metrics`
|
||||
- Inventory source corpus, guidelines, VSM reference artifacts, generated
|
||||
outputs, processing logs, and metrics files
|
||||
- Record what must be migrated, reframed, delegated, deferred, or retired
|
||||
- Pick the first one-chapter golden target, preferably Book I Chapter III so it
|
||||
aligns with the current pruned legacy slice
|
||||
|
||||
### T02 - Assisted generation adapter and CLI boundary
|
||||
|
||||
```task
|
||||
id: IB-WP-0013-T02
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "70beb49c-49a3-49f4-9b3a-a4c5bdb88485"
|
||||
```
|
||||
|
||||
- Extend workflow execution so assisted stages can be executed through an
|
||||
explicit adapter selected by the caller
|
||||
- Keep dry-run planning as the default safe path
|
||||
- Add a deterministic fake adapter for tests
|
||||
- Persist assisted requests, provider metadata, generated outputs, and run
|
||||
records
|
||||
- Expose CLI/API behavior without embedding provider-specific code in core
|
||||
workflow logic
|
||||
|
||||
### T03 - Entity bundle splitting and manifest registration
|
||||
|
||||
```task
|
||||
id: IB-WP-0013-T03
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "4a340077-f0ab-40fe-a0bc-0fa94a325774"
|
||||
```
|
||||
|
||||
- Parse generated chapter-level entity bundles into individual entity artifacts
|
||||
- Normalize stable artifact IDs and filenames
|
||||
- Register each artifact in `artifacts/index.yaml`
|
||||
- Preserve source chapter, workflow, stage, provider, and input provenance
|
||||
- Make reruns idempotent: unchanged artifacts should not duplicate manifest
|
||||
entries
|
||||
- Add tests for malformed bundles, duplicate entities, and manifest updates
|
||||
|
||||
### T04 - VSM mapping analysis and evaluation workflows
|
||||
|
||||
```task
|
||||
id: IB-WP-0013-T04
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "62696191-d6fa-4d34-bf18-97f390a31b61"
|
||||
```
|
||||
|
||||
- Recreate `map-to-vsm` as an explicit assisted workflow
|
||||
- Recreate `synthesize-analysis` as an explicit assisted workflow
|
||||
- Recreate entity evaluation as an explicit assisted workflow that writes
|
||||
successor `artifact_id` evaluation files
|
||||
- Ensure generated mappings and relations can be parsed by current semantic
|
||||
models or clearly identify required model extensions
|
||||
- Connect generated evaluations to metrics/history and viability checks
|
||||
|
||||
### T05 - Wealth VSM pilot scale-up acceptance
|
||||
|
||||
```task
|
||||
id: IB-WP-0013-T05
|
||||
status: todo
|
||||
priority: medium
|
||||
state_hub_task_id: "fe8dd175-9630-4fe1-99aa-2f3e58172a52"
|
||||
```
|
||||
|
||||
- Prove one-chapter regeneration end to end with deterministic tests
|
||||
- Add a committed pilot report comparing regenerated successor output with the
|
||||
legacy generated output shape
|
||||
- Add docs for running a live provider-backed generation outside the default
|
||||
test suite
|
||||
- Document cost, rate-limit, resume, and reproducibility guidance
|
||||
- Define the acceptance path for scaling from one chapter to the full corpus
|
||||
|
||||
## Acceptance
|
||||
|
||||
- A user can inspect, plan, and run the Wealth/VSM generation workflow over a
|
||||
one-chapter pilot without using the old `markitect-project` process script
|
||||
- Default tests use fake adapters and are deterministic
|
||||
- Generated entities are split into stable files and registered in the manifest
|
||||
- Evaluation outputs use successor `artifact_id` semantics and feed metrics
|
||||
history
|
||||
- The workflow clearly distinguishes deterministic template stages from
|
||||
assisted provider-backed stages
|
||||
- Remaining full-corpus risks are documented before any large generation run
|
||||
|
||||
## Relationship To IB-WP-0014
|
||||
|
||||
This workplan can start on the current local-folder backend. It should avoid
|
||||
hard-coding storage assumptions where reasonable, but it is not blocked by the
|
||||
backend abstraction workplan.
|
||||
|
||||
157
workplans/IB-WP-0014-infospace-backend-abstraction.md
Normal file
157
workplans/IB-WP-0014-infospace-backend-abstraction.md
Normal file
@@ -0,0 +1,157 @@
|
||||
---
|
||||
id: IB-WP-0014
|
||||
type: workplan
|
||||
title: "Infospace Backend Abstraction"
|
||||
domain: markitect
|
||||
repo: infospace-bench
|
||||
status: planned
|
||||
owner: markitect
|
||||
topic_slug: markitect
|
||||
created: "2026-05-14"
|
||||
updated: "2026-05-14"
|
||||
state_hub_workstream_slug: "ib-wp-0014-infospace-backend-abstraction"
|
||||
state_hub_workstream_id: "c2d23ee7-6b2b-4db0-b660-a9e295c94956"
|
||||
---
|
||||
|
||||
# IB-WP-0014 - Infospace Backend Abstraction
|
||||
|
||||
## Goal
|
||||
|
||||
Allow an infospace to live behind a selectable backend instead of assuming only
|
||||
a local filesystem directory.
|
||||
|
||||
Target backends:
|
||||
|
||||
- local folder
|
||||
- remote or mounted folder
|
||||
- S3-compatible bucket/prefix
|
||||
- git repository
|
||||
|
||||
This is a new successor capability, not legacy parity. It should be designed so
|
||||
generation, validation, evaluation, and inspection logic do not care where the
|
||||
infospace is physically stored.
|
||||
|
||||
## Intent
|
||||
|
||||
The current repo is intentionally file-backed. That should remain the default.
|
||||
The improvement is to formalize the storage boundary so the same lifecycle and
|
||||
workflow APIs can operate on other backing stores through explicit adapters.
|
||||
|
||||
The design should keep `infospace-bench` as an application workspace, not a
|
||||
durable storage engine. Credentials, remote locking, rich audit, and runtime
|
||||
orchestration should be delegated or integrated carefully rather than invented
|
||||
inside core application logic.
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Replace the existing local folder behavior.
|
||||
- Require S3 or git dependencies for ordinary local use.
|
||||
- Store secrets in `infospace.yaml`.
|
||||
- Build a general database, sync server, or object storage service inside this
|
||||
repo.
|
||||
- Solve multi-writer conflict resolution beyond clear detection and reporting
|
||||
in the first pass.
|
||||
|
||||
## Tasks
|
||||
|
||||
### T01 - Backend contract and URI model
|
||||
|
||||
```task
|
||||
id: IB-WP-0014-T01
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "75b7df31-066a-47ac-bb94-a4ae908569fd"
|
||||
```
|
||||
|
||||
- Define a backend-neutral infospace location model
|
||||
- Support local paths without changing current user flows
|
||||
- Define URI examples for local, mounted folder, S3-compatible, and git-backed
|
||||
infospaces
|
||||
- Define backend capabilities: read, write, list, exists, atomic write,
|
||||
digest, version, sync, lock, and credentials-required
|
||||
- Document where credentials and remote configuration are allowed to live
|
||||
|
||||
### T02 - Local and remote folder backend baseline
|
||||
|
||||
```task
|
||||
id: IB-WP-0014-T02
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "2e33d98a-0cd0-4608-b7a1-76c5a7bb26ca"
|
||||
```
|
||||
|
||||
- Refactor lifecycle reads and writes behind a backend adapter while preserving
|
||||
current `Path`-based behavior
|
||||
- Keep local folders as the default backend
|
||||
- Treat mounted or remote folders as folder backends when the OS exposes them
|
||||
as paths
|
||||
- Add tests proving current pilots and CLI commands still work unchanged
|
||||
- Add tests for backend errors such as missing files, write failures, and
|
||||
unsafe paths
|
||||
|
||||
### T03 - S3 object-store backend adapter
|
||||
|
||||
```task
|
||||
id: IB-WP-0014-T03
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "e2ee9497-0a6c-419f-a045-fb994bf73b05"
|
||||
```
|
||||
|
||||
- Design an optional S3-compatible backend adapter
|
||||
- Use a fake in-memory or local test double for default tests
|
||||
- Keep real credentials and network calls out of the default test suite
|
||||
- Define object key layout for manifests, artifacts, reports, exports, and run
|
||||
records
|
||||
- Decide how digests, optimistic concurrency, and partial writes are reported
|
||||
|
||||
### T04 - Git repository backend adapter
|
||||
|
||||
```task
|
||||
id: IB-WP-0014-T04
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "e2938c5b-e6c2-468a-b782-b39962e5a81b"
|
||||
```
|
||||
|
||||
- Support opening or initializing an infospace backed by a git repository
|
||||
- Prove behavior against local test repositories before any remote network
|
||||
workflow
|
||||
- Define when commits are created, when they are only suggested, and how dirty
|
||||
trees are reported
|
||||
- Keep automatic commits opt-in
|
||||
- Preserve compatibility with the existing State Hub and workplan workflow
|
||||
|
||||
### T05 - Backend CLI docs and migration path
|
||||
|
||||
```task
|
||||
id: IB-WP-0014-T05
|
||||
status: todo
|
||||
priority: medium
|
||||
state_hub_task_id: "20d75d49-f62a-4236-a895-698cd2fae45a"
|
||||
```
|
||||
|
||||
- Expose backend selection in CLI/API docs
|
||||
- Add examples for local, mounted folder, S3-compatible, and git-backed
|
||||
infospaces
|
||||
- Document backend capabilities and limitations
|
||||
- Add a migration guide for moving a local infospace to another backend
|
||||
- Update acceptance docs so backend support is distinct from Wealth/VSM
|
||||
generation parity
|
||||
|
||||
## Acceptance
|
||||
|
||||
- Existing local-folder behavior remains backward compatible
|
||||
- Lifecycle, validation, inspection, workflow, metrics, history, and graph
|
||||
commands can operate through the backend contract
|
||||
- Default tests remain deterministic and do not require network credentials
|
||||
- Backend-specific capabilities and failure modes are visible to callers
|
||||
- S3 and git support are optional and clearly documented
|
||||
- Storage backend concerns stay separate from generation workflow semantics
|
||||
|
||||
## Relationship To IB-WP-0013
|
||||
|
||||
`IB-WP-0013` should prove generation parity on the default local backend first.
|
||||
This workplan then makes the same infospace operations portable across storage
|
||||
backends.
|
||||
|
||||
Reference in New Issue
Block a user