Files
infospace-bench/workplans/IB-WP-0003-evaluation-and-inspection.md
2026-05-14 11:32:25 +02:00

87 lines
2.2 KiB
Markdown

---
id: IB-WP-0003
type: workplan
title: "Evaluation And Inspection Framework"
domain: markitect
repo: infospace-bench
status: done
owner: markitect
topic_slug: markitect
created: "2026-05-03"
updated: "2026-05-14"
state_hub_workstream_slug: "ib-wp-0003-evaluation-and-inspection"
state_hub_workstream_id: "bc368ba0-9fd7-4821-a5d7-e5c301faa80a"
---
# IB-WP-0003 — Evaluation And Inspection Framework
## Goal
Reestablish infospace quality evaluation and inspection as first-class
application behavior.
## FRS Coverage
- FR-030 to FR-032: evaluation and quality assessment
- FR-040 to FR-042: inspection, exploration, and visualization
- FR-080 to FR-081: optional AI-assisted operations and context provision
## Tasks
### T01 — Port evaluation result concepts
```task
id: IB-WP-0003-T01
status: done
priority: high
state_hub_task_id: "9bab4b20-3fef-469e-9ce2-f0db3e05e26a"
```
- Reimplement score entries, entity evaluations, metric values, snapshots, and
diffs from `markitect/infospace/evaluation.py`
- Keep serialization simple and inspectable
### T02 — Rebuild collection checks
```task
id: IB-WP-0003-T02
status: done
priority: high
state_hub_task_id: "ee335d74-5be3-4b94-91e3-509486909f93"
```
- Recreate redundancy, coverage, coherence, consistency, and granularity checks
- Keep dependencies explicit for embeddings and relationship graphs
- Write results to reusable structured outputs
### T03 — Add viability evaluation
```task
id: IB-WP-0003-T03
status: done
priority: high
state_hub_task_id: "d46b3429-37ef-4375-96e1-304eabf2cc13"
```
- Compare latest metrics to `infospace.yaml` thresholds
- Report pass/fail per threshold and overall viability
### T04 — Add relationship inspection output
```task
id: IB-WP-0003-T04
status: done
priority: medium
state_hub_task_id: "de4f45e4-81a1-4ddb-98de-15e99ed5605a"
```
- Export relationship summaries and graph representations
- Support at least one textual output and one graph-friendly output
## Acceptance
- Evaluation outputs are structured and diffable
- Collection-level metrics can be produced for a sample infospace
- Viability can be computed from declared thresholds
- Relationship structure is inspectable without hidden state