--- id: IB-WP-0003 type: workplan title: "Evaluation And Inspection Framework" domain: markitect repo: infospace-bench status: done owner: markitect topic_slug: markitect created: "2026-05-03" updated: "2026-05-14" state_hub_workstream_slug: "ib-wp-0003-evaluation-and-inspection" state_hub_workstream_id: "bc368ba0-9fd7-4821-a5d7-e5c301faa80a" --- # IB-WP-0003 — Evaluation And Inspection Framework ## Goal Reestablish infospace quality evaluation and inspection as first-class application behavior. ## FRS Coverage - FR-030 to FR-032: evaluation and quality assessment - FR-040 to FR-042: inspection, exploration, and visualization - FR-080 to FR-081: optional AI-assisted operations and context provision ## Tasks ### T01 — Port evaluation result concepts ```task id: IB-WP-0003-T01 status: done priority: high state_hub_task_id: "9bab4b20-3fef-469e-9ce2-f0db3e05e26a" ``` - Reimplement score entries, entity evaluations, metric values, snapshots, and diffs from `markitect/infospace/evaluation.py` - Keep serialization simple and inspectable ### T02 — Rebuild collection checks ```task id: IB-WP-0003-T02 status: done priority: high state_hub_task_id: "ee335d74-5be3-4b94-91e3-509486909f93" ``` - Recreate redundancy, coverage, coherence, consistency, and granularity checks - Keep dependencies explicit for embeddings and relationship graphs - Write results to reusable structured outputs ### T03 — Add viability evaluation ```task id: IB-WP-0003-T03 status: done priority: high state_hub_task_id: "d46b3429-37ef-4375-96e1-304eabf2cc13" ``` - Compare latest metrics to `infospace.yaml` thresholds - Report pass/fail per threshold and overall viability ### T04 — Add relationship inspection output ```task id: IB-WP-0003-T04 status: done priority: medium state_hub_task_id: "de4f45e4-81a1-4ddb-98de-15e99ed5605a" ``` - Export relationship summaries and graph representations - Support at least one textual output and one graph-friendly output ## Acceptance - Evaluation outputs are structured and diffable - Collection-level metrics can be produced for a sample infospace - Viability can be computed from declared thresholds - Relationship structure is inspectable without hidden state