Files
infospace-bench/workplans/IB-WP-0003-evaluation-and-inspection.md
2026-05-14 11:32:25 +02:00

2.2 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_slug, state_hub_workstream_id
id type title domain repo status owner topic_slug created updated state_hub_workstream_slug state_hub_workstream_id
IB-WP-0003 workplan Evaluation And Inspection Framework markitect infospace-bench done markitect markitect 2026-05-03 2026-05-14 ib-wp-0003-evaluation-and-inspection bc368ba0-9fd7-4821-a5d7-e5c301faa80a

IB-WP-0003 — Evaluation And Inspection Framework

Goal

Reestablish infospace quality evaluation and inspection as first-class application behavior.

FRS Coverage

  • FR-030 to FR-032: evaluation and quality assessment
  • FR-040 to FR-042: inspection, exploration, and visualization
  • FR-080 to FR-081: optional AI-assisted operations and context provision

Tasks

T01 — Port evaluation result concepts

id: IB-WP-0003-T01
status: done
priority: high
state_hub_task_id: "9bab4b20-3fef-469e-9ce2-f0db3e05e26a"
  • Reimplement score entries, entity evaluations, metric values, snapshots, and diffs from markitect/infospace/evaluation.py
  • Keep serialization simple and inspectable

T02 — Rebuild collection checks

id: IB-WP-0003-T02
status: done
priority: high
state_hub_task_id: "ee335d74-5be3-4b94-91e3-509486909f93"
  • Recreate redundancy, coverage, coherence, consistency, and granularity checks
  • Keep dependencies explicit for embeddings and relationship graphs
  • Write results to reusable structured outputs

T03 — Add viability evaluation

id: IB-WP-0003-T03
status: done
priority: high
state_hub_task_id: "d46b3429-37ef-4375-96e1-304eabf2cc13"
  • Compare latest metrics to infospace.yaml thresholds
  • Report pass/fail per threshold and overall viability

T04 — Add relationship inspection output

id: IB-WP-0003-T04
status: done
priority: medium
state_hub_task_id: "de4f45e4-81a1-4ddb-98de-15e99ed5605a"
  • Export relationship summaries and graph representations
  • Support at least one textual output and one graph-friendly output

Acceptance

  • Evaluation outputs are structured and diffable
  • Collection-level metrics can be produced for a sample infospace
  • Viability can be computed from declared thresholds
  • Relationship structure is inspectable without hidden state