infospace-bench/docs/evaluation-and-inspection.md

# Evaluation And Inspection

`infospace-bench` now has a deterministic baseline for evaluation and
inspection. It is intentionally small: the repo can produce structured quality
objects and relationship summaries before any LLM or engine integration is
introduced.

## Evaluation Objects

- `ScoreEntry`
- `EntityEvaluation`
- `MetricValue`
- `EvaluationSnapshot`
- `SnapshotDiff`

Snapshots are serializable through `to_dict()` / `from_dict()` and can be
compared with `diff_snapshots()`.

## Collection Checks

`run_collection_checks()` produces five baseline metrics:

- `redundancy_ratio`
- `coverage_ratio`
- `coherence_components`
- `consistency_cycles`
- `granularity_entropy`

These metrics are deliberately deterministic and file-backed. Later work can
replace or extend their internals with embeddings, richer graph analysis, or
agent-assisted evaluation without changing the result contract.

## Viability

`evaluate_viability()` compares metric values against declared
`ViabilityThreshold` values. Missing metrics fail visibly.

## Relationship Inspection

`relationship_summary()` extracts nodes, edges, and relationship type counts
from artifact manifests. `export_mermaid()` provides the first graph-friendly
representation.