Initial implementation

2026-05-14 11:32:25 +02:00
parent 6fd1ff7581
commit 916a895a85
31 changed files with 1461 additions and 21 deletions
--- a/docs/evaluation-and-inspection.md
+++ b/docs/evaluation-and-inspection.md
@@ -0,0 +1,42 @@
+# Evaluation And Inspection
+
+`infospace-bench` now has a deterministic baseline for evaluation and
+inspection. It is intentionally small: the repo can produce structured quality
+objects and relationship summaries before any LLM or engine integration is
+introduced.
+
+## Evaluation Objects
+
+- `ScoreEntry`
+- `EntityEvaluation`
+- `MetricValue`
+- `EvaluationSnapshot`
+- `SnapshotDiff`
+
+Snapshots are serializable through `to_dict()` / `from_dict()` and can be
+compared with `diff_snapshots()`.
+
+## Collection Checks
+
+`run_collection_checks()` produces five baseline metrics:
+
+- `redundancy_ratio`
+- `coverage_ratio`
+- `coherence_components`
+- `consistency_cycles`
+- `granularity_entropy`
+
+These metrics are deliberately deterministic and file-backed. Later work can
+replace or extend their internals with embeddings, richer graph analysis, or
+agent-assisted evaluation without changing the result contract.
+
+## Viability
+
+`evaluate_viability()` compares metric values against declared
+`ViabilityThreshold` values. Missing metrics fail visibly.
+
+## Relationship Inspection
+
+`relationship_summary()` extracts nodes, edges, and relationship type counts
+from artifact manifests. `export_mermaid()` provides the first graph-friendly
+representation.