1.2 KiB
Evaluation And Inspection
infospace-bench now has a deterministic baseline for evaluation and
inspection. It is intentionally small: the repo can produce structured quality
objects and relationship summaries before any LLM or engine integration is
introduced.
Evaluation Objects
ScoreEntryEntityEvaluationMetricValueEvaluationSnapshotSnapshotDiff
Snapshots are serializable through to_dict() / from_dict() and can be
compared with diff_snapshots().
Collection Checks
run_collection_checks() produces five baseline metrics:
redundancy_ratiocoverage_ratiocoherence_componentsconsistency_cyclesgranularity_entropy
These metrics are deliberately deterministic and file-backed. Later work can replace or extend their internals with embeddings, richer graph analysis, or agent-assisted evaluation without changing the result contract.
Viability
evaluate_viability() compares metric values against declared
ViabilityThreshold values. Missing metrics fail visibly.
Relationship Inspection
relationship_summary() extracts nodes, edges, and relationship type counts
from artifact manifests. export_mermaid() provides the first graph-friendly
representation.