Evaluation And Inspection

infospace-bench now has a deterministic baseline for evaluation and inspection. It is intentionally small: the repo can produce structured quality objects and relationship summaries before any LLM or engine integration is introduced.

Evaluation Objects

ScoreEntry
EntityEvaluation
MetricValue
EvaluationSnapshot
SnapshotDiff

Snapshots are serializable through to_dict() / from_dict() and can be compared with diff_snapshots().

Collection Checks

run_collection_checks() produces five baseline metrics:

redundancy_ratio
coverage_ratio
coherence_components
consistency_cycles
granularity_entropy

These metrics are deliberately deterministic and file-backed. Later work can replace or extend their internals with embeddings, richer graph analysis, or agent-assisted evaluation without changing the result contract.

Viability

evaluate_viability() compares metric values against declared ViabilityThreshold values. Missing metrics fail visibly.

Relationship Inspection

relationship_summary() extracts nodes, edges, and relationship type counts from artifact manifests. export_mermaid() provides the first graph-friendly representation.

1.2 KiB Raw Permalink Blame History

Evaluation And Inspection

Evaluation Objects

Collection Checks

Viability

Relationship Inspection

1.2 KiB

Raw Permalink Blame History