generated from coulomb/repo-seed
Initial implementation
This commit is contained in:
10
README.md
10
README.md
@@ -14,5 +14,15 @@ Start with:
|
||||
- `wiki/ProductRequirementsDocument.md`
|
||||
- `wiki/FunctionalRequirementsSpecification.md`
|
||||
- `SCOPE.md`
|
||||
- `docs/infospace-layout.md`
|
||||
- `docs/evaluation-and-inspection.md`
|
||||
- `docs/reference-pilot-decision.md`
|
||||
- `docs/markitect-main-scope-assessment.md`
|
||||
- `infospaces/bootstrap-pilot/`
|
||||
- `workplans/`
|
||||
|
||||
Current development command:
|
||||
|
||||
```bash
|
||||
python3 -m pytest
|
||||
```
|
||||
|
||||
42
docs/evaluation-and-inspection.md
Normal file
42
docs/evaluation-and-inspection.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# Evaluation And Inspection
|
||||
|
||||
`infospace-bench` now has a deterministic baseline for evaluation and
|
||||
inspection. It is intentionally small: the repo can produce structured quality
|
||||
objects and relationship summaries before any LLM or engine integration is
|
||||
introduced.
|
||||
|
||||
## Evaluation Objects
|
||||
|
||||
- `ScoreEntry`
|
||||
- `EntityEvaluation`
|
||||
- `MetricValue`
|
||||
- `EvaluationSnapshot`
|
||||
- `SnapshotDiff`
|
||||
|
||||
Snapshots are serializable through `to_dict()` / `from_dict()` and can be
|
||||
compared with `diff_snapshots()`.
|
||||
|
||||
## Collection Checks
|
||||
|
||||
`run_collection_checks()` produces five baseline metrics:
|
||||
|
||||
- `redundancy_ratio`
|
||||
- `coverage_ratio`
|
||||
- `coherence_components`
|
||||
- `consistency_cycles`
|
||||
- `granularity_entropy`
|
||||
|
||||
These metrics are deliberately deterministic and file-backed. Later work can
|
||||
replace or extend their internals with embeddings, richer graph analysis, or
|
||||
agent-assisted evaluation without changing the result contract.
|
||||
|
||||
## Viability
|
||||
|
||||
`evaluate_viability()` compares metric values against declared
|
||||
`ViabilityThreshold` values. Missing metrics fail visibly.
|
||||
|
||||
## Relationship Inspection
|
||||
|
||||
`relationship_summary()` extracts nodes, edges, and relationship type counts
|
||||
from artifact manifests. `export_mermaid()` provides the first graph-friendly
|
||||
representation.
|
||||
55
docs/infospace-layout.md
Normal file
55
docs/infospace-layout.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# Infospace Layout
|
||||
|
||||
An infospace is a file-backed project rooted at:
|
||||
|
||||
```text
|
||||
infospaces/<slug>/
|
||||
```
|
||||
|
||||
## Required Files
|
||||
|
||||
```text
|
||||
infospace.yaml
|
||||
artifacts/index.yaml
|
||||
```
|
||||
|
||||
`infospace.yaml` declares identity, topic, schema references, workflow
|
||||
references, discipline bindings, and viability thresholds. `artifacts/index.yaml`
|
||||
is the deterministic manifest of artifacts that have been added to the
|
||||
infospace.
|
||||
|
||||
## Required Directories
|
||||
|
||||
```text
|
||||
artifacts/sources/
|
||||
artifacts/generated/
|
||||
output/evaluations/
|
||||
output/metrics/
|
||||
reports/
|
||||
exports/
|
||||
```
|
||||
|
||||
## Artifact Manifest
|
||||
|
||||
Artifacts are represented by stable IDs such as `source/chapter-01.md`.
|
||||
|
||||
Each manifest entry records:
|
||||
|
||||
- `id`
|
||||
- `path`
|
||||
- `kind`
|
||||
- `title`
|
||||
- `provenance`
|
||||
- `relationships`
|
||||
|
||||
The manifest is intentionally plain YAML so it can be inspected, diffed, and
|
||||
regenerated by tools or agents.
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
python3 -m infospace_bench create . pilot --name "Pilot Infospace"
|
||||
python3 -m infospace_bench add-artifact infospaces/pilot ./source.md --kind source
|
||||
python3 -m infospace_bench inspect infospaces/pilot
|
||||
python3 -m infospace_bench export infospaces/pilot
|
||||
```
|
||||
24
docs/reference-pilot-decision.md
Normal file
24
docs/reference-pilot-decision.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Reference Pilot Decision
|
||||
|
||||
Date: 2026-05-14
|
||||
|
||||
## Decision
|
||||
|
||||
Use a small purpose-built corpus as the first maintained reference infospace.
|
||||
|
||||
## Rationale
|
||||
|
||||
`markitect-main/examples/infospace-with-history/` remains the primary migration
|
||||
candidate for a larger pilot, but it contains a large public-domain book corpus
|
||||
and substantial generated output. Pulling it in before the lifecycle and
|
||||
evaluation baseline exists would make the new repo noisy before it is useful.
|
||||
|
||||
The bootstrap pilot uses this repo's own PRD/FRS intent as a compact corpus. It
|
||||
proves the expected file layout, artifact manifest, relationship inspection,
|
||||
collection metrics, and viability thresholding with minimal bulk.
|
||||
|
||||
## Follow-up
|
||||
|
||||
After the baseline is stable, migrate a pruned Wealth of Nations/VSM fixture or
|
||||
a similarly representative slice from `markitect-main` under a separate
|
||||
workplan.
|
||||
@@ -0,0 +1,5 @@
|
||||
# Evaluation Baseline
|
||||
|
||||
The evaluation baseline implements serializable score objects, evaluation
|
||||
snapshots, snapshot diffs, deterministic collection checks, viability
|
||||
thresholding, and relationship inspection with Mermaid export.
|
||||
@@ -0,0 +1,5 @@
|
||||
# Lifecycle Baseline
|
||||
|
||||
The lifecycle baseline implements a file-backed infospace layout with
|
||||
`infospace.yaml`, an artifact manifest, deterministic artifact loading, and a
|
||||
small JSON CLI for create, inspect, add-artifact, and export operations.
|
||||
41
infospaces/bootstrap-pilot/artifacts/index.yaml
Normal file
41
infospaces/bootstrap-pilot/artifacts/index.yaml
Normal file
@@ -0,0 +1,41 @@
|
||||
artifacts:
|
||||
- id: source/prd-scope.md
|
||||
path: artifacts/sources/prd-scope.md
|
||||
kind: source
|
||||
title: PRD Scope
|
||||
provenance:
|
||||
source_path: wiki/ProductRequirementsDocument.md
|
||||
source_section: Scope Definition
|
||||
relationships:
|
||||
- type: supports
|
||||
target: generated/lifecycle-baseline.md
|
||||
- type: supports
|
||||
target: generated/evaluation-baseline.md
|
||||
- id: source/frs-requirements.md
|
||||
path: artifacts/sources/frs-requirements.md
|
||||
kind: source
|
||||
title: FRS Requirements
|
||||
provenance:
|
||||
source_path: wiki/FunctionalRequirementsSpecification.md
|
||||
source_section: Functional Requirements
|
||||
relationships:
|
||||
- type: supports
|
||||
target: generated/lifecycle-baseline.md
|
||||
- type: supports
|
||||
target: generated/evaluation-baseline.md
|
||||
- id: generated/lifecycle-baseline.md
|
||||
path: artifacts/generated/lifecycle-baseline.md
|
||||
kind: generated
|
||||
title: Lifecycle Baseline
|
||||
provenance:
|
||||
source_path: workplans/IB-WP-0002-infospace-lifecycle-baseline.md
|
||||
relationships:
|
||||
- type: precedes
|
||||
target: generated/evaluation-baseline.md
|
||||
- id: generated/evaluation-baseline.md
|
||||
path: artifacts/generated/evaluation-baseline.md
|
||||
kind: generated
|
||||
title: Evaluation Baseline
|
||||
provenance:
|
||||
source_path: workplans/IB-WP-0003-evaluation-and-inspection.md
|
||||
relationships: []
|
||||
@@ -0,0 +1,6 @@
|
||||
# FRS Requirements
|
||||
|
||||
The functional requirements describe externally observable behavior for
|
||||
infospace lifecycle management, knowledge population, structure and
|
||||
relationships, evaluation, inspection, workflow execution, export, AI
|
||||
assistance, and explicit error handling.
|
||||
@@ -0,0 +1,7 @@
|
||||
# PRD Scope
|
||||
|
||||
The product requirements define `infospace-bench` as the application layer for
|
||||
concrete structured knowledge spaces. The repo should support creation,
|
||||
population, evaluation, inspection, transformation, generation, and export of
|
||||
infospaces while avoiding low-level markdown tooling and runtime platform
|
||||
ownership.
|
||||
24
infospaces/bootstrap-pilot/infospace.yaml
Normal file
24
infospaces/bootstrap-pilot/infospace.yaml
Normal file
@@ -0,0 +1,24 @@
|
||||
slug: bootstrap-pilot
|
||||
name: Infospace Bench Bootstrap Pilot
|
||||
topic:
|
||||
name: Infospace Bench Bootstrap Pilot
|
||||
domain: Knowledge Engineering Tooling
|
||||
sources: artifacts/sources
|
||||
disciplines:
|
||||
- name: Infospace Lifecycle
|
||||
path: artifacts/generated/lifecycle-baseline.md
|
||||
schemas: {}
|
||||
workflows:
|
||||
- name: baseline-inspection
|
||||
report: reports/baseline-inspection.md
|
||||
viability:
|
||||
redundancy_ratio:
|
||||
max: 0
|
||||
coverage_ratio:
|
||||
min: 1
|
||||
coherence_components:
|
||||
max: 1
|
||||
consistency_cycles:
|
||||
max: 0
|
||||
granularity_entropy:
|
||||
min: 1
|
||||
9
infospaces/bootstrap-pilot/output/metrics/baseline.yaml
Normal file
9
infospaces/bootstrap-pilot/output/metrics/baseline.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
metrics:
|
||||
redundancy_ratio: 0
|
||||
coverage_ratio: 1
|
||||
coherence_components: 1
|
||||
consistency_cycles: 0
|
||||
granularity_entropy: 1
|
||||
details:
|
||||
artifact_count: 4
|
||||
relationship_count: 5
|
||||
24
infospaces/bootstrap-pilot/reports/baseline-inspection.md
Normal file
24
infospaces/bootstrap-pilot/reports/baseline-inspection.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Bootstrap Pilot Baseline Inspection
|
||||
|
||||
## Scope
|
||||
|
||||
This pilot exercises the new file-backed lifecycle and evaluation baseline
|
||||
against a compact corpus derived from the `infospace-bench` PRD, FRS, and
|
||||
initial workplans.
|
||||
|
||||
## Result
|
||||
|
||||
The pilot is viable under its declared thresholds:
|
||||
|
||||
- `redundancy_ratio`: 0
|
||||
- `coverage_ratio`: 1
|
||||
- `coherence_components`: 1
|
||||
- `consistency_cycles`: 0
|
||||
- `granularity_entropy`: 1
|
||||
|
||||
## Migration from markitect-main
|
||||
|
||||
Migration from markitect-main is intentionally staged. The large
|
||||
`examples/infospace-with-history/` corpus remains the main reference candidate,
|
||||
but this bootstrap pilot proves the successor repo's baseline behavior before
|
||||
bulk corpus migration.
|
||||
13
pyproject.toml
Normal file
13
pyproject.toml
Normal file
@@ -0,0 +1,13 @@
|
||||
[project]
|
||||
name = "infospace-bench"
|
||||
version = "0.1.0"
|
||||
description = "Application-layer workspace for concrete structured knowledge spaces."
|
||||
requires-python = ">=3.11"
|
||||
dependencies = ["PyYAML>=6"]
|
||||
|
||||
[project.scripts]
|
||||
infospace-bench = "infospace_bench.cli:main"
|
||||
|
||||
[tool.pytest.ini_options]
|
||||
pythonpath = ["src"]
|
||||
testpaths = ["tests"]
|
||||
28
src/infospace_bench/__init__.py
Normal file
28
src/infospace_bench/__init__.py
Normal file
@@ -0,0 +1,28 @@
|
||||
from .errors import InfospaceError
|
||||
from .evaluation import EntityEvaluation, EvaluationSnapshot, MetricValue, ScoreEntry
|
||||
from .lifecycle import add_artifact, create_infospace, load_infospace
|
||||
from .models import (
|
||||
DisciplineBinding,
|
||||
Infospace,
|
||||
InfospaceConfig,
|
||||
KnowledgeArtifact,
|
||||
TopicConfig,
|
||||
ViabilityThreshold,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"DisciplineBinding",
|
||||
"EntityEvaluation",
|
||||
"EvaluationSnapshot",
|
||||
"Infospace",
|
||||
"InfospaceConfig",
|
||||
"InfospaceError",
|
||||
"KnowledgeArtifact",
|
||||
"MetricValue",
|
||||
"ScoreEntry",
|
||||
"TopicConfig",
|
||||
"ViabilityThreshold",
|
||||
"add_artifact",
|
||||
"create_infospace",
|
||||
"load_infospace",
|
||||
]
|
||||
5
src/infospace_bench/__main__.py
Normal file
5
src/infospace_bench/__main__.py
Normal file
@@ -0,0 +1,5 @@
|
||||
from .cli import main
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
112
src/infospace_bench/checks.py
Normal file
112
src/infospace_bench/checks.py
Normal file
@@ -0,0 +1,112 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from math import log2
|
||||
|
||||
from .models import KnowledgeArtifact
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class CollectionCheckReport:
|
||||
metrics: dict[str, float]
|
||||
details: dict[str, object]
|
||||
|
||||
|
||||
def run_collection_checks(artifacts: list[KnowledgeArtifact]) -> CollectionCheckReport:
|
||||
graph = _directed_graph(artifacts)
|
||||
metrics = {
|
||||
"redundancy_ratio": _redundancy_ratio(artifacts),
|
||||
"coverage_ratio": _coverage_ratio(artifacts),
|
||||
"coherence_components": float(_component_count(graph)),
|
||||
"consistency_cycles": float(_cycle_count(graph)),
|
||||
"granularity_entropy": _kind_entropy(artifacts),
|
||||
}
|
||||
return CollectionCheckReport(
|
||||
metrics=metrics,
|
||||
details={
|
||||
"artifact_count": len(artifacts),
|
||||
"relationship_count": sum(len(item.relationships) for item in artifacts),
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
def _redundancy_ratio(artifacts: list[KnowledgeArtifact]) -> float:
|
||||
if not artifacts:
|
||||
return 0.0
|
||||
labels = [item.title or item.id for item in artifacts]
|
||||
duplicate_count = len(labels) - len(set(labels))
|
||||
return duplicate_count / len(artifacts)
|
||||
|
||||
|
||||
def _coverage_ratio(artifacts: list[KnowledgeArtifact]) -> float:
|
||||
if not artifacts:
|
||||
return 0.0
|
||||
covered = sum(1 for item in artifacts if item.title and item.path)
|
||||
return covered / len(artifacts)
|
||||
|
||||
|
||||
def _kind_entropy(artifacts: list[KnowledgeArtifact]) -> float:
|
||||
if not artifacts:
|
||||
return 0.0
|
||||
counts: dict[str, int] = {}
|
||||
for artifact in artifacts:
|
||||
counts[artifact.kind] = counts.get(artifact.kind, 0) + 1
|
||||
total = len(artifacts)
|
||||
return -sum((count / total) * log2(count / total) for count in counts.values())
|
||||
|
||||
|
||||
def _directed_graph(artifacts: list[KnowledgeArtifact]) -> dict[str, set[str]]:
|
||||
ids = {item.id for item in artifacts}
|
||||
graph = {item.id: set() for item in artifacts}
|
||||
for item in artifacts:
|
||||
for relationship in item.relationships:
|
||||
target = relationship.get("target")
|
||||
if isinstance(target, str) and target in ids:
|
||||
graph[item.id].add(target)
|
||||
return graph
|
||||
|
||||
|
||||
def _component_count(graph: dict[str, set[str]]) -> int:
|
||||
if not graph:
|
||||
return 0
|
||||
undirected = {node: set(edges) for node, edges in graph.items()}
|
||||
for node, edges in graph.items():
|
||||
for target in edges:
|
||||
undirected.setdefault(target, set()).add(node)
|
||||
|
||||
seen: set[str] = set()
|
||||
count = 0
|
||||
for node in undirected:
|
||||
if node in seen:
|
||||
continue
|
||||
count += 1
|
||||
stack = [node]
|
||||
while stack:
|
||||
current = stack.pop()
|
||||
if current in seen:
|
||||
continue
|
||||
seen.add(current)
|
||||
stack.extend(undirected[current] - seen)
|
||||
return count
|
||||
|
||||
|
||||
def _cycle_count(graph: dict[str, set[str]]) -> int:
|
||||
cycles = 0
|
||||
visited: set[str] = set()
|
||||
active: set[str] = set()
|
||||
|
||||
def visit(node: str) -> None:
|
||||
nonlocal cycles
|
||||
visited.add(node)
|
||||
active.add(node)
|
||||
for target in graph[node]:
|
||||
if target not in visited:
|
||||
visit(target)
|
||||
elif target in active:
|
||||
cycles += 1
|
||||
active.remove(node)
|
||||
|
||||
for node in graph:
|
||||
if node not in visited:
|
||||
visit(node)
|
||||
return cycles
|
||||
70
src/infospace_bench/cli.py
Normal file
70
src/infospace_bench/cli.py
Normal file
@@ -0,0 +1,70 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from .errors import InfospaceError
|
||||
from .lifecycle import add_artifact, create_infospace, load_infospace
|
||||
|
||||
|
||||
def build_parser() -> argparse.ArgumentParser:
|
||||
parser = argparse.ArgumentParser(prog="infospace-bench")
|
||||
sub = parser.add_subparsers(dest="command", required=True)
|
||||
|
||||
create = sub.add_parser("create", help="Create an infospace")
|
||||
create.add_argument("workspace")
|
||||
create.add_argument("slug")
|
||||
create.add_argument("--name", required=True)
|
||||
create.add_argument("--topic-domain", default="")
|
||||
|
||||
inspect = sub.add_parser("inspect", help="Inspect an infospace")
|
||||
inspect.add_argument("root")
|
||||
|
||||
add = sub.add_parser("add-artifact", help="Add an artifact to an infospace")
|
||||
add.add_argument("root")
|
||||
add.add_argument("source")
|
||||
add.add_argument("--kind", required=True)
|
||||
add.add_argument("--title", default="")
|
||||
|
||||
export = sub.add_parser("export", help="Print the infospace representation")
|
||||
export.add_argument("root")
|
||||
|
||||
return parser
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
parser = build_parser()
|
||||
args = parser.parse_args(argv)
|
||||
try:
|
||||
if args.command == "create":
|
||||
infospace = create_infospace(
|
||||
Path(args.workspace),
|
||||
args.slug,
|
||||
name=args.name,
|
||||
topic_domain=args.topic_domain,
|
||||
)
|
||||
_write_json({"slug": infospace.config.slug, "root": str(infospace.root)})
|
||||
elif args.command == "inspect":
|
||||
_write_json(load_infospace(Path(args.root)).to_dict())
|
||||
elif args.command == "add-artifact":
|
||||
artifact = add_artifact(
|
||||
Path(args.root),
|
||||
Path(args.source),
|
||||
kind=args.kind,
|
||||
title=args.title,
|
||||
)
|
||||
_write_json({"artifact": artifact.to_dict()})
|
||||
elif args.command == "export":
|
||||
_write_json(load_infospace(Path(args.root)).to_dict())
|
||||
else:
|
||||
parser.error(f"Unhandled command: {args.command}")
|
||||
except InfospaceError as exc:
|
||||
print(json.dumps(exc.to_dict(), indent=2), file=sys.stderr)
|
||||
return 2
|
||||
return 0
|
||||
|
||||
|
||||
def _write_json(payload: dict) -> None:
|
||||
print(json.dumps(payload, indent=2))
|
||||
25
src/infospace_bench/errors.py
Normal file
25
src/infospace_bench/errors.py
Normal file
@@ -0,0 +1,25 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any
|
||||
|
||||
|
||||
@dataclass
|
||||
class InfospaceError(Exception):
|
||||
"""Structured application error suitable for CLI and API surfaces."""
|
||||
|
||||
code: str
|
||||
message: str
|
||||
detail: dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
super().__init__(self.message)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {
|
||||
"error": {
|
||||
"code": self.code,
|
||||
"message": self.message,
|
||||
"detail": self.detail,
|
||||
}
|
||||
}
|
||||
210
src/infospace_bench/evaluation.py
Normal file
210
src/infospace_bench/evaluation.py
Normal file
@@ -0,0 +1,210 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime
|
||||
from typing import Any
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ScoreEntry:
|
||||
name: str
|
||||
value: float
|
||||
max_value: float = 5.0
|
||||
rationale: str = ""
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
data: dict[str, Any] = {
|
||||
"name": self.name,
|
||||
"value": self.value,
|
||||
"max_value": self.max_value,
|
||||
}
|
||||
if self.rationale:
|
||||
data["rationale"] = self.rationale
|
||||
return data
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict[str, Any]) -> "ScoreEntry":
|
||||
return cls(
|
||||
name=str(data["name"]),
|
||||
value=float(data["value"]),
|
||||
max_value=float(data.get("max_value", 5.0)),
|
||||
rationale=str(data.get("rationale") or ""),
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class EntityEvaluation:
|
||||
artifact_id: str
|
||||
evaluator: str
|
||||
scores: list[ScoreEntry]
|
||||
evaluated_at: datetime
|
||||
notes: list[str] = field(default_factory=list)
|
||||
|
||||
@property
|
||||
def overall_score(self) -> float:
|
||||
if not self.scores:
|
||||
return 0.0
|
||||
return sum(score.value for score in self.scores) / len(self.scores)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {
|
||||
"artifact_id": self.artifact_id,
|
||||
"evaluator": self.evaluator,
|
||||
"evaluated_at": self.evaluated_at.isoformat(),
|
||||
"overall_score": round(self.overall_score, 4),
|
||||
"scores": [score.to_dict() for score in self.scores],
|
||||
"notes": self.notes,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict[str, Any]) -> "EntityEvaluation":
|
||||
return cls(
|
||||
artifact_id=str(data["artifact_id"]),
|
||||
evaluator=str(data["evaluator"]),
|
||||
scores=[ScoreEntry.from_dict(item) for item in data.get("scores", [])],
|
||||
evaluated_at=datetime.fromisoformat(str(data["evaluated_at"])),
|
||||
notes=list(data.get("notes") or []),
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class MetricValue:
|
||||
name: str
|
||||
value: float
|
||||
concern: str = ""
|
||||
details: dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
data: dict[str, Any] = {"name": self.name, "value": self.value}
|
||||
if self.concern:
|
||||
data["concern"] = self.concern
|
||||
if self.details:
|
||||
data["details"] = self.details
|
||||
return data
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict[str, Any]) -> "MetricValue":
|
||||
return cls(
|
||||
name=str(data["name"]),
|
||||
value=float(data["value"]),
|
||||
concern=str(data.get("concern") or ""),
|
||||
details=dict(data.get("details") or {}),
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class EvaluationSnapshot:
|
||||
snapshot_id: str
|
||||
created_at: datetime
|
||||
schema_name: str
|
||||
artifact_count: int
|
||||
artifact_evaluations: list[EntityEvaluation] = field(default_factory=list)
|
||||
collection_metrics: list[MetricValue] = field(default_factory=list)
|
||||
metadata: dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {
|
||||
"snapshot_id": self.snapshot_id,
|
||||
"created_at": self.created_at.isoformat(),
|
||||
"schema_name": self.schema_name,
|
||||
"artifact_count": self.artifact_count,
|
||||
"artifact_evaluations": [
|
||||
evaluation.to_dict() for evaluation in self.artifact_evaluations
|
||||
],
|
||||
"collection_metrics": [
|
||||
metric.to_dict() for metric in self.collection_metrics
|
||||
],
|
||||
"metadata": self.metadata,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict[str, Any]) -> "EvaluationSnapshot":
|
||||
return cls(
|
||||
snapshot_id=str(data["snapshot_id"]),
|
||||
created_at=datetime.fromisoformat(str(data["created_at"])),
|
||||
schema_name=str(data["schema_name"]),
|
||||
artifact_count=int(data["artifact_count"]),
|
||||
artifact_evaluations=[
|
||||
EntityEvaluation.from_dict(item)
|
||||
for item in data.get("artifact_evaluations", [])
|
||||
],
|
||||
collection_metrics=[
|
||||
MetricValue.from_dict(item) for item in data.get("collection_metrics", [])
|
||||
],
|
||||
metadata=dict(data.get("metadata") or {}),
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ScoreChange:
|
||||
artifact_id: str
|
||||
dimension: str
|
||||
before: float
|
||||
after: float
|
||||
|
||||
@property
|
||||
def delta(self) -> float:
|
||||
return self.after - self.before
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class MetricChange:
|
||||
name: str
|
||||
before: float
|
||||
after: float
|
||||
|
||||
@property
|
||||
def delta(self) -> float:
|
||||
return self.after - self.before
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class SnapshotDiff:
|
||||
before_id: str
|
||||
after_id: str
|
||||
added_artifacts: list[str] = field(default_factory=list)
|
||||
removed_artifacts: list[str] = field(default_factory=list)
|
||||
score_changes: list[ScoreChange] = field(default_factory=list)
|
||||
metric_changes: list[MetricChange] = field(default_factory=list)
|
||||
|
||||
|
||||
def diff_snapshots(
|
||||
before: EvaluationSnapshot,
|
||||
after: EvaluationSnapshot,
|
||||
) -> SnapshotDiff:
|
||||
before_scores = _score_index(before)
|
||||
after_scores = _score_index(after)
|
||||
before_artifacts = {artifact_id for artifact_id, _ in before_scores}
|
||||
after_artifacts = {artifact_id for artifact_id, _ in after_scores}
|
||||
|
||||
score_changes = [
|
||||
ScoreChange(artifact_id, dimension, before_scores[key], after_scores[key])
|
||||
for key in sorted(before_scores.keys() & after_scores.keys())
|
||||
for artifact_id, dimension in [key]
|
||||
if before_scores[key] != after_scores[key]
|
||||
]
|
||||
|
||||
before_metrics = {metric.name: metric.value for metric in before.collection_metrics}
|
||||
after_metrics = {metric.name: metric.value for metric in after.collection_metrics}
|
||||
metric_changes = [
|
||||
MetricChange(name, before_metrics[name], after_metrics[name])
|
||||
for name in sorted(before_metrics.keys() & after_metrics.keys())
|
||||
if before_metrics[name] != after_metrics[name]
|
||||
]
|
||||
|
||||
return SnapshotDiff(
|
||||
before_id=before.snapshot_id,
|
||||
after_id=after.snapshot_id,
|
||||
added_artifacts=sorted(after_artifacts - before_artifacts),
|
||||
removed_artifacts=sorted(before_artifacts - after_artifacts),
|
||||
score_changes=score_changes,
|
||||
metric_changes=metric_changes,
|
||||
)
|
||||
|
||||
|
||||
def _score_index(snapshot: EvaluationSnapshot) -> dict[tuple[str, str], float]:
|
||||
return {
|
||||
(evaluation.artifact_id, score.name): score.value
|
||||
for evaluation in snapshot.artifact_evaluations
|
||||
for score in evaluation.scores
|
||||
}
|
||||
54
src/infospace_bench/inspection.py
Normal file
54
src/infospace_bench/inspection.py
Normal file
@@ -0,0 +1,54 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
|
||||
from .models import KnowledgeArtifact
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class RelationshipEdge:
|
||||
source: str
|
||||
target: str
|
||||
type: str
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class RelationshipSummary:
|
||||
nodes: list[str]
|
||||
edges: list[RelationshipEdge]
|
||||
relationship_types: dict[str, int] = field(default_factory=dict)
|
||||
|
||||
@property
|
||||
def node_count(self) -> int:
|
||||
return len(self.nodes)
|
||||
|
||||
@property
|
||||
def edge_count(self) -> int:
|
||||
return len(self.edges)
|
||||
|
||||
|
||||
def relationship_summary(artifacts: list[KnowledgeArtifact]) -> RelationshipSummary:
|
||||
ids = {artifact.id for artifact in artifacts}
|
||||
edges: list[RelationshipEdge] = []
|
||||
type_counts: dict[str, int] = {}
|
||||
for artifact in artifacts:
|
||||
for relationship in artifact.relationships:
|
||||
target = relationship.get("target")
|
||||
relation_type = str(relationship.get("type") or "related")
|
||||
if isinstance(target, str) and target in ids:
|
||||
edges.append(RelationshipEdge(artifact.id, target, relation_type))
|
||||
type_counts[relation_type] = type_counts.get(relation_type, 0) + 1
|
||||
return RelationshipSummary(
|
||||
nodes=sorted(ids),
|
||||
edges=edges,
|
||||
relationship_types=dict(sorted(type_counts.items())),
|
||||
)
|
||||
|
||||
|
||||
def export_mermaid(summary: RelationshipSummary) -> str:
|
||||
lines = ["graph TD"]
|
||||
for node in summary.nodes:
|
||||
lines.append(f" {node}")
|
||||
for edge in summary.edges:
|
||||
lines.append(f" {edge.source} -->|{edge.type}| {edge.target}")
|
||||
return "\n".join(lines) + "\n"
|
||||
170
src/infospace_bench/lifecycle.py
Normal file
170
src/infospace_bench/lifecycle.py
Normal file
@@ -0,0 +1,170 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import yaml
|
||||
|
||||
from .errors import InfospaceError
|
||||
from .models import Infospace, InfospaceConfig, KnowledgeArtifact, TopicConfig
|
||||
|
||||
SLUG_RE = re.compile(r"^[a-z0-9][a-z0-9-]*[a-z0-9]$|^[a-z0-9]$")
|
||||
CONFIG_FILE = "infospace.yaml"
|
||||
ARTIFACT_INDEX = "artifacts/index.yaml"
|
||||
LAYOUT_DIRS = (
|
||||
"artifacts/sources",
|
||||
"artifacts/generated",
|
||||
"output/evaluations",
|
||||
"output/metrics",
|
||||
"reports",
|
||||
"exports",
|
||||
)
|
||||
KIND_DIRS = {"source": "sources", "generated": "generated"}
|
||||
|
||||
|
||||
def create_infospace(
|
||||
workspace: Path | str,
|
||||
slug: str,
|
||||
*,
|
||||
name: str,
|
||||
topic_domain: str = "",
|
||||
) -> Infospace:
|
||||
_validate_slug(slug)
|
||||
workspace_path = Path(workspace)
|
||||
root = workspace_path / "infospaces" / slug
|
||||
if root.exists():
|
||||
raise InfospaceError(
|
||||
"infospace_exists",
|
||||
f"Infospace already exists: {root}",
|
||||
{"root": str(root)},
|
||||
)
|
||||
|
||||
for relative in LAYOUT_DIRS:
|
||||
(root / relative).mkdir(parents=True, exist_ok=True)
|
||||
|
||||
config = InfospaceConfig(
|
||||
slug=slug,
|
||||
name=name,
|
||||
topic=TopicConfig(name=name, domain=topic_domain),
|
||||
)
|
||||
_write_yaml(root / CONFIG_FILE, config.to_dict())
|
||||
_write_yaml(root / ARTIFACT_INDEX, {"artifacts": []})
|
||||
return Infospace(root=root, config=config, artifacts=[])
|
||||
|
||||
|
||||
def load_infospace(root: Path | str) -> Infospace:
|
||||
root_path = Path(root)
|
||||
if not root_path.exists():
|
||||
raise InfospaceError(
|
||||
"missing_infospace",
|
||||
f"Infospace path does not exist: {root_path}",
|
||||
{"root": str(root_path)},
|
||||
)
|
||||
config_path = root_path / CONFIG_FILE
|
||||
if not config_path.is_file():
|
||||
raise InfospaceError(
|
||||
"missing_config",
|
||||
f"Missing infospace.yaml at {config_path}",
|
||||
{"config_path": str(config_path)},
|
||||
)
|
||||
|
||||
raw_config = _read_yaml(config_path)
|
||||
try:
|
||||
config = InfospaceConfig.from_dict(raw_config)
|
||||
except KeyError as exc:
|
||||
raise InfospaceError(
|
||||
"invalid_config",
|
||||
f"Missing required config field: {exc.args[0]}",
|
||||
{"config_path": str(config_path), "field": exc.args[0]},
|
||||
) from exc
|
||||
|
||||
return Infospace(root=root_path, config=config, artifacts=_read_artifacts(root_path))
|
||||
|
||||
|
||||
def add_artifact(
|
||||
root: Path | str,
|
||||
source: Path | str,
|
||||
*,
|
||||
kind: str,
|
||||
title: str = "",
|
||||
relationships: list[dict[str, Any]] | None = None,
|
||||
) -> KnowledgeArtifact:
|
||||
infospace = load_infospace(root)
|
||||
if kind not in KIND_DIRS:
|
||||
raise InfospaceError(
|
||||
"invalid_artifact_kind",
|
||||
f"Unsupported artifact kind: {kind}",
|
||||
{"kind": kind, "valid_kinds": sorted(KIND_DIRS)},
|
||||
)
|
||||
|
||||
source_path = Path(source)
|
||||
if not source_path.is_file():
|
||||
raise InfospaceError(
|
||||
"missing_artifact_source",
|
||||
f"Artifact source does not exist: {source_path}",
|
||||
{"source": str(source_path)},
|
||||
)
|
||||
|
||||
artifact_id = f"{kind}/{source_path.name}"
|
||||
if any(item.id == artifact_id for item in infospace.artifacts):
|
||||
raise InfospaceError(
|
||||
"duplicate_artifact",
|
||||
f"Artifact already exists: {artifact_id}",
|
||||
{"artifact_id": artifact_id},
|
||||
)
|
||||
|
||||
target = infospace.root / "artifacts" / KIND_DIRS[kind] / source_path.name
|
||||
target.parent.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copyfile(source_path, target)
|
||||
|
||||
artifact = KnowledgeArtifact(
|
||||
id=artifact_id,
|
||||
path=str(target.relative_to(infospace.root)),
|
||||
kind=kind,
|
||||
title=title,
|
||||
provenance={"source_path": str(source_path)},
|
||||
relationships=relationships or [],
|
||||
)
|
||||
artifacts = [*infospace.artifacts, artifact]
|
||||
_write_yaml(
|
||||
infospace.root / ARTIFACT_INDEX,
|
||||
{"artifacts": [item.to_dict() for item in artifacts]},
|
||||
)
|
||||
return artifact
|
||||
|
||||
|
||||
def _validate_slug(slug: str) -> None:
|
||||
if not SLUG_RE.match(slug):
|
||||
raise InfospaceError(
|
||||
"invalid_slug",
|
||||
"Slug must contain only lowercase letters, numbers, and hyphens",
|
||||
{"slug": slug},
|
||||
)
|
||||
|
||||
|
||||
def _read_artifacts(root: Path) -> list[KnowledgeArtifact]:
|
||||
path = root / ARTIFACT_INDEX
|
||||
if not path.exists():
|
||||
return []
|
||||
data = _read_yaml(path)
|
||||
return [KnowledgeArtifact.from_dict(item) for item in data.get("artifacts", [])]
|
||||
|
||||
|
||||
def _read_yaml(path: Path) -> dict[str, Any]:
|
||||
with path.open("r", encoding="utf-8") as handle:
|
||||
data = yaml.safe_load(handle) or {}
|
||||
if not isinstance(data, dict):
|
||||
raise InfospaceError(
|
||||
"invalid_yaml",
|
||||
f"Expected mapping in YAML file: {path}",
|
||||
{"path": str(path)},
|
||||
)
|
||||
return data
|
||||
|
||||
|
||||
def _write_yaml(path: Path, data: dict[str, Any]) -> None:
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
with path.open("w", encoding="utf-8") as handle:
|
||||
yaml.safe_dump(data, handle, sort_keys=False)
|
||||
143
src/infospace_bench/models.py
Normal file
143
src/infospace_bench/models.py
Normal file
@@ -0,0 +1,143 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class TopicConfig:
|
||||
name: str
|
||||
domain: str = ""
|
||||
sources: str = "artifacts/sources"
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict[str, Any] | None) -> "TopicConfig":
|
||||
data = data or {}
|
||||
return cls(
|
||||
name=str(data.get("name") or ""),
|
||||
domain=str(data.get("domain") or ""),
|
||||
sources=str(data.get("sources") or "artifacts/sources"),
|
||||
)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {"name": self.name, "domain": self.domain, "sources": self.sources}
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class DisciplineBinding:
|
||||
name: str
|
||||
path: str
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict[str, Any]) -> "DisciplineBinding":
|
||||
return cls(name=str(data["name"]), path=str(data["path"]))
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {"name": self.name, "path": self.path}
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ViabilityThreshold:
|
||||
min: float | None = None
|
||||
max: float | None = None
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict[str, Any]) -> "ViabilityThreshold":
|
||||
return cls(
|
||||
min=float(data["min"]) if data.get("min") is not None else None,
|
||||
max=float(data["max"]) if data.get("max") is not None else None,
|
||||
)
|
||||
|
||||
def to_dict(self) -> dict[str, float]:
|
||||
result: dict[str, float] = {}
|
||||
if self.min is not None:
|
||||
result["min"] = self.min
|
||||
if self.max is not None:
|
||||
result["max"] = self.max
|
||||
return result
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class InfospaceConfig:
|
||||
slug: str
|
||||
name: str
|
||||
topic: TopicConfig
|
||||
disciplines: list[DisciplineBinding] = field(default_factory=list)
|
||||
schemas: dict[str, str] = field(default_factory=dict)
|
||||
workflows: list[dict[str, Any]] = field(default_factory=list)
|
||||
viability: dict[str, ViabilityThreshold] = field(default_factory=dict)
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict[str, Any]) -> "InfospaceConfig":
|
||||
return cls(
|
||||
slug=str(data["slug"]),
|
||||
name=str(data.get("name") or data["slug"]),
|
||||
topic=TopicConfig.from_dict(data.get("topic")),
|
||||
disciplines=[
|
||||
DisciplineBinding.from_dict(item)
|
||||
for item in data.get("disciplines", [])
|
||||
],
|
||||
schemas={str(k): str(v) for k, v in (data.get("schemas") or {}).items()},
|
||||
workflows=list(data.get("workflows") or []),
|
||||
viability={
|
||||
str(k): ViabilityThreshold.from_dict(v)
|
||||
for k, v in (data.get("viability") or {}).items()
|
||||
},
|
||||
)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {
|
||||
"slug": self.slug,
|
||||
"name": self.name,
|
||||
"topic": self.topic.to_dict(),
|
||||
"disciplines": [item.to_dict() for item in self.disciplines],
|
||||
"schemas": self.schemas,
|
||||
"workflows": self.workflows,
|
||||
"viability": {k: v.to_dict() for k, v in self.viability.items()},
|
||||
}
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class KnowledgeArtifact:
|
||||
id: str
|
||||
path: str
|
||||
kind: str
|
||||
title: str = ""
|
||||
provenance: dict[str, Any] = field(default_factory=dict)
|
||||
relationships: list[dict[str, Any]] = field(default_factory=list)
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict[str, Any]) -> "KnowledgeArtifact":
|
||||
return cls(
|
||||
id=str(data["id"]),
|
||||
path=str(data["path"]),
|
||||
kind=str(data["kind"]),
|
||||
title=str(data.get("title") or ""),
|
||||
provenance=dict(data.get("provenance") or {}),
|
||||
relationships=list(data.get("relationships") or []),
|
||||
)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {
|
||||
"id": self.id,
|
||||
"path": self.path,
|
||||
"kind": self.kind,
|
||||
"title": self.title,
|
||||
"provenance": self.provenance,
|
||||
"relationships": self.relationships,
|
||||
}
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Infospace:
|
||||
root: Path
|
||||
config: InfospaceConfig
|
||||
artifacts: list[KnowledgeArtifact] = field(default_factory=list)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {
|
||||
"root": str(self.root),
|
||||
"config": self.config.to_dict(),
|
||||
"artifacts": [item.to_dict() for item in self.artifacts],
|
||||
}
|
||||
43
src/infospace_bench/viability.py
Normal file
43
src/infospace_bench/viability.py
Normal file
@@ -0,0 +1,43 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
|
||||
from .models import ViabilityThreshold
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ViabilityResult:
|
||||
metric: str
|
||||
value: float | None
|
||||
threshold: ViabilityThreshold
|
||||
passed: bool
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ViabilityReport:
|
||||
passed: bool
|
||||
results: dict[str, ViabilityResult]
|
||||
|
||||
|
||||
def evaluate_viability(
|
||||
metrics: dict[str, float],
|
||||
thresholds: dict[str, ViabilityThreshold],
|
||||
) -> ViabilityReport:
|
||||
results: dict[str, ViabilityResult] = {}
|
||||
for name, threshold in thresholds.items():
|
||||
value = metrics.get(name)
|
||||
passed = value is not None
|
||||
if value is not None and threshold.min is not None:
|
||||
passed = passed and value >= threshold.min
|
||||
if value is not None and threshold.max is not None:
|
||||
passed = passed and value <= threshold.max
|
||||
results[name] = ViabilityResult(
|
||||
metric=name,
|
||||
value=value,
|
||||
threshold=threshold,
|
||||
passed=passed,
|
||||
)
|
||||
return ViabilityReport(
|
||||
passed=all(result.passed for result in results.values()),
|
||||
results=results,
|
||||
)
|
||||
59
tests/test_cli.py
Normal file
59
tests/test_cli.py
Normal file
@@ -0,0 +1,59 @@
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def run_cli(*args: str) -> subprocess.CompletedProcess[str]:
|
||||
env = os.environ.copy()
|
||||
env["PYTHONPATH"] = "src"
|
||||
return subprocess.run(
|
||||
[sys.executable, "-m", "infospace_bench", *args],
|
||||
check=False,
|
||||
env=env,
|
||||
text=True,
|
||||
capture_output=True,
|
||||
)
|
||||
|
||||
|
||||
def test_cli_create_inspect_and_add_artifact(tmp_path: Path) -> None:
|
||||
create = run_cli(
|
||||
"create",
|
||||
str(tmp_path),
|
||||
"pilot",
|
||||
"--name",
|
||||
"Pilot Infospace",
|
||||
"--topic-domain",
|
||||
"Test Domain",
|
||||
)
|
||||
assert create.returncode == 0, create.stderr
|
||||
assert json.loads(create.stdout)["slug"] == "pilot"
|
||||
|
||||
source = tmp_path / "source.md"
|
||||
source.write_text("# Source\n", encoding="utf-8")
|
||||
add = run_cli(
|
||||
"add-artifact",
|
||||
str(tmp_path / "infospaces" / "pilot"),
|
||||
str(source),
|
||||
"--kind",
|
||||
"source",
|
||||
"--title",
|
||||
"Source",
|
||||
)
|
||||
assert add.returncode == 0, add.stderr
|
||||
assert json.loads(add.stdout)["artifact"]["id"] == "source/source.md"
|
||||
|
||||
inspect = run_cli("inspect", str(tmp_path / "infospaces" / "pilot"))
|
||||
assert inspect.returncode == 0, inspect.stderr
|
||||
payload = json.loads(inspect.stdout)
|
||||
assert payload["config"]["topic"]["domain"] == "Test Domain"
|
||||
assert payload["artifacts"][0]["title"] == "Source"
|
||||
|
||||
|
||||
def test_cli_returns_structured_error(tmp_path: Path) -> None:
|
||||
result = run_cli("inspect", str(tmp_path / "missing"))
|
||||
|
||||
assert result.returncode == 2
|
||||
payload = json.loads(result.stderr)
|
||||
assert payload["error"]["code"] == "missing_infospace"
|
||||
78
tests/test_evaluation.py
Normal file
78
tests/test_evaluation.py
Normal file
@@ -0,0 +1,78 @@
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from infospace_bench.evaluation import (
|
||||
EntityEvaluation,
|
||||
EvaluationSnapshot,
|
||||
MetricValue,
|
||||
ScoreEntry,
|
||||
diff_snapshots,
|
||||
)
|
||||
|
||||
|
||||
def test_entity_evaluation_round_trips_and_computes_overall_score() -> None:
|
||||
evaluated_at = datetime(2026, 5, 14, tzinfo=timezone.utc)
|
||||
evaluation = EntityEvaluation(
|
||||
artifact_id="source/chapter.md",
|
||||
evaluator="test",
|
||||
scores=[
|
||||
ScoreEntry("definition_precision", 4),
|
||||
ScoreEntry("provenance_quality", 5),
|
||||
],
|
||||
evaluated_at=evaluated_at,
|
||||
notes=["clear enough"],
|
||||
)
|
||||
|
||||
payload = evaluation.to_dict()
|
||||
loaded = EntityEvaluation.from_dict(payload)
|
||||
|
||||
assert payload["overall_score"] == 4.5
|
||||
assert loaded == evaluation
|
||||
|
||||
|
||||
def test_snapshot_diff_reports_added_removed_score_and_metric_changes() -> None:
|
||||
now = datetime(2026, 5, 14, tzinfo=timezone.utc)
|
||||
before = EvaluationSnapshot(
|
||||
snapshot_id="before",
|
||||
created_at=now,
|
||||
schema_name="baseline",
|
||||
artifact_count=1,
|
||||
artifact_evaluations=[
|
||||
EntityEvaluation(
|
||||
artifact_id="source/a.md",
|
||||
evaluator="test",
|
||||
scores=[ScoreEntry("quality", 3)],
|
||||
evaluated_at=now,
|
||||
)
|
||||
],
|
||||
collection_metrics=[MetricValue("coverage_ratio", 0.5)],
|
||||
)
|
||||
after = EvaluationSnapshot(
|
||||
snapshot_id="after",
|
||||
created_at=now,
|
||||
schema_name="baseline",
|
||||
artifact_count=1,
|
||||
artifact_evaluations=[
|
||||
EntityEvaluation(
|
||||
artifact_id="source/a.md",
|
||||
evaluator="test",
|
||||
scores=[ScoreEntry("quality", 4)],
|
||||
evaluated_at=now,
|
||||
),
|
||||
EntityEvaluation(
|
||||
artifact_id="source/b.md",
|
||||
evaluator="test",
|
||||
scores=[ScoreEntry("quality", 5)],
|
||||
evaluated_at=now,
|
||||
),
|
||||
],
|
||||
collection_metrics=[MetricValue("coverage_ratio", 0.75)],
|
||||
)
|
||||
|
||||
diff = diff_snapshots(before, after)
|
||||
|
||||
assert diff.added_artifacts == ["source/b.md"]
|
||||
assert diff.removed_artifacts == []
|
||||
assert diff.score_changes[0].artifact_id == "source/a.md"
|
||||
assert diff.score_changes[0].delta == 1
|
||||
assert diff.metric_changes[0].name == "coverage_ratio"
|
||||
assert diff.metric_changes[0].delta == 0.25
|
||||
59
tests/test_inspection.py
Normal file
59
tests/test_inspection.py
Normal file
@@ -0,0 +1,59 @@
|
||||
from infospace_bench.checks import run_collection_checks
|
||||
from infospace_bench.inspection import export_mermaid, relationship_summary
|
||||
from infospace_bench.models import KnowledgeArtifact, ViabilityThreshold
|
||||
from infospace_bench.viability import evaluate_viability
|
||||
|
||||
|
||||
def artifacts() -> list[KnowledgeArtifact]:
|
||||
return [
|
||||
KnowledgeArtifact(
|
||||
id="source/a.md",
|
||||
path="artifacts/sources/a.md",
|
||||
kind="source",
|
||||
title="A",
|
||||
relationships=[{"type": "supports", "target": "generated/b.md"}],
|
||||
),
|
||||
KnowledgeArtifact(
|
||||
id="generated/b.md",
|
||||
path="artifacts/generated/b.md",
|
||||
kind="generated",
|
||||
title="B",
|
||||
relationships=[{"type": "refines", "target": "source/a.md"}],
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
def test_collection_checks_produce_viability_metrics() -> None:
|
||||
report = run_collection_checks(artifacts())
|
||||
|
||||
assert report.metrics["redundancy_ratio"] == 0
|
||||
assert report.metrics["coverage_ratio"] == 1
|
||||
assert report.metrics["coherence_components"] == 1
|
||||
assert report.metrics["consistency_cycles"] == 1
|
||||
assert report.metrics["granularity_entropy"] == 1
|
||||
|
||||
|
||||
def test_viability_reports_per_threshold_and_overall_result() -> None:
|
||||
report = evaluate_viability(
|
||||
{"coverage_ratio": 0.75, "consistency_cycles": 1},
|
||||
{
|
||||
"coverage_ratio": ViabilityThreshold(min=0.5),
|
||||
"consistency_cycles": ViabilityThreshold(max=0),
|
||||
},
|
||||
)
|
||||
|
||||
assert report.passed is False
|
||||
assert report.results["coverage_ratio"].passed is True
|
||||
assert report.results["consistency_cycles"].passed is False
|
||||
|
||||
|
||||
def test_relationship_summary_and_mermaid_export() -> None:
|
||||
summary = relationship_summary(artifacts())
|
||||
|
||||
assert summary.node_count == 2
|
||||
assert summary.edge_count == 2
|
||||
assert summary.relationship_types == {"refines": 1, "supports": 1}
|
||||
|
||||
mermaid = export_mermaid(summary)
|
||||
assert "source/a.md -->|supports| generated/b.md" in mermaid
|
||||
assert "generated/b.md -->|refines| source/a.md" in mermaid
|
||||
91
tests/test_lifecycle.py
Normal file
91
tests/test_lifecycle.py
Normal file
@@ -0,0 +1,91 @@
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from infospace_bench import (
|
||||
InfospaceError,
|
||||
add_artifact,
|
||||
create_infospace,
|
||||
load_infospace,
|
||||
)
|
||||
|
||||
|
||||
def test_create_infospace_writes_layout_and_loadable_config(tmp_path: Path) -> None:
|
||||
infospace = create_infospace(
|
||||
tmp_path,
|
||||
"wealth-vsm",
|
||||
name="Wealth of Nations through VSM",
|
||||
topic_domain="Classical Economics",
|
||||
)
|
||||
|
||||
root = tmp_path / "infospaces" / "wealth-vsm"
|
||||
assert infospace.root == root
|
||||
assert (root / "infospace.yaml").is_file()
|
||||
assert (root / "artifacts" / "sources").is_dir()
|
||||
assert (root / "artifacts" / "generated").is_dir()
|
||||
assert (root / "output" / "evaluations").is_dir()
|
||||
assert (root / "output" / "metrics").is_dir()
|
||||
assert (root / "reports").is_dir()
|
||||
assert (root / "exports").is_dir()
|
||||
|
||||
loaded = load_infospace(root)
|
||||
|
||||
assert loaded.config.slug == "wealth-vsm"
|
||||
assert loaded.config.name == "Wealth of Nations through VSM"
|
||||
assert loaded.config.topic.domain == "Classical Economics"
|
||||
assert loaded.config.topic.sources == "artifacts/sources"
|
||||
assert loaded.artifacts == []
|
||||
|
||||
|
||||
def test_create_infospace_rejects_unsafe_slug_with_structured_error(tmp_path: Path) -> None:
|
||||
with pytest.raises(InfospaceError) as raised:
|
||||
create_infospace(tmp_path, "../outside", name="Nope")
|
||||
|
||||
assert raised.value.code == "invalid_slug"
|
||||
assert raised.value.detail["slug"] == "../outside"
|
||||
|
||||
|
||||
def test_load_infospace_reports_missing_config(tmp_path: Path) -> None:
|
||||
root = tmp_path / "infospaces" / "empty"
|
||||
root.mkdir(parents=True)
|
||||
|
||||
with pytest.raises(InfospaceError) as raised:
|
||||
load_infospace(root)
|
||||
|
||||
assert raised.value.code == "missing_config"
|
||||
assert "infospace.yaml" in raised.value.message
|
||||
|
||||
|
||||
def test_add_artifact_copies_file_and_updates_manifest(tmp_path: Path) -> None:
|
||||
create_infospace(tmp_path, "pilot", name="Pilot Infospace")
|
||||
source = tmp_path / "chapter.md"
|
||||
source.write_text("# Chapter\n\nSource text.\n", encoding="utf-8")
|
||||
|
||||
artifact = add_artifact(
|
||||
tmp_path / "infospaces" / "pilot",
|
||||
source,
|
||||
kind="source",
|
||||
title="Chapter 1",
|
||||
)
|
||||
|
||||
assert artifact.id == "source/chapter.md"
|
||||
assert artifact.path == "artifacts/sources/chapter.md"
|
||||
assert (tmp_path / "infospaces" / "pilot" / artifact.path).read_text(
|
||||
encoding="utf-8"
|
||||
).startswith("# Chapter")
|
||||
|
||||
loaded = load_infospace(tmp_path / "infospaces" / "pilot")
|
||||
assert [item.id for item in loaded.artifacts] == ["source/chapter.md"]
|
||||
assert loaded.artifacts[0].title == "Chapter 1"
|
||||
|
||||
|
||||
def test_add_artifact_rejects_duplicate_manifest_entry(tmp_path: Path) -> None:
|
||||
create_infospace(tmp_path, "pilot", name="Pilot Infospace")
|
||||
source = tmp_path / "chapter.md"
|
||||
source.write_text("# Chapter\n", encoding="utf-8")
|
||||
add_artifact(tmp_path / "infospaces" / "pilot", source, kind="source")
|
||||
|
||||
with pytest.raises(InfospaceError) as raised:
|
||||
add_artifact(tmp_path / "infospaces" / "pilot", source, kind="source")
|
||||
|
||||
assert raised.value.code == "duplicate_artifact"
|
||||
27
tests/test_reference_pilot.py
Normal file
27
tests/test_reference_pilot.py
Normal file
@@ -0,0 +1,27 @@
|
||||
from pathlib import Path
|
||||
|
||||
from infospace_bench import load_infospace
|
||||
from infospace_bench.checks import run_collection_checks
|
||||
from infospace_bench.viability import evaluate_viability
|
||||
|
||||
|
||||
def test_reference_pilot_is_loadable_and_viable() -> None:
|
||||
root = Path("infospaces/bootstrap-pilot")
|
||||
|
||||
infospace = load_infospace(root)
|
||||
metrics = run_collection_checks(infospace.artifacts).metrics
|
||||
viability = evaluate_viability(metrics, infospace.config.viability)
|
||||
|
||||
assert infospace.config.slug == "bootstrap-pilot"
|
||||
assert len(infospace.artifacts) == 4
|
||||
assert metrics["coverage_ratio"] == 1
|
||||
assert metrics["coherence_components"] == 1
|
||||
assert viability.passed is True
|
||||
|
||||
|
||||
def test_reference_pilot_has_traceable_decision_and_report() -> None:
|
||||
decision = Path("docs/reference-pilot-decision.md")
|
||||
report = Path("infospaces/bootstrap-pilot/reports/baseline-inspection.md")
|
||||
|
||||
assert "small purpose-built corpus" in decision.read_text(encoding="utf-8")
|
||||
assert "Migration from markitect-main" in report.read_text(encoding="utf-8")
|
||||
@@ -4,11 +4,11 @@ type: workplan
|
||||
title: "Infospace Lifecycle Baseline"
|
||||
domain: markitect
|
||||
repo: infospace-bench
|
||||
status: planned
|
||||
status: done
|
||||
owner: markitect
|
||||
topic_slug: markitect
|
||||
created: "2026-05-03"
|
||||
updated: "2026-05-03"
|
||||
updated: "2026-05-14"
|
||||
state_hub_workstream_slug: "ib-wp-0002-infospace-lifecycle-baseline"
|
||||
state_hub_workstream_id: "b5972baf-1fb4-4375-a8e3-6e6d6d6fb97c"
|
||||
---
|
||||
@@ -34,7 +34,7 @@ inspecting, and exporting infospaces.
|
||||
|
||||
```task
|
||||
id: IB-WP-0002-T01
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "c8aaa52e-db89-42e3-9ebc-cb89be3f4d30"
|
||||
```
|
||||
@@ -47,7 +47,7 @@ state_hub_task_id: "c8aaa52e-db89-42e3-9ebc-cb89be3f4d30"
|
||||
|
||||
```task
|
||||
id: IB-WP-0002-T02
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "a96c3439-89e4-40d0-8731-fe6b39a8f451"
|
||||
```
|
||||
@@ -61,7 +61,7 @@ state_hub_task_id: "a96c3439-89e4-40d0-8731-fe6b39a8f451"
|
||||
|
||||
```task
|
||||
id: IB-WP-0002-T03
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "36ad8a84-f4fd-48ca-bc6f-94827ac03481"
|
||||
```
|
||||
@@ -74,7 +74,7 @@ state_hub_task_id: "36ad8a84-f4fd-48ca-bc6f-94827ac03481"
|
||||
|
||||
```task
|
||||
id: IB-WP-0002-T04
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "bfa70d92-c896-4133-8b93-0ece3e53c649"
|
||||
```
|
||||
|
||||
@@ -4,11 +4,11 @@ type: workplan
|
||||
title: "Evaluation And Inspection Framework"
|
||||
domain: markitect
|
||||
repo: infospace-bench
|
||||
status: planned
|
||||
status: done
|
||||
owner: markitect
|
||||
topic_slug: markitect
|
||||
created: "2026-05-03"
|
||||
updated: "2026-05-03"
|
||||
updated: "2026-05-14"
|
||||
state_hub_workstream_slug: "ib-wp-0003-evaluation-and-inspection"
|
||||
state_hub_workstream_id: "bc368ba0-9fd7-4821-a5d7-e5c301faa80a"
|
||||
---
|
||||
@@ -32,7 +32,7 @@ application behavior.
|
||||
|
||||
```task
|
||||
id: IB-WP-0003-T01
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "9bab4b20-3fef-469e-9ce2-f0db3e05e26a"
|
||||
```
|
||||
@@ -45,7 +45,7 @@ state_hub_task_id: "9bab4b20-3fef-469e-9ce2-f0db3e05e26a"
|
||||
|
||||
```task
|
||||
id: IB-WP-0003-T02
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "ee335d74-5be3-4b94-91e3-509486909f93"
|
||||
```
|
||||
@@ -58,7 +58,7 @@ state_hub_task_id: "ee335d74-5be3-4b94-91e3-509486909f93"
|
||||
|
||||
```task
|
||||
id: IB-WP-0003-T03
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "d46b3429-37ef-4375-96e1-304eabf2cc13"
|
||||
```
|
||||
@@ -70,7 +70,7 @@ state_hub_task_id: "d46b3429-37ef-4375-96e1-304eabf2cc13"
|
||||
|
||||
```task
|
||||
id: IB-WP-0003-T04
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "de4f45e4-81a1-4ddb-98de-15e99ed5605a"
|
||||
```
|
||||
|
||||
@@ -4,11 +4,11 @@ type: workplan
|
||||
title: "Reference Infospace Pilot"
|
||||
domain: markitect
|
||||
repo: infospace-bench
|
||||
status: planned
|
||||
status: done
|
||||
owner: markitect
|
||||
topic_slug: markitect
|
||||
created: "2026-05-03"
|
||||
updated: "2026-05-03"
|
||||
updated: "2026-05-14"
|
||||
state_hub_workstream_slug: "ib-wp-0004-reference-infospace-pilot"
|
||||
state_hub_workstream_id: "8940a180-646a-4b20-b41f-c56719adfb0e"
|
||||
---
|
||||
@@ -17,9 +17,10 @@ state_hub_workstream_id: "8940a180-646a-4b20-b41f-c56719adfb0e"
|
||||
|
||||
## Goal
|
||||
|
||||
Create the first maintained reference infospace in this repo, using
|
||||
`markitect-main/examples/infospace-with-history/` as the primary migration
|
||||
candidate.
|
||||
Create the first maintained reference infospace in this repo. The large
|
||||
`markitect-main/examples/infospace-with-history/` corpus remains the primary
|
||||
future migration candidate; this workplan starts with a small purpose-built
|
||||
bootstrap corpus so baseline behavior is easy to inspect.
|
||||
|
||||
## Tasks
|
||||
|
||||
@@ -27,7 +28,7 @@ candidate.
|
||||
|
||||
```task
|
||||
id: IB-WP-0004-T01
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "5042482f-a14c-4ae5-8d1a-19039dc97010"
|
||||
```
|
||||
@@ -40,7 +41,7 @@ state_hub_task_id: "5042482f-a14c-4ae5-8d1a-19039dc97010"
|
||||
|
||||
```task
|
||||
id: IB-WP-0004-T02
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "a0909309-775b-4220-94f5-6ca811d44caf"
|
||||
```
|
||||
@@ -53,7 +54,7 @@ state_hub_task_id: "a0909309-775b-4220-94f5-6ca811d44caf"
|
||||
|
||||
```task
|
||||
id: IB-WP-0004-T03
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "10c001a3-ce42-4f1b-8b9a-9965844a94cb"
|
||||
```
|
||||
@@ -66,7 +67,7 @@ state_hub_task_id: "10c001a3-ce42-4f1b-8b9a-9965844a94cb"
|
||||
|
||||
```task
|
||||
id: IB-WP-0004-T04
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "b5e43c2a-b29c-4cb6-b33c-b4db631d0079"
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user