generated from coulomb/repo-seed
9404831069f0d2df61829929c40806420ce440ce
fix(evaluation_io): tolerate code-fenced frontmatter and varied score
shapes from small LLMs
Two bugs surfaced running the first live Lefevre chapter-I smoke
against openai/gpt-4o-mini.
1. _relative_to_root doubled artifact paths when --workspace was a
relative path (e.g. "."). The function received an already-CWD-
relative path like infospaces/foo/artifacts/sources/x.md and
re-prepended root, producing infospaces/foo/infospaces/foo/...
stored in artifacts/index.yaml — which then failed file reads on
the subsequent workflow stage. Fix: when raw is relative, try
CWD-relative resolution first (matches root / sub call shapes);
fall back to root-prefixing only when the CWD interpretation does
not land under root (matches bare relative-subpath call shapes
from rendered template outputs).
2. _read_frontmatter_markdown only accepted a literal ---/---
delimited block at the start of the file. gpt-4o-mini emitted three
other shapes across the seven evaluation files this chapter
produced:
- ```yaml ... ``` fence (no --- delimiters)
- ```markdown ... ``` outer fence wrapping --- frontmatter
- scores as mapping ({groundedness: 4, ...}) instead of the
canonical list of {name, value} dicts
- scores as list of single-key dicts ([{groundedness: 4}, ...])
Fix: _extract_frontmatter_block tolerates ```yaml fences and strips
```markdown outer fences; _normalise_scores rewrites mapping- and
single-key-dict shapes into the canonical form so ScoreEntry.from_dict
keeps working.
Both fixes are pure-Python; no API changes. 179 tests pass, 2 skipped.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
infospace-bench
Workspace and service for creating, developing, evaluating, and inspecting structured knowledge spaces.
This repo is the application-layer successor for the infospace work that began
inside markitect-main. It focuses on concrete infospaces and their lifecycle,
while lower-level markdown tooling and runtime orchestration remain in sibling
projects.
Start with:
INTENT.mdwiki/ProductRequirementsDocument.mdwiki/FunctionalRequirementsSpecification.mdSCOPE.mddocs/infospace-layout.mddocs/evaluation-and-inspection.mddocs/reference-pilot-decision.mddocs/markitect-main-scope-assessment.mddocs/markitect-tool-adapter.mddocs/entity-relation-model.mddocs/evaluation-history-and-metrics.mddocs/workflow-generation-pipeline.mddocs/kontextual-engine-boundary.mddocs/orthogonal-successor-roadmap.mddocs/legacy-infospace-feature-inventory.mddocs/successor-boundary-interface-map.mddocs/replacement-acceptance-matrix.mddocs/legacy-command-parity.mddocs/legacy-infospace-migration-guide.mddocs/replacement-readiness-decision.mddocs/wealth-vsm-generation-pipeline.mddocs/generic-source-generator.mddocs/agentic-memory-profile-pilot.mddocs/lefevre-epub3-validation.mdinfospaces/bootstrap-pilot/infospaces/wealth-vsm-legacy-slice/infospaces/wealth-vsm-generation-pilot/infospaces/agentic-memory-profile-pilot/workplans/
Current development command:
python3 -m pytest
Languages
Python
99.9%
Makefile
0.1%