3 Commits

Author SHA1 Message Date
f325f89dc9 feat(infospace): evaluate 3 missing WoN entities (C.1)
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Fills the 988 entity / 985 evaluation gap in the Wealth of Nations
infospace. Entities advanced_state_of_society, bank_notes, and
bank_systemic_risk_management had no evaluation files; runs through
Gemini (2.5-flash / 2.5-flash-lite for the last one, which hit the
free-tier RPM limit) bring the eval count to 988.

per_entity_mean nudged from 3.955635 to 3.95668; viability still
6/6 PASS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 23:52:04 +02:00
36a5136bdf docs(infospace): add advanced-usage, composition guide, and performance notes (C.4/C.5/C.6)
Closes out three docs tasks from roadmap/infospace-s3-closeout/PLAN.md:

- examples/infospace-with-history/docs/advanced-usage.md (C.4) — 5 worked
  patterns covering incremental eval, re-eval workflow (no --force flag
  exists; documents the rm-then-re-run pattern instead), interpreting the
  eval-summary distribution, triaging low scorers via an awk pipeline
  over overall_score (since `entities --sort-by score` does not exist),
  and acting on check --json output.
- docs/composition-guide.md (C.5) — walks through how supply-chain-vsm
  binds WoN as a discipline, then a step-by-step for creating a new
  infospace that binds an existing one. Includes live output from
  `markitect infospace disciplines`.
- examples/infospace-with-history/docs/performance-notes.md (C.6) — cites
  the 6h 28m wall time of the 985-entity S3.3 batch, ~2.5 ent/min rate,
  ~2000–3000 tokens/entity estimate, word_overlap vs embedding backend
  for redundancy checks, and a provider-by-scale recommendation table.

All commands in these docs were run against the live infospace at
commit time.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 07:02:46 +02:00
b7e11461f4 chore: rename markitect_project to markitect-main across project
Finishes the in-progress rename so docs, configs, tests, and capability
manifests all reference the current repo name consistently. Fixes two
tests (test_roundtrip_consolidated.py, test_issue_140_roundtrip_simplified.py)
whose hardcoded cwd paths would have broken under the renamed directory.

Archival content under history/, reports/, and roadmap/eat-the-frog/, plus
derived artifacts (.venv_old/, node_modules/, asset_registry.json) are
intentionally left untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:57:35 +02:00
26 changed files with 677 additions and 40 deletions

View File

@@ -10,7 +10,7 @@ principles with strict separation of concerns.
## Directory Structure & Clean Architecture
```
markitect_project/
markitect-main/
├── domain/ # Business logic (innermost layer)
├── application/ # Use cases and workflows
├── infrastructure/ # External interfaces (database, file system)

2
.gitmodules vendored
View File

@@ -1,6 +1,6 @@
[submodule "wiki"]
path = wiki
url = http://92.205.130.254:32166/coulomb/markitect_project.wiki.git
url = http://92.205.130.254:32166/coulomb/markitect-main.wiki.git
branch = main
[submodule "capabilities/kaizen-agentic"]
path = capabilities/kaizen-agentic

View File

@@ -457,7 +457,7 @@ Sister projects can reuse these capabilities directly:
Install capabilities via local file references:
```toml
[project.dependencies]
release-management = {path = "../markitect_project/capabilities/release-management"}
release-management = {path = "../markitect-main/capabilities/release-management"}
```
### Shared Infrastructure

View File

@@ -15,7 +15,7 @@ You are responsible for:
### Directory Structure
```
markitect_project/
markitect-main/
├── Makefile # Main project Makefile
├── scripts/
│ └── capability_discovery.mk # Auto-discovery and delegation system

View File

@@ -7,7 +7,7 @@ detachment:
capability_name: issue-facade
capability_family: issue-tracking
integration_pattern: capabilities-directory
original_location: /home/worsch/markitect_project/capabilities/issue-facade
original_location: /home/worsch/markitect-main/capabilities/issue-facade
capability_metadata:
spec_file: CAPABILITY-issue-tracking.yaml
@@ -17,23 +17,23 @@ capability_metadata:
integration_details:
parent_project: capabilities
parent_path: /home/worsch/markitect_project/capabilities
parent_path: /home/worsch/markitect-main/capabilities
re_integration_guide: |
To re-integrate this capability using the new architecture:
# Option 1: Git submodule (recommended)
cd /home/worsch/markitect_project/capabilities
cd /home/worsch/markitect-main/capabilities
git submodule add <repo-url> _issue-facade
pip install -e _issue-facade/
# Option 2: Clone directly
cd /home/worsch/markitect_project/capabilities
cd /home/worsch/markitect-main/capabilities
git clone <repo-url> _issue-facade
pip install -e _issue-facade/
# Option 3: Copy into project
cd /home/worsch/markitect_project/capabilities
cd /home/worsch/markitect-main/capabilities
cp -r /path/to/issue-facade _issue-facade
pip install -e _issue-facade/

View File

@@ -8,7 +8,7 @@ This test module validates outline mode schema generation improvements including
- Content instruction integration
- End-to-end workflow from example document to generated drafts
Created for Issue #46: https://gitea.coulomb.social/coulomb/markitect_project/issues/46
Created for Issue #46: https://gitea.coulomb.social/coulomb/markitect-main/issues/46
"""
import pytest

View File

@@ -209,7 +209,7 @@ tests/
## 🎯 Detailed File Structure After Migration
```
markitect_project/
markitect-main/
├── capabilities/
│ └── release-management/
│ ├── README.md ✅ CREATED

View File

@@ -162,7 +162,7 @@ clean_before_build = true
[tool.release-management.registries.gitea]
url = "http://92.205.130.254:32166"
owner = "coulomb"
repo = "markitect_project"
repo = "markitect-main"
auth_token_env = "GITEA_API_TOKEN"
[tool.release-management.registries.pypi]

View File

@@ -141,7 +141,7 @@ make release-publish VERSION=0.8.0
## Registry Information
- **Gitea URL**: http://92.205.130.254:32166
- **Repository**: coulomb/markitect_project
- **Repository**: coulomb/markitect-main
- **PyPI Registry URL**: http://92.205.130.254:32166/api/packages/coulomb/pypi
- **Package List URL**: http://92.205.130.254:32166/api/v1/packages/coulomb

View File

@@ -8,7 +8,7 @@
```bash
# ❌ WRONG - Don't edit capability files from main repo
cd /home/worsch/markitect_project/capabilities/testdrive-jsui
cd /home/worsch/markitect-main/capabilities/testdrive-jsui
vim src/testdrive_jsui/core.py # DON'T DO THIS!
# ✅ CORRECT - Use separate Claude instance/session
@@ -29,7 +29,7 @@ cd /path/to/work/testdrive-jsui
| Session | Purpose | Location |
|---------|---------|----------|
| **Main Repo** | Integration, configuration | `/home/worsch/markitect_project` |
| **Main Repo** | Integration, configuration | `/home/worsch/markitect-main` |
| **Capability** | Feature development, bugs | Separate clone or `capabilities/capability-name` |
**Why?** Prevents accidental cross-contamination and respects repository boundaries.
@@ -40,7 +40,7 @@ cd /path/to/work/testdrive-jsui
```bash
# After pushing changes to capability repo
cd /home/worsch/markitect_project
cd /home/worsch/markitect-main
git submodule update --remote capabilities/testdrive-jsui
git add capabilities/testdrive-jsui
git commit -m "chore: update testdrive-jsui to latest"
@@ -50,7 +50,7 @@ git push
### Add New Capability
```bash
cd /home/worsch/markitect_project
cd /home/worsch/markitect-main
# Add as submodule
git submodule add http://gitea/coulomb/new-capability.git capabilities/new-capability
@@ -67,7 +67,7 @@ git commit -m "feat: add new-capability submodule"
```bash
# Option 1: In submodule directory (careful!)
cd /home/worsch/markitect_project/capabilities/testdrive-jsui
cd /home/worsch/markitect-main/capabilities/testdrive-jsui
git checkout -b feature-branch
# make changes
git commit -m "feat: new feature"
@@ -86,7 +86,7 @@ git push origin feature-branch
### Check Capability Status
```bash
cd /home/worsch/markitect_project
cd /home/worsch/markitect-main
# List all capabilities
make capabilities-list

View File

@@ -9,7 +9,7 @@ MarkiTect is a markdown processing toolkit with transclusion, schema validation,
## Current Directory Structure
```
markitect_project/
markitect-main/
├── markitect/ # Main package
│ ├── [34 root-level .py files] # Core functionality (see below)
│ ├── assets/ # Asset discovery, management, caching (21 files)

View File

@@ -8,7 +8,7 @@ MarkiTect uses a **capabilities-based architecture** where functionality is orga
### 1. **Separation of Concerns**
**Critical Rule:** The main repository (`markitect_project`) **MUST NOT** directly modify capability code.
**Critical Rule:** The main repository (`markitect-main`) **MUST NOT** directly modify capability code.
-**DO**: Use capabilities as dependencies
-**DO**: Configure capabilities through documented interfaces
@@ -28,7 +28,7 @@ MarkiTect uses a **capabilities-based architecture** where functionality is orga
Capabilities are integrated as **git submodules**, not regular directories:
```
markitect_project/
markitect-main/
├── .gitmodules # Submodule configuration
├── capabilities/
│ ├── testdrive-jsui/ # Git submodule → separate repo
@@ -80,8 +80,8 @@ engine.render_document(content, mode='edit', config=config)
#### Main Repository Session
```bash
# In markitect_project/
cd /home/worsch/markitect_project
# In markitect-main/
cd /home/worsch/markitect-main
# Main repo tasks:
# - Integrate capabilities
@@ -93,7 +93,7 @@ cd /home/worsch/markitect_project
#### Capability Session
```bash
# In capability repository
cd /home/worsch/markitect_project/capabilities/testdrive-jsui
cd /home/worsch/markitect-main/capabilities/testdrive-jsui
# OR clone separately
git clone http://gitea/coulomb/testdrive-jsui.git
@@ -122,7 +122,7 @@ cd testdrive-jsui
2. **Update main project** (different Claude instance)
```bash
cd /home/worsch/markitect_project
cd /home/worsch/markitect-main
git submodule update --remote capabilities/testdrive-jsui
git commit -m "chore: update testdrive-jsui submodule"
```
@@ -139,7 +139,7 @@ When a capability releases a new version:
```bash
# In main repo
cd /home/worsch/markitect_project
cd /home/worsch/markitect-main
# Update specific capability
cd capabilities/testdrive-jsui
@@ -160,7 +160,7 @@ git commit -am "chore: update all capabilities"
# http://gitea/coulomb/new-capability
# 2. Add as submodule to main repo
cd /home/worsch/markitect_project
cd /home/worsch/markitect-main
git submodule add http://gitea/coulomb/new-capability.git capabilities/new-capability
# 3. Add dependency to pyproject.toml
@@ -324,7 +324,7 @@ def test_testdrive_jsui_integration():
1. **Create separate git repo**
```bash
cd /tmp
cp -r markitect_project/capabilities/capability-name capability-name
cp -r markitect-main/capabilities/capability-name capability-name
cd capability-name
git init
git add .
@@ -335,7 +335,7 @@ def test_testdrive_jsui_integration():
2. **Remove from main repo**
```bash
cd markitect_project
cd markitect-main
git rm -rf capabilities/capability-name
git commit -m "chore: remove capability-name for submodule conversion"
```

203
docs/composition-guide.md Normal file
View File

@@ -0,0 +1,203 @@
# Infospace Composition Guide
One completed, viable infospace can be reused as a **discipline** for
another infospace — a lens applied to a different topic. This guide
explains how composition works and walks through the live
`examples/supply-chain-vsm/` reference.
---
## What composition means
An **infospace** is a directory of typed entities governed by
`infospace.yaml`. Its entities and relations describe a specific topic
(for example, Adam Smith's *Wealth of Nations*).
A **discipline** is an infospace declared as a reusable analytical
framework by another infospace. When infospace B binds infospace A as a
discipline:
1. B's entities can reference A's entities in `## WoN Concept` (or
equivalent) sections.
2. Properties A has already computed on its entities — such as VSM system
placement — become available to B by transitivity through the mapping.
3. B can impose its own viability thresholds independently of A's. The two
infospaces each pass or fail viability on their own terms.
The binding is declarative: a relative path in `infospace.yaml` plus a
display name. No code. No import. The discipline is looked up on disk at
the declared path when B's commands run.
---
## The viability pre-condition
Binding a non-viable infospace as a discipline is a mistake: a framework
that fails its own thresholds is not a stable reference frame. Before
binding, confirm the candidate discipline is viable:
```bash
cd examples/infospace-with-history
markitect infospace viability
```
```
Metric Value Threshold Status
---------------------------------------------------------------
redundancy_ratio 0.0061 max=0.1 PASS
coverage_ratio 0.6190 min=0.4 PASS
coherence_components 0.0000 max=3 PASS
consistency_cycles 0.0000 max=0 PASS
granularity_entropy 2.6748 min=1.0 PASS
per_entity_mean 3.9556 min=3.5 PASS
Viable: YES (6/6 thresholds met)
```
If the discipline is not viable, fix it first (see
`examples/infospace-with-history/docs/advanced-usage.md` §4 for triaging
low scorers).
---
## Example — how `supply-chain-vsm` binds WoN
The supply-chain infospace declares WoN as a discipline in its
`infospace.yaml`:
```yaml
topic:
name: "Modern Supply Chain Management"
domain: "Operations Management"
sources: artifacts/sources/
disciplines:
- name: "Wealth of Nations"
path: ../infospace-with-history
```
The binding is a **relative path**, so the two infospaces travel together
(they can be moved as a pair without breaking the link).
Verify the binding resolves and the discipline is viable:
```bash
cd examples/supply-chain-vsm
markitect infospace disciplines
```
```
Name Entities Viable Path
----------------------------------------------------------------------
Wealth of Nations 988 YES ../infospace-with-history
```
Each supply-chain entity then carries a `## WoN Concept` section
mapping it to exactly one WoN entity. The consolidated mapping files
(`output/mappings/*-mappings.md`) record the pairing, rationale, and a
conceptual-continuity rating (Strong / Moderate / Weak):
| Supply Chain Entity | WoN Concept | Strength | VSM |
|------------------------------|----------------------------------|----------|-------|
| Demand Signal | Effectual Demand | Strong | S2 |
| Vendor-Managed Inventory | Division of Labour | Strong | S1/S2 |
| Just-in-Time Inventory | Circulating Capital | Strong | S1/S3 |
| Bullwhip Effect | Natural Price as Central Price | Moderate | S2 |
| Safety Stock | Accumulation of Stock | Moderate | S3 |
Because each WoN entity already has a VSM system placement (S1S5), the
supply-chain entities inherit a VSM position by transitivity through
their mapping — without supply-chain-vsm needing its own VSM reference.
---
## Creating a new infospace that binds an existing one
Step-by-step, using WoN as the discipline for a hypothetical "Modern
Monetary Policy" infospace:
### 1. Start from the target topic
```bash
mkdir -p examples/monetary-policy/artifacts/sources
cd examples/monetary-policy
markitect infospace init
```
### 2. Declare the discipline in `infospace.yaml`
```yaml
topic:
name: "Modern Monetary Policy"
domain: "Macroeconomics"
sources: artifacts/sources/
disciplines:
- name: "Wealth of Nations"
path: ../infospace-with-history
```
Alternatively, bind imperatively after `init`:
```bash
markitect infospace bind-discipline ../infospace-with-history --name "Wealth of Nations"
```
### 3. Set your own viability thresholds
Copy the `viability:` block from a reference infospace and tune the
numbers to the scale and maturity of your topic. A smaller infospace
(50 entities, not 988) may need laxer `coverage_ratio` and stricter
`redundancy_ratio`.
### 4. Verify the binding
```bash
markitect infospace disciplines
```
If `Viable` is `NO`, stop and fix the discipline before continuing.
### 5. Reference discipline entities in your own entities
For each entity in the new infospace, add a `## <Discipline> Concept`
section that names the WoN entity the concept maps to, plus a rationale.
The exact section heading is configured per schema — see
`schemas/won-mapping-schema-v1.0.md` in `supply-chain-vsm` for the
template used there.
### 6. Run checks and evaluate
```bash
markitect infospace check
markitect infospace evaluate --provider openrouter
markitect infospace eval-summary --update-metrics
markitect infospace viability
```
The new infospace passes or fails viability independently of WoN.
---
## Why composition, not inclusion?
An alternative would be to copy WoN entities directly into the target
infospace. Composition avoids that by design:
- **One source of truth** — if WoN is refined, every infospace that binds
it picks up the improvement on the next run without a sync step.
- **Separation of concerns** — each infospace owns its own schema,
thresholds, and entity set. Changing the target topic cannot pollute
the discipline.
- **Bounded dependency** — the binding is a path, so the coupling is
visible in one place (`infospace.yaml`) and easy to remove.
---
## See also
- `examples/supply-chain-vsm/README.md` — the full reference composition.
- `examples/supply-chain-vsm/output/mappings/` — consolidated mapping
files showing the rationale and strength rating for each pairing.
- `examples/infospace-with-history/docs/advanced-usage.md` — patterns for
maintaining the discipline once it is in use.

View File

@@ -117,7 +117,7 @@ This graph enables:
```bash
# Ensure MarkiTect is installed
cd /path/to/markitect_project
cd /path/to/markitect-main
pip install -e .
```

View File

@@ -0,0 +1,179 @@
# Advanced Usage — Wealth of Nations Infospace
Patterns for working with the WoN infospace (988 entities) after the initial
pipeline run. Every command in this file has been run against the actual
infospace at the time of writing (2026-04-21); output shapes are excerpted
verbatim.
All commands assume `cwd = examples/infospace-with-history` and the
`markitect-venv` Python environment.
---
## 1. Incremental evaluation — add entities after the initial run
`markitect infospace evaluate` writes one file per entity under
`output/evaluations/<slug>.md`. It skips any entity whose evaluation file
already exists, so re-running after adding a new entity processes only the
new one.
```bash
# Add a new entity file
vim output/entities/new-concept.md
# Evaluate only the new entity (explicit)
markitect infospace evaluate --entity new-concept --provider openrouter
# Or re-run the whole pass — existing 988 are skipped, only the new file hits the LLM
markitect infospace evaluate --provider openrouter
```
**How skip detection works.** Evaluation slugs are normalised to underscores
with `_s_` preserving apostrophes (`farmers-capital` entity →
`farmer_s_capital.md` evaluation). If a new entity slug collides with an
existing evaluation under this normalisation, the eval will be skipped.
To be sure an entity was picked up, check:
```bash
# Count entities vs evaluations
ls output/entities/*.md | grep -Ev 'book-[0-9]+-(chapter-[0-9]+|introduction)-' | wc -l
ls output/evaluations/*.md | wc -l
```
---
## 2. Re-evaluating after guideline changes
`evaluate` has no `--force` flag; re-evaluation requires deleting the
existing file first.
```bash
# Re-evaluate a single entity after updating the evaluation rubric
rm output/evaluations/accumulation_of_stock.md
markitect infospace evaluate --entity accumulation-of-stock --provider openrouter
# Re-evaluate a whole chapter
ls output/entities/book-1-chapter-06-entities.md # see which entities the chapter produced
# Map chapter entities to eval filenames (apostrophe/underscore normalisation) and rm them
```
After re-evaluating, refresh the aggregate:
```bash
markitect infospace eval-summary --update-metrics
```
This merges `per_entity_mean` into `output/metrics/metrics.yaml` so the next
`markitect infospace viability` check reflects the new scores.
---
## 3. Interpreting per-entity score distributions
`eval-summary` shows the mean for each of the five evaluation dimensions
plus the overall range:
```
$ markitect infospace eval-summary
Evaluation summary — 985 entities evaluated
Dimension Mean
--------------------------------------
overall 3.956
definition_precision 3.620
domain_placement 4.559
explanatory_value 3.936
source_grounding 4.358
vsm_relevance 3.305
Range: 1.00 4.80
```
Interpretation:
- `overall` above the 3.5 viability threshold → the collection passes
`per_entity_mean`.
- The lowest dimension (`vsm_relevance` = 3.305) is the weakest signal. If
the collection is meant to be VSM-grounded, this is the dimension most
worth improving (via sharper entity definitions or schema changes).
- A wide range (1.00 4.80) tells you there are outliers at both ends —
worth triaging (see pattern 4).
---
## 4. Triaging low scorers
`markitect infospace entities --by-type` prints each entity's star score
in-line:
```
$ markitect infospace entities --by-type | head
=== Element (315 entities) ===
active_and_productive_stock Accumulation S1 ★4.6
advanced_state_of_society General Theory S5
agio_of_bank_money Exchange S2 ★4.8
```
Entities with no `★` have no evaluation yet. To list the lowest-scoring
entities across the whole collection:
```bash
# Extract overall_score from every evaluation file and sort ascending
for f in output/evaluations/*.md; do
score=$(awk '/^overall_score:/ {print $2; exit}' "$f")
printf "%s\t%s\n" "$score" "$(basename "$f" .md)"
done | sort -n | head -20
```
The 20 lowest scorers are the natural triage list — inspect their
`output/entities/<slug>.md` and evaluation rationales to decide whether to
refine the entity, merge it with a better-formed neighbour, or drop it.
---
## 5. Reading and acting on collection-check output
`markitect infospace check` runs five concerns (C1C5). Use `--concern` to
focus on one and `--json` for machine-readable output:
```bash
# Redundancy — which pairs of entities are suspiciously similar?
markitect infospace check --concern redundancy --json
```
```json
{
"redundancy": {
"concern": "C1",
"redundancy_ratio": 0.0061,
"similar_pairs": [
{"entity_a": "bank_economic_contribution_metrics",
"entity_b": "bank_economic_development_metrics",
"similarity": 1.0, "method": "word_overlap"},
{"entity_a": "economic_system_objectives",
"entity_b": "economic_system_purpose",
"similarity": 0.9394, "method": "word_overlap"}
]
}
}
```
Acting on this:
- **Similarity = 1.0** is almost certainly a duplicate — pick one slug and
merge or delete the other.
- **0.850.99** usually means two entities genuinely cover the same idea
with slight phrasing differences. Merging is the cleanest fix.
- **< 0.85** usually represents legitimate adjacent concepts — leave as-is
unless the definition rubric says otherwise.
For coverage and coherence, the pattern is the same: the `--json` output
surfaces the specific entities / missing links / disconnected components
you need to look at, rather than a bare ratio.
---
## See also
- `METRICS-METHODOLOGY.md` — how each metric is computed.
- `docs/composition-guide.md` — using this infospace as a discipline for a
different domain.
- `docs/performance-notes.md` — observed timings and provider choices.

View File

@@ -0,0 +1,106 @@
# Performance Notes — Wealth of Nations Infospace
Observed timings, file sizes, and provider choices from the 988-entity WoN
example. These are **operational notes**, not a benchmark — numbers come
from the actual S3.3 evaluation run (2026-02-23) rather than a controlled
experiment.
---
## Evaluation batch duration
The initial evaluation pass produced 985 `output/evaluations/*.md` files:
- First `evaluated_at`: `2026-02-23T00:11:52`
- Last `evaluated_at`: `2026-02-23T06:39:45`
- **Total wall time: ~6h 28m**
- **Effective throughput: ~2.5 entities/min** (~152 entities/hour)
Extracted from evaluation frontmatter:
```bash
grep -h '^evaluated_at:' output/evaluations/*.md | sort | sed -n '1p;$p'
```
Caveats:
- This was against OpenRouter's free tier, which applies implicit
rate-limiting and occasional retries.
- Throughput is not constant — gaps between bursts show up as plateaus
when you plot the timestamps.
- The batch was not fully parallelised; a tuned concurrent client could
likely 24× this throughput on a paid OpenRouter tier.
---
## Tokens per entity (estimate)
Direct token counts are not logged in the evaluation files, but the
inputs and outputs are on disk:
- **Input per request**: evaluation schema (~3.7 KB) + entity file
(~0.7 KB median) + fixed system prompt ≈ **~15002500 tokens in**
- **Output per request**: structured evaluation with 5 dimensions and
rationales, median eval file 3.6 KB ≈ **~600800 tokens out**
- **Round-trip total**: **~20003000 tokens per entity**
- **Batch total estimate**: 985 entities × ~2500 tokens ≈ **~2.5M tokens**
for the full pass
The constant per-entity input means the cheapest way to reduce spend on a
re-run is to narrow the targeted entities (`--entity <slug>` or
`--chapter <n>`), not to shorten the schema.
---
## Embedding cache and collection checks
`markitect infospace check --concern redundancy` supports two similarity
backends (see `markitect/infospace/checks/redundancy.py`):
- **`word_overlap`** — the default, used when no embeddings are provided.
Pure-Python set intersection over tokenised entity text. **No LLM calls,
no cache needed.** This is what the current WoN check runs.
- **`embedding`** — active when a pre-computed `{slug: vector}` mapping is
passed in. No persistent on-disk embedding cache exists today; the
caller is responsible for computing and supplying the vectors.
Implication: the 988-entity `check` runs in seconds because it's all
word-overlap. Switching to embedding similarity would add an embedding
API pass (another ~988 requests) which is currently a manual step
outside the CLI.
---
## Provider choice — recommendation
For the WoN dataset specifically (text-heavy entities, 5-dimension
rubric):
| Scale | Recommended provider | Rationale |
|-----------------------|----------------------------------|-----------|
| < 50 entities | `gemini/gemini-2.5-flash` | Fast default; free tier is generous enough; consistent with `markitect llm-check` out of the box. |
| 50 1000 entities | `openrouter` with a `:free` model (e.g. `arcee-ai/trinity-large-preview:free`) | What the S3.3 batch used; gets through 988 entities in one overnight run without cost. |
| > 1000 entities | `openrouter` with a paid small-context model, or `openai` | Free-tier rate limits start to dominate wall time; paying for higher concurrency is cheaper than calendar time. |
All providers are accepted by `markitect infospace evaluate --provider`.
The evaluation schema doesn't assume any provider-specific features.
Note on provider mixing: if part of a collection is evaluated under one
provider/model and the rest under another, `per_entity_mean` can drift
slightly (different models calibrate scores differently). For the
viability threshold of 3.5 the drift is usually negligible, but for
fine-grained outlier analysis prefer a single provider per batch.
---
## What is *not* measured here
- **End-to-end pipeline time** (entity extraction from raw chapters,
classification, relation graph) — only the evaluation phase is timed.
- **Memory footprint** — the full in-memory state for 988 entities is
small (< 200 MB observed), but not systematically measured.
- **Failure/retry rates** — the 985 vs 988 gap is three entities the
original run missed (plus one added later); no structured retry log
was kept.
Expanding any of these into a proper benchmark is **out of scope** for
the WoN example and should live alongside a synthetic corpus that can be
regenerated deterministically.

View File

@@ -0,0 +1,28 @@
---
entity_slug: advanced_state_of_society
evaluator: gemini-2.5-flash
evaluated_at: '2026-04-21T21:32:17.135192'
overall_score: 4.5
scores:
- name: definition_precision
value: 4.0
max_value: 5.0
rationale: The definition is precise, listing key characteristics like accumulated
stock and private property. It clearly distinguishes the concept by contrasting
it with earlier economic conditions.
- name: source_grounding
value: 5.0
max_value: 5.0
rationale: This entity is deeply grounded in Smith's work, particularly in Book
I
---
# Evaluation: Advanced State Of Society
## definition_precision — 4.0 / 5.0
The definition is precise, listing key characteristics like accumulated stock and private property. It clearly distinguishes the concept by contrasting it with earlier economic conditions.
## source_grounding — 5.0 / 5.0
This entity is deeply grounded in Smith's work, particularly in Book I

View File

@@ -0,0 +1,61 @@
---
entity_slug: bank_notes
evaluator: null
evaluated_at: '2026-04-21T21:33:16.736926'
overall_score: 4.4
scores:
- name: definition_precision
value: 5.0
max_value: 5.0
rationale: The definition is precise, clearly distinguishing bank notes by their
issuer, form, and key characteristics (payable on demand, confidence-based). It
avoids circularity and captures a distinct concept.
- name: source_grounding
value: 5.0
max_value: 5.0
rationale: The entity is excellently grounded in "The Wealth of Nations," specifically
Book II, Chapter 2, where Smith extensively discusses bank notes' role in economizing
precious metals and their reliance on public confidence.
- name: domain_placement
value: 4.0
max_value: 5.0
rationale: '"Exchange" is an appropriate domain as bank notes primarily function
as a medium for facilitating transactions. While "Money" or "Finance" could also
fit, "Exchange" accurately reflects their operational role in the economy.'
- name: vsm_relevance
value: 3.0
max_value: 5.0
rationale: Bank notes are a critical *medium* or *tool* that enables the primary
operations (S1) of an economy (i.e., exchange of goods and services). However,
they are not a VSM system or management function themselves, making their direct
mapping somewhat abstract.
- name: explanatory_value
value: 5.0
max_value: 5.0
rationale: This entity offers significant explanatory power by detailing how paper
money functions, its reliance on confidence, and its role in reducing the need
for precious metals, thereby illuminating a key mechanism in Smith's economic
theory.
---
# Evaluation: Bank Notes
## definition_precision — 5.0 / 5.0
The definition is precise, clearly distinguishing bank notes by their issuer, form, and key characteristics (payable on demand, confidence-based). It avoids circularity and captures a distinct concept.
## source_grounding — 5.0 / 5.0
The entity is excellently grounded in "The Wealth of Nations," specifically Book II, Chapter 2, where Smith extensively discusses bank notes' role in economizing precious metals and their reliance on public confidence.
## domain_placement — 4.0 / 5.0
"Exchange" is an appropriate domain as bank notes primarily function as a medium for facilitating transactions. While "Money" or "Finance" could also fit, "Exchange" accurately reflects their operational role in the economy.
## vsm_relevance — 3.0 / 5.0
Bank notes are a critical *medium* or *tool* that enables the primary operations (S1) of an economy (i.e., exchange of goods and services). However, they are not a VSM system or management function themselves, making their direct mapping somewhat abstract.
## explanatory_value — 5.0 / 5.0
This entity offers significant explanatory power by detailing how paper money functions, its reliance on confidence, and its role in reducing the need for precious metals, thereby illuminating a key mechanism in Smith's economic theory.

View File

@@ -0,0 +1,60 @@
---
entity_slug: bank_systemic_risk_management
evaluator: gemini-2.5-flash-lite
evaluated_at: '2026-04-21T21:49:35.222637'
overall_score: 4.0
scores:
- name: definition_precision
value: 4.0
max_value: 5.0
rationale: The definition is precise and clearly outlines the purpose of bank systemic
risk management. It avoids being an overly broad umbrella term.
- name: source_grounding
value: 3.0
max_value: 5.0
rationale: While the concept of managing risks to the banking system is present
in Book II, Chapter 2, the explicit framing of "systemic risk management" as a
distinct entity with specific practices might be a slight abstraction beyond Smith's
direct terminology.
- name: domain_placement
value: 5.0
max_value: 5.0
rationale: The "Regulation" domain is highly appropriate. Managing systemic risk
is fundamentally a regulatory concern aimed at ensuring the stability of the financial
system.
- name: vsm_relevance
value: 4.0
max_value: 5.0
rationale: This entity strongly maps to VSM System 3 (Internal Regulation/Audit)
as it involves monitoring and controlling internal operations to prevent systemic
failures. It also has elements of System 5 (Policy) in setting overall stability
goals.
- name: explanatory_value
value: 4.0
max_value: 5.0
rationale: The entity provides good explanatory value by highlighting a crucial
mechanism for maintaining financial stability. It explains *how* the banking system
can be protected from cascading failures.
---
# Evaluation: Bank Systemic Risk Management
## definition_precision — 4.0 / 5.0
The definition is precise and clearly outlines the purpose of bank systemic risk management. It avoids being an overly broad umbrella term.
## source_grounding — 3.0 / 5.0
While the concept of managing risks to the banking system is present in Book II, Chapter 2, the explicit framing of "systemic risk management" as a distinct entity with specific practices might be a slight abstraction beyond Smith's direct terminology.
## domain_placement — 5.0 / 5.0
The "Regulation" domain is highly appropriate. Managing systemic risk is fundamentally a regulatory concern aimed at ensuring the stability of the financial system.
## vsm_relevance — 4.0 / 5.0
This entity strongly maps to VSM System 3 (Internal Regulation/Audit) as it involves monitoring and controlling internal operations to prevent systemic failures. It also has elements of System 5 (Policy) in setting overall stability goals.
## explanatory_value — 4.0 / 5.0
The entity provides good explanatory value by highlighting a crucial mechanism for maintaining financial stability. It explains *how* the banking system can be protected from cascading failures.

View File

@@ -3,7 +3,7 @@ consistency_cycles: 0.0
coverage_ratio: 0.619048
granularity_entropy: 2.674752
modularity: 0.0
per_entity_mean: 3.955635
per_entity_mean: 3.95668
redundancy_ratio: 0.006073
type_distribution:
Element: 315

4
package-lock.json generated
View File

@@ -1,11 +1,11 @@
{
"name": "markitect_project",
"name": "markitect-main",
"version": "1.0.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "markitect_project",
"name": "markitect-main",
"version": "1.0.0",
"license": "ISC",
"dependencies": {

View File

@@ -1,5 +1,5 @@
{
"name": "markitect_project",
"name": "markitect-main",
"version": "1.0.0",
"description": "",
"main": "index.js",
@@ -14,7 +14,7 @@
},
"repository": {
"type": "git",
"url": "http://92.205.130.254:32166/coulomb/markitect_project"
"url": "http://92.205.130.254:32166/coulomb/markitect-main"
},
"keywords": [],
"author": "",

View File

@@ -39,7 +39,7 @@ Confirm the main Markitect application still works correctly with the current
capability code before publishing.
```bash
cd /home/worsch/markitect_project
cd /home/worsch/markitect-main
make testdrive-jsui-test-all # 84 tests must pass
# Manually verify view and edit modes in the running Markitect app
```

View File

@@ -30,7 +30,7 @@ class TestActualRoundtripBehavior:
cmd = ["python", "-m", "markitect.cli"] + args
result = subprocess.run(
cmd,
cwd="/home/worsch/markitect_project",
cwd="/home/worsch/markitect-main",
capture_output=True,
text=True
)

View File

@@ -5,7 +5,7 @@ This test implements the requirements for initializing a SQLite database
and storing markdown files with front matter parsing.
Issue #1: Initialize Database and Store Example Markdown File
https://gitea.coulomb.social/coulomb/markitect_project/issues/1
https://gitea.coulomb.social/coulomb/markitect-main/issues/1
"""
import pytest

View File

@@ -33,7 +33,7 @@ class TestRoundtripBase:
cmd,
capture_output=True,
text=True,
cwd="/home/worsch/markitect_project"
cwd="/home/worsch/markitect-main"
)
def validate_basic_structure_preservation(self, original: str, reconstructed: str) -> Dict[str, Any]: