markitect-tool Capacity Risk Sentinels

Date: 2026-05-05

Status: opt-in bottleneck tests for the kontextual-engine to markitect-tool integration boundary.

Purpose

The example-backed contract tests prove that the Markitect interface behaves correctly for representative documents. Capacity sentinels add one more layer: they exercise larger generated examples so we can notice algorithmic trouble before engine workplans depend on the interface.

These tests are not microbenchmarks. They are deliberately coarse, generous, and opt-in. A failure should trigger investigation, profiling, or an upstream markitect-tool improvement before the engine builds more assumptions on top.

Suspected Bottleneck Areas

Area	Risk	Sentinel
Large Markdown parsing	Section-heavy documents may create many headings, blocks, tokens, and sections.	Parse a generated document with hundreds of sections and verify document shape under a generous wall-clock budget.
Selector extraction	Repeated selectors over large documents can become `queries x document-size`.	Run multiple heading, section, frontmatter, and block selectors over one parsed large document.
Include resolution and composition	Fan-out includes with selectors may repeatedly parse included files and expand output size.	Resolve a generated include fan-out bundle and compose many Markdown files.
Context package creation	Packing many source files can parse and query each file, then filter by policy.	Create and activate a context package from many generated public/internal Markdown sources.
Snapshot identity	Hashing many or larger files should remain predictable and content-addressed.	Generate many Markdown files and compute stable snapshot identities.

Running The Sentinels

Normal test runs skip these tests. Run them against the sibling markitect-tool checkout with:

KONTEXTUAL_RUN_CAPACITY=1 \
PYTHONPATH=/home/worsch/kontextual-engine/src:/home/worsch/markitect-tool/src \
  python3 -m pytest tests/test_markitect_tool_capacity.py -q

Run all Markitect interface checks with:

KONTEXTUAL_RUN_CAPACITY=1 \
PYTHONPATH=/home/worsch/kontextual-engine/src:/home/worsch/markitect-tool/src \
  python3 -m pytest -m "markitect_tool" -q

Interpretation

Passing sentinels mean the current integration boundary is healthy enough for the planned engine work.
Failing sentinels should be treated as interface risk, not as proof of engine failure.
If a sentinel is too noisy, prefer improving its generated scenario or threshold over deleting it.
If a real use case exceeds the current generated sizes, add a new sentinel before relying on the behavior in an engine workplan.

Current Generated Sizes

The tests currently generate:

one section-heavy document with hundreds of decision sections,
dozens of repeated selector queries over a large parsed document,
a fan-out include bundle over many partial files,
a context package over many public/internal source files,
many snapshot identities over generated Markdown files.

The generated data lives in temporary pytest directories so the repository does not carry bulky synthetic corpora.

Initial Local Baseline

On 2026-05-05, running against /home/worsch/markitect-tool/src on the local WSL workspace, all sentinels passed. The slowest observed sentinel was repeated selector queries over a large parsed document, followed by large parse/query and context-package creation. This suggests selectors are the first area to watch as engine retrieval workloads grow.

The baseline is observational, not a committed performance guarantee. The budgets in tests/test_markitect_tool_capacity.py are intentionally wider than the observed timings to avoid false failures from normal workstation variance.

3.8 KiB Raw Blame History