generated from coulomb/repo-seed
Contract robustness and bottleneck test
This commit is contained in:
82
docs/markitect-tool-capacity-risks.md
Normal file
82
docs/markitect-tool-capacity-risks.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# markitect-tool Capacity Risk Sentinels
|
||||
|
||||
Date: 2026-05-05
|
||||
|
||||
Status: opt-in bottleneck tests for the `kontextual-engine` to
|
||||
`markitect-tool` integration boundary.
|
||||
|
||||
## Purpose
|
||||
|
||||
The example-backed contract tests prove that the Markitect interface behaves
|
||||
correctly for representative documents. Capacity sentinels add one more layer:
|
||||
they exercise larger generated examples so we can notice algorithmic trouble
|
||||
before engine workplans depend on the interface.
|
||||
|
||||
These tests are not microbenchmarks. They are deliberately coarse, generous,
|
||||
and opt-in. A failure should trigger investigation, profiling, or an upstream
|
||||
`markitect-tool` improvement before the engine builds more assumptions on top.
|
||||
|
||||
## Suspected Bottleneck Areas
|
||||
|
||||
| Area | Risk | Sentinel |
|
||||
| --- | --- | --- |
|
||||
| Large Markdown parsing | Section-heavy documents may create many headings, blocks, tokens, and sections. | Parse a generated document with hundreds of sections and verify document shape under a generous wall-clock budget. |
|
||||
| Selector extraction | Repeated selectors over large documents can become `queries x document-size`. | Run multiple heading, section, frontmatter, and block selectors over one parsed large document. |
|
||||
| Include resolution and composition | Fan-out includes with selectors may repeatedly parse included files and expand output size. | Resolve a generated include fan-out bundle and compose many Markdown files. |
|
||||
| Context package creation | Packing many source files can parse and query each file, then filter by policy. | Create and activate a context package from many generated public/internal Markdown sources. |
|
||||
| Snapshot identity | Hashing many or larger files should remain predictable and content-addressed. | Generate many Markdown files and compute stable snapshot identities. |
|
||||
|
||||
## Running The Sentinels
|
||||
|
||||
Normal test runs skip these tests. Run them against the sibling
|
||||
`markitect-tool` checkout with:
|
||||
|
||||
```bash
|
||||
KONTEXTUAL_RUN_CAPACITY=1 \
|
||||
PYTHONPATH=/home/worsch/kontextual-engine/src:/home/worsch/markitect-tool/src \
|
||||
python3 -m pytest tests/test_markitect_tool_capacity.py -q
|
||||
```
|
||||
|
||||
Run all Markitect interface checks with:
|
||||
|
||||
```bash
|
||||
KONTEXTUAL_RUN_CAPACITY=1 \
|
||||
PYTHONPATH=/home/worsch/kontextual-engine/src:/home/worsch/markitect-tool/src \
|
||||
python3 -m pytest -m "markitect_tool" -q
|
||||
```
|
||||
|
||||
## Interpretation
|
||||
|
||||
- Passing sentinels mean the current integration boundary is healthy enough for
|
||||
the planned engine work.
|
||||
- Failing sentinels should be treated as interface risk, not as proof of engine
|
||||
failure.
|
||||
- If a sentinel is too noisy, prefer improving its generated scenario or
|
||||
threshold over deleting it.
|
||||
- If a real use case exceeds the current generated sizes, add a new sentinel
|
||||
before relying on the behavior in an engine workplan.
|
||||
|
||||
## Current Generated Sizes
|
||||
|
||||
The tests currently generate:
|
||||
|
||||
- one section-heavy document with hundreds of decision sections,
|
||||
- dozens of repeated selector queries over a large parsed document,
|
||||
- a fan-out include bundle over many partial files,
|
||||
- a context package over many public/internal source files,
|
||||
- many snapshot identities over generated Markdown files.
|
||||
|
||||
The generated data lives in temporary pytest directories so the repository
|
||||
does not carry bulky synthetic corpora.
|
||||
|
||||
## Initial Local Baseline
|
||||
|
||||
On 2026-05-05, running against `/home/worsch/markitect-tool/src` on the local
|
||||
WSL workspace, all sentinels passed. The slowest observed sentinel was repeated
|
||||
selector queries over a large parsed document, followed by large parse/query
|
||||
and context-package creation. This suggests selectors are the first area to
|
||||
watch as engine retrieval workloads grow.
|
||||
|
||||
The baseline is observational, not a committed performance guarantee. The
|
||||
budgets in `tests/test_markitect_tool_capacity.py` are intentionally wider than
|
||||
the observed timings to avoid false failures from normal workstation variance.
|
||||
Reference in New Issue
Block a user