generated from coulomb/repo-seed
Contract robustness and bottleneck test
This commit is contained in:
@@ -17,6 +17,8 @@ Start here:
|
|||||||
- `docs/markitect-main-scope-assessment.md`
|
- `docs/markitect-main-scope-assessment.md`
|
||||||
- `docs/markitect-tool-reuse-boundary.md`
|
- `docs/markitect-tool-reuse-boundary.md`
|
||||||
- `docs/markitect-tool-integration-usecases.md`
|
- `docs/markitect-tool-integration-usecases.md`
|
||||||
|
- `docs/markitect-tool-capacity-risks.md`
|
||||||
|
- `examples/markitect-tool-contract/`
|
||||||
- `docs/phase-memory-boundary.md`
|
- `docs/phase-memory-boundary.md`
|
||||||
- `docs/system-layer-extraction-inventory.md`
|
- `docs/system-layer-extraction-inventory.md`
|
||||||
- `docs/system-layer-migration-backlog.md`
|
- `docs/system-layer-migration-backlog.md`
|
||||||
|
|||||||
82
docs/markitect-tool-capacity-risks.md
Normal file
82
docs/markitect-tool-capacity-risks.md
Normal file
@@ -0,0 +1,82 @@
|
|||||||
|
# markitect-tool Capacity Risk Sentinels
|
||||||
|
|
||||||
|
Date: 2026-05-05
|
||||||
|
|
||||||
|
Status: opt-in bottleneck tests for the `kontextual-engine` to
|
||||||
|
`markitect-tool` integration boundary.
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
The example-backed contract tests prove that the Markitect interface behaves
|
||||||
|
correctly for representative documents. Capacity sentinels add one more layer:
|
||||||
|
they exercise larger generated examples so we can notice algorithmic trouble
|
||||||
|
before engine workplans depend on the interface.
|
||||||
|
|
||||||
|
These tests are not microbenchmarks. They are deliberately coarse, generous,
|
||||||
|
and opt-in. A failure should trigger investigation, profiling, or an upstream
|
||||||
|
`markitect-tool` improvement before the engine builds more assumptions on top.
|
||||||
|
|
||||||
|
## Suspected Bottleneck Areas
|
||||||
|
|
||||||
|
| Area | Risk | Sentinel |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| Large Markdown parsing | Section-heavy documents may create many headings, blocks, tokens, and sections. | Parse a generated document with hundreds of sections and verify document shape under a generous wall-clock budget. |
|
||||||
|
| Selector extraction | Repeated selectors over large documents can become `queries x document-size`. | Run multiple heading, section, frontmatter, and block selectors over one parsed large document. |
|
||||||
|
| Include resolution and composition | Fan-out includes with selectors may repeatedly parse included files and expand output size. | Resolve a generated include fan-out bundle and compose many Markdown files. |
|
||||||
|
| Context package creation | Packing many source files can parse and query each file, then filter by policy. | Create and activate a context package from many generated public/internal Markdown sources. |
|
||||||
|
| Snapshot identity | Hashing many or larger files should remain predictable and content-addressed. | Generate many Markdown files and compute stable snapshot identities. |
|
||||||
|
|
||||||
|
## Running The Sentinels
|
||||||
|
|
||||||
|
Normal test runs skip these tests. Run them against the sibling
|
||||||
|
`markitect-tool` checkout with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
KONTEXTUAL_RUN_CAPACITY=1 \
|
||||||
|
PYTHONPATH=/home/worsch/kontextual-engine/src:/home/worsch/markitect-tool/src \
|
||||||
|
python3 -m pytest tests/test_markitect_tool_capacity.py -q
|
||||||
|
```
|
||||||
|
|
||||||
|
Run all Markitect interface checks with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
KONTEXTUAL_RUN_CAPACITY=1 \
|
||||||
|
PYTHONPATH=/home/worsch/kontextual-engine/src:/home/worsch/markitect-tool/src \
|
||||||
|
python3 -m pytest -m "markitect_tool" -q
|
||||||
|
```
|
||||||
|
|
||||||
|
## Interpretation
|
||||||
|
|
||||||
|
- Passing sentinels mean the current integration boundary is healthy enough for
|
||||||
|
the planned engine work.
|
||||||
|
- Failing sentinels should be treated as interface risk, not as proof of engine
|
||||||
|
failure.
|
||||||
|
- If a sentinel is too noisy, prefer improving its generated scenario or
|
||||||
|
threshold over deleting it.
|
||||||
|
- If a real use case exceeds the current generated sizes, add a new sentinel
|
||||||
|
before relying on the behavior in an engine workplan.
|
||||||
|
|
||||||
|
## Current Generated Sizes
|
||||||
|
|
||||||
|
The tests currently generate:
|
||||||
|
|
||||||
|
- one section-heavy document with hundreds of decision sections,
|
||||||
|
- dozens of repeated selector queries over a large parsed document,
|
||||||
|
- a fan-out include bundle over many partial files,
|
||||||
|
- a context package over many public/internal source files,
|
||||||
|
- many snapshot identities over generated Markdown files.
|
||||||
|
|
||||||
|
The generated data lives in temporary pytest directories so the repository
|
||||||
|
does not carry bulky synthetic corpora.
|
||||||
|
|
||||||
|
## Initial Local Baseline
|
||||||
|
|
||||||
|
On 2026-05-05, running against `/home/worsch/markitect-tool/src` on the local
|
||||||
|
WSL workspace, all sentinels passed. The slowest observed sentinel was repeated
|
||||||
|
selector queries over a large parsed document, followed by large parse/query
|
||||||
|
and context-package creation. This suggests selectors are the first area to
|
||||||
|
watch as engine retrieval workloads grow.
|
||||||
|
|
||||||
|
The baseline is observational, not a committed performance guarantee. The
|
||||||
|
budgets in `tests/test_markitect_tool_capacity.py` are intentionally wider than
|
||||||
|
the observed timings to avoid false failures from normal workstation variance.
|
||||||
@@ -14,7 +14,11 @@ Instead, it should wrap them as adapters and persist engine-owned assets,
|
|||||||
lineage, policy decisions, audit events, and service contracts around them.
|
lineage, policy decisions, audit events, and service contracts around them.
|
||||||
|
|
||||||
The executable companion for this document is
|
The executable companion for this document is
|
||||||
`tests/test_markitect_tool_contract.py`.
|
`tests/test_markitect_tool_contract.py`. The reusable fixture corpus lives in
|
||||||
|
`examples/markitect-tool-contract/`.
|
||||||
|
Opt-in bottleneck sentinels are described in
|
||||||
|
`docs/markitect-tool-capacity-risks.md` and implemented in
|
||||||
|
`tests/test_markitect_tool_capacity.py`.
|
||||||
|
|
||||||
## Expected Dependency Shape
|
## Expected Dependency Shape
|
||||||
|
|
||||||
@@ -26,6 +30,22 @@ The executable companion for this document is
|
|||||||
- Persistence posture: store serializable Markitect results and provenance as
|
- Persistence posture: store serializable Markitect results and provenance as
|
||||||
adapter metadata, not as canonical domain objects.
|
adapter metadata, not as canonical domain objects.
|
||||||
|
|
||||||
|
Run the examples against the sibling source checkout during integration
|
||||||
|
development with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
PYTHONPATH=/home/worsch/kontextual-engine/src:/home/worsch/markitect-tool/src \
|
||||||
|
python3 -m pytest tests/test_markitect_tool_contract.py -q
|
||||||
|
```
|
||||||
|
|
||||||
|
Run the larger capacity sentinels with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
KONTEXTUAL_RUN_CAPACITY=1 \
|
||||||
|
PYTHONPATH=/home/worsch/kontextual-engine/src:/home/worsch/markitect-tool/src \
|
||||||
|
python3 -m pytest tests/test_markitect_tool_capacity.py -q
|
||||||
|
```
|
||||||
|
|
||||||
## Use Case 1: Markdown Normalization
|
## Use Case 1: Markdown Normalization
|
||||||
|
|
||||||
Intent: convert Markdown source content into structured frontmatter, headings,
|
Intent: convert Markdown source content into structured frontmatter, headings,
|
||||||
@@ -207,7 +227,10 @@ Engine expectation:
|
|||||||
| Transform and include provenance | Markdown ops retain Markitect provenance. |
|
| Transform and include provenance | Markdown ops retain Markitect provenance. |
|
||||||
| Snapshot identity | Engine stores Markitect snapshot metadata without owning the algorithm. |
|
| Snapshot identity | Engine stores Markitect snapshot metadata without owning the algorithm. |
|
||||||
| Context package policy filtering | Agent context can reuse Markitect packages and local label policy. |
|
| Context package policy filtering | Agent context can reuse Markitect packages and local label policy. |
|
||||||
|
| Document contracts | Markdown validation can call Markitect contracts without moving contract semantics into the engine. |
|
||||||
|
| Capacity sentinels | Larger generated examples expose likely parser, selector, include, context-package, and snapshot bottlenecks. |
|
||||||
|
|
||||||
These tests are intentionally small. They are not a replacement for
|
These tests are intentionally small but example-backed. They are not a
|
||||||
`markitect-tool`'s own test suite; they assert only the behaviors this engine
|
replacement for `markitect-tool`'s own test suite; they assert only the
|
||||||
depends on.
|
behaviors this engine depends on and provide concrete data for diagnosing
|
||||||
|
interface drift.
|
||||||
|
|||||||
21
examples/markitect-tool-contract/README.md
Normal file
21
examples/markitect-tool-contract/README.md
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
# markitect-tool Contract Examples
|
||||||
|
|
||||||
|
This directory is a small interface lab for the `kontextual-engine` dependency
|
||||||
|
on `markitect-tool`.
|
||||||
|
|
||||||
|
The files are intentionally ordinary Markdown/YAML fixtures rather than inline
|
||||||
|
test strings. They should help us validate Markitect behavior before engine
|
||||||
|
workplans depend on it, and they should be updated whenever the expected
|
||||||
|
integration contract changes.
|
||||||
|
|
||||||
|
Covered examples:
|
||||||
|
|
||||||
|
- Markdown parsing with frontmatter, headings, sections, lists, and source
|
||||||
|
paths.
|
||||||
|
- Selector extraction for sections, frontmatter paths, and blocks.
|
||||||
|
- Include resolution, heading shifts, composition, and operation provenance.
|
||||||
|
- Snapshot identity for Markdown files.
|
||||||
|
- Context-package creation from sources and manifests.
|
||||||
|
- Local label policy filtering for public versus internal context.
|
||||||
|
- Basic document contract validation for decision records.
|
||||||
|
|
||||||
@@ -0,0 +1,10 @@
|
|||||||
|
# Kontextual Engine Context Bundle
|
||||||
|
|
||||||
|
{{include:../corpus/adr-0001-context-packages.md}}
|
||||||
|
|
||||||
|
<!-- mkt:include path="../corpus/engineering-policy.md" selector="sections[heading=Controls]" heading_delta="1" -->
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
{{include:../corpus/internal-risk-note.md}}
|
||||||
|
```
|
||||||
|
|
||||||
@@ -0,0 +1,51 @@
|
|||||||
|
# Decision Record Contract
|
||||||
|
|
||||||
|
```yaml contract
|
||||||
|
id: kontextual-decision-record-v1
|
||||||
|
document:
|
||||||
|
type: adr
|
||||||
|
title: Architecture Decision Record
|
||||||
|
fields:
|
||||||
|
status:
|
||||||
|
type: string
|
||||||
|
required: true
|
||||||
|
enum: [proposed, accepted, superseded]
|
||||||
|
owner:
|
||||||
|
type: string
|
||||||
|
required: true
|
||||||
|
metrics:
|
||||||
|
document:
|
||||||
|
words:
|
||||||
|
min: 35
|
||||||
|
max: 500
|
||||||
|
severity: warning
|
||||||
|
sections:
|
||||||
|
- id: context
|
||||||
|
title: Context
|
||||||
|
presence: required
|
||||||
|
level: 2
|
||||||
|
order:
|
||||||
|
before: decision
|
||||||
|
assertions:
|
||||||
|
- id: context-names-problem
|
||||||
|
contains_any: [problem, motivation, need]
|
||||||
|
severity: warning
|
||||||
|
guidance: Explain why the decision exists.
|
||||||
|
- id: decision
|
||||||
|
title: Decision
|
||||||
|
presence: required
|
||||||
|
level: 2
|
||||||
|
assertions:
|
||||||
|
- id: decision-commits
|
||||||
|
matches: "\\b(use|adopt|choose|will)\\b"
|
||||||
|
severity: error
|
||||||
|
guidance: State the actual decision.
|
||||||
|
- id: consequences
|
||||||
|
title: Consequences
|
||||||
|
presence: recommended
|
||||||
|
level: 2
|
||||||
|
- id: deprecated
|
||||||
|
title: Deprecated Approach
|
||||||
|
presence: forbidden
|
||||||
|
```
|
||||||
|
|
||||||
@@ -0,0 +1,33 @@
|
|||||||
|
---
|
||||||
|
document_type: adr
|
||||||
|
status: accepted
|
||||||
|
owner: Platform Knowledge
|
||||||
|
tags:
|
||||||
|
- context
|
||||||
|
- markdown
|
||||||
|
- governance
|
||||||
|
policy:
|
||||||
|
labels: [public, engineering]
|
||||||
|
source:
|
||||||
|
system: repo
|
||||||
|
path: examples/markitect-tool-contract/corpus/adr-0001-context-packages.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# Use Markitect Context Packages
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The problem is that the engine needs Markdown-native structure and context
|
||||||
|
packages without owning a second Markdown parser or selector language.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will use markitect-tool as the Markdown syntax, selector, deterministic
|
||||||
|
operation, snapshot, and context-package layer for Markdown-backed assets.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
- Engine assets stay cross-format and durable.
|
||||||
|
- Markdown selectors stay Markitect-owned.
|
||||||
|
- Adapter provenance can be stored with engine transformation runs.
|
||||||
|
|
||||||
@@ -0,0 +1,18 @@
|
|||||||
|
---
|
||||||
|
document_type: adr
|
||||||
|
status: accepted
|
||||||
|
owner: Platform Knowledge
|
||||||
|
policy:
|
||||||
|
labels: [public, engineering]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Weak Decision Record
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The note mentions a need but does not contain the required decision section.
|
||||||
|
|
||||||
|
## Deprecated Approach
|
||||||
|
|
||||||
|
This forbidden section should be reported by the Markitect contract checker.
|
||||||
|
|
||||||
@@ -0,0 +1,24 @@
|
|||||||
|
---
|
||||||
|
document_type: policy
|
||||||
|
status: active
|
||||||
|
owner: Platform Knowledge
|
||||||
|
policy:
|
||||||
|
labels: [public, governance]
|
||||||
|
source:
|
||||||
|
system: repo
|
||||||
|
path: examples/markitect-tool-contract/corpus/engineering-policy.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# Engineering Knowledge Policy
|
||||||
|
|
||||||
|
## Controls
|
||||||
|
|
||||||
|
Published context packages must preserve source paths, source spans, policy
|
||||||
|
labels, and enough provenance for the engine to audit how the package was
|
||||||
|
assembled.
|
||||||
|
|
||||||
|
## Review
|
||||||
|
|
||||||
|
Sensitive or high-impact generated artifacts must pass through an engine-owned
|
||||||
|
review gate before publication or export.
|
||||||
|
|
||||||
@@ -0,0 +1,22 @@
|
|||||||
|
---
|
||||||
|
document_type: risk-note
|
||||||
|
status: draft
|
||||||
|
owner: Platform Knowledge
|
||||||
|
policy:
|
||||||
|
labels: [internal]
|
||||||
|
source:
|
||||||
|
system: repo
|
||||||
|
path: examples/markitect-tool-contract/corpus/internal-risk-note.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# Internal Retrieval Risk
|
||||||
|
|
||||||
|
## Risk
|
||||||
|
|
||||||
|
This internal note should not appear in a public context activation.
|
||||||
|
|
||||||
|
## Mitigation
|
||||||
|
|
||||||
|
Permission filtering must happen before snippets, context packages, or derived
|
||||||
|
outputs are returned to a caller.
|
||||||
|
|
||||||
@@ -0,0 +1,21 @@
|
|||||||
|
title: Kontextual Engine Markdown Adapter Context
|
||||||
|
intent: Provide public Markdown-backed context for adapter boundary testing.
|
||||||
|
namespace:
|
||||||
|
project: kontextual-engine
|
||||||
|
task: markitect-tool-contract
|
||||||
|
budget:
|
||||||
|
max_items: 4
|
||||||
|
retrieval_recipes:
|
||||||
|
- kind: selector
|
||||||
|
engine: selector
|
||||||
|
query: sections[heading=Decision]
|
||||||
|
sources:
|
||||||
|
- corpus/adr-0001-context-packages.md
|
||||||
|
- kind: selector
|
||||||
|
engine: selector
|
||||||
|
query: sections[heading=Controls]
|
||||||
|
sources:
|
||||||
|
- corpus/engineering-policy.md
|
||||||
|
metadata:
|
||||||
|
fixture: markitect-tool-contract
|
||||||
|
|
||||||
@@ -40,4 +40,5 @@ pythonpath = ["src"]
|
|||||||
markers = [
|
markers = [
|
||||||
"integration: tests that exercise optional external package contracts",
|
"integration: tests that exercise optional external package contracts",
|
||||||
"markitect_tool: tests for the optional markitect-tool adapter boundary",
|
"markitect_tool: tests for the optional markitect-tool adapter boundary",
|
||||||
|
"capacity: opt-in capacity sentinel tests for bottleneck and scaling risks",
|
||||||
]
|
]
|
||||||
|
|||||||
249
tests/test_markitect_tool_capacity.py
Normal file
249
tests/test_markitect_tool_capacity.py
Normal file
@@ -0,0 +1,249 @@
|
|||||||
|
import importlib.util
|
||||||
|
import os
|
||||||
|
import time
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
|
||||||
|
pytestmark = [pytest.mark.integration, pytest.mark.markitect_tool, pytest.mark.capacity]
|
||||||
|
if importlib.util.find_spec("markitect_tool") is None:
|
||||||
|
pytestmark.append(
|
||||||
|
pytest.mark.skip(
|
||||||
|
reason="Install kontextual-engine[markdown] to run markitect-tool capacity tests."
|
||||||
|
)
|
||||||
|
)
|
||||||
|
mkt = None
|
||||||
|
elif os.environ.get("KONTEXTUAL_RUN_CAPACITY", "").lower() not in {"1", "true", "yes"}:
|
||||||
|
pytestmark.append(
|
||||||
|
pytest.mark.skip(
|
||||||
|
reason="Set KONTEXTUAL_RUN_CAPACITY=1 to run opt-in capacity sentinels."
|
||||||
|
)
|
||||||
|
)
|
||||||
|
mkt = None
|
||||||
|
else:
|
||||||
|
import markitect_tool as mkt
|
||||||
|
|
||||||
|
|
||||||
|
def test_large_markdown_parse_query_and_extract_capacity() -> None:
|
||||||
|
markdown = _large_decision_markdown(section_count=650)
|
||||||
|
|
||||||
|
elapsed, document = _timed(lambda: mkt.parse_markdown(markdown, source_path="large.md"))
|
||||||
|
_assert_within("parse 650-section markdown", elapsed, seconds=6.0)
|
||||||
|
|
||||||
|
elapsed, matches = _timed(lambda: mkt.query_document(document, "sections[heading=Decision 640]"))
|
||||||
|
_assert_within("query exact section in 650-section markdown", elapsed, seconds=2.0)
|
||||||
|
|
||||||
|
elapsed, extracted = _timed(lambda: mkt.extract_document(document, "sections[heading=Decision 640]"))
|
||||||
|
_assert_within("extract exact section in 650-section markdown", elapsed, seconds=2.0)
|
||||||
|
|
||||||
|
assert len(document.sections) == 651
|
||||||
|
assert len(document.headings) == 651
|
||||||
|
assert len(matches) == 1
|
||||||
|
assert "CAPACITY-MARKER-640" in extracted[0]
|
||||||
|
|
||||||
|
|
||||||
|
def test_repeated_selectors_over_large_document_capacity() -> None:
|
||||||
|
document = mkt.parse_markdown(_large_decision_markdown(section_count=420))
|
||||||
|
selectors = [
|
||||||
|
"frontmatter.status",
|
||||||
|
"headings[level=2]",
|
||||||
|
"blocks[type=bullet_list]",
|
||||||
|
"sections[contains~=CAPACITY-MARKER-120]",
|
||||||
|
"sections[heading=Decision 240]",
|
||||||
|
"metrics.document.sections",
|
||||||
|
]
|
||||||
|
|
||||||
|
def run_queries() -> list[int]:
|
||||||
|
counts = []
|
||||||
|
for _ in range(12):
|
||||||
|
for selector in selectors:
|
||||||
|
counts.append(len(mkt.query_document(document, selector)))
|
||||||
|
return counts
|
||||||
|
|
||||||
|
elapsed, counts = _timed(run_queries)
|
||||||
|
|
||||||
|
_assert_within("72 selector queries over 420-section markdown", elapsed, seconds=5.0)
|
||||||
|
assert min(counts) >= 1
|
||||||
|
assert max(counts) >= 420
|
||||||
|
|
||||||
|
|
||||||
|
def test_include_fanout_compose_and_transform_capacity(tmp_path: Path) -> None:
|
||||||
|
partials = []
|
||||||
|
for index in range(90):
|
||||||
|
partial = tmp_path / f"partial-{index:03}.md"
|
||||||
|
partial.write_text(_partial_markdown(index), encoding="utf-8")
|
||||||
|
partials.append(partial)
|
||||||
|
bundle = tmp_path / "bundle.md"
|
||||||
|
bundle.write_text(
|
||||||
|
"\n".join(
|
||||||
|
f'<!-- mkt:include path="{partial.name}" selector="sections[heading=Include Target]" heading_delta="1" -->'
|
||||||
|
for partial in partials
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
elapsed, included = _timed(
|
||||||
|
lambda: mkt.resolve_includes(
|
||||||
|
bundle.read_text(encoding="utf-8"),
|
||||||
|
base_dir=tmp_path,
|
||||||
|
current_path=bundle,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
_assert_within("resolve 90 include fan-out bundle", elapsed, seconds=8.0)
|
||||||
|
|
||||||
|
elapsed, composed = _timed(lambda: mkt.compose_files(partials, title="Capacity Bundle", heading_delta=1))
|
||||||
|
_assert_within("compose 90 markdown partials", elapsed, seconds=5.0)
|
||||||
|
|
||||||
|
elapsed, transformed = _timed(
|
||||||
|
lambda: mkt.transform_markdown(
|
||||||
|
included.markdown,
|
||||||
|
set_frontmatter={"status": "capacity-check"},
|
||||||
|
heading_delta=1,
|
||||||
|
source_path=str(bundle),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
_assert_within("transform resolved include fan-out bundle", elapsed, seconds=5.0)
|
||||||
|
|
||||||
|
assert len(included.included_paths) == 90
|
||||||
|
assert "### Include Target" in included.markdown
|
||||||
|
assert composed.markdown.startswith("# Capacity Bundle")
|
||||||
|
assert "status: capacity-check" in transformed.markdown
|
||||||
|
|
||||||
|
|
||||||
|
def test_context_package_many_sources_policy_filtering_capacity(tmp_path: Path) -> None:
|
||||||
|
sources = []
|
||||||
|
for index in range(140):
|
||||||
|
source = tmp_path / f"source-{index:03}.md"
|
||||||
|
label = "public" if index % 2 == 0 else "internal"
|
||||||
|
source.write_text(_context_source_markdown(index, label), encoding="utf-8")
|
||||||
|
sources.append(source)
|
||||||
|
gateway = mkt.LocalLabelPolicyGateway(
|
||||||
|
{
|
||||||
|
"id": "capacity-policy",
|
||||||
|
"subjects": {
|
||||||
|
"reader": {
|
||||||
|
"allowed_labels": ["public"],
|
||||||
|
"allowed_actions": ["read", "activate"],
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"default_subject": "reader",
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
elapsed, package = _timed(
|
||||||
|
lambda: mkt.create_context_package_from_sources(
|
||||||
|
"sections[heading=Decision]",
|
||||||
|
sources,
|
||||||
|
root=tmp_path,
|
||||||
|
namespace=mkt.MemoryNamespace(project="kontextual-engine", task="capacity"),
|
||||||
|
budget=mkt.ContextBudget(max_items=160),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
_assert_within("create context package from 140 markdown sources", elapsed, seconds=12.0)
|
||||||
|
|
||||||
|
elapsed, activation = _timed(
|
||||||
|
lambda: mkt.activate_context_package(
|
||||||
|
package,
|
||||||
|
policy_gateway=gateway,
|
||||||
|
subject="reader",
|
||||||
|
)
|
||||||
|
)
|
||||||
|
_assert_within("activate and policy-filter 140-source context package", elapsed, seconds=6.0)
|
||||||
|
|
||||||
|
assert len(package.items) == 140
|
||||||
|
assert len(activation.items) == 70
|
||||||
|
assert "PUBLIC-CAPACITY-000" in activation.content
|
||||||
|
assert "INTERNAL-CAPACITY-001" not in activation.content
|
||||||
|
assert activation.policy["summary"]["denied"] == 70
|
||||||
|
|
||||||
|
|
||||||
|
def test_snapshot_identity_many_files_capacity(tmp_path: Path) -> None:
|
||||||
|
paths = []
|
||||||
|
for index in range(120):
|
||||||
|
path = tmp_path / f"snapshot-{index:03}.md"
|
||||||
|
path.write_text(_context_source_markdown(index, "public"), encoding="utf-8")
|
||||||
|
paths.append(path)
|
||||||
|
|
||||||
|
elapsed, identities = _timed(lambda: [mkt.snapshot_identity_for_file(path) for path in paths])
|
||||||
|
_assert_within("compute 120 markdown snapshot identities", elapsed, seconds=4.0)
|
||||||
|
|
||||||
|
assert len({identity.snapshot_id for identity in identities}) == 120
|
||||||
|
assert all(identity.content_hash.startswith("sha256:") for identity in identities)
|
||||||
|
|
||||||
|
|
||||||
|
def _large_decision_markdown(section_count: int) -> str:
|
||||||
|
sections = [
|
||||||
|
"---",
|
||||||
|
"document_type: capacity-fixture",
|
||||||
|
"status: active",
|
||||||
|
"owner: Platform Knowledge",
|
||||||
|
"---",
|
||||||
|
"",
|
||||||
|
"# Capacity Fixture",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
for index in range(section_count):
|
||||||
|
sections.extend(
|
||||||
|
[
|
||||||
|
f"## Decision {index}",
|
||||||
|
"",
|
||||||
|
(
|
||||||
|
f"CAPACITY-MARKER-{index} records a synthetic decision section "
|
||||||
|
"with enough text to exercise parsing, selector matching, and extraction."
|
||||||
|
),
|
||||||
|
"",
|
||||||
|
"- Parser shape must stay stable.",
|
||||||
|
"- Selector scans must remain bounded enough for adapter use.",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
return "\n".join(sections)
|
||||||
|
|
||||||
|
|
||||||
|
def _partial_markdown(index: int) -> str:
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
f"# Partial {index}",
|
||||||
|
"",
|
||||||
|
"## Include Target",
|
||||||
|
"",
|
||||||
|
f"Included capacity text {index}.",
|
||||||
|
"",
|
||||||
|
"## Ignore",
|
||||||
|
"",
|
||||||
|
"This section should not be selected by the include resolver.",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _context_source_markdown(index: int, label: str) -> str:
|
||||||
|
marker = f"{label.upper()}-CAPACITY-{index:03}"
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"---",
|
||||||
|
"document_type: capacity-source",
|
||||||
|
f"status: {'active' if label == 'public' else 'draft'}",
|
||||||
|
"policy:",
|
||||||
|
f" labels: [{label}]",
|
||||||
|
"---",
|
||||||
|
"",
|
||||||
|
f"# Capacity Source {index}",
|
||||||
|
"",
|
||||||
|
"## Decision",
|
||||||
|
"",
|
||||||
|
f"{marker} uses Markitect context packaging for generated source {index}.",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _timed(operation):
|
||||||
|
start = time.perf_counter()
|
||||||
|
value = operation()
|
||||||
|
return time.perf_counter() - start, value
|
||||||
|
|
||||||
|
|
||||||
|
def _assert_within(name: str, elapsed: float, *, seconds: float) -> None:
|
||||||
|
assert elapsed <= seconds, f"{name} took {elapsed:.3f}s, expected <= {seconds:.3f}s"
|
||||||
@@ -1,48 +1,39 @@
|
|||||||
|
import importlib.util
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
|
|
||||||
pytestmark = [pytest.mark.integration, pytest.mark.markitect_tool]
|
pytestmark = [pytest.mark.integration, pytest.mark.markitect_tool]
|
||||||
|
if importlib.util.find_spec("markitect_tool") is None:
|
||||||
|
pytestmark.append(
|
||||||
|
pytest.mark.skip(
|
||||||
|
reason="Install kontextual-engine[markdown] to run markitect-tool contract tests."
|
||||||
|
)
|
||||||
|
)
|
||||||
|
mkt = None
|
||||||
|
else:
|
||||||
|
import markitect_tool as mkt
|
||||||
|
|
||||||
mkt = pytest.importorskip(
|
EXAMPLE_ROOT = Path(__file__).resolve().parents[1] / "examples" / "markitect-tool-contract"
|
||||||
"markitect_tool",
|
ADR = EXAMPLE_ROOT / "corpus" / "adr-0001-context-packages.md"
|
||||||
reason="Install kontextual-engine[markdown] to run markitect-tool contract tests.",
|
INVALID_ADR = EXAMPLE_ROOT / "corpus" / "adr-invalid-missing-decision.md"
|
||||||
)
|
POLICY = EXAMPLE_ROOT / "corpus" / "engineering-policy.md"
|
||||||
|
INTERNAL = EXAMPLE_ROOT / "corpus" / "internal-risk-note.md"
|
||||||
|
BUNDLE = EXAMPLE_ROOT / "composition" / "context-bundle.md"
|
||||||
SAMPLE_MARKDOWN = """---
|
MANIFEST = EXAMPLE_ROOT / "manifests" / "agent-context.yaml"
|
||||||
document_type: decision
|
CONTRACT = EXAMPLE_ROOT / "contracts" / "decision-record.contract.md"
|
||||||
status: accepted
|
|
||||||
policy:
|
|
||||||
labels: [public]
|
|
||||||
---
|
|
||||||
|
|
||||||
# Engine Boundary
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
The engine needs Markdown-native structure without owning a Markdown parser.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
Use markitect-tool as the syntax and deterministic operations layer.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
- Engine assets stay cross-format.
|
|
||||||
- Markdown selectors stay Markitect-owned.
|
|
||||||
"""
|
|
||||||
|
|
||||||
|
|
||||||
def test_markitect_parser_returns_structured_markdown_document() -> None:
|
def test_markitect_parser_returns_structured_markdown_document() -> None:
|
||||||
document = mkt.parse_markdown(SAMPLE_MARKDOWN, source_path="docs/decision.md")
|
document = mkt.parse_markdown_file(ADR)
|
||||||
serialized = document.to_dict()
|
serialized = document.to_dict()
|
||||||
|
|
||||||
assert serialized["frontmatter"]["status"] == "accepted"
|
assert serialized["frontmatter"]["status"] == "accepted"
|
||||||
assert serialized["source_path"] == "docs/decision.md"
|
assert serialized["frontmatter"]["owner"] == "Platform Knowledge"
|
||||||
|
assert serialized["source_path"] == str(ADR)
|
||||||
assert [heading["text"] for heading in serialized["headings"]] == [
|
assert [heading["text"] for heading in serialized["headings"]] == [
|
||||||
"Engine Boundary",
|
"Use Markitect Context Packages",
|
||||||
"Context",
|
"Context",
|
||||||
"Decision",
|
"Decision",
|
||||||
"Consequences",
|
"Consequences",
|
||||||
@@ -51,80 +42,93 @@ def test_markitect_parser_returns_structured_markdown_document() -> None:
|
|||||||
|
|
||||||
|
|
||||||
def test_markitect_selectors_extract_source_grounded_markdown_units() -> None:
|
def test_markitect_selectors_extract_source_grounded_markdown_units() -> None:
|
||||||
document = mkt.parse_markdown(SAMPLE_MARKDOWN)
|
document = mkt.parse_markdown_file(ADR)
|
||||||
|
|
||||||
|
status = mkt.extract_document(document, "frontmatter.status")
|
||||||
matches = mkt.query_document(document, "sections[heading=Decision]")
|
matches = mkt.query_document(document, "sections[heading=Decision]")
|
||||||
extracted = mkt.extract_document(document, "sections[heading=Decision]")
|
extracted = mkt.extract_document(document, "sections[heading=Decision]")
|
||||||
|
bullets = mkt.query_document(document, "blocks[type=bullet_list]")
|
||||||
|
|
||||||
|
assert status == ["accepted"]
|
||||||
assert len(matches) == 1
|
assert len(matches) == 1
|
||||||
assert matches[0].kind == "section"
|
assert matches[0].kind == "section"
|
||||||
assert matches[0].line is not None
|
assert matches[0].line is not None
|
||||||
assert "deterministic operations layer" in matches[0].text
|
assert "context-package layer" in matches[0].text
|
||||||
assert extracted == [
|
assert extracted == [
|
||||||
"## Decision\n\nUse markitect-tool as the syntax and deterministic operations layer."
|
"## Decision\n\n"
|
||||||
|
"We will use markitect-tool as the Markdown syntax, selector, deterministic\n"
|
||||||
|
"operation, snapshot, and context-package layer for Markdown-backed assets."
|
||||||
]
|
]
|
||||||
|
assert len(bullets) == 1
|
||||||
|
assert "Engine assets stay cross-format" in bullets[0].text
|
||||||
|
|
||||||
|
|
||||||
def test_markitect_ops_resolve_includes_transform_and_return_provenance(tmp_path: Path) -> None:
|
def test_markitect_ops_compose_include_transform_and_return_provenance() -> None:
|
||||||
partial = tmp_path / "partial.md"
|
|
||||||
partial.write_text(
|
|
||||||
"# Included\n\n## Decision\n\nReuse Markitect operations.\n",
|
|
||||||
encoding="utf-8",
|
|
||||||
)
|
|
||||||
|
|
||||||
included = mkt.resolve_includes(
|
included = mkt.resolve_includes(
|
||||||
'{{include:partial.md}}',
|
BUNDLE.read_text(encoding="utf-8"),
|
||||||
base_dir=tmp_path,
|
base_dir=EXAMPLE_ROOT,
|
||||||
|
current_path=BUNDLE,
|
||||||
|
)
|
||||||
|
composed = mkt.compose_files(
|
||||||
|
[ADR, POLICY],
|
||||||
|
title="Combined Markdown Context",
|
||||||
|
heading_delta=1,
|
||||||
)
|
)
|
||||||
transformed = mkt.transform_markdown(
|
transformed = mkt.transform_markdown(
|
||||||
included.markdown,
|
included.markdown,
|
||||||
set_frontmatter={"status": "draft"},
|
set_frontmatter={"status": "draft", "producer": {"name": "kontextual-engine"}},
|
||||||
heading_delta=1,
|
heading_delta=1,
|
||||||
source_path="composed.md",
|
source_path=str(BUNDLE),
|
||||||
)
|
)
|
||||||
|
|
||||||
assert included.included_paths == [str(partial.resolve())]
|
assert included.included_paths == [str(ADR.resolve()), str(POLICY.resolve())]
|
||||||
assert included.provenance[0].operation == "include"
|
assert [event.operation for event in included.provenance] == ["include", "include"]
|
||||||
assert included.provenance[0].target_path == str(partial.resolve())
|
assert included.provenance[1].metadata["selector"] == "sections[heading=Controls]"
|
||||||
assert "status: draft" in transformed.markdown
|
assert "### Controls" in included.markdown
|
||||||
assert "## Included" in transformed.markdown
|
assert "{{include:../corpus/internal-risk-note.md}}" in included.markdown
|
||||||
assert "### Decision" in transformed.markdown
|
assert "This internal note should not appear" not in included.markdown
|
||||||
|
assert composed.markdown.startswith("# Combined Markdown Context")
|
||||||
|
assert "## Use Markitect Context Packages" in composed.markdown
|
||||||
|
assert "document_type: adr" not in composed.markdown
|
||||||
|
assert "producer:" in transformed.markdown
|
||||||
assert [event.operation for event in transformed.provenance] == [
|
assert [event.operation for event in transformed.provenance] == [
|
||||||
"set_frontmatter",
|
"set_frontmatter",
|
||||||
"shift_headings",
|
"shift_headings",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
def test_markitect_snapshot_identity_is_content_addressed_adapter_metadata(tmp_path: Path) -> None:
|
def test_markitect_snapshot_identity_is_content_addressed_adapter_metadata() -> None:
|
||||||
source = tmp_path / "decision.md"
|
first = mkt.snapshot_identity_for_file(ADR, parse_options={"profile": "default"})
|
||||||
source.write_text(SAMPLE_MARKDOWN, encoding="utf-8")
|
second = mkt.snapshot_identity_for_file(ADR, parse_options={"profile": "default"})
|
||||||
|
changed = mkt.snapshot_identity_for_file(ADR, parse_options={"profile": "strict"})
|
||||||
first = mkt.snapshot_identity_for_file(source, parse_options={"profile": "default"})
|
|
||||||
second = mkt.snapshot_identity_for_file(source, parse_options={"profile": "default"})
|
|
||||||
changed = mkt.snapshot_identity_for_file(source, parse_options={"profile": "strict"})
|
|
||||||
|
|
||||||
assert first.snapshot_id == second.snapshot_id
|
assert first.snapshot_id == second.snapshot_id
|
||||||
assert first.content_hash == second.content_hash
|
assert first.content_hash == second.content_hash
|
||||||
assert first.parser == "markdown-it-py/commonmark"
|
assert first.parser == "markdown-it-py/commonmark"
|
||||||
assert first.snapshot_id != changed.snapshot_id
|
assert first.snapshot_id != changed.snapshot_id
|
||||||
assert first.to_dict()["source_path"] == str(source)
|
assert first.to_dict()["source_path"] == str(ADR)
|
||||||
|
|
||||||
|
|
||||||
def test_markitect_context_packages_filter_by_local_policy(tmp_path: Path) -> None:
|
def test_markitect_context_packages_from_manifest_preserve_sources() -> None:
|
||||||
public = tmp_path / "public.md"
|
package = mkt.create_context_package_from_manifest(MANIFEST, root=EXAMPLE_ROOT)
|
||||||
private = tmp_path / "private.md"
|
activation = mkt.activate_context_package(package, target="thread:contract-test")
|
||||||
public.write_text(
|
|
||||||
"---\npolicy:\n labels: [public]\n---\n# Public\n\nVisible context.\n",
|
assert package.title == "Kontextual Engine Markdown Adapter Context"
|
||||||
encoding="utf-8",
|
assert package.namespace.project == "kontextual-engine"
|
||||||
)
|
assert [item.source.path for item in package.items] == [
|
||||||
private.write_text(
|
"corpus/adr-0001-context-packages.md",
|
||||||
"---\npolicy:\n labels: [internal]\n---\n# Private\n\nHidden context.\n",
|
"corpus/engineering-policy.md",
|
||||||
encoding="utf-8",
|
]
|
||||||
)
|
assert "Markdown-backed assets" in activation.content
|
||||||
|
assert "source paths" in activation.content
|
||||||
|
assert activation.metadata["package_title"] == package.title
|
||||||
|
|
||||||
|
|
||||||
|
def test_markitect_context_packages_filter_by_local_policy() -> None:
|
||||||
package = mkt.create_context_package_from_sources(
|
package = mkt.create_context_package_from_sources(
|
||||||
"document",
|
"document",
|
||||||
[public, private],
|
[ADR, INTERNAL],
|
||||||
root=tmp_path,
|
root=EXAMPLE_ROOT,
|
||||||
namespace=mkt.MemoryNamespace(project="kontextual-engine", task="boundary"),
|
namespace=mkt.MemoryNamespace(project="kontextual-engine", task="boundary"),
|
||||||
budget=mkt.ContextBudget(max_items=5),
|
budget=mkt.ContextBudget(max_items=5),
|
||||||
)
|
)
|
||||||
@@ -133,7 +137,7 @@ def test_markitect_context_packages_filter_by_local_policy(tmp_path: Path) -> No
|
|||||||
"id": "kontextual-engine-boundary",
|
"id": "kontextual-engine-boundary",
|
||||||
"subjects": {
|
"subjects": {
|
||||||
"reader": {
|
"reader": {
|
||||||
"allowed_labels": ["public"],
|
"allowed_labels": ["public", "engineering"],
|
||||||
"allowed_actions": ["read", "activate"],
|
"allowed_actions": ["read", "activate"],
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -149,6 +153,20 @@ def test_markitect_context_packages_filter_by_local_policy(tmp_path: Path) -> No
|
|||||||
|
|
||||||
assert package.namespace.project == "kontextual-engine"
|
assert package.namespace.project == "kontextual-engine"
|
||||||
assert len(activation.items) == 1
|
assert len(activation.items) == 1
|
||||||
assert "Visible context" in activation.content
|
assert "Use Markitect Context Packages" in activation.content
|
||||||
assert "Hidden context" not in activation.content
|
assert "Internal Retrieval Risk" not in activation.content
|
||||||
assert activation.policy["summary"]["denied"] == 1
|
assert activation.policy["summary"]["denied"] == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_markitect_document_contracts_accept_valid_and_report_invalid_documents() -> None:
|
||||||
|
contract = mkt.load_contract_file(CONTRACT)
|
||||||
|
valid = mkt.check_markdown_file(ADR, CONTRACT)
|
||||||
|
invalid = mkt.check_markdown_file(INVALID_ADR, CONTRACT)
|
||||||
|
invalid_codes = {diagnostic.code for diagnostic in invalid.diagnostics}
|
||||||
|
|
||||||
|
assert contract.id == "kontextual-decision-record-v1"
|
||||||
|
assert valid.valid is True
|
||||||
|
assert valid.diagnostics == []
|
||||||
|
assert invalid.valid is False
|
||||||
|
assert "contract.section.missing" in invalid_codes
|
||||||
|
assert "contract.section.forbidden" in invalid_codes
|
||||||
|
|||||||
Reference in New Issue
Block a user