E2e tests and frist use docs

2026-05-04 22:58:59 +02:00
parent 5dfb403979
commit a7ab4904d5
9 changed files with 583 additions and 10 deletions
--- a/docs/performance-notes.md
+++ b/docs/performance-notes.md
@@ -0,0 +1,60 @@
+# Performance Notes
+
+Markitect is designed to remain useful without persistent services. The current
+performance posture is therefore local-first:
+
+- parse individual Markdown files directly for one-off use
+- build a local SQLite index for repeated corpus operations
+- use refresh planning to avoid unnecessary parse/index work
+- keep policy filtering and context package creation deterministic
+
+## Current Smoke Coverage
+
+`tests/test_practical_usecases_e2e.py` includes a large-corpus smoke test that
+creates 120 synthetic Markdown files, indexes them locally, and runs an FTS
+search.
+
+The thresholds are intentionally generous:
+
+- local cache/index build: under 30 seconds
+- local indexed search: under 5 seconds
+
+These are not benchmark claims. They are regression guards to catch accidental
+algorithmic or IO mistakes while keeping the test portable.
+
+## Practical Guidance
+
+For one file:
+
+```bash
+mkt parse file.md
+mkt query file.md 'sections[heading=Decision]'
+```
+
+For a directory you will query repeatedly:
+
+```bash
+mkt cache index docs --root .
+mkt search keyword --root .
+mkt cache query 'sections[heading=Decision]' --root .
+```
+
+Before refreshing derived work:
+
+```bash
+mkt backend refresh-plan docs --root . --verify-hashes
+```
+
+## Future Measurement
+
+If performance becomes a release gate, add a separate benchmark suite instead
+of tightening normal E2E tests. Good benchmark dimensions would be:
+
+- number of documents
+- total bytes
+- heading/section density
+- frontmatter size
+- number of reference/include relationships
+- policy labels per document
+- index rebuild versus incremental refresh
+- context package item and token budgets