# Test Performance Monitoring Date: 2026-05-05 Status: lightweight pytest performance history for local situational awareness. ## Purpose The test suite records a compact performance history on every pytest run. The goal is not detailed profiling. It is a small scorekeeping loop that helps us notice negative drift while the engine grows. The monitor captures: - run start and finish timestamps, - total test run duration, - per-test duration and outcome, - Python and platform identity, - logical CPU count, - load averages and load-per-CPU where available, - memory total, available memory, and available ratio from `/proc/meminfo` where available, - process user/system CPU deltas and peak resident memory. ## Storage Default history path: ```text .pytest_cache/kontextual/performance-history.json ``` `.pytest_cache/` is ignored by git, so regular test runs do not dirty the repository. A different path can be supplied with `--perf-history-path` or `KONTEXTUAL_PERF_HISTORY`. ## Retention Model The JSON file keeps a bounded, compact record: - the last `N` raw runs, - the last `N` rolling averages over the retained runs, - the average of the last `N` rolling averages, - one compact daily average record per day, updated on every run, - daily records retained for a configurable number of days. Defaults: - `N = 20`, - daily retention = `730` days, - drift warning ratio = `35%`, - minimum duration delta before warning = `0.05s`. Skipped tests are recorded in raw runs and aggregate counts, but they are not used as per-test duration baselines. This keeps optional Markitect and capacity tests from producing false regressions when they switch from skipped to executed. ## Warnings At the end of the pytest run, the monitor compares the current run with the previous average-of-averages. It prints warnings for: - total run duration drift, when the executed test count is comparable, - individual test duration drift, - materially higher normalized start load, - materially lower available-memory ratio. Warnings do not fail the test run. They are meant to create attention, not gate development. ## Configuration Disable monitoring: ```bash python3 -m pytest --perf-history-disable ``` or: ```bash KONTEXTUAL_PERF_MONITOR=0 python3 -m pytest ``` Override retention and warning thresholds: ```bash python3 -m pytest \ --perf-history-window 30 \ --perf-history-drift-ratio 0.50 \ --perf-history-min-delta 0.10 ``` Environment equivalents: - `KONTEXTUAL_PERF_HISTORY`, - `KONTEXTUAL_PERF_WINDOW`, - `KONTEXTUAL_PERF_DAILY_RETENTION_DAYS`, - `KONTEXTUAL_PERF_DRIFT_RATIO`, - `KONTEXTUAL_PERF_MIN_DELTA_SECONDS`. ## When To Profile Instead Use this monitor to spot drift and identify candidate tests or areas. If a warning points to a real bottleneck, create a focused profiling experiment or a capacity sentinel. Do not add large traces or per-function profiling data to the rolling history.