Expose retained runs through service API

This commit is contained in:
2026-05-16 03:04:17 +02:00
parent 2412f30975
commit 2a1a53c140
8 changed files with 378 additions and 21 deletions

View File

@@ -164,6 +164,15 @@ Fetch reports after the job status is `succeeded`:
curl -sf http://127.0.0.1:8080/runs/JOB_ID/reports | python3 -m json.tool
```
Inspect retained run history, including runs produced before the current
service process started:
```sh
curl -sf "http://127.0.0.1:8080/retained-runs?runs_dir=runs" | python3 -m json.tool
curl -sf "http://127.0.0.1:8080/retained-runs/latest?runs_dir=runs" | python3 -m json.tool
curl -sf "http://127.0.0.1:8080/retained-runs/RUN_ID/artifact-manifest?runs_dir=runs" | python3 -m json.tool
```
Service job state is currently in memory for the running service process. Run
artifacts are durable in the output directory and can still be inspected after a
service restart. See `docs/SERVICE-JOB-DURABILITY.md` for the restart and

View File

@@ -144,4 +144,9 @@ podman run --rm -p 8080:8080 \
The service layer adds in-memory job tracking and HTTP transport. Execution
semantics remain the CLI/core semantics documented in
`docs/LOCAL-SERVICE-API.md`.
`docs/LOCAL-SERVICE-API.md`. Mounted run directories remain discoverable through
the retained-run endpoints, for example:
```sh
curl -sf "http://127.0.0.1:8080/retained-runs?runs_dir=/runs" | python3 -m json.tool
```

View File

@@ -98,7 +98,41 @@ errors.
### `GET /runs/{job_id}/reports`
Returns the Markdown report content, assessment package JSON, retention summary,
and their filesystem paths after a job has succeeded.
submission package JSON when present, and their filesystem paths after a job has
succeeded.
### `GET /retained-runs`
Lists durable retained run summaries by scanning a runs directory. Without a
query parameter, the service scans `<root>/runs`.
```text
GET /retained-runs?runs_dir=/runs
```
### `GET /retained-runs/latest`
Selects the latest retained run, optionally filtered by target and assessment
profile refs.
```text
GET /retained-runs/latest?runs_dir=/runs&target=sample-repository&assessment=sample-noop-assessment
```
### `GET /retained-runs/{run_id}/reports`
Returns the retained summary plus safe report paths for a durable run. This
works after a service restart because it reads `retention-summary.json` from
disk instead of in-memory job records.
### `GET /retained-runs/{run_id}/artifact-manifest`
Returns the assessment package `artifact_manifest` for a retained run. If the
run predates assessment packages, the response is compatible and returns an
empty manifest with `compatibility: "assessment-package-missing"`.
Retained-run endpoints validate report and artifact paths before returning
them. A path that escapes the selected run directory is rejected.
## Container Mode
@@ -112,5 +146,6 @@ podman run --rm -p 8080:8080 \
```
The service keeps job state in memory. Durable run evidence remains in the
mounted output directory. See `docs/SERVICE-JOB-DURABILITY.md` for the explicit
restart and recovery contract.
mounted output directory and can be discovered through `GET /retained-runs`
after restart. See `docs/SERVICE-JOB-DURABILITY.md` for the explicit recovery
contract.

View File

@@ -13,16 +13,19 @@ Durable state lives in run directories:
- `run.json`
- `plan.json`
- `sources.lock.json`
- `retention-summary.json`
- `normalized/evidence.json`
- `normalized/findings.json`
- `normalized/mappings.json`
- `reports/assessment-package.json`
- `reports/report.md`
- `reports/submission-package.json`
- `artifacts/`
The durable recovery index is the set of `retention-summary.json` files under a
runs directory.
runs directory. No separate durable service index is required for the baseline;
the service reconstructs retained-run views by scanning those summaries.
## Why In-Memory Jobs Stay The Baseline
@@ -47,9 +50,14 @@ After a service restart:
- old `job_id` values are invalid,
- `GET /runs/{job_id}` cannot recover pre-restart job metadata,
- `GET /runs/{job_id}/reports` only works for jobs known to the current process,
- run artifacts from earlier service processes remain available on disk.
- run artifacts from earlier service processes remain available on disk,
- `GET /retained-runs`, `GET /retained-runs/latest`,
`GET /retained-runs/{run_id}/reports`, and
`GET /retained-runs/{run_id}/artifact-manifest` can expose completed retained
runs after restart.
Operators should recover previous results with the CLI run-history commands:
Operators can recover previous results with either the CLI run-history commands
or the retained-run service endpoints:
```sh
PYTHONPATH=src python3 -m guide_board runs list --runs-dir runs
@@ -57,6 +65,12 @@ PYTHONPATH=src python3 -m guide_board runs latest --runs-dir runs
PYTHONPATH=src python3 -m guide_board runs report --runs-dir runs --run-id RUN_ID
```
```sh
curl -sf "http://127.0.0.1:8080/retained-runs?runs_dir=runs" | python3 -m json.tool
curl -sf "http://127.0.0.1:8080/retained-runs/RUN_ID/reports?runs_dir=runs" | python3 -m json.tool
curl -sf "http://127.0.0.1:8080/retained-runs/RUN_ID/artifact-manifest?runs_dir=runs" | python3 -m json.tool
```
## Recovery Flow
Use this flow when the service process restarted or a browser/UI lost its job
@@ -64,8 +78,9 @@ state:
1. Identify the output directory passed to `POST /runs`.
2. Confirm whether `retention-summary.json` exists.
3. If it exists, use `guide-board runs report --runs-dir <parent>` to retrieve
report paths.
3. If it exists, use `guide-board runs report --runs-dir <parent>` or
`GET /retained-runs/{run_id}/reports?runs_dir=<parent>` to retrieve report
paths.
4. If only partial files exist, inspect `run.json`, `plan.json`, and artifacts
before rerunning.
5. Rerun into a fresh output directory when the prior status is unclear.
@@ -73,8 +88,8 @@ state:
## Future Durable Index Option
A future durable service index may be added if UI or automation workflows need
cross-restart job lookup. If added, it should remain reconstructable from run
directories and should not become the authority for assessment results.
cross-restart transport job lookup. If added, it should remain reconstructable
from run directories and should not become the authority for assessment results.
The minimum acceptable durable index would contain:

View File

@@ -35,12 +35,14 @@ echo "==> Verifying mounted run artifacts"
for path in \
"$RUNS_DIR/sample-noop/run.json" \
"$RUNS_DIR/sample-noop/plan.json" \
"$RUNS_DIR/sample-noop/sources.lock.json" \
"$RUNS_DIR/sample-noop/retention-summary.json" \
"$RUNS_DIR/sample-noop/normalized/evidence.json" \
"$RUNS_DIR/sample-noop/normalized/findings.json" \
"$RUNS_DIR/sample-noop/normalized/mappings.json" \
"$RUNS_DIR/sample-noop/reports/assessment-package.json" \
"$RUNS_DIR/sample-noop/reports/report.md"
"$RUNS_DIR/sample-noop/reports/report.md" \
"$RUNS_DIR/sample-noop/reports/submission-package.json"
do
if [ ! -f "$path" ]; then
echo "ERROR: expected artifact missing: $path" >&2

View File

@@ -10,7 +10,7 @@ from datetime import datetime, timezone
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
from pathlib import Path
from typing import Any
from urllib.parse import urlparse
from urllib.parse import parse_qs, unquote, urlparse
from guide_board.discovery import discover_extensions
from guide_board.errors import GuideBoardError
@@ -21,6 +21,11 @@ from guide_board.planning import (
validate_assessment_profile,
validate_target_profile,
)
from guide_board.retention import (
list_retained_runs,
retained_run_report_paths,
select_retained_run,
)
@dataclass(frozen=True)
@@ -131,7 +136,7 @@ class GuideBoardRequestHandler(BaseHTTPRequestHandler):
def _handle(self, method: str) -> None:
parsed = urlparse(self.path)
try:
response, status_code = self._route(method, parsed.path)
response, status_code = self._route(method, parsed.path, parsed.query)
except HttpProblem as exc:
response = _error_response(exc.message, exc.__class__.__name__, exc.status_code)
status_code = exc.status_code
@@ -147,7 +152,8 @@ class GuideBoardRequestHandler(BaseHTTPRequestHandler):
self._send_json(status_code, response)
def _route(self, method: str, path: str) -> tuple[dict[str, Any], int]:
def _route(self, method: str, path: str, query: str = "") -> tuple[dict[str, Any], int]:
query_params = _query_params(query)
if method == "GET" and path == "/health":
return self._health(), 200
if method == "GET" and path == "/extensions":
@@ -160,6 +166,10 @@ class GuideBoardRequestHandler(BaseHTTPRequestHandler):
return {"runs": self.server.context.jobs.list()}, 200
if method == "POST" and path == "/runs":
return self._start_run(), 202
if method == "GET" and path == "/retained-runs":
return self._retained_runs(query_params), 200
if method == "GET" and path == "/retained-runs/latest":
return self._retained_latest(query_params), 200
run_match = _match_run_path(path)
if method == "GET" and run_match is not None:
@@ -169,6 +179,14 @@ class GuideBoardRequestHandler(BaseHTTPRequestHandler):
if suffix == "reports":
return self._run_reports(job_id), 200
retained_match = _match_retained_run_path(path)
if method == "GET" and retained_match is not None:
run_id, suffix = retained_match
if suffix == "reports":
return self._retained_run_reports(run_id, query_params), 200
if suffix == "artifact-manifest":
return self._retained_artifact_manifest(run_id, query_params), 200
raise HttpProblem(404, f"endpoint not found: {method} {path}")
def _health(self) -> dict[str, Any]:
@@ -261,10 +279,21 @@ class GuideBoardRequestHandler(BaseHTTPRequestHandler):
report_path = Path(result["report"])
package_path = Path(result["assessment_package"])
retention_path = Path(result["retention_summary"])
submission_value = result.get("submission_package")
submission_path = (
Path(submission_value)
if isinstance(submission_value, str) and submission_value
else None
)
try:
report_markdown = report_path.read_text(encoding="utf-8")
assessment_package = load_json(package_path)
retention_summary = load_json(retention_path)
submission_package = (
load_json(submission_path)
if submission_path is not None and submission_path.is_file()
else None
)
except OSError as exc:
raise HttpProblem(404, f"run report artifact is missing: {exc}") from exc
@@ -277,6 +306,7 @@ class GuideBoardRequestHandler(BaseHTTPRequestHandler):
"report": str(report_path),
"assessment_package": str(package_path),
"retention_summary": str(retention_path),
"submission_package": str(submission_path) if submission_path is not None else None,
},
"report": {
"path": str(report_path),
@@ -290,6 +320,69 @@ class GuideBoardRequestHandler(BaseHTTPRequestHandler):
"path": str(retention_path),
"json": retention_summary,
},
"submission_package": {
"path": str(submission_path) if submission_package else None,
"json": submission_package,
},
}
def _retained_runs(self, query: dict[str, str]) -> dict[str, Any]:
runs_dir = _runs_dir_from_query(self.server.context.root, query)
return {
"runs_dir": str(runs_dir),
"runs": list_retained_runs(runs_dir),
}
def _retained_latest(self, query: dict[str, str]) -> dict[str, Any]:
runs_dir = _runs_dir_from_query(self.server.context.root, query)
run = select_retained_run(
runs_dir,
target_profile_ref=query.get("target"),
assessment_profile_ref=query.get("assessment"),
)
return {
"runs_dir": str(runs_dir),
"selection": {
"target_profile_ref": query.get("target"),
"assessment_profile_ref": query.get("assessment"),
},
"run": _retained_run_with_paths(run) if run else None,
}
def _retained_run_reports(self, run_id: str, query: dict[str, str]) -> dict[str, Any]:
runs_dir = _runs_dir_from_query(self.server.context.root, query)
run = _select_retained_run_or_404(runs_dir, run_id)
return {
"runs_dir": str(runs_dir),
"run": _retained_run_with_paths(run),
}
def _retained_artifact_manifest(self, run_id: str, query: dict[str, str]) -> dict[str, Any]:
runs_dir = _runs_dir_from_query(self.server.context.root, query)
run = _select_retained_run_or_404(runs_dir, run_id)
run_dir = _safe_run_dir(runs_dir, run)
package_path = run_dir / "reports" / "assessment-package.json"
if not package_path.exists():
return {
"runs_dir": str(runs_dir),
"run_id": run_id,
"run_dir": str(run_dir),
"artifact_manifest": [],
"compatibility": "assessment-package-missing",
}
package = load_json(package_path)
artifacts = package.get("artifact_manifest", [])
if not isinstance(artifacts, list):
raise HttpProblem(400, f"{package_path}: artifact_manifest must be a list")
for artifact in artifacts:
if isinstance(artifact, dict):
_safe_run_ref(run_dir, artifact.get("path"))
return {
"runs_dir": str(runs_dir),
"run_id": run_id,
"run_dir": str(run_dir),
"artifact_manifest": artifacts,
"compatibility": "current",
}
def _read_payload(self) -> dict[str, Any]:
@@ -430,6 +523,81 @@ def _match_run_path(path: str) -> tuple[str, str | None] | None:
return None
def _match_retained_run_path(path: str) -> tuple[str, str] | None:
parts = [unquote(part) for part in path.split("/") if part]
if len(parts) == 3 and parts[0] == "retained-runs":
return parts[1], parts[2]
return None
def _query_params(query: str) -> dict[str, str]:
parsed = parse_qs(query, keep_blank_values=False)
params = {}
for key, values in parsed.items():
if values:
params[key] = values[-1]
return params
def _runs_dir_from_query(root: Path, query: dict[str, str]) -> Path:
runs_dir = query.get("runs_dir")
if not runs_dir:
return (root / "runs").resolve()
return _resolve_path(root, runs_dir)
def _select_retained_run_or_404(runs_dir: Path, run_id: str) -> dict[str, Any]:
run = select_retained_run(runs_dir, run_id=run_id)
if run is None:
raise HttpProblem(404, f"retained run not found: {run_id}")
return run
def _retained_run_with_paths(run: dict[str, Any] | None) -> dict[str, Any] | None:
if run is None:
return None
paths = retained_run_report_paths(run)
run_dir = Path(run["run_dir"]).resolve()
safe_paths = {}
for key, value in paths.items():
path = Path(value).resolve()
try:
path.relative_to(run_dir)
except ValueError as exc:
raise HttpProblem(
400,
f"retained run report path escapes run directory: {value}",
) from exc
safe_paths[key] = str(path)
return {
**run,
"paths": dict(sorted(safe_paths.items())),
}
def _safe_run_dir(runs_dir: Path, run: dict[str, Any]) -> Path:
run_dir_value = run.get("run_dir")
if not isinstance(run_dir_value, str) or not run_dir_value:
raise HttpProblem(400, "retained run is missing run_dir")
run_dir = Path(run_dir_value).resolve()
try:
run_dir.relative_to(runs_dir.resolve())
except ValueError as exc:
raise HttpProblem(400, f"retained run escapes runs_dir: {run_dir}") from exc
return run_dir
def _safe_run_ref(run_dir: Path, ref: Any) -> Path:
if not isinstance(ref, str) or not ref:
raise HttpProblem(400, "artifact manifest entry path must be a non-empty string")
path = (run_dir / ref).resolve()
try:
path.relative_to(run_dir.resolve())
except ValueError as exc:
raise HttpProblem(400, f"artifact path escapes run directory: {ref}") from exc
return path
def _display_path(root: Path, path: Path) -> str:
try:
return str(path.resolve().relative_to(root.resolve()))

View File

@@ -7,6 +7,7 @@ import time
import unittest
from tempfile import TemporaryDirectory
from pathlib import Path
from urllib.parse import quote
from guide_board.discovery import discover_extensions
from guide_board.errors import ValidationError
@@ -458,6 +459,64 @@ class CoreArchitectureTests(unittest.TestCase):
reports["assessment_package"]["json"]["run_id"],
status["result"]["run_id"],
)
self.assertEqual(
reports["submission_package"]["json"]["run_id"],
status["result"]["run_id"],
)
finally:
service.stop()
def test_service_exposes_retained_runs_after_restart(self) -> None:
with TemporaryDirectory() as temporary_directory:
runs_dir = Path(temporary_directory) / "runs"
result = run_assessment(
ROOT,
ROOT / "profiles" / "targets" / "sample-repository.json",
ROOT / "profiles" / "assessments" / "sample-noop.json",
runs_dir / "sample",
)
_write_unsafe_artifact_run(runs_dir / "unsafe-run")
service = start_service(ROOT, host="127.0.0.1", port=0)
try:
query = f"runs_dir={quote(str(runs_dir), safe='')}"
listing = _request_json(service, "GET", f"/retained-runs?{query}")
self.assertEqual(listing["runs_dir"], str(runs_dir))
self.assertIn(result["run_id"], [run["run_id"] for run in listing["runs"]])
latest = _request_json(
service,
"GET",
f"/retained-runs/latest?{query}&target=sample-repository&assessment=sample-noop-assessment",
)
self.assertEqual(latest["run"]["run_id"], result["run_id"])
self.assertIn("submission_package", latest["run"]["paths"])
reports = _request_json(
service,
"GET",
f"/retained-runs/{result['run_id']}/reports?{query}",
)
self.assertEqual(
reports["run"]["paths"]["assessment_package"],
str(runs_dir / "sample" / "reports" / "assessment-package.json"),
)
artifacts = _request_json(
service,
"GET",
f"/retained-runs/{result['run_id']}/artifact-manifest?{query}",
)
self.assertEqual(artifacts["artifact_manifest"], [])
self.assertEqual(artifacts["compatibility"], "current")
unsafe = _request_json(
service,
"GET",
f"/retained-runs/unsafe-run/artifact-manifest?{query}",
expected_status=400,
)
self.assertIn("escapes run directory", unsafe["error"]["message"])
finally:
service.stop()
@@ -603,6 +662,34 @@ def _write_retention_summary(
)
def _write_unsafe_artifact_run(run_dir: Path) -> None:
_write_retention_summary(
run_dir,
"unsafe-run",
"2026-05-07T12:00:00+00:00",
"completed",
{"pass": 1},
0,
1,
)
reports_dir = run_dir / "reports"
reports_dir.mkdir(parents=True, exist_ok=True)
(reports_dir / "assessment-package.json").write_text(
json.dumps(
{
"artifact_manifest": [
{
"id": "artifact:unsafe",
"path": "../outside.txt",
"checksum": "sha256:unsafe",
}
]
}
),
encoding="utf-8",
)
def _request_json(
service: ServiceHandle,
method: str,

View File

@@ -4,12 +4,12 @@ type: workplan
title: "Service Artifact Access And Durable Run Index"
repo: guide-board
domain: markitect
status: active
status: completed
owner: codex
planning_priority: medium
planning_order: 6
created: "2026-05-15"
updated: "2026-05-15"
updated: "2026-05-16"
state_hub_workstream_id: "ba008283-1631-467b-868e-1052c3870ab9"
---
@@ -40,7 +40,7 @@ existing run artifacts.
```task
id: GUIDE-BOARD-WP-0006-T001
status: todo
status: done
priority: high
state_hub_task_id: "4d392fc5-6a1c-46f7-9cbf-6c02bbd744c6"
```
@@ -54,11 +54,22 @@ Acceptance:
directory layout.
- Document the operational tradeoff and failure modes.
Decision:
- Keep the durable index as retained run summaries and helper scans.
- Do not add a separate service index file for the baseline.
Progress:
- Documented reconstruction from `retention-summary.json` files.
- Kept compatibility with older runs that lack newer assessment package or
submission manifest files.
## D6.2 - Service Run History And Artifact Endpoints
```task
id: GUIDE-BOARD-WP-0006-T002
status: todo
status: done
priority: high
state_hub_task_id: "8f209920-6b14-4d6f-bfa1-8f1d03bcdbf1"
```
@@ -71,11 +82,21 @@ Acceptance:
- Avoid serving arbitrary filesystem paths outside configured run directories.
- Add tests for successful retrieval and path-safety failures.
Progress:
- Added `GET /retained-runs`.
- Added `GET /retained-runs/latest`.
- Added `GET /retained-runs/{run_id}/reports`.
- Added `GET /retained-runs/{run_id}/artifact-manifest`.
- Added path containment checks for report refs and artifact manifest paths.
- Added service tests for retained history retrieval after a fresh service
process and unsafe artifact path rejection.
## D6.3 - Restart Recovery And Compatibility
```task
id: GUIDE-BOARD-WP-0006-T003
status: todo
status: done
priority: medium
state_hub_task_id: "0857e7d8-3d23-4426-b7fa-73362d7041a0"
```
@@ -89,11 +110,19 @@ Acceptance:
files.
- Update service durability documentation with examples.
Progress:
- Preserved `/runs` as in-memory job history.
- Exposed durable run results through retained-run endpoints after restart.
- Returned a compatibility marker when an older retained run lacks an
assessment package artifact manifest.
- Updated service durability and local API docs.
## D6.4 - Container And Service Acceptance Tests
```task
id: GUIDE-BOARD-WP-0006-T004
status: todo
status: done
priority: medium
state_hub_task_id: "900a70fa-65ff-4815-9c0c-31f0da4019f0"
```
@@ -106,6 +135,13 @@ Acceptance:
- Document service endpoint usage in local and container modes.
- Keep tests dependency-light.
Progress:
- Added dependency-light service tests for durable run lookup, report paths, and
artifact manifest retrieval.
- Updated container smoke artifact expectations for current run outputs.
- Documented retained-run endpoint usage in local and container modes.
## Definition Of Done
- The local service can expose retained runs and artifacts after restart.