generated from coulomb/repo-seed
Document service job durability contract
This commit is contained in:
@@ -1,17 +1,16 @@
|
|||||||
<!-- custodian-brief: generated by fix-consistency — do not edit manually -->
|
<!-- custodian-brief: generated by fix-consistency — do not edit manually -->
|
||||||
# Custodian Brief — guide-board
|
# Custodian Brief — guide-board
|
||||||
|
|
||||||
**Domain:** markitect
|
**Domain:** markitect
|
||||||
**Last synced:** 2026-05-15 11:49 UTC
|
**Last synced:** 2026-05-15 12:35 UTC
|
||||||
**State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
|
**State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
|
||||||
|
|
||||||
## Active Workstreams
|
## Active Workstreams
|
||||||
|
|
||||||
### Assessment Operations Baseline
|
### Assessment Operations Baseline
|
||||||
Progress: 3/6 done | workstream_id: `fc5b1573-91b2-4a19-b6a9-dd4d17057d9b`
|
Progress: 4/6 done | workstream_id: `fc5b1573-91b2-4a19-b6a9-dd4d17057d9b`
|
||||||
|
|
||||||
**Open tasks:**
|
**Open tasks:**
|
||||||
- · D2.4 - Service Job Durability Contract `10e4003c`
|
|
||||||
- · D2.5 - Container Smoke Acceptance `9e2e7fa7`
|
- · D2.5 - Container Smoke Acceptance `9e2e7fa7`
|
||||||
- · D2.6 - External Extension Acceptance Path `65fbf1df`
|
- · D2.6 - External Extension Acceptance Path `65fbf1df`
|
||||||
|
|
||||||
|
|||||||
@@ -56,5 +56,6 @@ See:
|
|||||||
- [docs/CONTAINER.md](docs/CONTAINER.md)
|
- [docs/CONTAINER.md](docs/CONTAINER.md)
|
||||||
- [docs/EXTENSION-SDK.md](docs/EXTENSION-SDK.md)
|
- [docs/EXTENSION-SDK.md](docs/EXTENSION-SDK.md)
|
||||||
- [docs/LOCAL-SERVICE-API.md](docs/LOCAL-SERVICE-API.md)
|
- [docs/LOCAL-SERVICE-API.md](docs/LOCAL-SERVICE-API.md)
|
||||||
|
- [docs/SERVICE-JOB-DURABILITY.md](docs/SERVICE-JOB-DURABILITY.md)
|
||||||
- [extensions/CANDIDATES.md](extensions/CANDIDATES.md)
|
- [extensions/CANDIDATES.md](extensions/CANDIDATES.md)
|
||||||
- [workplans/GUIDE-BOARD-WP-0001-bootstrapping.md](workplans/GUIDE-BOARD-WP-0001-bootstrapping.md)
|
- [workplans/GUIDE-BOARD-WP-0001-bootstrapping.md](workplans/GUIDE-BOARD-WP-0001-bootstrapping.md)
|
||||||
|
|||||||
@@ -147,7 +147,8 @@ curl -sf http://127.0.0.1:8080/runs/JOB_ID/reports | python3 -m json.tool
|
|||||||
|
|
||||||
Service job state is currently in memory for the running service process. Run
|
Service job state is currently in memory for the running service process. Run
|
||||||
artifacts are durable in the output directory and can still be inspected after a
|
artifacts are durable in the output directory and can still be inspected after a
|
||||||
service restart.
|
service restart. See `docs/SERVICE-JOB-DURABILITY.md` for the restart and
|
||||||
|
recovery contract.
|
||||||
|
|
||||||
## Status Vocabulary
|
## Status Vocabulary
|
||||||
|
|
||||||
|
|||||||
@@ -87,7 +87,8 @@ run directory; the assessment result itself is still reported separately as
|
|||||||
|
|
||||||
### `GET /runs`
|
### `GET /runs`
|
||||||
|
|
||||||
Lists known in-memory jobs for the current service process.
|
Lists known in-memory jobs for the current service process. Job records are not
|
||||||
|
durable across service restarts.
|
||||||
|
|
||||||
### `GET /runs/{job_id}`
|
### `GET /runs/{job_id}`
|
||||||
|
|
||||||
@@ -111,4 +112,5 @@ podman run --rm -p 8080:8080 \
|
|||||||
```
|
```
|
||||||
|
|
||||||
The service keeps job state in memory. Durable run evidence remains in the
|
The service keeps job state in memory. Durable run evidence remains in the
|
||||||
mounted output directory.
|
mounted output directory. See `docs/SERVICE-JOB-DURABILITY.md` for the explicit
|
||||||
|
restart and recovery contract.
|
||||||
|
|||||||
90
docs/SERVICE-JOB-DURABILITY.md
Normal file
90
docs/SERVICE-JOB-DURABILITY.md
Normal file
@@ -0,0 +1,90 @@
|
|||||||
|
# Service Job Durability
|
||||||
|
|
||||||
|
Status: draft
|
||||||
|
Created: 2026-05-15
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
The guide-board local service keeps HTTP job state in memory for the baseline.
|
||||||
|
This is intentional. The service is a thin local transport over the CLI
|
||||||
|
contracts, not a workflow database.
|
||||||
|
|
||||||
|
Durable state lives in run directories:
|
||||||
|
|
||||||
|
- `run.json`
|
||||||
|
- `plan.json`
|
||||||
|
- `retention-summary.json`
|
||||||
|
- `normalized/evidence.json`
|
||||||
|
- `normalized/findings.json`
|
||||||
|
- `normalized/mappings.json`
|
||||||
|
- `reports/assessment-package.json`
|
||||||
|
- `reports/report.md`
|
||||||
|
- `artifacts/`
|
||||||
|
|
||||||
|
The durable recovery index is the set of `retention-summary.json` files under a
|
||||||
|
runs directory.
|
||||||
|
|
||||||
|
## Why In-Memory Jobs Stay The Baseline
|
||||||
|
|
||||||
|
In-memory service jobs keep the first service layer dependency-light and easy to
|
||||||
|
embed in local, container, and extension-specific environments. Operators can
|
||||||
|
restart the service without migrating or repairing a service database, and the
|
||||||
|
CLI remains the source of truth for execution semantics.
|
||||||
|
|
||||||
|
This also keeps interrupted service runs easy to reason about:
|
||||||
|
|
||||||
|
- if the process exits before a run completes, the HTTP job record is gone,
|
||||||
|
- any partial run directory remains for inspection,
|
||||||
|
- completed runs are recoverable through retained run summaries,
|
||||||
|
- repeated runs should use a new output directory or an intentional overwrite
|
||||||
|
policy chosen by the operator.
|
||||||
|
|
||||||
|
## Restart Semantics
|
||||||
|
|
||||||
|
After a service restart:
|
||||||
|
|
||||||
|
- `GET /runs` returns only jobs created since the new service process started,
|
||||||
|
- old `job_id` values are invalid,
|
||||||
|
- `GET /runs/{job_id}` cannot recover pre-restart job metadata,
|
||||||
|
- `GET /runs/{job_id}/reports` only works for jobs known to the current process,
|
||||||
|
- run artifacts from earlier service processes remain available on disk.
|
||||||
|
|
||||||
|
Operators should recover previous results with the CLI run-history commands:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
PYTHONPATH=src python3 -m guide_board runs list --runs-dir runs
|
||||||
|
PYTHONPATH=src python3 -m guide_board runs latest --runs-dir runs
|
||||||
|
PYTHONPATH=src python3 -m guide_board runs report --runs-dir runs --run-id RUN_ID
|
||||||
|
```
|
||||||
|
|
||||||
|
## Recovery Flow
|
||||||
|
|
||||||
|
Use this flow when the service process restarted or a browser/UI lost its job
|
||||||
|
state:
|
||||||
|
|
||||||
|
1. Identify the output directory passed to `POST /runs`.
|
||||||
|
2. Confirm whether `retention-summary.json` exists.
|
||||||
|
3. If it exists, use `guide-board runs report --runs-dir <parent>` to retrieve
|
||||||
|
report paths.
|
||||||
|
4. If only partial files exist, inspect `run.json`, `plan.json`, and artifacts
|
||||||
|
before rerunning.
|
||||||
|
5. Rerun into a fresh output directory when the prior status is unclear.
|
||||||
|
|
||||||
|
## Future Durable Index Option
|
||||||
|
|
||||||
|
A future durable service index may be added if UI or automation workflows need
|
||||||
|
cross-restart job lookup. If added, it should remain reconstructable from run
|
||||||
|
directories and should not become the authority for assessment results.
|
||||||
|
|
||||||
|
The minimum acceptable durable index would contain:
|
||||||
|
|
||||||
|
- job id,
|
||||||
|
- request payload,
|
||||||
|
- job transport status,
|
||||||
|
- run id,
|
||||||
|
- output directory,
|
||||||
|
- result paths,
|
||||||
|
- error summary.
|
||||||
|
|
||||||
|
The index should be optional, dependency-light, and repairable by scanning
|
||||||
|
retained run summaries.
|
||||||
@@ -120,7 +120,7 @@ Progress:
|
|||||||
|
|
||||||
```task
|
```task
|
||||||
id: GUIDE-BOARD-WP-0002-T004
|
id: GUIDE-BOARD-WP-0002-T004
|
||||||
status: todo
|
status: done
|
||||||
priority: medium
|
priority: medium
|
||||||
state_hub_task_id: "10e4003c-dc11-4a8e-aecc-7815559ac439"
|
state_hub_task_id: "10e4003c-dc11-4a8e-aecc-7815559ac439"
|
||||||
```
|
```
|
||||||
@@ -134,6 +134,18 @@ Acceptance:
|
|||||||
- If durable indexing is added, keep it dependency-light and reconstructable
|
- If durable indexing is added, keep it dependency-light and reconstructable
|
||||||
from retained run artifacts.
|
from retained run artifacts.
|
||||||
|
|
||||||
|
Decision:
|
||||||
|
|
||||||
|
- Keep local service job state intentionally in-memory for the baseline.
|
||||||
|
- Treat run directories and `retention-summary.json` as the durable recovery
|
||||||
|
source.
|
||||||
|
|
||||||
|
Progress:
|
||||||
|
|
||||||
|
- Added `docs/SERVICE-JOB-DURABILITY.md`.
|
||||||
|
- Linked the contract from README, the local service API docs, and the
|
||||||
|
assessment operations guide.
|
||||||
|
|
||||||
## D2.5 - Container Smoke Acceptance
|
## D2.5 - Container Smoke Acceptance
|
||||||
|
|
||||||
```task
|
```task
|
||||||
|
|||||||
Reference in New Issue
Block a user