generated from coulomb/repo-seed
Add guide-board pilot ingestion
This commit is contained in:
@@ -3,8 +3,9 @@
|
||||
Status: v0.1 (WP-0003 baseline)
|
||||
Updated: 2026-05-16
|
||||
|
||||
This guide is the user manual for running `artifact-store` v0.1 — the
|
||||
library, CLI, HTTP ingestion API, manifest surface, and retention lifecycle.
|
||||
This guide is the user manual for running `artifact-store` v0.1: the library,
|
||||
CLI, HTTP ingestion API, manifest surface, retention lifecycle, storage checks,
|
||||
and the guide-board pilot path.
|
||||
|
||||
For architectural background see
|
||||
[ARCHITECTURE-BLUEPRINT.md](ARCHITECTURE-BLUEPRINT.md), the ADRs under
|
||||
@@ -52,6 +53,7 @@ All settings are prefixed with ``ARTIFACTSTORE_`` and read by
|
||||
| `ARTIFACTSTORE_ANON_READ` | `false` | Set `true` only for local demos where read endpoints may be anonymous. |
|
||||
| `ARTIFACTSTORE_API_URL` | `http://127.0.0.1:8000` | Default API base URL used by HTTP-backed CLI commands. |
|
||||
| `ARTIFACTSTORE_API_TOKEN` | empty | Default bearer token used by HTTP-backed CLI commands. |
|
||||
| `ARTIFACTSTORE_GUIDE_BOARD_SCHEMA` | `schemas/guide-board.run.v1.json` | Schema path used by guide-board pilot bootstrap helpers. |
|
||||
| `ARTIFACTSTORE_RETENTION_CONFIG_PATH` | empty | Optional TOML file overriding retention-class default durations. |
|
||||
| `ARTIFACTSTORE_RETENTION_SWEEP_INTERVAL_SECONDS` | `3600` | Default interval for external schedulers that invoke the retention sweeper. |
|
||||
| `ARTIFACTSTORE_STORAGE_BACKENDS` | `local` | Comma-separated backend IDs to configure (`local`, `s3`). |
|
||||
@@ -67,6 +69,9 @@ All settings are prefixed with ``ARTIFACTSTORE_`` and read by
|
||||
| `ARTIFACTSTORE_S3_SSE` | empty | Optional server-side encryption value, e.g. `AES256`. |
|
||||
| `ARTIFACTSTORE_S3_MULTIPART_THRESHOLD_BYTES` | `67108864` | Multipart threshold for the S3 backend. |
|
||||
| `ARTIFACTSTORE_S3_MULTIPART_CHUNK_BYTES` | `8388608` | Multipart part size for the S3 backend. |
|
||||
| `STATE_HUB_URL` | `http://127.0.0.1:8000` | State Hub base URL used by guide-board linkage helpers. |
|
||||
| `STATE_HUB_WORKSTREAM_ID` | empty | Optional workstream id for State Hub linkage events. |
|
||||
| `STATE_HUB_TASK_ID` | empty | Optional task id for State Hub linkage events. |
|
||||
|
||||
See [`.env.example`](../.env.example) for the canonical template.
|
||||
|
||||
@@ -201,6 +206,7 @@ digest, emits `v1.storage.location_verified`, and marks failed locations as
|
||||
| `artifactstore manifest <package_id>` | Fetch the JSON manifest projection through the HTTP API. |
|
||||
| `artifactstore retention sweep` | Run one deletion-eligibility sweep against the configured DB. |
|
||||
| `artifactstore storage verify --backend <id>` | Re-read stored objects for a backend and record verification events. |
|
||||
| `artifactstore guide-board ingest <run-dir>` | Ingest one guide-board run directory as an artifact package. |
|
||||
|
||||
The CLI is a thin client over `artifactstore.registry.Registry`
|
||||
(see [ADR-0005](adr/0005-v1-tech-stack.md)).
|
||||
@@ -215,6 +221,7 @@ The CLI is a thin client over `artifactstore.registry.Registry`
|
||||
| `/files...` | File metadata and byte downloads, including single-range reads. |
|
||||
| `/uploads...` | Upload-session wire shape for whole-body v1 uploads. |
|
||||
| `/packages/{id}/retention...` | Extend retention, apply/release holds, and read retention history. |
|
||||
| `POST /metadata-schemas` | Register package metadata schemas by slug. |
|
||||
| `GET /events` | Long-poll event feed, CBOR by default or JSON with `Accept: application/json`. |
|
||||
|
||||
All non-health routes require a bearer token unless
|
||||
@@ -267,6 +274,14 @@ asyncio.run(main())
|
||||
Prerequisites: `make migrate-fresh` has been run so the schema and the
|
||||
retention class seeds exist.
|
||||
|
||||
## Guide-board pilot
|
||||
|
||||
The guide-board pilot stores a run directory as one artifact package and records
|
||||
only package identifiers in State Hub. See
|
||||
[docs/pilots/guide-board.md](pilots/guide-board.md) for schema registration,
|
||||
the real `~/guide-board` plus `~/open-cmis-tck` smoke procedure, and the exact
|
||||
`POST /progress/` linkage payload.
|
||||
|
||||
## Replay / disaster recovery
|
||||
|
||||
Every state-changing operation writes one row to `events` and updates the
|
||||
@@ -303,6 +318,8 @@ sequence order through the canonical view writer. The result is
|
||||
and the v1 schema commitments.
|
||||
- [ROADMAP.md](ROADMAP.md) — workplan sequencing.
|
||||
- [ASSEMBLY-EXPERIMENT.md](ASSEMBLY-EXPERIMENT.md) — opt-in asm research line.
|
||||
- [pilots/guide-board.md](pilots/guide-board.md) — guide-board pilot ingestion
|
||||
and State Hub linkage.
|
||||
|
||||
### Architecture Decision Records
|
||||
|
||||
|
||||
162
docs/pilots/guide-board.md
Normal file
162
docs/pilots/guide-board.md
Normal file
@@ -0,0 +1,162 @@
|
||||
# Guide-Board Pilot
|
||||
|
||||
Status: active pilot
|
||||
Updated: 2026-05-16
|
||||
|
||||
This guide wires the first real producer into artifact-store. A guide-board run
|
||||
directory becomes one artifact package; State Hub records the package identity
|
||||
and manifest digest, but never stores artifact bytes.
|
||||
|
||||
## One-Time Schema Registration
|
||||
|
||||
Start artifact-store and register the pilot metadata schema:
|
||||
|
||||
```sh
|
||||
cd /home/worsch/artifact-store
|
||||
export ARTIFACTSTORE_API_URL=http://127.0.0.1:8000
|
||||
export ARTIFACTSTORE_API_TOKEN=dev-token
|
||||
python3 scripts/register-guide-board-schema.py
|
||||
```
|
||||
|
||||
The script posts this payload shape to `POST /metadata-schemas`:
|
||||
|
||||
```json
|
||||
{
|
||||
"slug": "guide-board.run.v1",
|
||||
"json_schema": {
|
||||
"$id": "artifactstore:schemas:guide-board.run.v1"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Ingest A Run
|
||||
|
||||
The local CLI path opens the configured database and storage backend directly:
|
||||
|
||||
```sh
|
||||
artifactstore guide-board ingest /tmp/guide-board-run \
|
||||
--schema schemas/guide-board.run.v1.json
|
||||
```
|
||||
|
||||
Output is JSON:
|
||||
|
||||
```json
|
||||
{
|
||||
"package_id": "00000000-0000-0000-0000-000000000000",
|
||||
"manifest_digest": "blake3:...",
|
||||
"file_count": 8,
|
||||
"reused_existing": false
|
||||
}
|
||||
```
|
||||
|
||||
The helper is idempotent by guide-board `run_id`. Re-ingesting the same
|
||||
finalized run returns the existing package id and manifest digest with
|
||||
`reused_existing: true`.
|
||||
|
||||
## State Hub Linkage
|
||||
|
||||
After ingest, record a progress event with structured `detail`. This is the
|
||||
canonical linkage shape:
|
||||
|
||||
```sh
|
||||
curl -s -X POST "$STATE_HUB_URL/progress/" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"event_type": "artifact_link",
|
||||
"author": "artifact-store",
|
||||
"workstream_id": "701c4d8c-5cf4-4a4a-ab60-1dcae53fe771",
|
||||
"task_id": "bffa3573-4a1f-4c12-8c73-6d55bd8f6297",
|
||||
"summary": "guide-board run <run_id> artifacts stored in artifact-store package <package_id>",
|
||||
"detail": {
|
||||
"producer": "guide-board",
|
||||
"artifact_store_api_url": "http://127.0.0.1:8000",
|
||||
"run_dir": "/tmp/guide-board-run",
|
||||
"run_id": "<run_id>",
|
||||
"target_profile_ref": "<target>",
|
||||
"assessment_profile_ref": "<assessment>",
|
||||
"result_status": "<status>",
|
||||
"package_id": "<package_id>",
|
||||
"manifest_digest": "<manifest_digest>",
|
||||
"file_count": 8,
|
||||
"retention_class": "release-evidence"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
Use the checked-in helper to build the same event from environment variables:
|
||||
|
||||
```sh
|
||||
export STATE_HUB_URL=http://127.0.0.1:8000
|
||||
export STATE_HUB_WORKSTREAM_ID=701c4d8c-5cf4-4a4a-ab60-1dcae53fe771
|
||||
export STATE_HUB_TASK_ID=bffa3573-4a1f-4c12-8c73-6d55bd8f6297
|
||||
export GUIDE_BOARD_RUN_DIR=/tmp/guide-board-run
|
||||
export ARTIFACTSTORE_INGEST_RESULT_PATH=/tmp/artifactstore-guide-board-ingest.json
|
||||
python3 scripts/link-guide-board-package.py
|
||||
```
|
||||
|
||||
The helper posts only identifiers, summary metadata, and links. Artifact bytes
|
||||
remain in artifact-store storage backends.
|
||||
|
||||
## Real Producer Smoke
|
||||
|
||||
This path uses the real guide-board core and the external `open-cmis-tck`
|
||||
extension. It is expected to complete under five minutes on a developer
|
||||
workstation once Python dependencies and local candidate prerequisites are in
|
||||
place.
|
||||
|
||||
1. Produce a guide-board run:
|
||||
|
||||
```sh
|
||||
cd /home/worsch/guide-board
|
||||
mkdir -p /tmp/guide-board-artifact-store-smoke
|
||||
PYTHONPATH=src python3 -m guide_board \
|
||||
--extension-dir ../open-cmis-tck \
|
||||
run \
|
||||
--target ../open-cmis-tck/profiles/targets/kontextual-cmis-compat.json \
|
||||
--assessment ../open-cmis-tck/profiles/assessments/cmis-browser-baseline.json \
|
||||
--output-dir /tmp/guide-board-artifact-store-smoke/open-cmis-tck-baseline
|
||||
```
|
||||
|
||||
2. Start artifact-store:
|
||||
|
||||
```sh
|
||||
cd /home/worsch/artifact-store
|
||||
cp .env.example .env
|
||||
make migrate-fresh
|
||||
make dev
|
||||
```
|
||||
|
||||
3. Register the schema and ingest the run:
|
||||
|
||||
```sh
|
||||
export ARTIFACTSTORE_API_TOKEN=dev-token
|
||||
python3 scripts/register-guide-board-schema.py
|
||||
artifactstore guide-board ingest \
|
||||
/tmp/guide-board-artifact-store-smoke/open-cmis-tck-baseline \
|
||||
--schema schemas/guide-board.run.v1.json \
|
||||
> /tmp/artifactstore-guide-board-ingest.json
|
||||
cat /tmp/artifactstore-guide-board-ingest.json
|
||||
```
|
||||
|
||||
4. Verify the manifest:
|
||||
|
||||
```sh
|
||||
PACKAGE_ID=$(python3 -c 'import json; print(json.load(open("/tmp/artifactstore-guide-board-ingest.json"))["package_id"])')
|
||||
artifactstore manifest "$PACKAGE_ID"
|
||||
```
|
||||
|
||||
5. Record State Hub linkage:
|
||||
|
||||
```sh
|
||||
export STATE_HUB_URL=http://127.0.0.1:8000
|
||||
export STATE_HUB_WORKSTREAM_ID=701c4d8c-5cf4-4a4a-ab60-1dcae53fe771
|
||||
export STATE_HUB_TASK_ID=bffa3573-4a1f-4c12-8c73-6d55bd8f6297
|
||||
export GUIDE_BOARD_RUN_DIR=/tmp/guide-board-artifact-store-smoke/open-cmis-tck-baseline
|
||||
export ARTIFACTSTORE_INGEST_RESULT_PATH=/tmp/artifactstore-guide-board-ingest.json
|
||||
python3 scripts/link-guide-board-package.py
|
||||
```
|
||||
|
||||
To smoke the storage swap after enabling WP-0004 S3 settings, keep the same
|
||||
guide-board ingest command and set
|
||||
`ARTIFACTSTORE_STORAGE_BACKEND_ROUTES='guide-board:release-evidence=s3,*:*=local'`
|
||||
before starting artifact-store.
|
||||
Reference in New Issue
Block a user