generated from coulomb/repo-seed
Add guide-board pilot ingestion
This commit is contained in:
@@ -26,6 +26,15 @@ ARTIFACTSTORE_ANON_READ=false
|
||||
ARTIFACTSTORE_API_URL=http://127.0.0.1:8000
|
||||
ARTIFACTSTORE_API_TOKEN=dev-token
|
||||
|
||||
# Guide-board pilot helper defaults.
|
||||
ARTIFACTSTORE_GUIDE_BOARD_SCHEMA=schemas/guide-board.run.v1.json
|
||||
STATE_HUB_URL=http://127.0.0.1:8000
|
||||
STATE_HUB_AUTHOR=artifact-store
|
||||
STATE_HUB_WORKSTREAM_ID=
|
||||
STATE_HUB_TASK_ID=
|
||||
GUIDE_BOARD_RUN_DIR=
|
||||
ARTIFACTSTORE_INGEST_RESULT_PATH=
|
||||
|
||||
# Optional TOML file overriding retention class default durations.
|
||||
ARTIFACTSTORE_RETENTION_CONFIG_PATH=
|
||||
|
||||
|
||||
@@ -3,8 +3,9 @@
|
||||
Status: v0.1 (WP-0003 baseline)
|
||||
Updated: 2026-05-16
|
||||
|
||||
This guide is the user manual for running `artifact-store` v0.1 — the
|
||||
library, CLI, HTTP ingestion API, manifest surface, and retention lifecycle.
|
||||
This guide is the user manual for running `artifact-store` v0.1: the library,
|
||||
CLI, HTTP ingestion API, manifest surface, retention lifecycle, storage checks,
|
||||
and the guide-board pilot path.
|
||||
|
||||
For architectural background see
|
||||
[ARCHITECTURE-BLUEPRINT.md](ARCHITECTURE-BLUEPRINT.md), the ADRs under
|
||||
@@ -52,6 +53,7 @@ All settings are prefixed with ``ARTIFACTSTORE_`` and read by
|
||||
| `ARTIFACTSTORE_ANON_READ` | `false` | Set `true` only for local demos where read endpoints may be anonymous. |
|
||||
| `ARTIFACTSTORE_API_URL` | `http://127.0.0.1:8000` | Default API base URL used by HTTP-backed CLI commands. |
|
||||
| `ARTIFACTSTORE_API_TOKEN` | empty | Default bearer token used by HTTP-backed CLI commands. |
|
||||
| `ARTIFACTSTORE_GUIDE_BOARD_SCHEMA` | `schemas/guide-board.run.v1.json` | Schema path used by guide-board pilot bootstrap helpers. |
|
||||
| `ARTIFACTSTORE_RETENTION_CONFIG_PATH` | empty | Optional TOML file overriding retention-class default durations. |
|
||||
| `ARTIFACTSTORE_RETENTION_SWEEP_INTERVAL_SECONDS` | `3600` | Default interval for external schedulers that invoke the retention sweeper. |
|
||||
| `ARTIFACTSTORE_STORAGE_BACKENDS` | `local` | Comma-separated backend IDs to configure (`local`, `s3`). |
|
||||
@@ -67,6 +69,9 @@ All settings are prefixed with ``ARTIFACTSTORE_`` and read by
|
||||
| `ARTIFACTSTORE_S3_SSE` | empty | Optional server-side encryption value, e.g. `AES256`. |
|
||||
| `ARTIFACTSTORE_S3_MULTIPART_THRESHOLD_BYTES` | `67108864` | Multipart threshold for the S3 backend. |
|
||||
| `ARTIFACTSTORE_S3_MULTIPART_CHUNK_BYTES` | `8388608` | Multipart part size for the S3 backend. |
|
||||
| `STATE_HUB_URL` | `http://127.0.0.1:8000` | State Hub base URL used by guide-board linkage helpers. |
|
||||
| `STATE_HUB_WORKSTREAM_ID` | empty | Optional workstream id for State Hub linkage events. |
|
||||
| `STATE_HUB_TASK_ID` | empty | Optional task id for State Hub linkage events. |
|
||||
|
||||
See [`.env.example`](../.env.example) for the canonical template.
|
||||
|
||||
@@ -201,6 +206,7 @@ digest, emits `v1.storage.location_verified`, and marks failed locations as
|
||||
| `artifactstore manifest <package_id>` | Fetch the JSON manifest projection through the HTTP API. |
|
||||
| `artifactstore retention sweep` | Run one deletion-eligibility sweep against the configured DB. |
|
||||
| `artifactstore storage verify --backend <id>` | Re-read stored objects for a backend and record verification events. |
|
||||
| `artifactstore guide-board ingest <run-dir>` | Ingest one guide-board run directory as an artifact package. |
|
||||
|
||||
The CLI is a thin client over `artifactstore.registry.Registry`
|
||||
(see [ADR-0005](adr/0005-v1-tech-stack.md)).
|
||||
@@ -215,6 +221,7 @@ The CLI is a thin client over `artifactstore.registry.Registry`
|
||||
| `/files...` | File metadata and byte downloads, including single-range reads. |
|
||||
| `/uploads...` | Upload-session wire shape for whole-body v1 uploads. |
|
||||
| `/packages/{id}/retention...` | Extend retention, apply/release holds, and read retention history. |
|
||||
| `POST /metadata-schemas` | Register package metadata schemas by slug. |
|
||||
| `GET /events` | Long-poll event feed, CBOR by default or JSON with `Accept: application/json`. |
|
||||
|
||||
All non-health routes require a bearer token unless
|
||||
@@ -267,6 +274,14 @@ asyncio.run(main())
|
||||
Prerequisites: `make migrate-fresh` has been run so the schema and the
|
||||
retention class seeds exist.
|
||||
|
||||
## Guide-board pilot
|
||||
|
||||
The guide-board pilot stores a run directory as one artifact package and records
|
||||
only package identifiers in State Hub. See
|
||||
[docs/pilots/guide-board.md](pilots/guide-board.md) for schema registration,
|
||||
the real `~/guide-board` plus `~/open-cmis-tck` smoke procedure, and the exact
|
||||
`POST /progress/` linkage payload.
|
||||
|
||||
## Replay / disaster recovery
|
||||
|
||||
Every state-changing operation writes one row to `events` and updates the
|
||||
@@ -303,6 +318,8 @@ sequence order through the canonical view writer. The result is
|
||||
and the v1 schema commitments.
|
||||
- [ROADMAP.md](ROADMAP.md) — workplan sequencing.
|
||||
- [ASSEMBLY-EXPERIMENT.md](ASSEMBLY-EXPERIMENT.md) — opt-in asm research line.
|
||||
- [pilots/guide-board.md](pilots/guide-board.md) — guide-board pilot ingestion
|
||||
and State Hub linkage.
|
||||
|
||||
### Architecture Decision Records
|
||||
|
||||
|
||||
162
docs/pilots/guide-board.md
Normal file
162
docs/pilots/guide-board.md
Normal file
@@ -0,0 +1,162 @@
|
||||
# Guide-Board Pilot
|
||||
|
||||
Status: active pilot
|
||||
Updated: 2026-05-16
|
||||
|
||||
This guide wires the first real producer into artifact-store. A guide-board run
|
||||
directory becomes one artifact package; State Hub records the package identity
|
||||
and manifest digest, but never stores artifact bytes.
|
||||
|
||||
## One-Time Schema Registration
|
||||
|
||||
Start artifact-store and register the pilot metadata schema:
|
||||
|
||||
```sh
|
||||
cd /home/worsch/artifact-store
|
||||
export ARTIFACTSTORE_API_URL=http://127.0.0.1:8000
|
||||
export ARTIFACTSTORE_API_TOKEN=dev-token
|
||||
python3 scripts/register-guide-board-schema.py
|
||||
```
|
||||
|
||||
The script posts this payload shape to `POST /metadata-schemas`:
|
||||
|
||||
```json
|
||||
{
|
||||
"slug": "guide-board.run.v1",
|
||||
"json_schema": {
|
||||
"$id": "artifactstore:schemas:guide-board.run.v1"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Ingest A Run
|
||||
|
||||
The local CLI path opens the configured database and storage backend directly:
|
||||
|
||||
```sh
|
||||
artifactstore guide-board ingest /tmp/guide-board-run \
|
||||
--schema schemas/guide-board.run.v1.json
|
||||
```
|
||||
|
||||
Output is JSON:
|
||||
|
||||
```json
|
||||
{
|
||||
"package_id": "00000000-0000-0000-0000-000000000000",
|
||||
"manifest_digest": "blake3:...",
|
||||
"file_count": 8,
|
||||
"reused_existing": false
|
||||
}
|
||||
```
|
||||
|
||||
The helper is idempotent by guide-board `run_id`. Re-ingesting the same
|
||||
finalized run returns the existing package id and manifest digest with
|
||||
`reused_existing: true`.
|
||||
|
||||
## State Hub Linkage
|
||||
|
||||
After ingest, record a progress event with structured `detail`. This is the
|
||||
canonical linkage shape:
|
||||
|
||||
```sh
|
||||
curl -s -X POST "$STATE_HUB_URL/progress/" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"event_type": "artifact_link",
|
||||
"author": "artifact-store",
|
||||
"workstream_id": "701c4d8c-5cf4-4a4a-ab60-1dcae53fe771",
|
||||
"task_id": "bffa3573-4a1f-4c12-8c73-6d55bd8f6297",
|
||||
"summary": "guide-board run <run_id> artifacts stored in artifact-store package <package_id>",
|
||||
"detail": {
|
||||
"producer": "guide-board",
|
||||
"artifact_store_api_url": "http://127.0.0.1:8000",
|
||||
"run_dir": "/tmp/guide-board-run",
|
||||
"run_id": "<run_id>",
|
||||
"target_profile_ref": "<target>",
|
||||
"assessment_profile_ref": "<assessment>",
|
||||
"result_status": "<status>",
|
||||
"package_id": "<package_id>",
|
||||
"manifest_digest": "<manifest_digest>",
|
||||
"file_count": 8,
|
||||
"retention_class": "release-evidence"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
Use the checked-in helper to build the same event from environment variables:
|
||||
|
||||
```sh
|
||||
export STATE_HUB_URL=http://127.0.0.1:8000
|
||||
export STATE_HUB_WORKSTREAM_ID=701c4d8c-5cf4-4a4a-ab60-1dcae53fe771
|
||||
export STATE_HUB_TASK_ID=bffa3573-4a1f-4c12-8c73-6d55bd8f6297
|
||||
export GUIDE_BOARD_RUN_DIR=/tmp/guide-board-run
|
||||
export ARTIFACTSTORE_INGEST_RESULT_PATH=/tmp/artifactstore-guide-board-ingest.json
|
||||
python3 scripts/link-guide-board-package.py
|
||||
```
|
||||
|
||||
The helper posts only identifiers, summary metadata, and links. Artifact bytes
|
||||
remain in artifact-store storage backends.
|
||||
|
||||
## Real Producer Smoke
|
||||
|
||||
This path uses the real guide-board core and the external `open-cmis-tck`
|
||||
extension. It is expected to complete under five minutes on a developer
|
||||
workstation once Python dependencies and local candidate prerequisites are in
|
||||
place.
|
||||
|
||||
1. Produce a guide-board run:
|
||||
|
||||
```sh
|
||||
cd /home/worsch/guide-board
|
||||
mkdir -p /tmp/guide-board-artifact-store-smoke
|
||||
PYTHONPATH=src python3 -m guide_board \
|
||||
--extension-dir ../open-cmis-tck \
|
||||
run \
|
||||
--target ../open-cmis-tck/profiles/targets/kontextual-cmis-compat.json \
|
||||
--assessment ../open-cmis-tck/profiles/assessments/cmis-browser-baseline.json \
|
||||
--output-dir /tmp/guide-board-artifact-store-smoke/open-cmis-tck-baseline
|
||||
```
|
||||
|
||||
2. Start artifact-store:
|
||||
|
||||
```sh
|
||||
cd /home/worsch/artifact-store
|
||||
cp .env.example .env
|
||||
make migrate-fresh
|
||||
make dev
|
||||
```
|
||||
|
||||
3. Register the schema and ingest the run:
|
||||
|
||||
```sh
|
||||
export ARTIFACTSTORE_API_TOKEN=dev-token
|
||||
python3 scripts/register-guide-board-schema.py
|
||||
artifactstore guide-board ingest \
|
||||
/tmp/guide-board-artifact-store-smoke/open-cmis-tck-baseline \
|
||||
--schema schemas/guide-board.run.v1.json \
|
||||
> /tmp/artifactstore-guide-board-ingest.json
|
||||
cat /tmp/artifactstore-guide-board-ingest.json
|
||||
```
|
||||
|
||||
4. Verify the manifest:
|
||||
|
||||
```sh
|
||||
PACKAGE_ID=$(python3 -c 'import json; print(json.load(open("/tmp/artifactstore-guide-board-ingest.json"))["package_id"])')
|
||||
artifactstore manifest "$PACKAGE_ID"
|
||||
```
|
||||
|
||||
5. Record State Hub linkage:
|
||||
|
||||
```sh
|
||||
export STATE_HUB_URL=http://127.0.0.1:8000
|
||||
export STATE_HUB_WORKSTREAM_ID=701c4d8c-5cf4-4a4a-ab60-1dcae53fe771
|
||||
export STATE_HUB_TASK_ID=bffa3573-4a1f-4c12-8c73-6d55bd8f6297
|
||||
export GUIDE_BOARD_RUN_DIR=/tmp/guide-board-artifact-store-smoke/open-cmis-tck-baseline
|
||||
export ARTIFACTSTORE_INGEST_RESULT_PATH=/tmp/artifactstore-guide-board-ingest.json
|
||||
python3 scripts/link-guide-board-package.py
|
||||
```
|
||||
|
||||
To smoke the storage swap after enabling WP-0004 S3 settings, keep the same
|
||||
guide-board ingest command and set
|
||||
`ARTIFACTSTORE_STORAGE_BACKEND_ROUTES='guide-board:release-evidence=s3,*:*=local'`
|
||||
before starting artifact-store.
|
||||
42
schemas/guide-board.run.v1.json
Normal file
42
schemas/guide-board.run.v1.json
Normal file
@@ -0,0 +1,42 @@
|
||||
{
|
||||
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||
"$id": "artifactstore:schemas:guide-board.run.v1",
|
||||
"title": "Guide-board run metadata",
|
||||
"type": "object",
|
||||
"additionalProperties": false,
|
||||
"required": [
|
||||
"run_id",
|
||||
"target_profile_ref",
|
||||
"assessment_profile_ref",
|
||||
"result_status",
|
||||
"source_commits",
|
||||
"report_paths",
|
||||
"evidence_counts",
|
||||
"finding_counts"
|
||||
],
|
||||
"properties": {
|
||||
"run_id": { "type": "string", "minLength": 1 },
|
||||
"target_profile_ref": { "type": "string", "minLength": 1 },
|
||||
"assessment_profile_ref": { "type": "string", "minLength": 1 },
|
||||
"result_status": { "type": "string", "minLength": 1 },
|
||||
"source_commits": {
|
||||
"type": "object",
|
||||
"additionalProperties": {
|
||||
"type": "string",
|
||||
"minLength": 7
|
||||
}
|
||||
},
|
||||
"report_paths": {
|
||||
"type": "array",
|
||||
"items": { "type": "string", "minLength": 1 }
|
||||
},
|
||||
"evidence_counts": {
|
||||
"type": "object",
|
||||
"additionalProperties": { "type": "integer", "minimum": 0 }
|
||||
},
|
||||
"finding_counts": {
|
||||
"type": "object",
|
||||
"additionalProperties": { "type": "integer", "minimum": 0 }
|
||||
}
|
||||
}
|
||||
}
|
||||
133
scripts/link-guide-board-package.py
Normal file
133
scripts/link-guide-board-package.py
Normal file
@@ -0,0 +1,133 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Record guide-board artifact package linkage in State Hub."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
|
||||
def main() -> None:
|
||||
state_hub_url = _env("STATE_HUB_URL", "http://127.0.0.1:8000").rstrip("/")
|
||||
artifact_api_url = _env("ARTIFACTSTORE_API_URL", "http://127.0.0.1:8000").rstrip("/")
|
||||
run_dir = Path(_required("GUIDE_BOARD_RUN_DIR"))
|
||||
run_json = _read_json(run_dir / "run.json")
|
||||
retention_summary = _read_json(run_dir / "retention-summary.json")
|
||||
ingest_result = _ingest_result()
|
||||
|
||||
package_id = _env("ARTIFACTSTORE_PACKAGE_ID") or _required_from(
|
||||
ingest_result,
|
||||
"package_id",
|
||||
"ARTIFACTSTORE_PACKAGE_ID",
|
||||
)
|
||||
manifest_digest = _env("ARTIFACTSTORE_MANIFEST_DIGEST") or _required_from(
|
||||
ingest_result,
|
||||
"manifest_digest",
|
||||
"ARTIFACTSTORE_MANIFEST_DIGEST",
|
||||
)
|
||||
run_id = _env("GUIDE_BOARD_RUN_ID") or str(
|
||||
run_json.get("run_id") or run_json.get("id") or retention_summary.get("run_id")
|
||||
)
|
||||
summary = retention_summary.get("summary", {})
|
||||
if not isinstance(summary, dict):
|
||||
summary = {}
|
||||
result_status = _env("GUIDE_BOARD_RESULT_STATUS") or str(
|
||||
run_json.get("result_status") or run_json.get("status") or summary.get("status")
|
||||
)
|
||||
|
||||
detail: dict[str, Any] = {
|
||||
"producer": "guide-board",
|
||||
"artifact_store_api_url": artifact_api_url,
|
||||
"run_dir": str(run_dir),
|
||||
"run_id": run_id,
|
||||
"target_profile_ref": str(run_json["target_profile_ref"]),
|
||||
"assessment_profile_ref": str(run_json["assessment_profile_ref"]),
|
||||
"result_status": result_status,
|
||||
"package_id": package_id,
|
||||
"manifest_digest": manifest_digest,
|
||||
}
|
||||
if "file_count" in ingest_result:
|
||||
detail["file_count"] = ingest_result["file_count"]
|
||||
retention_class = _env("ARTIFACTSTORE_RETENTION_CLASS")
|
||||
if retention_class:
|
||||
detail["retention_class"] = retention_class
|
||||
|
||||
payload: dict[str, Any] = {
|
||||
"event_type": _env("STATE_HUB_EVENT_TYPE", "artifact_link"),
|
||||
"author": _env("STATE_HUB_AUTHOR", "artifact-store"),
|
||||
"summary": _env(
|
||||
"STATE_HUB_SUMMARY",
|
||||
f"guide-board run {run_id} artifacts stored in artifact-store package {package_id}",
|
||||
),
|
||||
"detail": detail,
|
||||
}
|
||||
for field, env_name in (
|
||||
("topic_id", "STATE_HUB_TOPIC_ID"),
|
||||
("workstream_id", "STATE_HUB_WORKSTREAM_ID"),
|
||||
("task_id", "STATE_HUB_TASK_ID"),
|
||||
("session_id", "STATE_HUB_SESSION_ID"),
|
||||
):
|
||||
value = _env(env_name)
|
||||
if value:
|
||||
payload[field] = value
|
||||
|
||||
request = urllib.request.Request(
|
||||
f"{state_hub_url}/progress/",
|
||||
data=json.dumps(payload).encode("utf-8"),
|
||||
headers={"Content-Type": "application/json", "Accept": "application/json"},
|
||||
method="POST",
|
||||
)
|
||||
try:
|
||||
with urllib.request.urlopen(request, timeout=30) as response:
|
||||
print(response.read().decode("utf-8"))
|
||||
except urllib.error.HTTPError as exc:
|
||||
detail_text = exc.read().decode("utf-8", errors="replace")
|
||||
raise SystemExit(f"HTTP {exc.code}: {detail_text}") from exc
|
||||
|
||||
|
||||
def _env(name: str, default: str = "") -> str:
|
||||
return os.environ.get(name, default)
|
||||
|
||||
|
||||
def _required(name: str) -> str:
|
||||
value = _env(name)
|
||||
if not value:
|
||||
raise SystemExit(f"missing required environment variable: {name}")
|
||||
return value
|
||||
|
||||
|
||||
def _required_from(payload: dict[str, Any], key: str, env_name: str) -> str:
|
||||
value = payload.get(key)
|
||||
if isinstance(value, str) and value:
|
||||
return value
|
||||
raise SystemExit(f"missing {key!r}; set {env_name} or ARTIFACTSTORE_INGEST_RESULT_PATH")
|
||||
|
||||
|
||||
def _ingest_result() -> dict[str, Any]:
|
||||
raw_json = _env("ARTIFACTSTORE_INGEST_RESULT_JSON")
|
||||
if raw_json:
|
||||
payload = json.loads(raw_json)
|
||||
if not isinstance(payload, dict):
|
||||
raise SystemExit("ARTIFACTSTORE_INGEST_RESULT_JSON must be a JSON object")
|
||||
return payload
|
||||
|
||||
result_path = _env("ARTIFACTSTORE_INGEST_RESULT_PATH")
|
||||
if result_path:
|
||||
return _read_json(Path(result_path))
|
||||
return {}
|
||||
|
||||
|
||||
def _read_json(path: Path) -> dict[str, Any]:
|
||||
with path.open("r", encoding="utf-8") as fh:
|
||||
payload = json.load(fh)
|
||||
if not isinstance(payload, dict):
|
||||
raise SystemExit(f"{path} must contain a JSON object")
|
||||
return payload
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
44
scripts/register-guide-board-schema.py
Normal file
44
scripts/register-guide-board-schema.py
Normal file
@@ -0,0 +1,44 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Register the guide-board pilot metadata schema through the HTTP API."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from pathlib import Path
|
||||
|
||||
SCHEMA_SLUG = "guide-board.run.v1"
|
||||
|
||||
|
||||
def main() -> None:
|
||||
api_url = os.environ.get("ARTIFACTSTORE_API_URL", "http://127.0.0.1:8000").rstrip("/")
|
||||
token = os.environ["ARTIFACTSTORE_API_TOKEN"]
|
||||
schema_path = Path(
|
||||
os.environ.get("ARTIFACTSTORE_GUIDE_BOARD_SCHEMA", "schemas/guide-board.run.v1.json")
|
||||
)
|
||||
payload = {
|
||||
"slug": SCHEMA_SLUG,
|
||||
"json_schema": json.loads(schema_path.read_text(encoding="utf-8")),
|
||||
}
|
||||
request = urllib.request.Request(
|
||||
f"{api_url}/metadata-schemas",
|
||||
data=json.dumps(payload).encode(),
|
||||
headers={
|
||||
"Authorization": f"Bearer {token}",
|
||||
"Content-Type": "application/json",
|
||||
"Accept": "application/json",
|
||||
},
|
||||
method="POST",
|
||||
)
|
||||
try:
|
||||
with urllib.request.urlopen(request, timeout=30) as response:
|
||||
print(response.read().decode("utf-8"))
|
||||
except urllib.error.HTTPError as exc:
|
||||
detail = exc.read().decode("utf-8", errors="replace")
|
||||
raise SystemExit(f"HTTP {exc.code}: {detail}") from exc
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -52,6 +52,12 @@ class PackageCreate(BaseModel):
|
||||
subject: str = Field(min_length=1)
|
||||
retention_class: str = Field(min_length=1)
|
||||
metadata: dict[str, Any] = Field(default_factory=dict)
|
||||
metadata_schema_slug: str | None = None
|
||||
|
||||
|
||||
class MetadataSchemaCreate(BaseModel):
|
||||
slug: str = Field(min_length=1)
|
||||
json_schema: dict[str, Any]
|
||||
|
||||
|
||||
class UploadCreate(BaseModel):
|
||||
@@ -224,6 +230,24 @@ def create_app(settings: Settings | None = None) -> FastAPI:
|
||||
classes = await registry.list_retention_classes()
|
||||
return {"retention_classes": [_retention_class_dict(c) for c in classes]}
|
||||
|
||||
@application.post("/metadata-schemas", status_code=status.HTTP_201_CREATED)
|
||||
async def register_metadata_schema(
|
||||
body: MetadataSchemaCreate,
|
||||
_actor: str = Depends(require_write_auth),
|
||||
registry: Registry = Depends(get_registry),
|
||||
) -> dict[str, Any]:
|
||||
schema_id = await registry.register_metadata_schema(
|
||||
slug=body.slug,
|
||||
json_schema=body.json_schema,
|
||||
)
|
||||
schema = await registry.get_metadata_schema(body.slug)
|
||||
return {
|
||||
"id": str(schema_id),
|
||||
"slug": schema.slug,
|
||||
"json_schema": schema.json_schema,
|
||||
"created_at": _iso(schema.created_at),
|
||||
}
|
||||
|
||||
@application.post("/packages", status_code=status.HTTP_201_CREATED)
|
||||
async def create_package(
|
||||
body: PackageCreate,
|
||||
@@ -238,6 +262,7 @@ def create_app(settings: Settings | None = None) -> FastAPI:
|
||||
retention_class=body.retention_class,
|
||||
actor=actor,
|
||||
metadata=body.metadata,
|
||||
metadata_schema_slug=body.metadata_schema_slug,
|
||||
)
|
||||
return _package_dict(await registry.get_package(package_id))
|
||||
except ValueError as exc:
|
||||
|
||||
@@ -35,8 +35,10 @@ app = typer.Typer(
|
||||
)
|
||||
retention_app = typer.Typer(help="Retention lifecycle commands", no_args_is_help=True)
|
||||
storage_app = typer.Typer(help="Storage backend commands", no_args_is_help=True)
|
||||
guide_board_app = typer.Typer(help="Guide-board pilot commands", no_args_is_help=True)
|
||||
app.add_typer(retention_app, name="retention")
|
||||
app.add_typer(storage_app, name="storage")
|
||||
app.add_typer(guide_board_app, name="guide-board")
|
||||
|
||||
|
||||
@app.callback()
|
||||
@@ -208,6 +210,28 @@ def storage_verify(
|
||||
)
|
||||
|
||||
|
||||
@guide_board_app.command("ingest")
|
||||
def guide_board_ingest(
|
||||
run_dir: Path = typer.Argument(
|
||||
...,
|
||||
exists=True,
|
||||
file_okay=False,
|
||||
dir_okay=True,
|
||||
readable=True,
|
||||
help="Guide-board run directory.",
|
||||
),
|
||||
schema_path: Path = typer.Option(
|
||||
Path("schemas/guide-board.run.v1.json"),
|
||||
"--schema",
|
||||
help="Path to the guide-board metadata schema JSON.",
|
||||
),
|
||||
) -> None:
|
||||
"""Ingest a guide-board run directory through the local registry."""
|
||||
settings = get_settings()
|
||||
result = asyncio.run(_guide_board_ingest_async(settings, run_dir, schema_path))
|
||||
typer.echo(json.dumps(result, indent=2))
|
||||
|
||||
|
||||
# ---- internals -------------------------------------------------------------
|
||||
|
||||
|
||||
@@ -286,6 +310,34 @@ async def _storage_verify_async(
|
||||
]
|
||||
|
||||
|
||||
async def _guide_board_ingest_async(
|
||||
settings: Settings,
|
||||
run_dir: Path,
|
||||
schema_path: Path,
|
||||
) -> dict[str, Any]:
|
||||
from artifactstore.app import build_registry
|
||||
from artifactstore.pilots.guide_board import GUIDE_BOARD_SCHEMA_SLUG, ingest_run
|
||||
|
||||
registry: Registry = build_registry(settings)
|
||||
try:
|
||||
schema = json.loads(schema_path.read_text(encoding="utf-8"))
|
||||
if not isinstance(schema, dict):
|
||||
raise click.BadParameter(f"schema must be a JSON object: {schema_path}")
|
||||
await registry.register_metadata_schema(
|
||||
slug=GUIDE_BOARD_SCHEMA_SLUG,
|
||||
json_schema=schema,
|
||||
)
|
||||
result = await ingest_run(run_dir, registry=registry)
|
||||
finally:
|
||||
await registry.dispose()
|
||||
return {
|
||||
"package_id": result.package_id,
|
||||
"manifest_digest": result.manifest_digest,
|
||||
"file_count": result.file_count,
|
||||
"reused_existing": result.reused_existing,
|
||||
}
|
||||
|
||||
|
||||
def _http_json(
|
||||
method: str,
|
||||
base_url: str,
|
||||
|
||||
@@ -68,7 +68,9 @@ async def _apply_package_created(connection: AsyncConnection, event: Event) -> N
|
||||
producer=payload["producer"],
|
||||
subject=payload["subject"],
|
||||
retention_class=payload["retention_class"],
|
||||
metadata_schema_id=None,
|
||||
metadata_schema_id=UUID(payload["metadata_schema_id"])
|
||||
if payload.get("metadata_schema_id")
|
||||
else None,
|
||||
metadata=payload.get("metadata", {}),
|
||||
status="created",
|
||||
manifest_digest=None,
|
||||
|
||||
1
src/artifactstore/pilots/__init__.py
Normal file
1
src/artifactstore/pilots/__init__.py
Normal file
@@ -0,0 +1 @@
|
||||
"""Pilot producer integrations."""
|
||||
308
src/artifactstore/pilots/guide_board.py
Normal file
308
src/artifactstore/pilots/guide_board.py
Normal file
@@ -0,0 +1,308 @@
|
||||
"""Guide-board pilot ingestion helper."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import mimetypes
|
||||
import subprocess
|
||||
from collections.abc import AsyncIterator
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from artifactstore.registry import Registry
|
||||
|
||||
__all__ = ["GUIDE_BOARD_SCHEMA_SLUG", "GuideBoardIngestResult", "ingest_run"]
|
||||
|
||||
GUIDE_BOARD_SCHEMA_SLUG = "guide-board.run.v1"
|
||||
CORE_RUN_PATHS = (
|
||||
"run.json",
|
||||
"retention-summary.json",
|
||||
"plan.json",
|
||||
"sources.lock.json",
|
||||
"target-profile.snapshot.json",
|
||||
"assessment-profile.snapshot.json",
|
||||
"normalized/evidence.json",
|
||||
"normalized/findings.json",
|
||||
"normalized/mappings.json",
|
||||
"reports/fragments.json",
|
||||
"reports/submission-package.json",
|
||||
"exports/export-manifest.json",
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class GuideBoardIngestResult:
|
||||
package_id: str
|
||||
manifest_digest: str
|
||||
file_count: int
|
||||
reused_existing: bool = False
|
||||
|
||||
|
||||
async def ingest_run(
|
||||
run_dir: str | Path,
|
||||
*,
|
||||
registry: Registry,
|
||||
actor: str = "guide-board",
|
||||
metadata_schema_slug: str = GUIDE_BOARD_SCHEMA_SLUG,
|
||||
) -> GuideBoardIngestResult:
|
||||
"""Ingest one guide-board run directory into artifact-store."""
|
||||
root = Path(run_dir)
|
||||
run_json = _read_json(root / "run.json")
|
||||
retention_summary = _read_json(root / "retention-summary.json")
|
||||
source_lock = _read_json_if_exists(root / "sources.lock.json")
|
||||
package_manifest_path = root / "reports" / "assessment-package.json"
|
||||
package_manifest = _read_json(package_manifest_path)
|
||||
|
||||
metadata = _metadata(run_json, retention_summary, source_lock)
|
||||
run_id = str(metadata["run_id"])
|
||||
existing = await registry.list_packages(
|
||||
producer="guide-board",
|
||||
metadata_key="run_id",
|
||||
metadata_value=run_id,
|
||||
)
|
||||
for package in existing:
|
||||
if package.status == "finalized" and package.manifest_digest_hex:
|
||||
return GuideBoardIngestResult(
|
||||
package_id=str(package.id),
|
||||
manifest_digest=f"blake3:{package.manifest_digest_hex}",
|
||||
file_count=0,
|
||||
reused_existing=True,
|
||||
)
|
||||
|
||||
package_id = await registry.create_package(
|
||||
name=f"guide-board run {run_id}",
|
||||
producer="guide-board",
|
||||
subject=str(metadata["target_profile_ref"]),
|
||||
retention_class=str(retention_summary.get("retention_class", "release-evidence")),
|
||||
actor=actor,
|
||||
metadata=metadata,
|
||||
metadata_schema_slug=metadata_schema_slug,
|
||||
)
|
||||
|
||||
paths = _declared_paths(package_manifest)
|
||||
paths.update(_retained_report_paths(retention_summary))
|
||||
paths.add("reports/assessment-package.json")
|
||||
for rel_path in CORE_RUN_PATHS:
|
||||
if (root / rel_path).is_file():
|
||||
paths.add(rel_path)
|
||||
for rel_path in sorted(paths):
|
||||
source = root / rel_path
|
||||
await registry.ingest_file(
|
||||
package_id,
|
||||
relative_path=rel_path,
|
||||
media_type=mimetypes.guess_type(source.name)[0] or "application/octet-stream",
|
||||
stream=_file_chunks(source),
|
||||
actor=actor,
|
||||
)
|
||||
|
||||
await registry.finalize_package(package_id, actor=actor)
|
||||
package = await registry.get_package(package_id)
|
||||
if package.manifest_digest_hex is None:
|
||||
raise RuntimeError(f"package {package_id} finalized without manifest digest")
|
||||
return GuideBoardIngestResult(
|
||||
package_id=str(package_id),
|
||||
manifest_digest=f"blake3:{package.manifest_digest_hex}",
|
||||
file_count=len(paths),
|
||||
)
|
||||
|
||||
|
||||
def _metadata(
|
||||
run_json: dict[str, Any],
|
||||
retention_summary: dict[str, Any],
|
||||
source_lock: dict[str, Any] | None,
|
||||
) -> dict[str, Any]:
|
||||
summary = retention_summary.get("summary", {})
|
||||
if not isinstance(summary, dict):
|
||||
summary = {}
|
||||
return {
|
||||
"run_id": str(run_json.get("run_id") or run_json.get("id") or retention_summary["run_id"]),
|
||||
"target_profile_ref": str(run_json["target_profile_ref"]),
|
||||
"assessment_profile_ref": str(run_json["assessment_profile_ref"]),
|
||||
"result_status": str(
|
||||
run_json.get("result_status") or run_json.get("status") or summary.get("status")
|
||||
),
|
||||
"source_commits": _source_commits(run_json, source_lock),
|
||||
"report_paths": sorted(_retained_report_paths(retention_summary)),
|
||||
"evidence_counts": _evidence_counts(retention_summary, summary),
|
||||
"finding_counts": _finding_counts(retention_summary, summary),
|
||||
}
|
||||
|
||||
|
||||
def _declared_paths(package_manifest: dict[str, Any]) -> set[str]:
|
||||
paths: set[str] = set()
|
||||
raw_files = package_manifest.get("files", [])
|
||||
if raw_files is not None and not isinstance(raw_files, list):
|
||||
raise ValueError("assessment-package.json 'files' must be a list")
|
||||
for entry in raw_files or []:
|
||||
if isinstance(entry, str):
|
||||
paths.add(entry)
|
||||
elif isinstance(entry, dict) and isinstance(entry.get("path"), str):
|
||||
paths.add(entry["path"])
|
||||
else:
|
||||
raise ValueError(f"invalid assessment package file entry: {entry!r}")
|
||||
|
||||
raw_artifacts = package_manifest.get("artifact_manifest", [])
|
||||
if raw_artifacts is not None and not isinstance(raw_artifacts, list):
|
||||
raise ValueError("assessment-package.json 'artifact_manifest' must be a list")
|
||||
for entry in raw_artifacts or []:
|
||||
if isinstance(entry, dict) and isinstance(entry.get("path"), str):
|
||||
paths.add(entry["path"])
|
||||
else:
|
||||
raise ValueError(f"invalid assessment package artifact entry: {entry!r}")
|
||||
return paths
|
||||
|
||||
|
||||
def _retained_report_paths(retention_summary: dict[str, Any]) -> set[str]:
|
||||
paths: set[str] = set()
|
||||
for key in ("report_paths", "report_refs", "export_refs"):
|
||||
raw_paths = retention_summary.get(key, [])
|
||||
if not isinstance(raw_paths, list):
|
||||
continue
|
||||
paths.update(path for path in raw_paths if isinstance(path, str) and path)
|
||||
return paths
|
||||
|
||||
|
||||
def _source_commits(
|
||||
run_json: dict[str, Any],
|
||||
source_lock: dict[str, Any] | None,
|
||||
) -> dict[str, str]:
|
||||
raw = run_json.get("source_commits")
|
||||
if isinstance(raw, dict):
|
||||
return {str(key): str(value) for key, value in raw.items()}
|
||||
|
||||
commits: dict[str, str] = {}
|
||||
if source_lock is not None:
|
||||
for label, path in _source_paths(source_lock).items():
|
||||
commit = _git_head(path)
|
||||
if commit is not None:
|
||||
commits[label] = commit
|
||||
if commits:
|
||||
return commits
|
||||
|
||||
fingerprints = _source_fingerprints(source_lock)
|
||||
if fingerprints:
|
||||
return fingerprints
|
||||
|
||||
return {"unknown": "unrecorded-source"}
|
||||
|
||||
|
||||
def _source_paths(source_lock: dict[str, Any]) -> dict[str, Path]:
|
||||
paths: dict[str, Path] = {}
|
||||
profiles = source_lock.get("profiles", {})
|
||||
if isinstance(profiles, dict):
|
||||
for key, value in profiles.items():
|
||||
if isinstance(value, dict) and isinstance(value.get("path"), str):
|
||||
paths[f"profile:{key}"] = Path(value["path"])
|
||||
|
||||
extensions = source_lock.get("extensions", [])
|
||||
if isinstance(extensions, list):
|
||||
for entry in extensions:
|
||||
if not isinstance(entry, dict):
|
||||
continue
|
||||
extension_id = str(entry.get("id") or "unknown-extension")
|
||||
raw_path = entry.get("path")
|
||||
if isinstance(raw_path, str) and Path(raw_path).is_absolute():
|
||||
paths[f"extension:{extension_id}"] = Path(raw_path)
|
||||
return paths
|
||||
|
||||
|
||||
def _git_head(path: Path) -> str | None:
|
||||
try:
|
||||
completed = subprocess.run(
|
||||
["git", "-C", str(path.parent if path.is_file() else path), "rev-parse", "HEAD"],
|
||||
check=True,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=5,
|
||||
)
|
||||
except (OSError, subprocess.CalledProcessError, subprocess.TimeoutExpired):
|
||||
return None
|
||||
commit = completed.stdout.strip()
|
||||
return commit or None
|
||||
|
||||
|
||||
def _source_fingerprints(source_lock: dict[str, Any]) -> dict[str, str]:
|
||||
fingerprints: dict[str, str] = {}
|
||||
for key, value in source_lock.items():
|
||||
if key == "id" and isinstance(value, str):
|
||||
fingerprints["source_lock"] = value
|
||||
|
||||
profiles = source_lock.get("profiles", {})
|
||||
if isinstance(profiles, dict):
|
||||
for key, value in profiles.items():
|
||||
if isinstance(value, dict) and isinstance(value.get("checksum"), str):
|
||||
fingerprints[f"profile:{key}"] = value["checksum"]
|
||||
|
||||
extensions = source_lock.get("extensions", [])
|
||||
if isinstance(extensions, list):
|
||||
for entry in extensions:
|
||||
if isinstance(entry, dict) and isinstance(entry.get("manifest_checksum"), str):
|
||||
fingerprints[f"extension:{entry.get('id', 'unknown-extension')}"] = entry[
|
||||
"manifest_checksum"
|
||||
]
|
||||
return fingerprints
|
||||
|
||||
|
||||
def _evidence_counts(
|
||||
retention_summary: dict[str, Any],
|
||||
summary: dict[str, Any],
|
||||
) -> dict[str, int]:
|
||||
raw = retention_summary.get("evidence_counts")
|
||||
if isinstance(raw, dict):
|
||||
return _int_mapping(raw)
|
||||
raw_evidence = summary.get("evidence_results")
|
||||
if isinstance(raw_evidence, dict):
|
||||
return _int_mapping(raw_evidence)
|
||||
return {}
|
||||
|
||||
|
||||
def _finding_counts(
|
||||
retention_summary: dict[str, Any],
|
||||
summary: dict[str, Any],
|
||||
) -> dict[str, int]:
|
||||
raw = retention_summary.get("finding_counts")
|
||||
if isinstance(raw, dict):
|
||||
return _int_mapping(raw)
|
||||
keys = (
|
||||
"finding_count",
|
||||
"unexpected_findings",
|
||||
"expected_findings",
|
||||
"waived_findings",
|
||||
"challenged_findings",
|
||||
"authority_exclusions",
|
||||
"unresolved_defects",
|
||||
"unresolved_review_items",
|
||||
)
|
||||
return _int_mapping({key: summary[key] for key in keys if key in summary})
|
||||
|
||||
|
||||
def _int_mapping(raw: dict[str, Any]) -> dict[str, int]:
|
||||
return {
|
||||
str(key): int(value)
|
||||
for key, value in raw.items()
|
||||
if isinstance(value, int) and not isinstance(value, bool)
|
||||
}
|
||||
|
||||
|
||||
def _read_json(path: Path) -> dict[str, Any]:
|
||||
with path.open("r", encoding="utf-8") as fh:
|
||||
payload = json.load(fh)
|
||||
if not isinstance(payload, dict):
|
||||
raise ValueError(f"{path} must contain a JSON object")
|
||||
return payload
|
||||
|
||||
|
||||
def _read_json_if_exists(path: Path) -> dict[str, Any] | None:
|
||||
if not path.exists():
|
||||
return None
|
||||
return _read_json(path)
|
||||
|
||||
|
||||
async def _file_chunks(path: Path, chunk_size: int = 64 * 1024) -> AsyncIterator[bytes]:
|
||||
with path.open("rb") as fh:
|
||||
while True:
|
||||
chunk = fh.read(chunk_size)
|
||||
if not chunk:
|
||||
break
|
||||
yield chunk
|
||||
@@ -33,6 +33,7 @@ from artifactstore.dataplane.spi import DataPlane, IngestHints
|
||||
from artifactstore.db.schema import (
|
||||
artifact_files,
|
||||
artifact_packages,
|
||||
metadata_schemas,
|
||||
retention_classes,
|
||||
retention_state,
|
||||
storage_locations,
|
||||
@@ -70,6 +71,7 @@ __all__ = [
|
||||
"FileNotFoundError",
|
||||
"FileRecord",
|
||||
"IllegalPackageStateError",
|
||||
"MetadataSchemaRecord",
|
||||
"PackageNotFoundError",
|
||||
"PackageRecord",
|
||||
"Registry",
|
||||
@@ -100,6 +102,16 @@ class RetentionStateError(ValueError):
|
||||
"""Raised when a retention lifecycle operation is invalid."""
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class MetadataSchemaRecord:
|
||||
"""Registered package metadata schema."""
|
||||
|
||||
id: UUID
|
||||
slug: str
|
||||
json_schema: dict[str, Any]
|
||||
created_at: datetime | None
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class PackageRecord:
|
||||
"""Materialised package row projected into the registry API."""
|
||||
@@ -208,9 +220,15 @@ class Registry:
|
||||
retention_class: str,
|
||||
actor: str,
|
||||
metadata: dict[str, Any] | None = None,
|
||||
metadata_schema_slug: str | None = None,
|
||||
) -> UUID:
|
||||
"""Create a new package; returns its ``UUID``."""
|
||||
retention_class_row = await self._get_retention_class(retention_class)
|
||||
package_metadata = metadata or {}
|
||||
metadata_schema_id = await self._validate_metadata_schema(
|
||||
metadata_schema_slug,
|
||||
package_metadata,
|
||||
)
|
||||
package_id = uuid.uuid4()
|
||||
payload = cbor2.dumps(
|
||||
{
|
||||
@@ -218,7 +236,8 @@ class Registry:
|
||||
"producer": producer,
|
||||
"subject": subject,
|
||||
"retention_class": retention_class,
|
||||
"metadata": metadata or {},
|
||||
"metadata": package_metadata,
|
||||
"metadata_schema_id": str(metadata_schema_id) if metadata_schema_id else None,
|
||||
},
|
||||
canonical=True,
|
||||
)
|
||||
@@ -513,6 +532,48 @@ class Registry:
|
||||
for r in rows
|
||||
]
|
||||
|
||||
async def register_metadata_schema(
|
||||
self,
|
||||
*,
|
||||
slug: str,
|
||||
json_schema: dict[str, Any],
|
||||
) -> UUID:
|
||||
"""Register a package metadata JSON Schema, idempotent by slug."""
|
||||
schema_id = uuid.uuid4()
|
||||
async with self._engine.begin() as conn:
|
||||
existing = (
|
||||
await conn.execute(
|
||||
select(metadata_schemas.c.id).where(metadata_schemas.c.slug == slug)
|
||||
)
|
||||
).first()
|
||||
if existing is not None:
|
||||
return UUID(str(existing.id))
|
||||
await conn.execute(
|
||||
metadata_schemas.insert().values(
|
||||
id=schema_id,
|
||||
slug=slug,
|
||||
json_schema=json_schema,
|
||||
)
|
||||
)
|
||||
return schema_id
|
||||
|
||||
async def get_metadata_schema(self, slug: str) -> MetadataSchemaRecord:
|
||||
"""Return one registered metadata schema by slug."""
|
||||
async with self._engine.connect() as conn:
|
||||
row = (
|
||||
await conn.execute(
|
||||
select(metadata_schemas).where(metadata_schemas.c.slug == slug)
|
||||
)
|
||||
).first()
|
||||
if row is None:
|
||||
raise KeyError(f"metadata schema not found: {slug}")
|
||||
return MetadataSchemaRecord(
|
||||
id=row.id,
|
||||
slug=row.slug,
|
||||
json_schema=dict(row.json_schema),
|
||||
created_at=row.created_at,
|
||||
)
|
||||
|
||||
async def get_retention_state(self, package_id: UUID) -> RetentionStateRecord:
|
||||
"""Return the retention materialised view for one package."""
|
||||
async with self._engine.connect() as conn:
|
||||
@@ -902,6 +963,25 @@ class Registry:
|
||||
deletion_strategy=row.deletion_strategy,
|
||||
)
|
||||
|
||||
async def _validate_metadata_schema(
|
||||
self,
|
||||
slug: str | None,
|
||||
metadata: dict[str, Any],
|
||||
) -> UUID | None:
|
||||
if slug is None:
|
||||
return None
|
||||
try:
|
||||
schema = await self.get_metadata_schema(slug)
|
||||
except KeyError as exc:
|
||||
raise ValueError(str(exc)) from exc
|
||||
required = schema.json_schema.get("required", [])
|
||||
if not isinstance(required, list):
|
||||
raise ValueError(f"metadata schema {slug!r} has invalid required list")
|
||||
missing = [key for key in required if isinstance(key, str) and key not in metadata]
|
||||
if missing:
|
||||
raise ValueError(f"metadata missing required schema keys: {', '.join(missing)}")
|
||||
return schema.id
|
||||
|
||||
|
||||
def _iso(value: datetime | None) -> str | None:
|
||||
if value is None:
|
||||
|
||||
8
tests/fixtures/guide-board/logs/log-review-summary.json
vendored
Normal file
8
tests/fixtures/guide-board/logs/log-review-summary.json
vendored
Normal file
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"reviewed_logs": [
|
||||
"raw/session/transcript.txt"
|
||||
],
|
||||
"warnings": [
|
||||
"Repository returned one optional capability warning."
|
||||
]
|
||||
}
|
||||
6
tests/fixtures/guide-board/raw/session/browser-response.json
vendored
Normal file
6
tests/fixtures/guide-board/raw/session/browser-response.json
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"repositoryId": "fixture-repo",
|
||||
"capabilities": {
|
||||
"capabilityQuery": "metadataonly"
|
||||
}
|
||||
}
|
||||
3
tests/fixtures/guide-board/raw/session/transcript.txt
vendored
Normal file
3
tests/fixtures/guide-board/raw/session/transcript.txt
vendored
Normal file
@@ -0,0 +1,3 @@
|
||||
GET /cmis/browser
|
||||
200 OK
|
||||
Repository info collected for fixture.
|
||||
12
tests/fixtures/guide-board/reports/assessment-package.json
vendored
Normal file
12
tests/fixtures/guide-board/reports/assessment-package.json
vendored
Normal file
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"package_version": 1,
|
||||
"files": [
|
||||
{ "path": "run.json", "kind": "run-metadata" },
|
||||
{ "path": "retention-summary.json", "kind": "retention-summary" },
|
||||
{ "path": "reports/report.md", "kind": "report" },
|
||||
{ "path": "scorecards/cmis-scorecard.json", "kind": "scorecard" },
|
||||
{ "path": "logs/log-review-summary.json", "kind": "log-review" },
|
||||
{ "path": "raw/session/transcript.txt", "kind": "raw-artifact" },
|
||||
{ "path": "raw/session/browser-response.json", "kind": "raw-artifact" }
|
||||
]
|
||||
}
|
||||
3
tests/fixtures/guide-board/reports/report.md
vendored
Normal file
3
tests/fixtures/guide-board/reports/report.md
vendored
Normal file
@@ -0,0 +1,3 @@
|
||||
# Guide-board CMIS Assessment
|
||||
|
||||
Fixture run `gb-fixture-001` completed with one warning and no failed checks.
|
||||
17
tests/fixtures/guide-board/retention-summary.json
vendored
Normal file
17
tests/fixtures/guide-board/retention-summary.json
vendored
Normal file
@@ -0,0 +1,17 @@
|
||||
{
|
||||
"retention_class": "release-evidence",
|
||||
"report_paths": [
|
||||
"reports/report.md",
|
||||
"scorecards/cmis-scorecard.json",
|
||||
"logs/log-review-summary.json"
|
||||
],
|
||||
"evidence_counts": {
|
||||
"raw_artifacts": 2,
|
||||
"reports": 3
|
||||
},
|
||||
"finding_counts": {
|
||||
"pass": 17,
|
||||
"warning": 1,
|
||||
"fail": 0
|
||||
}
|
||||
}
|
||||
10
tests/fixtures/guide-board/run.json
vendored
Normal file
10
tests/fixtures/guide-board/run.json
vendored
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"run_id": "gb-fixture-001",
|
||||
"target_profile_ref": "open-cmis-tck:browser-binding",
|
||||
"assessment_profile_ref": "guide-board:cmis-assessment:v1",
|
||||
"result_status": "passed-with-findings",
|
||||
"source_commits": {
|
||||
"guide-board": "1234567890abcdef",
|
||||
"open-cmis-tck": "abcdef1234567890"
|
||||
}
|
||||
}
|
||||
7
tests/fixtures/guide-board/scorecards/cmis-scorecard.json
vendored
Normal file
7
tests/fixtures/guide-board/scorecards/cmis-scorecard.json
vendored
Normal file
@@ -0,0 +1,7 @@
|
||||
{
|
||||
"scorecard": "cmis-browser-binding",
|
||||
"checks": 18,
|
||||
"passed": 17,
|
||||
"warnings": 1,
|
||||
"failed": 0
|
||||
}
|
||||
110
tests/integration/test_guide_board_pilot.py
Normal file
110
tests/integration/test_guide_board_pilot.py
Normal file
@@ -0,0 +1,110 @@
|
||||
"""Guide-board pilot ingestion tests (ARTIFACT-STORE-WP-0005)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from collections.abc import AsyncIterator
|
||||
from pathlib import Path
|
||||
from uuid import UUID
|
||||
|
||||
import pytest
|
||||
import pytest_asyncio
|
||||
from sqlalchemy import create_engine, insert
|
||||
from sqlalchemy.ext.asyncio import create_async_engine
|
||||
from typer.testing import CliRunner
|
||||
|
||||
from artifactstore.cli import app as cli_app
|
||||
from artifactstore.dataplane import InProcessDataPlane
|
||||
from artifactstore.db.schema import metadata, retention_classes
|
||||
from artifactstore.db.seed import RETENTION_CLASS_SEEDS
|
||||
from artifactstore.events import RegistryViewWriter
|
||||
from artifactstore.manifest import decode as manifest_decode
|
||||
from artifactstore.pilots.guide_board import GUIDE_BOARD_SCHEMA_SLUG, ingest_run
|
||||
from artifactstore.registry import Registry
|
||||
from artifactstore.storage import LocalBackend
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parents[2]
|
||||
FIXTURE = REPO_ROOT / "tests" / "fixtures" / "guide-board"
|
||||
SCHEMA = REPO_ROOT / "schemas" / "guide-board.run.v1.json"
|
||||
|
||||
|
||||
@pytest_asyncio.fixture
|
||||
async def registry(tmp_path: Path) -> AsyncIterator[Registry]:
|
||||
db_path = tmp_path / "guide-board.db"
|
||||
engine = create_async_engine(f"sqlite+aiosqlite:///{db_path}")
|
||||
async with engine.begin() as conn:
|
||||
await conn.run_sync(metadata.create_all)
|
||||
for seed in RETENTION_CLASS_SEEDS:
|
||||
await conn.execute(insert(retention_classes).values(**seed))
|
||||
backend = LocalBackend(tmp_path / "storage", backend_id="local")
|
||||
reg = Registry(engine, InProcessDataPlane(backend), RegistryViewWriter())
|
||||
try:
|
||||
yield reg
|
||||
finally:
|
||||
await reg.dispose()
|
||||
|
||||
|
||||
async def _consume(stream: AsyncIterator[bytes]) -> bytes:
|
||||
out = bytearray()
|
||||
async for chunk in stream:
|
||||
out.extend(chunk)
|
||||
return bytes(out)
|
||||
|
||||
|
||||
async def test_guide_board_library_ingest_is_idempotent_and_downloadable(
|
||||
registry: Registry,
|
||||
) -> None:
|
||||
schema = json.loads(SCHEMA.read_text(encoding="utf-8"))
|
||||
await registry.register_metadata_schema(slug=GUIDE_BOARD_SCHEMA_SLUG, json_schema=schema)
|
||||
|
||||
first = await ingest_run(FIXTURE, registry=registry)
|
||||
second = await ingest_run(FIXTURE, registry=registry)
|
||||
|
||||
assert first.package_id
|
||||
assert first.manifest_digest.startswith("blake3:")
|
||||
assert first.manifest_digest == second.manifest_digest
|
||||
assert second.reused_existing is True
|
||||
|
||||
manifest = manifest_decode(
|
||||
await registry.get_manifest_bytes(UUID(first.package_id), format="cbor")
|
||||
)
|
||||
assert manifest.package.producer == "guide-board"
|
||||
assert manifest.package.metadata_schema_id is not None
|
||||
assert manifest.retention_summary.retention_class == "release-evidence"
|
||||
assert len(manifest.files) == 8
|
||||
|
||||
for file_entry in manifest.files:
|
||||
stream = await registry.get_file(UUID(file_entry.id))
|
||||
assert await _consume(stream) == (FIXTURE / file_entry.relative_path).read_bytes()
|
||||
|
||||
state = await registry.get_retention_state(UUID(first.package_id))
|
||||
assert state.effective_class == "release-evidence"
|
||||
|
||||
|
||||
def test_guide_board_cli_ingest_outputs_package_and_digest(
|
||||
tmp_path: Path,
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
db_path = tmp_path / "guide-board-cli.db"
|
||||
storage_root = tmp_path / "storage"
|
||||
storage_root.mkdir()
|
||||
sync_engine = create_engine(f"sqlite:///{db_path}", future=True)
|
||||
metadata.create_all(sync_engine)
|
||||
with sync_engine.begin() as conn:
|
||||
conn.execute(insert(retention_classes), [dict(s) for s in RETENTION_CLASS_SEEDS])
|
||||
sync_engine.dispose()
|
||||
|
||||
monkeypatch.setenv("ARTIFACTSTORE_DATABASE_URL", f"sqlite+aiosqlite:///{db_path}")
|
||||
monkeypatch.setenv("ARTIFACTSTORE_STORAGE_LOCAL_ROOT", str(storage_root))
|
||||
|
||||
result = CliRunner().invoke(
|
||||
cli_app,
|
||||
["guide-board", "ingest", str(FIXTURE), "--schema", str(SCHEMA)],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
payload = json.loads(result.output)
|
||||
assert payload["package_id"]
|
||||
assert payload["manifest_digest"].startswith("blake3:")
|
||||
assert payload["file_count"] == 8
|
||||
assert payload["reused_existing"] is False
|
||||
@@ -4,13 +4,13 @@ type: workplan
|
||||
title: "Guide-Board Pilot Ingestion"
|
||||
repo: artifact-store
|
||||
domain: stack
|
||||
status: planned
|
||||
status: active
|
||||
owner: codex
|
||||
topic_slug: stack
|
||||
planning_priority: high
|
||||
planning_order: 5
|
||||
created: "2026-05-15"
|
||||
updated: "2026-05-15"
|
||||
updated: "2026-05-16"
|
||||
state_hub_workstream_id: "701c4d8c-5cf4-4a4a-ab60-1dcae53fe771"
|
||||
---
|
||||
|
||||
@@ -41,9 +41,9 @@ bytes itself. This is the pilot success criterion in INTENT.md.
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T001
|
||||
status: cancelled
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "eb822821-353c-4cd2-95bf-acb2f084b7ea"
|
||||
state_hub_task_id: "830f6822-1cfe-4955-a4e0-5b9a42fb5db1"
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
@@ -61,7 +61,7 @@ Acceptance:
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T002
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "ff0ba2eb-b8d3-418a-8685-a54457cea2ed"
|
||||
```
|
||||
@@ -82,7 +82,7 @@ Acceptance:
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T003
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "5c367257-2d2a-4de9-9a06-663ba2c60d77"
|
||||
```
|
||||
@@ -106,7 +106,7 @@ Acceptance:
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T004
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "b1ca7133-ad27-4091-93a0-a4e1b7450791"
|
||||
```
|
||||
@@ -124,7 +124,7 @@ Acceptance:
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T005
|
||||
status: todo
|
||||
status: blocked
|
||||
priority: medium
|
||||
state_hub_task_id: "bffa3573-4a1f-4c12-8c73-6d55bd8f6297"
|
||||
```
|
||||
@@ -139,6 +139,17 @@ Acceptance:
|
||||
- Procedure runs end-to-end on a developer workstation under 5
|
||||
minutes.
|
||||
|
||||
Blocked note: the artifact-store ingest path was verified against an
|
||||
existing non-fixture OpenCMIS guide-board run at
|
||||
`/home/worsch/open-cmis-tck/.local/runs/opencmis-inmemory-pilot` using
|
||||
an isolated SQLite DB and local storage root. It ingested 23 files,
|
||||
replayed the event log through sequence 26, and verified 23 storage
|
||||
locations with zero failures. A fresh guide-board/OpenCMIS producer run
|
||||
from `~/guide-board` currently stops before artifact-store handoff with
|
||||
`cmis-summary: report fragment not found: reports/cmis-summary.md`,
|
||||
which needs to be fixed in the producer/extension before the documented
|
||||
fresh-run procedure can be marked complete.
|
||||
|
||||
## Success criteria
|
||||
|
||||
- A real guide-board CMIS run is ingested with one CLI invocation.
|
||||
|
||||
Reference in New Issue
Block a user