feat: TTL enforcement and operational hardening (SAND-WP-0009)

Add TTL parser, expires_at on create, extend_ttl and expire/reap APIs,
activity-core integration doc, repo classification, registry refresh,
HTTP parity, and 69 tests.
This commit is contained in:
2026-06-24 12:44:04 +02:00
parent b58191b23e
commit df658e7ef9
20 changed files with 913 additions and 39 deletions

25
.repo-classification.yaml Normal file
View File

@@ -0,0 +1,25 @@
repo_classification:
standard: Repo Classification Standard
version: '1.0'
classified_at: '2026-06-24'
classified_by: codex
category: tooling
domain: infotech
secondary_domains:
- agents
capability_tags:
- sandbox
- isolation
- provision
- execution
- orchestration
business_stake:
- technology
- execution
- automation
business_mechanics:
- operation
- coordination
notes: >
Sandbox establishment meta-framework — profiles, extensions, routing,
lifecycle, TTL, and host telemetry for agentic and deterministic workloads.

View File

@@ -44,7 +44,7 @@ orchestration from `create` remains deferred.
## In Scope
- **Unified establishment API** — CLI v0 + HTTP stub (`create`, `get`, `list`,
`destroy`, `recreate`, `snapshot`, `restore`); `extend_ttl` planned
`destroy`, `recreate`, `snapshot`, `restore`, `extend-ttl`, `expire`)
- **Profile catalog** — six profiles: compose e2e/checkpoint, sandbox canary,
vm-haskell-build, saas-stub, burst-sandbox
- **Extension platform** — `ext.compose-ssh`, `ext.vm-packer`, `ext.saas-stub`;
@@ -128,12 +128,12 @@ own tunnels or CAs.
- **Docs:** `meta-framework`, `extension-sdk`, `host-telemetry`, `routing`,
`payments`, `snapshots`, `migration-gaps`, `migration-build-machines`
- **Registry:** `capability.execution.sandbox-provision` indexed (draft)
- **Tests:** 54 pytest cases; `make check` green
- **Tests:** 69 pytest cases; `make check` green
- **Siblings:** wise-validator `validate run` (SAND-WP-0003); the-custodian
`make e2e REPO=` shim (SAND-WP-0004)
Latest gap analysis: `history/2026-06-24-post-wp0007-intent-scope-gap-analysis.md`
Next workplan: **SAND-WP-0009** (TTL enforcement and operational hardening).
Latest workplan: **SAND-WP-0009** (TTL enforcement — finished).
---
@@ -149,6 +149,9 @@ sandboxer get <id> / list / destroy / recreate
sandboxer snapshot <id> [--name LABEL]
sandboxer restore <snapshot_id>
sandboxer snapshots list / snapshots get <id>
sandboxer extend-ttl <id> --duration 2h
sandboxer expire [--apply]
sandboxer create --ttl 2h ...
sandboxer credits show / credits add <amount>
sandboxer inspect host / inspect stale / reap-stale [--apply]
make smoke-remote # CoulombCore compose smoke (SANDBOXER_HOST)
@@ -168,14 +171,14 @@ cd ~/the-custodian && make e2e REPO=activity-core
## What Is Not Possible Yet
- TTL auto-expiry / `extend_ttl` enforcement
- ~~TTL auto-expiry / `extend_ttl` enforcement~~ — done (SAND-WP-0009)
- Packer build orchestration from `create` (attach-only today)
- Real E2B / Modal / Daytona adapters (in-repo stub only)
- Cross-host snapshot transfer
- Formal ops-bridge tunnel attachment in reachability descriptor
- Dedicated sandboxer01 host (CoulombCore interim only today)
- `reuse-surface validate` / federation publish workflow
- `.repo-classification.yaml` (State Hub C-24 hygiene)
- ~~`.repo-classification.yaml`~~ — done (SAND-WP-0009)
- fin-hub billing export for metered usage
---
@@ -239,6 +242,8 @@ see `registry/capabilities/execution.sandbox-provision.md`.
| `docs/routing.md` | Backend selection strategies |
| `docs/payments.md` | Credits and metering |
| `docs/snapshots.md` | Checkpoint snapshot/restore |
| `docs/ttl.md` | TTL extend and expire/reap |
| `docs/security.md` | Blast-radius vs intent enforcement |
| `docs/migration-gaps.md` | Legacy cutover status |
| `docs/integrations/` | Consumer contracts |
| `workplans/` | ADR-001 work structure |

View File

@@ -0,0 +1,42 @@
# activity-core integration
activity-core schedules bounded work on Railiance01. sand-boxer provides
**sandbox venues** with TTL enforcement; activity-core owns **when** expire runs.
## Scheduled TTL reap
Run periodically (cron, Temporal activity, or CI):
```bash
sandboxer expire --apply
```
HTTP equivalent:
```http
POST /v1/sandboxes/expire?apply=true
```
Returns a list of `ExpireActionResult` entries (`dry-run`, `destroyed`, `failed`).
## Lifecycle events
Each expired sandbox emits a State Hub progress event:
- `state`: `expired` (`event_type`: `milestone`)
- Followed by `destroying``destroyed`
Event `detail` includes `ttl`, `expires_at`, and reachability fields.
## What sand-boxer does not do
- No Temporal workflows or activity-core code in this repo
- No push webhook to activity-core on expiry (poll/schedule only in v0)
- TTL parsing and destroy orchestration live in sand-boxer
## Consumer pattern
1. activity-core activity provisions via `sandboxer create` (or HTTP)
2. Work runs in the sandbox (glas-harness, wise-validator, etc.)
3. Scheduled `sandboxer expire --apply` reaps past-TTL sandboxes
4. State Hub records full lifecycle for audit

View File

@@ -82,7 +82,7 @@ Extends the `build-agent` self-register pattern: generic sandbox identities carr
| `create` | Provision from profile + inputs | **Yes** |
| `get` | Inspect sandbox status | **Yes** |
| `list` | List sandboxes (filter by consumer optional) | **Yes** |
| `extend_ttl` | Extend time-to-live | Stub |
| `extend_ttl` | Extend time-to-live | **Yes** |
| `recreate` | Destroy and reprovision from stored seed | **Yes** |
| `destroy` | Idempotent teardown | **Yes** |
| `snapshot` / `restore` | Checkpoint workspace | **Yes** (compose-ssh, saas-stub) |
@@ -97,6 +97,9 @@ HTTP surface (optional v0; CLI calls core library directly):
- `POST /v1/sandboxes/{id}/snapshot` — checkpoint
- `POST /v1/snapshots/{id}/restore` — restore
- `GET /v1/snapshots` — list checkpoints
- `POST /v1/sandboxes/{id}/recreate` — recreate
- `PATCH /v1/sandboxes/{id}/ttl` — extend TTL
- `POST /v1/sandboxes/expire` — TTL reap (query `apply=true`)
---

View File

@@ -46,4 +46,4 @@ Deferred: Packer orchestration from API, `make remote-build` shim.
| ~~SaaS extensions + payments v0~~ | SAND-WP-0006 — stub + routing + credits |
| E2B / Modal real adapters | Post SAND-WP-0006 |
| ~~Snapshot / restore~~ | SAND-WP-0007 — `docs/snapshots.md` |
| TTL enforcement + scheduled reap | **SAND-WP-0009** |
| ~~TTL enforcement + scheduled reap~~ | SAND-WP-0009`docs/ttl.md` |

23
docs/security.md Normal file
View File

@@ -0,0 +1,23 @@
# Security posture
sand-boxer limits **blast radius** — it does not enforce **intent**.
## What sandboxing provides
- Isolated compose projects and workspace directories on placement hosts
- Profile-declared network default-deny (declarative in v0; enforcement varies by extension)
- TTL-bound disposable venues with automated expire/reap
- Consumer attribution (`adm` / `agt` / `atm`) on lifecycle events
## What sandboxing does not provide
- Protection against a malicious or compromised agent *inside* the sandbox
- Guarantee that an agent follows instructions or policy
- Replacement for secrets management (use OpenBao / operator paths via `warden route`)
- Production isolation on Railiance01 (sandboxes run on sandboxer01 / CoulombCore)
Per INTENT: *"Honest security — sandboxing limits blast radius; it is not intent
enforcement."*
Operators should combine sand-boxer with flex-auth, credential routing, and
harness-level controls for end-to-end safety.

65
docs/ttl.md Normal file
View File

@@ -0,0 +1,65 @@
# Time-to-live (TTL)
Disposable-by-default sandboxes — SAND-WP-0009.
## Semantics
Each ready sandbox has:
| Field | Meaning |
|-------|---------|
| `ttl` | Active duration string (e.g. `4h`) |
| `expires_at` | UTC timestamp when the sandbox should be reaped |
On `create`, TTL comes from `SandboxCreateRequest.ttl` or the profile
`ttl.default`, capped at `ttl.max`. Anchor is `ready_at`.
Duration format: positive integer + unit — `s`, `m`, `h`, `d` (e.g. `30m`, `4h`).
## extend_ttl
Extend a live sandbox (`ready` or `active`):
```bash
sandboxer extend-ttl <sandbox_id> --duration 2h
```
HTTP: `PATCH /v1/sandboxes/{id}/ttl` with `{"duration": "2h"}`.
Extension adds to the current `expires_at` (or now if already past). Total lifetime
from `ready_at` cannot exceed profile `ttl.max`.
## expire / reap
TTL reap is distinct from host inventory `reap-stale`:
| Command | Purpose |
|---------|---------|
| `sandboxer expire` | Sandboxes past `expires_at` or profile `ttl.idle_reap` |
| `sandboxer reap-stale` | Orphan host resources vs store inventory |
```bash
sandboxer expire # dry-run (default)
sandboxer expire --apply # mark expired, destroy
```
HTTP: `POST /v1/sandboxes/expire?apply=true`
Flow on `--apply`:
1. Transition to `expired` (State Hub milestone)
2. `destroy` (idempotent teardown)
## Profile fields
```yaml
ttl:
default: 4h
max: 24h
idle_reap: null # optional; reap when updated_at + idle_reap elapsed
```
## activity-core
Scheduled jobs should invoke `sandboxer expire --apply` (or HTTP equivalent).
See `docs/integrations/activity-core.md`.

View File

@@ -6,7 +6,16 @@ Markdown-first capability index for federation and reuse planning.
1. Copy a capability entry template (see reuse-surface `templates/capability-entry.template.md`).
2. Add the row to `indexes/capabilities.yaml`.
3. Run `reuse-surface validate` from a checkout with the CLI installed.
3. Run `reuse-surface validate` from a checkout with the CLI installed:
```bash
cd ~/reuse-surface
reuse-surface validate --repo ~/sand-boxer
```
4. Merge to `main` and verify publish with `reuse-surface establish --publish-check`.
sand-boxer v0 maturity (post SAND-WP-0009): D5/A4/C4 — see
`registry/capabilities/execution.sandbox-provision.md`.
Federation contract: reuse-surface `docs/RegistryFederation.md`.

View File

@@ -9,32 +9,35 @@ tags: [sandbox, isolation, provision, e2e, agentic, execution, profile]
maturity:
discovery:
current: D4
current: D5
target: D6
confidence: high
rationale: >
Charter (INTENT.md), meta-framework spec (docs/meta-framework.md), and
research synthesis define scope. First extension (ext.compose-ssh) in progress.
Charter (INTENT.md), meta-framework spec, extension SDK, integration docs,
and research synthesis. Capability indexed in registry/.
availability:
current: A2
current: A4
target: A5
confidence: medium
confidence: high
rationale: >
CLI v0 and ext.compose-ssh scaffold land in SAND-WP-0002. SaaS extensions
and payments deferred.
CLI v0 (create/destroy/snapshot/TTL), HTTP API, CoulombCore remote smoke.
SaaS stub + routing + credits shipped (SAND-WP-0006).
external_evidence:
completeness:
level: C2
name: Partial
confidence: medium
level: C4
name: Substantial
confidence: high
basis: scope_vs_intent_and_consumer_expectations
satisfied_expectations:
- profile-based create/destroy via CLI
- profile-based create/destroy/snapshot/restore via CLI
- TTL extend and expire/reap (SAND-WP-0009)
- State Hub lifecycle events on transitions
- wise-validator and the-custodian migration arc complete
- extension SDK with compose-ssh, vm-packer attach, saas-stub
broken_expectations:
- Real E2B/Modal adapters not yet built (saas-stub + credits v0 done)
- wise-validator migration not complete
- Real E2B/Modal adapters not yet built
- sandboxer01 dedicated host not live (CoulombCore interim)
out_of_scope_expectations:
- agent harness and tool orchestration (glas-harness)
- e2e test semantics (wise-validator)
@@ -42,4 +45,4 @@ external_evidence:
consumption_modes:
- CLI (sandboxer)
- core library (Python)
- HTTP API (planned)
- HTTP API (uvicorn sandboxer.api.app:app)

View File

@@ -6,6 +6,8 @@ from fastapi import FastAPI, HTTPException
from sandboxer.core.manager import SandboxManager
from sandboxer.models import (
ExpireActionResult,
ExtendTtlRequest,
SandboxCreateRequest,
SandboxStatus,
SnapshotRecord,
@@ -82,4 +84,29 @@ def get_snapshot(snapshot_id: str) -> SnapshotRecord:
record = _manager.get_snapshot(snapshot_id)
if not record:
raise HTTPException(status_code=404, detail="snapshot not found")
return record
return record
@app.post("/v1/sandboxes/{sandbox_id}/recreate", response_model=SandboxStatus)
def recreate_sandbox(sandbox_id: str) -> SandboxStatus:
try:
return _manager.recreate(sandbox_id)
except KeyError as exc:
raise HTTPException(status_code=404, detail=str(exc)) from exc
except Exception as exc:
raise HTTPException(status_code=400, detail=str(exc)) from exc
@app.patch("/v1/sandboxes/{sandbox_id}/ttl", response_model=SandboxStatus)
def extend_sandbox_ttl(sandbox_id: str, request: ExtendTtlRequest) -> SandboxStatus:
try:
return _manager.extend_ttl(sandbox_id, request.duration)
except KeyError as exc:
raise HTTPException(status_code=404, detail=str(exc)) from exc
except (RuntimeError, ValueError) as exc:
raise HTTPException(status_code=400, detail=str(exc)) from exc
@app.post("/v1/sandboxes/expire", response_model=list[ExpireActionResult])
def expire_sandboxes(apply: bool = False) -> list[ExpireActionResult]:
return _manager.expire(apply=apply)

View File

@@ -90,6 +90,7 @@ def sandbox_create(
actor: Annotated[str, typer.Option(help="Consumer actor type")] = "adm",
project: Annotated[str, typer.Option(help="Calling project id")] = "sand-boxer",
host: Annotated[str | None, typer.Option(help="Override placement host")] = None,
ttl: Annotated[str | None, typer.Option(help="TTL override (e.g. 4h)")] = None,
) -> None:
"""Provision a sandbox. No args → canary self-deploy of sand-boxer."""
parsed = _parse_inputs(input or [])
@@ -98,6 +99,7 @@ def sandbox_create(
profile=resolved_profile,
inputs=resolved_inputs,
consumer=Consumer(actor=ActorType(actor), project=project),
ttl=ttl,
)
manager = SandboxManager()
try:
@@ -196,6 +198,33 @@ def snapshots_get(snapshot_id: str) -> None:
_print_json(record.model_dump(mode="json"))
@app.command("extend-ttl")
def sandbox_extend_ttl(
sandbox_id: str,
duration: Annotated[str, typer.Option("--duration", help="Extension duration (e.g. 2h)")],
) -> None:
"""Extend sandbox time-to-live (capped at profile max)."""
manager = SandboxManager()
try:
status = manager.extend_ttl(sandbox_id, duration)
except (KeyError, RuntimeError, ValueError) as exc:
typer.echo(f"Error: {exc}", err=True)
raise typer.Exit(code=1) from exc
_print_json(status.model_dump(mode="json"))
@app.command("expire")
def sandbox_expire(
apply: Annotated[bool, typer.Option("--apply", help="Destroy expired sandboxes")] = False,
) -> None:
"""Report or destroy sandboxes past TTL or idle-reap threshold."""
manager = SandboxManager()
results = manager.expire(apply=apply)
mode = "apply" if apply else "dry-run"
typer.echo(f"expire ({mode}): {len(results)} candidate(s)", err=True)
_print_json([r.model_dump(mode="json") for r in results])
@app.command("recreate")
def sandbox_recreate(sandbox_id: str) -> None:
"""Destroy and reprovision from stored inputs."""

View File

@@ -3,10 +3,17 @@
from __future__ import annotations
from sandboxer.extensions.registry import load_extension, resolve_backend
from sandboxer.lifecycle.expire import (
ExpireCandidate,
apply_expired_state,
find_expire_candidates,
)
from sandboxer.lifecycle.state_hub import emit_lifecycle_event, event_type_for_state
from sandboxer.lifecycle.store import SandboxStore, utcnow
from sandboxer.lifecycle.ttl import expires_at_from, extend_expires_at, resolve_initial_ttl
from sandboxer.models import (
Consumer,
ExpireActionResult,
MeterRecord,
Reachability,
SandboxCreateRequest,
@@ -60,6 +67,18 @@ class SandboxManager:
return extension.config.get("provider", "saas")
return resolve_host(profile, override=host_override)
@staticmethod
def _assign_ttl(
status: SandboxStatus,
profile,
*,
request_ttl: str | None,
) -> None:
ttl_str = resolve_initial_ttl(profile, request_ttl)
anchor = status.ready_at or utcnow()
status.ttl = ttl_str
status.expires_at = expires_at_from(anchor, ttl_str)
def create(self, request: SandboxCreateRequest, *, host: str | None = None) -> SandboxStatus:
profile = load_profile(request.profile)
extension = resolve_extension(profile, request.inputs, host_override=host)
@@ -119,6 +138,7 @@ class SandboxManager:
status.state = SandboxState.READY
status.ready_at = utcnow()
status.updated_at = status.ready_at
self._assign_ttl(status, profile, request_ttl=request.ttl)
if wants_telemetry and provision_before:
provision_after = collect_host_snapshot(resolved_host)
@@ -224,11 +244,98 @@ class SandboxManager:
profile=existing.profile_id,
inputs=dict(existing.inputs),
consumer=existing.consumer,
ttl=existing.ttl,
)
if existing.state != SandboxState.DESTROYED:
self.destroy(sandbox_id)
return self.create(request, host=existing.host)
def extend_ttl(self, sandbox_id: str, duration: str) -> SandboxStatus:
status = self.store.get(sandbox_id)
if not status:
raise KeyError(f"Sandbox not found: {sandbox_id}")
if status.state not in (SandboxState.READY, SandboxState.ACTIVE):
raise RuntimeError(
f"Cannot extend TTL for sandbox in state {status.state.value}"
)
if not status.expires_at or not status.ready_at:
raise RuntimeError("Sandbox has no expiry metadata")
profile = load_profile(status.profile_id)
new_expires, applied = extend_expires_at(
status.expires_at,
anchor=status.ready_at,
extension=duration,
max_duration=profile.ttl.max,
)
status.expires_at = new_expires
status.ttl = applied
status.updated_at = utcnow()
self.store.save(status)
emit_lifecycle_event(
status,
summary=f"TTL extended by {applied} (expires {new_expires.isoformat()})",
event_type="note",
)
return status
def expire(
self,
*,
apply: bool = False,
now=None,
) -> list[ExpireActionResult]:
candidates = find_expire_candidates(self.store, now=now)
results: list[ExpireActionResult] = []
for candidate in candidates:
if not apply:
results.append(
ExpireActionResult(
sandbox_id=candidate.sandbox_id,
reason=candidate.reason,
action="dry-run",
)
)
continue
try:
status = self.store.get(candidate.sandbox_id)
if not status or status.state not in (
SandboxState.READY,
SandboxState.ACTIVE,
):
continue
status = apply_expired_state(status, now=now)
self.store.save(status)
emit_lifecycle_event(
status,
summary=f"Sandbox expired ({candidate.reason})",
event_type=event_type_for_state(status.state),
)
self.destroy(candidate.sandbox_id)
results.append(
ExpireActionResult(
sandbox_id=candidate.sandbox_id,
reason=candidate.reason,
action="destroyed",
)
)
except Exception as exc:
results.append(
ExpireActionResult(
sandbox_id=candidate.sandbox_id,
reason=candidate.reason,
action="failed",
error=str(exc),
)
)
return results
def list_expire_candidates(self, *, now=None) -> list[ExpireCandidate]:
return find_expire_candidates(self.store, now=now)
def snapshot(self, sandbox_id: str, *, name: str | None = None) -> SnapshotRecord:
status = self.store.get(sandbox_id)
if not status:
@@ -345,6 +452,7 @@ class SandboxManager:
status.state = SandboxState.READY
status.ready_at = utcnow()
status.updated_at = status.ready_at
self._assign_ttl(status, profile, request_ttl=None)
self.store.save(status)
emit_lifecycle_event(
status,

View File

@@ -0,0 +1,77 @@
"""TTL and idle-reap expiry candidate selection."""
from __future__ import annotations
from dataclasses import dataclass
from datetime import UTC, datetime
from typing import Literal
from sandboxer.lifecycle.store import SandboxStore
from sandboxer.lifecycle.ttl import is_idle_expired, is_past_expiry
from sandboxer.models import SandboxState, SandboxStatus
from sandboxer.profiles.loader import load_profile
ExpireReason = Literal["ttl", "idle"]
@dataclass
class ExpireCandidate:
sandbox_id: str
profile_id: str
reason: ExpireReason
expires_at: datetime | None = None
updated_at: datetime | None = None
_LIVE_STATES = frozenset({SandboxState.READY, SandboxState.ACTIVE})
def find_expire_candidates(
store: SandboxStore,
*,
now: datetime | None = None,
) -> list[ExpireCandidate]:
ref = now or datetime.now(UTC)
candidates: list[ExpireCandidate] = []
seen: set[str] = set()
for status in store.list_all():
if status.state not in _LIVE_STATES:
continue
if is_past_expiry(status.expires_at, now=ref):
candidates.append(
ExpireCandidate(
sandbox_id=status.sandbox_id,
profile_id=status.profile_id,
reason="ttl",
expires_at=status.expires_at,
)
)
seen.add(status.sandbox_id)
continue
try:
profile = load_profile(status.profile_id)
except FileNotFoundError:
continue
if is_idle_expired(status.updated_at, profile.ttl.idle_reap, now=ref):
candidates.append(
ExpireCandidate(
sandbox_id=status.sandbox_id,
profile_id=status.profile_id,
reason="idle",
updated_at=status.updated_at,
)
)
seen.add(status.sandbox_id)
return sorted(candidates, key=lambda c: c.sandbox_id)
def apply_expired_state(status: SandboxStatus, *, now: datetime | None = None) -> SandboxStatus:
ref = now or datetime.now(UTC)
status.state = SandboxState.EXPIRED
status.updated_at = ref
return status

View File

@@ -38,6 +38,8 @@ def emit_lifecycle_event(
"consumer": status.consumer.model_dump(),
"actor_type": status.consumer.actor.value,
"state": status.state.value,
"ttl": status.ttl,
"expires_at": status.expires_at.isoformat() if status.expires_at else None,
"reachability": status.reachability.model_dump() if status.reachability else None,
"telemetry": status.telemetry,
"timestamps": {
@@ -58,6 +60,6 @@ def emit_lifecycle_event(
def event_type_for_state(state: SandboxState) -> str:
if state in (SandboxState.READY, SandboxState.DESTROYED):
if state in (SandboxState.READY, SandboxState.DESTROYED, SandboxState.EXPIRED):
return "milestone"
return "note"

View File

@@ -0,0 +1,121 @@
"""TTL duration parsing and expiry calculation."""
from __future__ import annotations
import re
from datetime import UTC, datetime, timedelta
from sandboxer.models import Profile
_DURATION_RE = re.compile(r"^(\d+)([smhd])$", re.IGNORECASE)
_UNIT_SECONDS = {"s": 1, "m": 60, "h": 3600, "d": 86400}
def parse_duration(value: str) -> timedelta:
"""Parse a duration string like ``4h``, ``30m``, ``1d``."""
raw = value.strip()
match = _DURATION_RE.match(raw)
if not match:
raise ValueError(f"Invalid duration: {value!r} (expected e.g. 4h, 30m, 1d)")
amount = int(match.group(1))
if amount <= 0:
raise ValueError(f"Duration must be positive: {value!r}")
unit = match.group(2).lower()
return timedelta(seconds=amount * _UNIT_SECONDS[unit])
def duration_seconds(value: str) -> int:
return int(parse_duration(value).total_seconds())
def resolve_initial_ttl(profile: Profile, request_ttl: str | None) -> str:
"""Pick create TTL from request override or profile default, capped at profile max."""
requested = request_ttl or profile.ttl.default
return cap_duration(requested, profile.ttl.max)
def cap_duration(requested: str, maximum: str) -> str:
"""Return ``requested`` if within ``maximum``; otherwise return ``maximum``."""
req_s = duration_seconds(requested)
max_s = duration_seconds(maximum)
if req_s > max_s:
return maximum
return requested
def expires_at_from(base: datetime, duration: str) -> datetime:
if base.tzinfo is None:
base = base.replace(tzinfo=UTC)
return base + parse_duration(duration)
def cap_expires_at(
candidate: datetime,
*,
anchor: datetime,
max_duration: str,
) -> datetime:
"""Cap ``candidate`` so it does not exceed ``anchor + max_duration``."""
ceiling = expires_at_from(anchor, max_duration)
if candidate.tzinfo is None:
candidate = candidate.replace(tzinfo=UTC)
return min(candidate, ceiling)
def extend_expires_at(
current: datetime,
*,
anchor: datetime,
extension: str,
max_duration: str,
) -> tuple[datetime, str]:
"""Add ``extension`` to ``current`` and cap at ``anchor + max_duration``."""
now = datetime.now(UTC)
base = max(current, now)
proposed = expires_at_from(base, extension)
capped = cap_expires_at(proposed, anchor=anchor, max_duration=max_duration)
applied = extension
if capped < proposed:
remaining = capped - base
if remaining.total_seconds() <= 0:
raise ValueError(f"Cannot extend: already at profile max ({max_duration})")
applied = format_timedelta(remaining)
return capped, applied
def format_timedelta(delta: timedelta) -> str:
seconds = int(delta.total_seconds())
if seconds <= 0:
raise ValueError("Duration must be positive")
if seconds >= 86400 and seconds % 86400 == 0:
return f"{seconds // 86400}d"
if seconds >= 3600 and seconds % 3600 == 0:
return f"{seconds // 3600}h"
if seconds >= 60 and seconds % 60 == 0:
return f"{seconds // 60}m"
return f"{seconds}s"
def is_past_expiry(expires_at: datetime | None, *, now: datetime | None = None) -> bool:
if expires_at is None:
return False
ref = now or datetime.now(UTC)
if expires_at.tzinfo is None:
expires_at = expires_at.replace(tzinfo=UTC)
return expires_at <= ref
def is_idle_expired(
updated_at: datetime,
idle_reap: str | None,
*,
now: datetime | None = None,
) -> bool:
if not idle_reap:
return False
ref = now or datetime.now(UTC)
if updated_at.tzinfo is None:
updated_at = updated_at.replace(tzinfo=UTC)
return updated_at + parse_duration(idle_reap) <= ref

View File

@@ -164,6 +164,8 @@ class SandboxStatus(BaseModel):
host: str | None = None
reachability: Reachability | None = None
inputs: dict[str, str] = Field(default_factory=dict)
ttl: str | None = None
expires_at: datetime | None = None
error: str | None = None
meter: MeterRecord | None = None
telemetry: dict | None = None # IntrospectionReport JSON when canary
@@ -173,6 +175,17 @@ class SandboxStatus(BaseModel):
destroyed_at: datetime | None = None
class ExtendTtlRequest(BaseModel):
duration: str
class ExpireActionResult(BaseModel):
sandbox_id: str
reason: Literal["ttl", "idle"]
action: Literal["dry-run", "expired", "destroyed", "failed"]
error: str | None = None
class SnapshotRestoreRequest(BaseModel):
host: str | None = None
consumer: Consumer | None = None

View File

@@ -88,4 +88,64 @@ def test_restore_snapshot() -> None:
json={"consumer": {"actor": "adm", "project": "sand-boxer"}},
)
assert resp.status_code == 200
assert resp.json()["sandbox_id"] == "restored1"
assert resp.json()["sandbox_id"] == "restored1"
def test_recreate_sandbox() -> None:
from datetime import UTC, datetime
status = SandboxStatus(
sandbox_id="new12345",
profile_id="profile.compose-e2e",
extension_id="ext.compose-ssh",
state=SandboxState.READY,
consumer=Consumer(actor=ActorType.ADM, project="sand-boxer"),
created_at=datetime.now(UTC),
updated_at=datetime.now(UTC),
)
with patch("sandboxer.api.app._manager") as mgr:
mgr.recreate.return_value = status
client = TestClient(app)
resp = client.post("/v1/sandboxes/abc12345/recreate")
assert resp.status_code == 200
assert resp.json()["sandbox_id"] == "new12345"
def test_extend_ttl() -> None:
from datetime import UTC, datetime
now = datetime.now(UTC)
status = SandboxStatus(
sandbox_id="abc12345",
profile_id="profile.compose-e2e",
extension_id="ext.compose-ssh",
state=SandboxState.READY,
consumer=Consumer(actor=ActorType.ADM, project="sand-boxer"),
ttl="2h",
expires_at=now,
created_at=now,
updated_at=now,
ready_at=now,
)
with patch("sandboxer.api.app._manager") as mgr:
mgr.extend_ttl.return_value = status
client = TestClient(app)
resp = client.patch(
"/v1/sandboxes/abc12345/ttl",
json={"duration": "2h"},
)
assert resp.status_code == 200
assert resp.json()["ttl"] == "2h"
def test_expire_sandboxes() -> None:
from sandboxer.models import ExpireActionResult
with patch("sandboxer.api.app._manager") as mgr:
mgr.expire.return_value = [
ExpireActionResult(sandbox_id="x", reason="ttl", action="dry-run")
]
client = TestClient(app)
resp = client.post("/v1/sandboxes/expire")
assert resp.status_code == 200
assert resp.json()[0]["action"] == "dry-run"

View File

@@ -126,9 +126,7 @@ def test_manager_snapshot_and_restore(store: SandboxStore, snapshots: SnapshotSt
with (
patch("sandboxer.core.manager.resolve_backend", return_value=backend),
patch("sandboxer.core.manager.load_extension"),
patch("sandboxer.core.manager.emit_lifecycle_event", return_value=None),
patch("sandboxer.core.manager.load_profile"),
patch("sandboxer.core.manager.resolve_host", return_value="coulombcore"),
):
record = manager.snapshot("test1234", name="pre-test")

265
tests/test_ttl.py Normal file
View File

@@ -0,0 +1,265 @@
"""TTL parsing, extend, and expire tests."""
from __future__ import annotations
from datetime import UTC, datetime, timedelta
from pathlib import Path
from unittest.mock import patch
import pytest
from sandboxer.core.manager import SandboxManager
from sandboxer.lifecycle.expire import find_expire_candidates
from sandboxer.lifecycle.store import SandboxStore
from sandboxer.lifecycle.ttl import (
cap_duration,
extend_expires_at,
format_timedelta,
is_idle_expired,
is_past_expiry,
parse_duration,
resolve_initial_ttl,
)
from sandboxer.models import (
ActorType,
Consumer,
Profile,
Reachability,
SandboxCreateRequest,
SandboxState,
SandboxStatus,
)
def _profile(**ttl_overrides) -> Profile:
ttl_data = {"default": "4h", "max": "24h", "idle_reap": None}
ttl_data.update(ttl_overrides)
return Profile.model_validate(
{
"id": "profile.compose-e2e",
"version": "1.0.0",
"extension": "ext.compose-ssh",
"ttl": ttl_data,
}
)
def test_parse_duration_units() -> None:
assert parse_duration("30m") == timedelta(minutes=30)
assert parse_duration("4h") == timedelta(hours=4)
assert parse_duration("1d") == timedelta(days=1)
assert parse_duration("90s") == timedelta(seconds=90)
def test_parse_duration_invalid() -> None:
with pytest.raises(ValueError, match="Invalid duration"):
parse_duration("4hours")
with pytest.raises(ValueError, match="positive"):
parse_duration("0h")
def test_cap_duration() -> None:
assert cap_duration("4h", "24h") == "4h"
assert cap_duration("48h", "24h") == "24h"
def test_resolve_initial_ttl() -> None:
profile = _profile()
assert resolve_initial_ttl(profile, None) == "4h"
assert resolve_initial_ttl(profile, "2h") == "2h"
assert resolve_initial_ttl(profile, "48h") == "24h"
def test_extend_expires_at_caps_at_max() -> None:
anchor = datetime(2026, 6, 24, 10, 0, tzinfo=UTC)
current = anchor + timedelta(hours=23)
new_expires, applied = extend_expires_at(
current,
anchor=anchor,
extension="4h",
max_duration="24h",
)
assert new_expires == anchor + timedelta(hours=24)
assert applied == "1h"
def test_extend_expires_at_at_max_raises() -> None:
anchor = datetime(2026, 6, 24, 10, 0, tzinfo=UTC)
current = anchor + timedelta(hours=24)
with pytest.raises(ValueError, match="profile max"):
extend_expires_at(
current,
anchor=anchor,
extension="1h",
max_duration="24h",
)
def test_format_timedelta() -> None:
assert format_timedelta(timedelta(hours=2)) == "2h"
assert format_timedelta(timedelta(minutes=30)) == "30m"
def test_is_past_expiry_and_idle() -> None:
now = datetime(2026, 6, 24, 12, 0, tzinfo=UTC)
assert is_past_expiry(now - timedelta(minutes=1), now=now)
assert not is_past_expiry(now + timedelta(minutes=1), now=now)
updated = now - timedelta(hours=2)
assert is_idle_expired(updated, "1h", now=now)
assert not is_idle_expired(updated, "4h", now=now)
@pytest.fixture
def store(tmp_path: Path) -> SandboxStore:
return SandboxStore(path=tmp_path / "sandboxes.json")
def test_find_expire_candidates_ttl_and_idle(store: SandboxStore) -> None:
now = datetime(2026, 6, 24, 12, 0, tzinfo=UTC)
store.save(
SandboxStatus(
sandbox_id="expired1",
profile_id="profile.compose-e2e",
extension_id="ext.compose-ssh",
state=SandboxState.READY,
consumer=Consumer(actor=ActorType.ADM, project="sand-boxer"),
expires_at=now - timedelta(minutes=5),
created_at=now - timedelta(hours=5),
updated_at=now - timedelta(hours=5),
ready_at=now - timedelta(hours=5),
)
)
store.save(
SandboxStatus(
sandbox_id="idle1",
profile_id="profile.sandbox-canary",
extension_id="ext.compose-ssh",
state=SandboxState.READY,
consumer=Consumer(actor=ActorType.ADM, project="sand-boxer"),
expires_at=now + timedelta(hours=2),
created_at=now - timedelta(hours=5),
updated_at=now - timedelta(hours=3),
ready_at=now - timedelta(hours=5),
)
)
with patch("sandboxer.lifecycle.expire.load_profile") as load_profile:
load_profile.side_effect = lambda pid: _profile(
idle_reap="2h" if pid == "profile.sandbox-canary" else None
)
candidates = find_expire_candidates(store, now=now)
reasons = {c.sandbox_id: c.reason for c in candidates}
assert reasons["expired1"] == "ttl"
assert reasons["idle1"] == "idle"
class FakeBackend:
def provision(self, profile, inputs, host):
return {
"sandbox_id": "test1234",
"host": host,
"remote_dir": "/tmp/sandboxer/test1234",
"compose_project": "sbx-e2e-test1234",
"compose_file": "docker-compose.yml",
"ssh_user": "root",
}
def wait_ready(self, handle):
return {
"ssh": f"root@{handle['host']}",
"remote_dir": handle["remote_dir"],
"compose_project": handle["compose_project"],
"host": handle["host"],
}
def teardown(self, handle):
return {"compose_removed": "True", "remote_dir_removed": "True"}
def test_manager_create_sets_expires_at(store: SandboxStore) -> None:
manager = SandboxManager(store=store)
request = SandboxCreateRequest(
profile="profile.compose-e2e",
inputs={"repo": "/tmp/repo"},
consumer=Consumer(actor=ActorType.ADM, project="sand-boxer"),
ttl="2h",
)
fake = FakeBackend()
with (
patch("sandboxer.core.manager.resolve_backend", return_value=fake),
patch("sandboxer.core.manager.emit_lifecycle_event", return_value=None),
patch("sandboxer.core.manager.resolve_host", return_value="coulombcore"),
):
status = manager.create(request)
assert status.ttl == "2h"
assert status.expires_at is not None
assert status.ready_at is not None
assert status.expires_at > status.ready_at
def test_manager_extend_ttl(store: SandboxStore) -> None:
now = datetime.now(UTC)
store.save(
SandboxStatus(
sandbox_id="live1234",
profile_id="profile.compose-e2e",
extension_id="ext.compose-ssh",
state=SandboxState.READY,
consumer=Consumer(actor=ActorType.ADM, project="sand-boxer"),
host="coulombcore",
reachability=Reachability(remote_dir="/tmp/x", host="coulombcore"),
ttl="4h",
expires_at=now + timedelta(hours=1),
created_at=now - timedelta(hours=1),
updated_at=now,
ready_at=now - timedelta(hours=1),
)
)
manager = SandboxManager(store=store)
with patch("sandboxer.core.manager.emit_lifecycle_event", return_value=None):
extended = manager.extend_ttl("live1234", "2h")
assert extended.expires_at > now + timedelta(hours=1)
def test_manager_expire_dry_run_and_apply(store: SandboxStore) -> None:
now = datetime.now(UTC)
store.save(
SandboxStatus(
sandbox_id="gone5678",
profile_id="profile.compose-e2e",
extension_id="ext.compose-ssh",
state=SandboxState.READY,
consumer=Consumer(actor=ActorType.ADM, project="sand-boxer"),
host="coulombcore",
reachability=Reachability(
remote_dir="/tmp/sandboxer/gone5678",
compose_project="sbx-e2e-gone5678",
host="coulombcore",
),
inputs={"compose_file": "docker-compose.yml"},
ttl="1h",
expires_at=now - timedelta(minutes=1),
created_at=now - timedelta(hours=2),
updated_at=now - timedelta(hours=2),
ready_at=now - timedelta(hours=2),
)
)
manager = SandboxManager(store=store)
fake = FakeBackend()
dry = manager.expire(apply=False, now=now)
assert len(dry) == 1
assert dry[0].action == "dry-run"
assert manager.get("gone5678").state == SandboxState.READY
with (
patch("sandboxer.core.manager.resolve_backend", return_value=fake),
patch("sandboxer.core.manager.emit_lifecycle_event", return_value=None),
patch("sandboxer.core.manager.load_extension"),
patch("sandboxer.core.manager.load_profile"),
):
applied = manager.expire(apply=True, now=now)
assert applied[0].action == "destroyed"
assert manager.get("gone5678").state == SandboxState.DESTROYED

View File

@@ -4,7 +4,7 @@ type: workplan
title: "TTL enforcement and operational hardening"
domain: infotech
repo: sand-boxer
status: ready
status: finished
owner: codex
topic_slug: custodian
created: "2026-06-24"
@@ -30,7 +30,7 @@ consumer profiles), SAND-WP-0012 (Packer orchestration)
```task
id: SAND-WP-0009-T01
status: todo
status: done
priority: high
state_hub_task_id: "44cee754-2874-40eb-9cb3-168e5bc8dd54"
```
@@ -43,7 +43,7 @@ max-cap enforcement.
```task
id: SAND-WP-0009-T02
status: todo
status: done
priority: high
state_hub_task_id: "a5a6503c-56a3-4876-8211-e06b9eed6292"
```
@@ -56,7 +56,7 @@ Persist in `SandboxStore`. Emit expiry in State Hub `detail`.
```task
id: SAND-WP-0009-T03
status: todo
status: done
priority: high
state_hub_task_id: "ff32a3e5-0bf6-479c-8373-d601588461e7"
```
@@ -69,7 +69,7 @@ HTTP: `PATCH /v1/sandboxes/{id}/ttl` with body `{"duration": "2h"}`.
```task
id: SAND-WP-0009-T04
status: todo
status: done
priority: high
state_hub_task_id: "ce597f28-a2f3-44ed-8e85-f8bd254bc4ce"
```
@@ -83,7 +83,7 @@ with existing `reap-stale` docs (host inventory vs TTL are distinct concerns).
```task
id: SAND-WP-0009-T05
status: todo
status: done
priority: medium
state_hub_task_id: "9ad34d90-bbc7-4ede-8549-f4291e27ba22"
```
@@ -96,7 +96,7 @@ state; no Temporal code in this repo.
```task
id: SAND-WP-0009-T06
status: todo
status: done
priority: medium
state_hub_task_id: "ffde8196-18e3-4762-8cfd-1b69874e51e1"
```
@@ -110,7 +110,7 @@ run validate if reuse-surface CLI available in environment.
```task
id: SAND-WP-0009-T07
status: todo
status: done
priority: medium
state_hub_task_id: "69b192c7-8599-46e7-bb63-8457bfb72a81"
```
@@ -122,21 +122,20 @@ Align OpenAPI with CLI surface from SAND-WP-0007.
```task
id: SAND-WP-0009-T08
status: todo
status: done
priority: medium
state_hub_task_id: "69d1a23f-b3a3-4aa7-846c-e953f02977f3"
```
`docs/ttl.md` — semantics, extend, expire, profile fields. Update
`docs/meta-framework.md`, `SCOPE.md`, `docs/migration-gaps.md`. Brief security
note in `docs/runbooks/` or `docs/security.md`: sandbox limits blast radius, not
intent enforcement (INTENT design principle).
note in `docs/security.md`: sandbox limits blast radius, not intent enforcement.
## Tests
```task
id: SAND-WP-0009-T09
status: todo
status: done
priority: high
state_hub_task_id: "0683b09a-0dd9-4880-9bd0-13003e3621a6"
```