perf(doi): fingerprint-based DB cache for DoI results

Adds doi_cache table (migration k8f9a0b1c2d3). Results are stored after
each evaluation and reused on subsequent requests when the fingerprint
matches. Fingerprint covers repo.updated_at, latest TPSC snapshot_at,
latest goal updated_at, and mtime of SCOPE.md / CLAUDE.md / tpsc.yaml.

Behaviour:
- Summary (warm cache, nothing changed): ~0.4s (was 0.9s)
- Summary (one repo stale): ~0.9s (only stale repos recomputed)
- Single repo (cache hit): ~0.2s (was 40s for full check)
- Single repo ?force_refresh=true: ~2s (full C7/C13 subprocess check)

Total journey: 108s (original) → 6s → <1s → 0.2s (cached single repo)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-20 01:47:19 +01:00
parent 245cd72ba3
commit 1ee0343f75
5 changed files with 251 additions and 38 deletions

View File

@@ -45,6 +45,35 @@ class DoIReport:
checked_at: str = field(default_factory=lambda: datetime.now(tz=timezone.utc).isoformat())
def compute_fingerprint(
repo: dict,
latest_tpsc_snap_at: str | None,
latest_goal_updated_at: str | None,
) -> str:
"""Compute a pipe-joined fingerprint of all inputs that affect DoI criteria.
If any component changes, the fingerprint changes and the cache is invalidated:
- repo.updated_at → covers last_sbom_at, remote_url, host_paths, domain changes
- latest_tpsc_snap_at → C9 (TPSC snapshot exists)
- latest_goal_updated_at → C10 (active repo goal)
- mtime of SCOPE.md, CLAUDE.md, tpsc.yaml → C5, C6, C9, C11, C12
"""
parts = [
str(repo.get("updated_at") or ""),
str(latest_tpsc_snap_at or ""),
str(latest_goal_updated_at or ""),
]
repo_path = _resolve_path(repo)
if repo_path:
for fname in ("SCOPE.md", "CLAUDE.md", "tpsc.yaml"):
f = Path(repo_path) / fname
try:
parts.append(f"{fname}:{f.stat().st_mtime:.3f}")
except FileNotFoundError:
parts.append(f"{fname}:absent")
return "|".join(parts)
def _resolve_path(repo: dict) -> str:
hostname = socket.gethostname()
host_paths = repo.get("host_paths") or {}