feat(repos): git-fingerprint-based machine-independent repo identity

Add git_fingerprint (root commit SHA-1) to managed_repos as a stable,
machine-independent identifier — identical across every clone regardless
of checkout path, remote URL, or SSH alias.

- Migration n1i2j3k4l5m6: adds git_fingerprint column + non-unique index
  (non-unique to support repos that share ancestry via forks/splits)
- GET /repos/by-fingerprint?hash=<sha>[&remote_url=<url>]: lookup by
  fingerprint; optional remote_url disambiguates shared-ancestry repos
- GET /repos/by-remote?url=<url>: fallback lookup by remote URL
- consistency_check.py --here [PATH]: auto-detects repo slug from any
  local checkout via fingerprint (falls back to remote URL), then auto-
  registers host_paths[hostname] so subsequent runs need no override
- --all now includes repos with host_paths[current_hostname], not just
  those with local_path
- fix-consistency-here / check-consistency-here Makefile targets
- Fixed _api_get bug: httpx strips query strings when params={} is passed
- Backfilled fingerprints for 14 repos on this host

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-28 23:28:22 +01:00
parent 3f96dc035d
commit 1f8ef7f88b
6 changed files with 232 additions and 7 deletions

View File

@@ -27,6 +27,7 @@ class ManagedRepo(Base, TimestampMixin):
topic_id: Mapped[uuid.UUID | None] = mapped_column(
UUID(as_uuid=True), ForeignKey("topics.id", ondelete="SET NULL"), nullable=True
)
git_fingerprint: Mapped[str | None] = mapped_column(String(40), nullable=True, index=True)
sbom_source: Mapped[str | None] = mapped_column(Text, nullable=True)
last_sbom_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True