Add reference-counted garbage collection

This commit is contained in:
2026-05-17 00:26:19 +02:00
parent 1dae855700
commit a60d24f814
7 changed files with 631 additions and 3 deletions

View File

@@ -99,8 +99,20 @@ default_duration_seconds = 220752000
```
Run `artifactstore retention sweep` from cron or another scheduler to mark
expired, unheld packages eligible for deletion. This work only records
eligibility; it never deletes bytes.
expired, unheld packages eligible for deletion. Then run
`artifactstore retention gc` to release the eligible packages' storage
locations and delete physical objects whose final reference has been
released:
```sh
artifactstore retention sweep
artifactstore retention gc
```
GC is reference-counted by `(backend_id, content_address)`: shared bytes stay in
the backend until every non-deleted storage location has been released. Each
released location emits a `v1.storage.location_deleted` event. A package becomes
`garbage_collected` only after all of its storage locations are released.
## Database backends
@@ -205,6 +217,7 @@ digest, emits `v1.storage.location_verified`, and marks failed locations as
| `artifactstore push <dir>` | Push a directory through the HTTP API and finalize the package. |
| `artifactstore manifest <package_id>` | Fetch the JSON manifest projection through the HTTP API. |
| `artifactstore retention sweep` | Run one deletion-eligibility sweep against the configured DB. |
| `artifactstore retention gc` | Run one reference-counted garbage-collection pass. |
| `artifactstore storage verify --backend <id>` | Re-read stored objects for a backend and record verification events. |
| `artifactstore guide-board ingest <run-dir>` | Ingest one guide-board run directory as an artifact package. |