Files
the-custodian/workplans/CUST-WP-0014-repo-sync-automation.md
tegwick 07031fa63e feat(CUST-WP-0014): repo sync automation & Gitea inventory
- Migration e2f3a4b5c6d7: add last_state_synced_at to managed_repos
- consistency_check.py: PATCH last_state_synced_at after fix run;
  fix ~ treated as non-empty state_hub_task_id (C-03 vs C-11);
  fix _inject_task_id_into_block skipping injection when field exists
  with null value
- install_hooks.sh: idempotent post-commit hook installer for all
  registered repos (make install-hooks REPO= / install-hooks-all)
- gitea_inventory.py: compare coulomb Gitea org against state-hub
  registered repos — registered / unregistered / hub-only sections
- infra/README.md: document systemd user timer + crontab fallback
- systemd user timer: custodian-sync.{service,timer} runs
  fix-consistency-all every 15 min (enabled)
- dashboard/src/repo-sync.md: Repo Sync Health page — sync age table,
  unregistered Gitea repos, hub-only repos
- api/routers/repos.py: GET /repos/{slug}/dispatch endpoint returning
  active goal, pending tasks per workstream, human interventions
- mcp_server/server.py: get_repo_dispatch() MCP tool

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 01:41:16 +01:00

247 lines
8.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: CUST-WP-0014
type: workplan
title: Repo Sync Automation & Gitea Inventory
domain: custodian
repo: the-custodian
status: done
state_hub_workstream_id: 27ea80bd-76bf-44a7-b0ed-e09748d5390b
created: 2026-03-16
updated: 2026-03-16
---
# CUST-WP-0014 — Repo Sync Automation & Gitea Inventory
## Problem
When a repo agent completes work and commits, the state-hub does not automatically
learn about it. Task statuses in workplan `.md` files go unsynced until a human
manually runs `make fix-consistency REPO=<slug>`. This breaks the episodic memory
loop: future sessions see stale hub state and give wrong orientation.
In parallel, the custodian only tracks repos that have been manually registered.
All other repos living on Gitea (`http://92.205.130.254:32166`, org `coulomb`) are
invisible — no workplan tracking, no SBOM, no goal alignment.
## Goal
1. **Automatic sync**: after every commit in a registered repo, the state-hub learns
about it within seconds — no agent discipline required.
2. **Gitea inventory**: the hub knows about every repo on Gitea; unregistered repos
are surfaced so they can be onboarded or explicitly marked out-of-scope.
3. **Sync timestamp**: every registered repo carries a `last_state_synced_at`
timestamp so health dashboards can detect stale repos at a glance.
4. **Dispatch endpoint** (Tier 3): the hub can tell any repo what active workplan
it should be working on and what tasks are pending — foundation for autonomous
agent sessions.
## Architecture
```
┌──────────────┐ post-commit hook ┌───────────────────────────┐
│ repo agent │ ──────────────────► │ fix-consistency REPO=x │
│ (any repo) │ │ → updates task statuses │
└──────────────┘ │ → sets last_state_synced │
└───────────────────────────┘
┌──────────────────────────┐ cron (15 min) │
│ fix-consistency-all │ ─────────────────────┘ (belt & suspenders)
└──────────────────────────┘
┌─────────────────────────────┐
│ Gitea API (:32166/coulomb) │ ──► gitea_inventory.py ──► surface gaps
└─────────────────────────────┘
┌─────────────────────────────────────┐
│ GET /repos/{slug}/dispatch │ ──► active workplan + pending tasks
└─────────────────────────────────────┘ for autonomous agent sessions
```
---
## Task: Add `last_state_synced_at` to managed_repos
```task
id: CUST-WP-0014-T01
status: todo
priority: high
state_hub_task_id: "f35c86a9-d927-4543-9e74-ff32cadcc766"
```
Migration: add `last_state_synced_at: DateTime (nullable)` to `managed_repos`.
Update `consistency_check.py` to PATCH this field to `utcnow()` after every
successful `--fix` run via `PATCH /repos/{slug}/` (add endpoint if missing).
Update `ManagedRepoRead` schema to include the field.
Acceptance: `GET /repos/the-custodian/` shows `last_state_synced_at` non-null
after running `make fix-consistency REPO=the-custodian`.
---
## Task: Git post-commit hook installer
```task
id: CUST-WP-0014-T02
status: todo
priority: high
state_hub_task_id: "97c831d9-d915-4b77-9dd6-929ff24dfd5e"
```
Create `state-hub/scripts/install_hooks.sh`:
- Accepts `--repo <slug>` or `--all` (iterates `GET /repos/`)
- Resolves repo path from slug (convention: `/home/worsch/<slug>` or via a
`local_path` field — see T05)
- Writes `.git/hooks/post-commit` that calls:
```bash
cd ~/the-custodian/state-hub && make fix-consistency REPO=<slug>
```
- Idempotent: prepends block guarded by `# custodian-sync-hook` marker if hook
already exists; skips if marker present
- Makes hook executable
Add `make install-hooks REPO=<slug>` and `make install-hooks-all` Makefile targets.
Acceptance: commit in `marki-docx` → `last_state_synced_at` updates within 2s.
---
## Task: Periodic cron sync (belt-and-suspenders)
```task
id: CUST-WP-0014-T03
status: todo
priority: medium
state_hub_task_id: "06be1c0b-893b-4fbb-967c-9842ba59ffaa"
```
Add a cron entry (via systemd user timer or direct crontab) that runs:
```
cd ~/the-custodian/state-hub && make fix-consistency-all
```
every 15 minutes when the state-hub API is reachable. Use a guard:
```bash
curl -sf http://127.0.0.1:8000/state/health || exit 0
```
Document the timer setup in `state-hub/infra/README.md` (systemd user timer
preferred on WSL2 if systemd is available; otherwise crontab fallback).
Acceptance: after stopping all agents for 15 min and making a manual workplan
edit, `last_state_synced_at` updates without human intervention.
---
## Task: Gitea repo discovery tool
```task
id: CUST-WP-0014-T04
status: todo
priority: high
state_hub_task_id: "f05a04e4-10f3-4c41-a73f-057f0dea5126"
```
Create `state-hub/scripts/gitea_inventory.py`:
- Reads Gitea base URL + token from env (`GITEA_URL`, `GITEA_TOKEN`) or `.env`
- Calls `GET /api/v1/orgs/coulomb/repos?limit=50&page=N` (paginate)
- Also includes user repos if needed: `GET /api/v1/user/repos`
- Compares result against `GET /repos/` from state-hub
- Outputs three sections:
1. **Registered** — in both (show `last_state_synced_at`)
2. **Unregistered** — on Gitea but not in hub (candidate for onboarding)
3. **Hub-only** — in hub but no matching Gitea remote (stale or local-only)
Add `make gitea-inventory` Makefile target.
Add `GITEA_URL=http://92.205.130.254:32166` and `GITEA_TOKEN=` to `.env.example`.
Acceptance: running `make gitea-inventory` with a valid token prints a clear
three-section report.
---
## Task: Dashboard — Repo Sync Health page
```task
id: CUST-WP-0014-T05
status: todo
priority: medium
state_hub_task_id: "ceae2737-4762-49e5-ae41-9eca3ca79dda"
```
Add `dashboard/src/repo-sync.md` Observable page:
- Table: all registered repos, `last_state_synced_at` (age in h/m), colour-coded
(green < 1h, orange 124h, red > 24h or null)
- Section: Gitea repos not yet registered (calls a new data loader that wraps
`gitea_inventory.py --json`)
- Inline "Register" action placeholder (links to `make register-project` docs)
Add to nav in `observablehq.config.js`.
---
## Task: Dispatch endpoint
```task
id: CUST-WP-0014-T06
status: todo
priority: low
state_hub_task_id: "86b646f3-a966-4ff4-9c9f-8684f1e81c54"
```
Add `GET /repos/{slug}/dispatch` router in `api/routers/repos.py`:
Response shape:
```json
{
"repo_slug": "marki-docx",
"active_goal": { "id": "...", "title": "...", "description": "..." },
"active_workstreams": [
{
"id": "...",
"title": "...",
"workplan_file": "workplans/MRKD-WP-0001-level1-core.md",
"pending_tasks": [
{ "id": "...", "title": "...", "priority": "high", "needs_human": false }
]
}
],
"human_interventions": [...],
"last_state_synced_at": "2026-03-16T..."
}
```
`workplan_file` is derived from the workstream's `slug` field matched against
known workplan naming conventions — or stored explicitly (stretch: add
`workplan_path` column to workstreams).
This endpoint is the foundation for a cron-triggered autonomous agent session:
```bash
curl http://127.0.0.1:8000/repos/marki-docx/dispatch | \
claude --print "You are the marki-docx agent. $(cat -)"
```
MCP tool: `get_repo_dispatch(repo_slug)`.
---
## Milestones
| # | Milestone | Tasks |
|---|-----------|-------|
| M1 | Sync timestamp live | T01 |
| M2 | Auto-sync on commit | T01, T02 |
| M3 | Belt-and-suspenders | T03 |
| M4 | Gitea inventory visible | T04, T05 |
| M5 | Dispatch endpoint ready | T06 |
## Dependencies
- Consistency engine (CUST-WP-0008) — completed ✓
- `managed_repos` table (v0.5) — live ✓
## Out of Scope
- Autonomous agent scheduling (that builds on T06 but is a separate workplan)
- Gitea webhook integration (post-commit hook covers the same use case locally)
- Multi-user Gitea orgs beyond `coulomb`