Files
railiance-forge/workplans/FORGE-WP-0003-actions-runner-substrate.md
tegwick de6178764c
Some checks failed
Forge Runner Smoke / compatibility-smoke (push) Has been cancelled
Record haskelseed runner smoke state
2026-06-08 00:51:50 +02:00

9.5 KiB

id, type, title, domain, repo, status, owner, topic_slug, planning_priority, created, updated, state_hub_workstream_id
id type title domain repo status owner topic_slug planning_priority created updated state_hub_workstream_id
FORGE-WP-0003 workplan Gitea Actions runner substrate for Railiance workloads railiance railiance-forge active codex railiance high 2026-06-07 2026-06-07 149a0316-64d1-4664-96d0-274577c32e63

Gitea Actions runner substrate for Railiance workloads

Context

Inter-Hub reported that its production deployment is blocked on a forge-owned Actions runner substrate. The inter-hub workflow currently targets self-hosted and haskelseed, but production remained on the older API surface after deployment-trigger commits. The current forge migration notes explicitly excluded an Actions runner deployment, while the forge operating contract says railiance-forge owns runner deployment, registration, labels, credential boundaries, and health evidence.

This workplan turns that ownership contract into an actionable runner substrate without weakening repo or app boundaries. It should unblock inter-hub only after the runner is registered, visible, and has passed a non-production sample job.

T01 - Register blocker and dependency evidence

id: FORGE-WP-0003-T01
status: done
priority: high
state_hub_task_id: "b5a42f74-7792-4fbc-8e1f-16c1082ea194"

Capture the immediate dependency chain:

  • inter-hub R7 waits on a self-hosted runner for labels currently written as self-hosted and haskelseed;
  • hub.coulomb.social still serves the older API surface after pushed deployment-trigger commits;
  • docs/first-migration-plan.md made runner deployment a non-goal for the first forge migration;
  • docs/ci-runner-actions-gitops-ownership.md assigns runner substrate ownership to railiance-forge.

Done when this workplan is registered in State Hub and the unread forge inbox messages that created the blocker are marked read.


T02 - Inventory current Gitea Actions state

id: FORGE-WP-0003-T02
status: done
priority: high
state_hub_task_id: "87181d63-049e-4a2b-a5e3-bf16763246d7"

Inspect the current Gitea Actions configuration without printing secrets.

Check:

  • whether Actions are enabled for the current Gitea instance;
  • whether any act_runner service is already registered and online;
  • whether a haskelseed runner exists, and which labels it advertises;
  • runner logs around the inter-hub Build and Deploy attempts;
  • registry tags for the blocked inter-hub commits, including the commit tag and latest where applicable.

Done when the actual current runner/registry state is recorded as non-secret evidence in the repo and State Hub.

2026-06-07: Added docs/gitea-actions-runner-evidence.md and make runner-status to capture non-secret inventory. Current session evidence: public inter-hub /api/v2/hubs still returns 404, the direct haskelseed SSH alias timed out, and skopeo is unavailable for registry tag inspection. After ops-bridge was updated, haskelseed is reachable at root@192.168.178.135 with /home/worsch/.ssh/id_ops. Haskelseed has act_runner v0.6.1-1-g8e6b3be9 and /root/.runner registered as haskelseed with labels haskelseed:host, linux:host, and x86_64:host, but no OpenRC service or live runner process was observed. This task still waits on Gitea runner admin visibility and registry tag inspection.

2026-06-07: Activated the existing haskelseed runner registration through ops-bridge. Backed up /root/.runner to /root/.runner.bak-20260607225905, updated labels to include self-hosted, linux_amd64, container-build, and registry-publish, installed the OpenRC service from runner/act-runner-haskelseed.openrc.example, and started act_runner as PID 5911. The daemon log reports that runner haskelseed declared successfully with labels self-hosted, haskelseed, linux, linux_amd64, x86_64, container-build, and registry-publish.

2026-06-08: Completed current-state inventory. Gitea created forge-runner-smoke.yaml #1 for commit 19ee47fe82, but the run remains Waiting with duration 0s. Haskelseed login shell has the deploy tools needed by inter-hub (skopeo, helm, kubectl, nix, git, curl). Registry inspection from haskelseed shows inter-hub tags 91037a4, ae9e497, fa96fb8, 7cc3173, and latest are all manifest unknown, confirming the blocked inter-hub workflow did not publish those images.


T03 - Decide runner placement, labels, and capacity rules

id: FORGE-WP-0003-T03
status: done
priority: high
state_hub_task_id: "eecde550-43a5-4d77-8e19-c991c5456b42"

Choose the first supported runner model.

Decisions:

  • place the runner on haskelseed or on a separate approved runner host;
  • publish semantic labels such as linux, container-build, and registry-publish;
  • decide whether to keep compatibility labels like self-hosted and haskelseed during the first unblock;
  • use concurrency 1 or an explicit build lock if haskelseed remains shared infrastructure;
  • treat cluster-deploy or cluster-access labels as separate approvals, not as implicit side effects of the build runner.

Done when the label and placement contract is documented with any required human approvals called out.

2026-06-07: Documented the first supported runner model in docs/gitea-actions-runner-substrate.md: one haskelseed compatibility runner named railiance-haskelseed-build-01, capacity 1, compatibility labels self-hosted and haskelseed, semantic labels linux, linux_amd64, container-build, and registry-publish, and no implicit cluster-deploy label.


T04 - Build the runner deployment and recovery runbook

id: FORGE-WP-0003-T04
status: done
priority: high
state_hub_task_id: "a3d0adfb-d1f9-4a5f-8e05-c4a8fbb160b1"

Create the forge-owned runner operating surface.

Include:

  • installation or service definition for the selected runner host;
  • registration-token custody path, referenced by name only;
  • start, stop, restart, drain, replacement, and token-rotation steps;
  • log inspection commands that avoid secret output;
  • health and label inspection commands;
  • rollback or disable path for a bad runner registration.

Done when an operator can register and operate the runner from the forge repo without committing decrypted secrets or machine-local assumptions.

2026-06-07: Added the attended install/recovery runbook, non-secret runner/ templates, systemd and OpenRC service examples, make runner-docs, make runner-status, and make check-runner-tools. Registration tokens are referenced by file path only and are never committed.


T05 - Prove a non-production sample job

id: FORGE-WP-0003-T05
status: wait
priority: high
state_hub_task_id: "9ada5b3e-2ddb-4a55-b9f4-5a6e00fef8b2"

Run a tiny non-production workflow against the runner before using it for inter-hub deployment.

The proof should show:

  • job scheduling reaches the expected runner;
  • labels match the published contract;
  • build tooling required by the first supported workload is present;
  • no cluster deployment authority is granted unless separately approved;
  • logs and State Hub evidence identify the runner and commit without exposing tokens.

Done when the sample job result is recorded and consumers can cite the runner label as available.

2026-06-07: Added .gitea/workflows/forge-runner-smoke.yaml. It cannot pass until an approved runner is registered and visible to Gitea.

2026-06-07: Haskelseed now has a running runner with matching labels. Smoke execution is still pending until the workflow exists in the remote Gitea repo and is dispatched or triggered.

2026-06-08: The workflow exists in Gitea and run #1 was created from the push, but it is still Waiting. This task now waits on authenticated Gitea Actions inspection to approve, rerun, or diagnose runner assignment.


T06 - Unblock the inter-hub deployment path

id: FORGE-WP-0003-T06
status: wait
priority: high
state_hub_task_id: "53929202-40aa-4470-a249-9d0ee02d3213"

Coordinate the first real consumer unblock with inter-hub after T05 passes.

Steps:

  • confirm the inter-hub workflow can target the approved runner labels;
  • rerun or inspect the Build and Deploy workflow for the blocked commits;
  • verify the expected inter-hub image tag exists in the registry;
  • hand off runner evidence and any workflow adjustment recommendation to inter-hub;
  • avoid repeated production push probes until the runner is visible and ready.

Done when inter-hub has a clear deployment result or a narrower non-runner blocker.

2026-06-07: Inter-hub unblock remains gated on T05. Do not rerun production push probes until the forge smoke workflow passes.


T07 - Publish runner evidence and ongoing health checks

id: FORGE-WP-0003-T07
status: done
priority: medium
state_hub_task_id: "c959a553-ec48-4e98-a752-168a2b067a81"

Update forge evidence docs and read-only operator targets so the runner is not a one-off fix.

Include:

  • runner inventory by label, placement, and trust level;
  • last successful sample job and any publish job evidence;
  • expected logs, dashboards, or status commands;
  • documented alert or escalation condition for stuck jobs and offline runners;
  • Forgejo migration notes so the same semantic labels can survive the future Gitea-to-Forgejo cutover.

Done when forge can continuously explain whether the runner substrate is healthy and what labels downstream workflows may depend on.

2026-06-07: Published runner evidence docs and Makefile probes. Current health is explicitly not proven: no runner registration has been observed from this session, and live host/Gitea inspection requires attended access.