From a2f1c1299cf08c09d7dab51727a7fab768730e21 Mon Sep 17 00:00:00 2001 From: tegwick Date: Wed, 1 Jul 2026 23:52:05 +0200 Subject: [PATCH] Finish RAILIANCE-WP-0014 activity-core llm-connect live reconcile MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Provider Secret gate cleared; full reconcile passed with fixture smoke (health=ok, latency 2.084s). Harden the smoke against NetworkPolicy allowlist propagation by retrying up to 6x with a 5s warm-up inside the smoke pod — the netpol added 2026-06-19 rejected the pod's immediate first request before its IP propagated. Co-Authored-By: Claude Fable 5 --- ...iliance-reconcile-activity-core-llm-connect | 2 +- ...activity-core-llm-connect-live-reconcile.md | 18 ++++++++++++++---- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/tools/cmd/railiance-reconcile-activity-core-llm-connect b/tools/cmd/railiance-reconcile-activity-core-llm-connect index 7a58207..0742f2e 100755 --- a/tools/cmd/railiance-reconcile-activity-core-llm-connect +++ b/tools/cmd/railiance-reconcile-activity-core-llm-connect @@ -269,7 +269,7 @@ kubectl -n $(quote "$NAMESPACE") run llm-connect-smoke-\$(date +%s) \\ --image-pull-policy=Never \\ --env=LLM_CONNECT_URL=$(quote "$EXPECTED_URL") \\ --env=LLM_CONNECT_TIMEOUT_SECONDS=$(quote "$EXPECTED_TIMEOUT") \\ - -- python scripts/smoke_activity_core_endpoint.py + --command -- sh -c 'for i in 1 2 3 4 5 6; do sleep 5; python scripts/smoke_activity_core_endpoint.py && exit 0; echo "smoke attempt \$i failed; retrying"; done; exit 1' EOF )" 2>&1 )" diff --git a/workplans/RAILIANCE-WP-0014-activity-core-llm-connect-live-reconcile.md b/workplans/RAILIANCE-WP-0014-activity-core-llm-connect-live-reconcile.md index 340fbe3..e088293 100644 --- a/workplans/RAILIANCE-WP-0014-activity-core-llm-connect-live-reconcile.md +++ b/workplans/RAILIANCE-WP-0014-activity-core-llm-connect-live-reconcile.md @@ -4,11 +4,11 @@ type: workplan title: "activity-core llm-connect live reconcile" domain: financials repo: railiance-cluster -status: blocked +status: finished owner: codex topic_slug: railiance created: "2026-06-18" -updated: "2026-06-18" +updated: "2026-07-01" state_hub_workstream_id: "a152ddda-d60a-4a65-9b9c-59e2db9ff2b7" --- @@ -70,7 +70,7 @@ values. Live evidence note `c72c514a-399e-4c54-8d5b-d36405932360` confirms ```task id: RAILIANCE-WP-0014-T03 -status: wait +status: done priority: high state_hub_task_id: "ae8af00a-c14f-4b76-933c-46d06cd360ae" ``` @@ -87,7 +87,17 @@ run the in-namespace fixture smoke with `imagePullPolicy=Never`, and post non-secret evidence: provider Secret key count, deployment readiness, pass/fail, latency/recommendation summary or sanitized failure. -Current live gate on 2026-06-18: provider Secret +2026-07-01: Gate closed. Provider Secret `activity-core/llm-connect-provider-secrets` +present (key count 1, no values inspected), overlay applied (no drift), +deployment `llm-connect` ready 1/1, in-namespace fixture smoke passed +(`health=ok latency_seconds=2.084 recommendations=1`). Evidence note +`bddbf5d2-6cbe-4d97-9de6-689147d61be1`. The first rerun failed with +`Connection refused` because the `llm-connect-activity-core-only` +NetworkPolicy (added 2026-06-19) allowlist had not yet propagated the fresh +smoke-pod IP; the reconcile tool now retries the smoke up to 6× with a 5s +warm-up inside the pod. + +Historical live gate on 2026-06-18: provider Secret `activity-core/llm-connect-provider-secrets` is missing, so deployment and smoke are intentionally blocked until operator/OpenBao-to-Kubernetes Secret custody is complete. Evidence note