Finish RAILIANCE-WP-0014 activity-core llm-connect live reconcile
Some checks failed
railiance-tests / smoke (push) Has been cancelled

Provider Secret gate cleared; full reconcile passed with fixture smoke
(health=ok, latency 2.084s). Harden the smoke against NetworkPolicy
allowlist propagation by retrying up to 6x with a 5s warm-up inside the
smoke pod — the netpol added 2026-06-19 rejected the pod's immediate
first request before its IP propagated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-07-01 23:52:05 +02:00
parent 23c23798be
commit a2f1c1299c
2 changed files with 15 additions and 5 deletions

View File

@@ -269,7 +269,7 @@ kubectl -n $(quote "$NAMESPACE") run llm-connect-smoke-\$(date +%s) \\
--image-pull-policy=Never \\
--env=LLM_CONNECT_URL=$(quote "$EXPECTED_URL") \\
--env=LLM_CONNECT_TIMEOUT_SECONDS=$(quote "$EXPECTED_TIMEOUT") \\
-- python scripts/smoke_activity_core_endpoint.py
--command -- sh -c 'for i in 1 2 3 4 5 6; do sleep 5; python scripts/smoke_activity_core_endpoint.py && exit 0; echo "smoke attempt \$i failed; retrying"; done; exit 1'
EOF
)" 2>&1
)"

View File

@@ -4,11 +4,11 @@ type: workplan
title: "activity-core llm-connect live reconcile"
domain: financials
repo: railiance-cluster
status: blocked
status: finished
owner: codex
topic_slug: railiance
created: "2026-06-18"
updated: "2026-06-18"
updated: "2026-07-01"
state_hub_workstream_id: "a152ddda-d60a-4a65-9b9c-59e2db9ff2b7"
---
@@ -70,7 +70,7 @@ values. Live evidence note `c72c514a-399e-4c54-8d5b-d36405932360` confirms
```task
id: RAILIANCE-WP-0014-T03
status: wait
status: done
priority: high
state_hub_task_id: "ae8af00a-c14f-4b76-933c-46d06cd360ae"
```
@@ -87,7 +87,17 @@ run the in-namespace fixture smoke with `imagePullPolicy=Never`, and post
non-secret evidence: provider Secret key count, deployment readiness,
pass/fail, latency/recommendation summary or sanitized failure.
Current live gate on 2026-06-18: provider Secret
2026-07-01: Gate closed. Provider Secret `activity-core/llm-connect-provider-secrets`
present (key count 1, no values inspected), overlay applied (no drift),
deployment `llm-connect` ready 1/1, in-namespace fixture smoke passed
(`health=ok latency_seconds=2.084 recommendations=1`). Evidence note
`bddbf5d2-6cbe-4d97-9de6-689147d61be1`. The first rerun failed with
`Connection refused` because the `llm-connect-activity-core-only`
NetworkPolicy (added 2026-06-19) allowlist had not yet propagated the fresh
smoke-pod IP; the reconcile tool now retries the smoke up to 6× with a 5s
warm-up inside the pod.
Historical live gate on 2026-06-18: provider Secret
`activity-core/llm-connect-provider-secrets` is missing, so deployment and
smoke are intentionally blocked until operator/OpenBao-to-Kubernetes Secret
custody is complete. Evidence note