generated from coulomb/repo-seed
Add activity-core LLM endpoint support
This commit is contained in:
104
docs/activity-core-llm-endpoint.md
Normal file
104
docs/activity-core-llm-endpoint.md
Normal file
@@ -0,0 +1,104 @@
|
||||
# Activity-Core LLM Endpoint Handoff
|
||||
|
||||
This document records the `llm-connect` endpoint contract for activity-core
|
||||
daily WSJF triage.
|
||||
|
||||
## Service URL
|
||||
|
||||
Proposed stable in-cluster URL:
|
||||
|
||||
```text
|
||||
http://llm-connect.activity-core.svc.cluster.local:8080
|
||||
```
|
||||
|
||||
Use this value for activity-core `LLM_CONNECT_URL` after the Kubernetes overlay
|
||||
has been applied and smoked from the `activity-core` namespace. Keep
|
||||
`LLM_CONNECT_TIMEOUT_SECONDS=300`.
|
||||
|
||||
## Runtime Profile
|
||||
|
||||
The service supports the activity-core profile name:
|
||||
|
||||
```text
|
||||
custodian-triage-balanced
|
||||
```
|
||||
|
||||
Default runtime values:
|
||||
|
||||
```text
|
||||
provider=openrouter
|
||||
model=anthropic/claude-sonnet-4
|
||||
temperature=0.2
|
||||
max_tokens=1800
|
||||
max_depth=2
|
||||
timeout_seconds=300
|
||||
model_params.reasoning_effort=medium
|
||||
```
|
||||
|
||||
Operators can override provider/model through the Deployment ConfigMap or
|
||||
runtime env:
|
||||
|
||||
```text
|
||||
LLM_CONNECT_CUSTODIAN_TRIAGE_PROVIDER
|
||||
LLM_CONNECT_CUSTODIAN_TRIAGE_MODEL
|
||||
```
|
||||
|
||||
Provider credentials must be injected at runtime through
|
||||
`llm-connect-provider-secrets`; do not store credential values in Git or State
|
||||
Hub.
|
||||
|
||||
## Local Smoke
|
||||
|
||||
Run a mock server that returns known schema-valid daily triage JSON:
|
||||
|
||||
```bash
|
||||
export LLM_CONNECT_MOCK_RESPONSE="$(python -c 'import json; print(json.dumps(json.load(open("fixtures/activity_core/daily-triage-valid-content.json"))))')"
|
||||
python -m llm_connect.server --host 127.0.0.1 --port 8080 --provider mock
|
||||
```
|
||||
|
||||
In another shell:
|
||||
|
||||
```bash
|
||||
python scripts/smoke_activity_core_endpoint.py --url http://127.0.0.1:8080
|
||||
```
|
||||
|
||||
The smoke script checks:
|
||||
|
||||
- `GET /health`
|
||||
- fixture `POST /execute`
|
||||
- response has a string `content` field
|
||||
- `content` parses as JSON
|
||||
- parsed JSON matches `fixtures/activity_core/daily-triage-report.schema.json`
|
||||
|
||||
## Cluster Smoke
|
||||
|
||||
Apply the overlay from the repo root after creating the provider Secret:
|
||||
|
||||
```bash
|
||||
kubectl apply -k deploy/k8s/activity-core-llm-connect
|
||||
kubectl -n activity-core rollout status deployment/llm-connect
|
||||
```
|
||||
|
||||
Run the in-namespace smoke:
|
||||
|
||||
```bash
|
||||
kubectl -n activity-core run llm-connect-smoke \
|
||||
--rm -i --restart=Never \
|
||||
--image=llm-connect:latest \
|
||||
--env=LLM_CONNECT_URL=http://llm-connect.activity-core.svc.cluster.local:8080 \
|
||||
--env=LLM_CONNECT_TIMEOUT_SECONDS=300 \
|
||||
-- python scripts/smoke_activity_core_endpoint.py
|
||||
```
|
||||
|
||||
## Handoff Status
|
||||
|
||||
Code-owned artifacts are present in this repo. Live handoff is still pending
|
||||
operator action:
|
||||
|
||||
- Build/publish the `llm-connect` image selected by Railiance.
|
||||
- Create the runtime provider Secret outside Git.
|
||||
- Apply `deploy/k8s/activity-core-llm-connect`.
|
||||
- Smoke from the `activity-core` namespace.
|
||||
- Set activity-core `LLM_CONNECT_URL` to the stable URL above.
|
||||
- Run or observe one daily WSJF smoke/manual activity run and confirm a
|
||||
non-secret State Hub `daily_triage` progress event.
|
||||
Reference in New Issue
Block a user