3.7 KiB
TeleMcp
Mission control for Kubernetes hosts, exposed to LLM agents through MCP.
TeleMcp deploys a standard observability stack onto a Linux Kubernetes host via Ansible + Helm, then surfaces metrics, logs, and cluster state through a read-only MCP bridge so an LLM agent can bootstrap, monitor, triage, and operate the box.
For project goals, scope, and design principles, see INTENT.md.
Components
| Component | Namespace | Role |
|---|---|---|
| kube-prometheus-stack | monitoring |
Prometheus, Alertmanager, Grafana, node-exporter, kube-state-metrics |
| Loki + Promtail | logging |
Log aggregation and shipping |
| OpenTelemetry Collector | observability |
Optional OTLP fan-out to Prometheus and Loki |
| mcp-telemetry-bridge | mcp |
FastAPI service exposing MCP resources, tools, and prompts |
Quick Start
0) Prereqs
- Ubuntu 24.04 host with k8s (k3s or kubeadm) reachable and
kubectlcontext configured - Ansible 2.15+ on your control machine
- Helm 3 on the host (Ansible role installs if missing)
1) Run Ansible
cd ansible
ansible-playbook -i inventories/local.ini playbook.yml
2) Smoke tests
From any machine with a kubectl context:
kubectl get pods -n monitoring
kubectl get pods -n logging
kubectl get pods -n mcp
kubectl port-forward -n mcp svc/mcp-telemetry-bridge 8080:80
curl http://localhost:8080/mcp/schema | jq .
curl http://localhost:8080/healthz
3) Point your LLM agent
Configure your agent's MCP client to the bridge endpoint (ClusterIP, Ingress, or port-forward).
Implemented tools:
| Tool | Description |
|---|---|
promql.query |
Run a PromQL expression against Prometheus |
loki.query |
Run a LogQL query against Loki |
k8s.get |
Fetch Kubernetes objects (pods, nodes, deployments, etc.) |
k8s.events |
List cluster or namespace events |
inventory.snapshot |
JSON snapshot of nodes, namespaces, and workloads |
Saved resources (via /mcp/resource?uri=...):
res://dashboards/top-pods-by-cpu.promqlres://dashboards/pod-restarts.promqlres://dashboards/warn-events.logql
The bridge currently exposes an HTTP schema approximation (
/mcp/schema,/tools/...). Full MCP transport (stdio/SSE) is planned — see INTENT.md.
Repo layout
tele-mcp/
INTENT.md # Project north star — goals, scope, current state
ansible/ # Bootstrap playbook and roles
helm/
values/ # Chart values for monitoring, logging, OTel
mcp-telemetry-bridge/ # Bridge Helm chart
mcp-telemetry-bridge/ # FastAPI bridge application
environments/ # Per-environment overrides
wiki/ # Extended project and design docs
Documentation
| Document | Purpose |
|---|---|
| INTENT.md | Goals, principles, scope, success criteria |
| wiki/TeleMcpProject.md | Project overview and audience |
| wiki/TeleMcpBlueprint.md | Component rationale and bridge design |
| environments/dev/README.md | Dev environment notes |
Security
- MCP bridge ServiceAccount is read-only (
get/list/watchonly) - NetworkPolicy limits bridge egress to Prometheus and Loki
- Consider mTLS or OIDC if exposing the bridge outside the cluster
Current limitations
See INTENT.md — Current State for the full list. Notable gaps:
- Bridge container image is a placeholder (
ghcr.io/example/telemcp-bridge) - No Alertmanager integration in the bridge yet
- Host-level signals (systemd, certs, firewall) are deferred to a future DaemonSet sidecar