# TeleMcp **Mission control for Kubernetes hosts, exposed to LLM agents through MCP.** TeleMcp deploys a standard observability stack onto a Linux Kubernetes host via **Ansible + Helm**, then surfaces metrics, logs, and cluster state through a read-only **MCP bridge** so an LLM agent can bootstrap, monitor, triage, and operate the box. > For project goals, scope, and design principles, see **[INTENT.md](INTENT.md)**. ## Components | Component | Namespace | Role | |-----------|-----------|------| | **kube-prometheus-stack** | `monitoring` | Prometheus, Alertmanager, Grafana, node-exporter, kube-state-metrics | | **Loki + Promtail** | `logging` | Log aggregation and shipping | | **OpenTelemetry Collector** | `observability` | Optional OTLP fan-out to Prometheus and Loki | | **mcp-telemetry-bridge** | `mcp` | FastAPI service exposing MCP resources, tools, and prompts | ## Quick Start ### 0) Prereqs - Ubuntu 24.04 host with k8s (k3s or kubeadm) reachable and `kubectl` context configured - Ansible 2.15+ on your control machine - Helm 3 on the host (Ansible role installs if missing) ### 1) Run Ansible ```bash cd ansible ansible-playbook -i inventories/local.ini playbook.yml ``` ### 2) Smoke tests From any machine with a `kubectl` context: ```bash kubectl get pods -n monitoring kubectl get pods -n logging kubectl get pods -n mcp kubectl port-forward -n mcp svc/mcp-telemetry-bridge 8080:80 curl http://localhost:8080/mcp/schema | jq . curl http://localhost:8080/healthz ``` ### 3) Point your LLM agent Configure your agent's MCP client to the bridge endpoint (ClusterIP, Ingress, or port-forward). **Implemented tools:** | Tool | Description | |------|-------------| | `promql.query` | Run a PromQL expression against Prometheus | | `loki.query` | Run a LogQL query against Loki | | `k8s.get` | Fetch Kubernetes objects (pods, nodes, deployments, etc.) | | `k8s.events` | List cluster or namespace events | | `inventory.snapshot` | JSON snapshot of nodes, namespaces, and workloads | **Saved resources** (via `/mcp/resource?uri=...`): - `res://dashboards/top-pods-by-cpu.promql` - `res://dashboards/pod-restarts.promql` - `res://dashboards/warn-events.logql` > The bridge currently exposes an HTTP schema approximation (`/mcp/schema`, `/tools/...`). Full MCP transport (stdio/SSE) is planned — see [INTENT.md](INTENT.md). ## Repo layout ``` tele-mcp/ INTENT.md # Project north star — goals, scope, current state ansible/ # Bootstrap playbook and roles helm/ values/ # Chart values for monitoring, logging, OTel mcp-telemetry-bridge/ # Bridge Helm chart mcp-telemetry-bridge/ # FastAPI bridge application environments/ # Per-environment overrides wiki/ # Extended project and design docs ``` ## Documentation | Document | Purpose | |----------|---------| | [INTENT.md](INTENT.md) | Goals, principles, scope, success criteria | | [wiki/TeleMcpProject.md](wiki/TeleMcpProject.md) | Project overview and audience | | [wiki/TeleMcpBlueprint.md](wiki/TeleMcpBlueprint.md) | Component rationale and bridge design | | [environments/dev/README.md](environments/dev/README.md) | Dev environment notes | ## Security - MCP bridge ServiceAccount is read-only (`get` / `list` / `watch` only) - NetworkPolicy limits bridge egress to Prometheus and Loki - Consider mTLS or OIDC if exposing the bridge outside the cluster ## Current limitations See [INTENT.md — Current State](INTENT.md#current-state-as-of-initial-scaffold) for the full list. Notable gaps: - Bridge container image is a placeholder (`ghcr.io/example/telemcp-bridge`) - No Alertmanager integration in the bridge yet - Host-level signals (systemd, certs, firewall) are deferred to a future DaemonSet sidecar