3.4 KiB
TeleMcp Project
Telemetry for autonomous control
What is TeleMcp?
TeleMcp is mission control for Kubernetes hosts. It collects health, performance, and alert signals from a Linux k8s cluster and exposes them through a single Model Context Protocol (MCP) interface so intelligent assistants can understand what's happening, triage problems, and help keep systems running smoothly — without constant human supervision.
The project name reflects its two halves:
- Tele — telemetry: metrics, logs, events, and cluster inventory
- MCP — the standardized bridge between observability backends and LLM agents
Who is it for?
- Operators who want repeatable, one-command observability on a k3s or kubeadm host
- LLM agent builders who need a safe, read-only API for cluster situational awareness
- Developers running local or edge Kubernetes who want agent-assisted monitoring without wiring up bespoke integrations
What problem does it solve?
Running a Kubernetes host means tracking signals across many systems. Humans reach for Grafana, kubectl, and ad-hoc PromQL. Agents need the same information through a standardized, safe contract — not raw shell access or scattered API credentials.
TeleMcp solves this in three steps:
- Collect — deploy Prometheus, Loki, and supporting exporters via Helm
- Deploy — bootstrap everything with a single Ansible playbook
- Bridge — expose resources, tools, and prompts through
mcp-telemetry-bridge
What can an agent do today?
With the current scaffold, an agent connected to the bridge can:
- Query Prometheus with
promql.query - Search logs with
loki.query - Inspect Kubernetes objects with
k8s.getandk8s.events - Pull a cluster inventory snapshot with
inventory.snapshot - Use pre-built PromQL/LogQL resources for common triage queries
What is planned?
Stretch goals — explicitly deferred in v1 — include host-level signals (systemd status, cert expiry, firewall summary), Alertmanager integration, additional prompts (Capacity-Check, CrashLoop-Playbook), and full MCP protocol transport. See INTENT.md for the authoritative scope list.
Design principles
| Principle | Summary |
|---|---|
| Read-only by default | No cluster mutations through the bridge |
| Standard stack | CNCF/Grafana components, not custom collectors |
| MCP as the interface | One bridge, one contract for agents |
| Deployable in one shot | Ansible + Helm, no manual assembly |
| Least privilege | Scoped RBAC and NetworkPolicy |
Repository map
| Path | Contents |
|---|---|
| INTENT.md | North star — goals, scope, current state |
| README.md | Quick start and operational guide |
| TeleMcpBlueprint.md | Architecture and component rationale |
ansible/ |
Bootstrap playbook |
helm/ |
Chart values and bridge chart |
mcp-telemetry-bridge/ |
FastAPI bridge source |
Success criteria
TeleMcp is working when:
ansible-playbookbrings up healthy pods inmonitoring,logging, andmcpnamespaces/mcp/schemareturns resources, tools, and prompts- An agent can query metrics, logs, and cluster state without direct API credentials
- Default alert rules fire on induced failures and the agent can triage them
- The stack redeploys cleanly on a fresh Ubuntu 24.04 + k3s/kubeadm host