diff --git a/FEATURE_REQUESTS.md b/FEATURE_REQUESTS.md
new file mode 100644
index 0000000..e1b993f
--- /dev/null
+++ b/FEATURE_REQUESTS.md
@@ -0,0 +1,107 @@
+# llm-connect Feature Requests
+
+Raised by: IHF Phase 11 — Advanced AI Federation (IHUB-WP-0012)
+Date: 2026-04-01
+
+These gaps were identified during integration of llm-connect into the
+Interaction Hub Framework (IHF) as a subprocess bridge for multi-agent
+federation. None are blockers for Phase 11, but they affect performance
+and architectural elegance.
+
+---
+
+## FR-1 — HTTP/JSON-RPC serve mode
+
+**Problem:** The current architecture requires spawning a new `python3
+scripts/llm_bridge.py` process for every agent invocation. This adds
+significant overhead in production when collective proposals invoke 3–5
+agents in sequence.
+
+**Proposed API:**
+```bash
+python -m llm_connect.server --port 9999
+```
+IHP (Haskell) would call `POST localhost:9999/execute` with the same JSON
+payload the bridge script currently reads from stdin.
+
+**Impact:** Eliminates process spawn overhead. A single persistent server
+process handles all requests in the session lifetime.
+
+---
+
+## FR-2 — `RoutingPolicy` class for declarative provider/model selection
+
+**Problem:** `RunConfig.model_name` is the only selection mechanism. IHF
+needs declarative routing rules — e.g. "for triage tasks, prefer
+openrouter/claude-haiku-4-5; fall back to gemini if cost exceeds 0.5/1k
+tokens; never use auto_apply trust agents for autonomous actions".
+
+**Proposed API:**
+```python
+from llm_connect import RoutingPolicy
+
+policy = RoutingPolicy(rules=[
+    {
+        "task_type": "triage",
+        "prefer": [{"provider": "openrouter", "model": "claude-haiku-4-5"}],
+        "max_cost_per_1k": 0.5,
+        "fallback": {"provider": "gemini", "model": "gemini-flash-1.5"},
+    }
+])
+adapter = policy.resolve(task_type="triage")
+```
+
+**Impact:** Moves routing logic into llm-connect instead of duplicating it
+in every consumer (currently IHF implements this in `ModelRouter.hs`).
+
+---
+
+## FR-3 — `async_execute_prompt()` for concurrent execution
+
+**Problem:** Collective proposals invoke agents sequentially because
+`execute_prompt` is synchronous. With 3–5 agents this is 3–5× slower than
+necessary.
+
+**Proposed API:**
+```python
+import asyncio
+from llm_connect import create_adapter
+
+async def main():
+    adapters = [create_adapter(...) for _ in agents]
+    responses = await asyncio.gather(*[
+        a.async_execute_prompt(prompt, config) for a in adapters
+    ])
+```
+
+Standard `asyncio` coroutine interface, same signature as `execute_prompt`.
+
+**Impact:** Collective proposal latency scales with the slowest agent
+rather than the sum of all agent latencies.
+
+---
+
+## FR-4 — `BudgetTracker` for delegation chains
+
+**Problem:** IHF's inter-agent delegation model enforces token budgets at
+the Haskell layer (`AgentDelegation.tokenBudget`), but the bridge itself
+has no concept of a shared budget. A delegation chain (A → B → C) cannot
+enforce that the total token spend stays below a cap set by A.
+
+**Proposed API:**
+```python
+from llm_connect import BudgetTracker, RunConfig
+
+tracker = BudgetTracker(total=4000)
+config = RunConfig(model_name="...", budget_tracker=tracker)
+# Subsequent calls on any adapter sharing this tracker will raise
+# LLMBudgetExceededError if the cumulative spend exceeds 4000 tokens.
+resp = adapter.execute_prompt(prompt, config)
+```
+
+`LLMBudgetExceededError` should be a subclass of `LLMError` so existing
+error handling catches it.
+
+**Impact:** Budget enforcement moves into the bridge layer where it can be
+applied uniformly across all providers, rather than requiring each consumer
+to track it manually.