generated from coulomb/repo-seed
242 lines
6.4 KiB
Markdown
242 lines
6.4 KiB
Markdown
# 🤖 Tutorial 8 — Agent-Driven TDD Automation
|
||
|
||
This tutorial focuses on *Agent-Driven TDD Automation* — turning TestDrive-UI into a living collaboration space between human developers and coding agents.
|
||
|
||
### “Coding with Companions” — Using AI Agents for Continuous Improvement
|
||
|
||
---
|
||
|
||
## 🎯 Goal
|
||
|
||
Integrate LLM-based coding agents into your TestDrive-UI workflow so they can:
|
||
|
||
1. **Generate new tests** from natural-language requirements.
|
||
2. **Run tests and detect regressions** automatically.
|
||
3. **Propose targeted refactorings or patches** when failures occur.
|
||
|
||
By combining deterministic testing with creative reasoning, you build a feedback loop that never stops improving your code.
|
||
|
||
---
|
||
|
||
## 🧩 Concept Overview
|
||
|
||
Traditional TDD cycle:
|
||
|
||
```
|
||
requirement → test → fail → code → pass → refactor
|
||
```
|
||
|
||
Agent-Driven TDD cycle:
|
||
|
||
```
|
||
requirement → agent generates test → run → fail →
|
||
agent proposes fix → run → pass → review → merge
|
||
```
|
||
|
||
You remain the architect and reviewer — the agent acts as an automated junior developer executing fast, repeatable loops.
|
||
|
||
---
|
||
|
||
## 🧪 Step 1 — Define a Machine-Readable Requirement
|
||
|
||
Create a simple JSON file to store behavioral specifications.
|
||
|
||
`specs/hello-world.name-change.json`
|
||
|
||
```json
|
||
{
|
||
"component": "hello-world",
|
||
"feature": "name-change",
|
||
"description": "Greeting text should update when the user types a new name.",
|
||
"expected_behavior": [
|
||
"Typing in the input field updates the displayed greeting immediately.",
|
||
"The component property 'name' reflects the latest input value."
|
||
]
|
||
}
|
||
```
|
||
|
||
Agents use these files as prompts to generate corresponding tests.
|
||
|
||
---
|
||
|
||
## ⚙️ Step 2 — Agent Generates a Test
|
||
|
||
An agent reads the JSON spec and produces Mocha tests automatically.
|
||
|
||
Example output from an LLM agent:
|
||
|
||
`generated-tests/hello-world.name-change.test.js`
|
||
|
||
```javascript
|
||
import "./hello-world.js";
|
||
|
||
describe("<hello-world> (auto-generated name-change)", () => {
|
||
it("updates greeting text when user types", async () => {
|
||
const el = document.createElement("hello-world");
|
||
document.body.appendChild(el);
|
||
const input = el.shadowRoot.querySelector("input");
|
||
input.value = "Agent";
|
||
input.dispatchEvent(new Event("input"));
|
||
await el.updateComplete;
|
||
const greeting = el.shadowRoot.querySelector(".greeting");
|
||
expect(greeting.textContent.trim()).to.equal("Hello, Agent!");
|
||
});
|
||
});
|
||
```
|
||
|
||
Run your full suite:
|
||
|
||
```bash
|
||
npm test
|
||
```
|
||
|
||
If the test fails, the agent analyzes output and proposes a minimal fix for `hello-world.js`.
|
||
|
||
---
|
||
|
||
## 🔍 Step 3 — Detect Regressions
|
||
|
||
Agents monitor test results over time by parsing Mocha’s machine-readable report (`--reporter json`).
|
||
|
||
Example agent logic (pseudocode):
|
||
|
||
```python
|
||
def analyze_report(report):
|
||
failed = [t for t in report['tests'] if t['err']]
|
||
if not failed:
|
||
return "All green!"
|
||
for f in failed:
|
||
print(f"Regression in {f['title']}: {f['err']['message']}")
|
||
propose_fix(f)
|
||
```
|
||
|
||
This analysis lets agents flag new failures, auto-create issues, or propose PRs.
|
||
|
||
---
|
||
|
||
## 🧠 Step 4 — Agent Proposes Refactorings
|
||
|
||
Once tests are green, agents can examine code for quality signals:
|
||
|
||
| Heuristic | Example Action |
|
||
| --------------------- | ------------------------------------------ |
|
||
| Duplicate logic | Extract shared helper or controller |
|
||
| Excessive DOM queries | Cache references or use ReactiveController |
|
||
| Long methods | Suggest method splitting |
|
||
| Repeated strings | Propose localization constants |
|
||
|
||
Example prompt to your agent:
|
||
|
||
```
|
||
“Review src/components/hello-world.js and propose a refactor
|
||
that reduces duplication without breaking current tests.”
|
||
```
|
||
|
||
The agent runs tests after each change to validate its proposal.
|
||
|
||
---
|
||
|
||
## 🔁 Step 5 — Automate the Loop
|
||
|
||
Create a simple Node script to chain the whole process.
|
||
|
||
`scripts/agent-tdd.js`
|
||
|
||
```javascript
|
||
import { execSync } from "child_process";
|
||
import fs from "fs";
|
||
import OpenAI from "openai"; // or another LLM SDK
|
||
|
||
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
|
||
|
||
function runTests() {
|
||
try {
|
||
const output = execSync("npx mocha --reporter json", { encoding: "utf-8" });
|
||
return JSON.parse(output);
|
||
} catch (e) {
|
||
return JSON.parse(e.stdout);
|
||
}
|
||
}
|
||
|
||
async function main() {
|
||
const report = runTests();
|
||
const failed = report.tests.filter(t => t.err.message);
|
||
if (failed.length === 0) return console.log("✅ All tests passed");
|
||
|
||
const prompt = `Some tests failed:\n${JSON.stringify(failed, null, 2)}\n
|
||
Propose minimal code changes to fix them.`;
|
||
const completion = await client.responses.create({ model: "gpt-5", input: prompt });
|
||
fs.writeFileSync("agent-proposal.txt", completion.output_text);
|
||
console.log("💡 Agent proposal written to agent-proposal.txt");
|
||
}
|
||
|
||
main();
|
||
```
|
||
|
||
This script connects your tests with an AI assistant that learns and suggests fixes continuously.
|
||
|
||
---
|
||
|
||
## 🧰 Step 6 — Integrate into CI
|
||
|
||
Add a CI job (e.g., GitHub Actions or local cron) to run the agent loop daily or on push.
|
||
|
||
Example workflow:
|
||
|
||
```
|
||
on:
|
||
push:
|
||
schedule:
|
||
- cron: '0 2 * * *'
|
||
jobs:
|
||
agent-tdd:
|
||
runs-on: ubuntu-latest
|
||
steps:
|
||
- uses: actions/checkout@v4
|
||
- run: npm ci
|
||
- run: node scripts/agent-tdd.js
|
||
```
|
||
|
||
Now your project self-tests and self-critiques even when you’re offline.
|
||
|
||
---
|
||
|
||
## 🧩 Step 7 — Visualize Agent Progress
|
||
|
||
Agents can log progress into a dashboard component (`<agent-console>`) showing:
|
||
|
||
* Number of tests generated.
|
||
* Pass/fail trend over time.
|
||
* Proposed vs. accepted refactors.
|
||
|
||
It’s your window into the machine’s learning curve.
|
||
|
||
---
|
||
|
||
## ✅ Outcome
|
||
|
||
You now have a self-reinforcing loop:
|
||
|
||
1. Humans write specs.
|
||
2. Agents create tests and code.
|
||
3. The suite proves stability.
|
||
4. Agents refactor and review under guard of tests.
|
||
|
||
This combines the discipline of TDD with the creativity and endurance of AI.
|
||
|
||
---
|
||
|
||
## 🔍 Next Steps
|
||
|
||
* Add **semantic diff filters** so agents learn from accepted patches.
|
||
* Train agents to cluster tests into *feature domains* for smarter coverage analysis.
|
||
* Integrate Storybook snapshots for visual regression detection.
|
||
* Build a CLI (`npx agent-tdd`) to run and audit your AI test loops interactively.
|
||
|
||
---
|
||
|
||
Congratulations! You finished all tutorials and should be fine going Forward Building components.
|
||
|
||
Feel free to tell us which additional tutorials we should provide.
|
||
|
||
xxx |