Files
testdrive-jsui/tutorials/Tutorial 8 Agent Driven Tdd Automation.md

6.4 KiB
Raw Blame History

🤖 Tutorial 8 — Agent-Driven TDD Automation

This tutorial focuses on Agent-Driven TDD Automation — turning TestDrive-UI into a living collaboration space between human developers and coding agents.

“Coding with Companions” — Using AI Agents for Continuous Improvement


🎯 Goal

Integrate LLM-based coding agents into your TestDrive-UI workflow so they can:

  1. Generate new tests from natural-language requirements.
  2. Run tests and detect regressions automatically.
  3. Propose targeted refactorings or patches when failures occur.

By combining deterministic testing with creative reasoning, you build a feedback loop that never stops improving your code.


🧩 Concept Overview

Traditional TDD cycle:

requirement → test → fail → code → pass → refactor

Agent-Driven TDD cycle:

requirement → agent generates test → run → fail →
agent proposes fix → run → pass → review → merge

You remain the architect and reviewer — the agent acts as an automated junior developer executing fast, repeatable loops.


🧪 Step 1 — Define a Machine-Readable Requirement

Create a simple JSON file to store behavioral specifications.

specs/hello-world.name-change.json

{
  "component": "hello-world",
  "feature": "name-change",
  "description": "Greeting text should update when the user types a new name.",
  "expected_behavior": [
    "Typing in the input field updates the displayed greeting immediately.",
    "The component property 'name' reflects the latest input value."
  ]
}

Agents use these files as prompts to generate corresponding tests.


⚙️ Step 2 — Agent Generates a Test

An agent reads the JSON spec and produces Mocha tests automatically.

Example output from an LLM agent:

generated-tests/hello-world.name-change.test.js

import "./hello-world.js";

describe("<hello-world> (auto-generated name-change)", () => {
  it("updates greeting text when user types", async () => {
    const el = document.createElement("hello-world");
    document.body.appendChild(el);
    const input = el.shadowRoot.querySelector("input");
    input.value = "Agent";
    input.dispatchEvent(new Event("input"));
    await el.updateComplete;
    const greeting = el.shadowRoot.querySelector(".greeting");
    expect(greeting.textContent.trim()).to.equal("Hello, Agent!");
  });
});

Run your full suite:

npm test

If the test fails, the agent analyzes output and proposes a minimal fix for hello-world.js.


🔍 Step 3 — Detect Regressions

Agents monitor test results over time by parsing Mochas machine-readable report (--reporter json).

Example agent logic (pseudocode):

def analyze_report(report):
    failed = [t for t in report['tests'] if t['err']]
    if not failed:
        return "All green!"
    for f in failed:
        print(f"Regression in {f['title']}: {f['err']['message']}")
        propose_fix(f)

This analysis lets agents flag new failures, auto-create issues, or propose PRs.


🧠 Step 4 — Agent Proposes Refactorings

Once tests are green, agents can examine code for quality signals:

Heuristic Example Action
Duplicate logic Extract shared helper or controller
Excessive DOM queries Cache references or use ReactiveController
Long methods Suggest method splitting
Repeated strings Propose localization constants

Example prompt to your agent:

“Review src/components/hello-world.js and propose a refactor
that reduces duplication without breaking current tests.”

The agent runs tests after each change to validate its proposal.


🔁 Step 5 — Automate the Loop

Create a simple Node script to chain the whole process.

scripts/agent-tdd.js

import { execSync } from "child_process";
import fs from "fs";
import OpenAI from "openai"; // or another LLM SDK

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

function runTests() {
  try {
    const output = execSync("npx mocha --reporter json", { encoding: "utf-8" });
    return JSON.parse(output);
  } catch (e) {
    return JSON.parse(e.stdout);
  }
}

async function main() {
  const report = runTests();
  const failed = report.tests.filter(t => t.err.message);
  if (failed.length === 0) return console.log("✅ All tests passed");

  const prompt = `Some tests failed:\n${JSON.stringify(failed, null, 2)}\n
Propose minimal code changes to fix them.`;
  const completion = await client.responses.create({ model: "gpt-5", input: prompt });
  fs.writeFileSync("agent-proposal.txt", completion.output_text);
  console.log("💡 Agent proposal written to agent-proposal.txt");
}

main();

This script connects your tests with an AI assistant that learns and suggests fixes continuously.


🧰 Step 6 — Integrate into CI

Add a CI job (e.g., GitHub Actions or local cron) to run the agent loop daily or on push.

Example workflow:

on:
  push:
  schedule:
    - cron: '0 2 * * *'
jobs:
  agent-tdd:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: node scripts/agent-tdd.js

Now your project self-tests and self-critiques even when youre offline.


🧩 Step 7 — Visualize Agent Progress

Agents can log progress into a dashboard component (<agent-console>) showing:

  • Number of tests generated.
  • Pass/fail trend over time.
  • Proposed vs. accepted refactors.

Its your window into the machines learning curve.


Outcome

You now have a self-reinforcing loop:

  1. Humans write specs.
  2. Agents create tests and code.
  3. The suite proves stability.
  4. Agents refactor and review under guard of tests.

This combines the discipline of TDD with the creativity and endurance of AI.


🔍 Next Steps

  • Add semantic diff filters so agents learn from accepted patches.
  • Train agents to cluster tests into feature domains for smarter coverage analysis.
  • Integrate Storybook snapshots for visual regression detection.
  • Build a CLI (npx agent-tdd) to run and audit your AI test loops interactively.

Congratulations! You finished all tutorials and should be fine going Forward Building components.

Feel free to tell us which additional tutorials we should provide.

xxx