chore: setup agentic coding workplan

2025-12-01 22:37:27 +01:00
parent 45d60fc1a9
commit 9081cb80d3
2 changed files with 1013 additions and 0 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,360 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+**DirektVermittlungDe (DVD)** is a document-centric communication platform between citizens and German authorities. It eliminates the "phone hunt" by:
+1. Citizens upload a document or provide an *Aktenzeichen* (reference number)
+2. The system auto-routes to the responsible unit
+3. Opens an interaction thread for direct clarification
+
+## Current Development Status
+
+**⚠️ IMPORTANT**: This repository is currently in transition from prototype phase to main production codebase.
+
+- **Current State**: Three prototype implementations (chatgpt5, geminiNbt3pro, grok4.1) in `/prototype-*` directories
+- **Target State**: Unified production codebase in `/src` with TDD-driven development
+- **Active Workplan**: See `docs/WORKPLAN_MainCodebase_Integration.md` for detailed migration plan
+
+**When working on this codebase:**
+- Check the workplan to understand which phase we're in
+- Follow TDD practices: write tests first, then implementation
+- Reference ADRs for architectural decisions (see below)
+- New code goes in `/src`, not prototypes
+
+## Repository Structure (Target State)
+
+```
+/
+├── docs/
+│   ├── architecture/
+│   │   ├── adr/                    # Individual ADR files (0001-*.md)
+│   │   │   ├── 0001-split-payload-model.md
+│   │   │   ├── 0002-stateless-authentication.md
+│   │   │   └── ...
+│   │   ├── architecture-overview.md
+│   │   └── design-patterns.md
+│   ├── api/
+│   │   ├── openapi.yaml
+│   │   └── api-scenarios.md
+│   ├── development/
+│   │   ├── testing-strategy.md
+│   │   ├── agentic-coding-guide.md
+│   │   └── setup-guide.md
+│   └── WORKPLAN_MainCodebase_Integration.md
+├── src/
+│   ├── domain/          # Pure business logic (TDD tested)
+│   ├── adapters/        # External integrations (mocked in tests)
+│   ├── service/         # Application services (TDD tested)
+│   ├── api/             # FastAPI routes (integration tested)
+│   └── workers/         # Background jobs
+├── tests/
+│   ├── unit/            # TDD unit tests
+│   ├── integration/     # Integration tests
+│   └── fixtures/        # Test data and mocks
+├── scripts/             # Utility scripts
+├── prototype-*/         # ARCHIVED prototypes (reference only)
+├── pyproject.toml       # Dependencies & tool config
+└── pytest.ini           # Test configuration
+```
+
+## Development Commands
+
+### Setup and Run (Main Codebase)
+
+```bash
+# Install dependencies (from repo root)
+pip install -e ".[dev]"
+
+# Initialize database metadata
+python -m scripts.init_db
+
+# Run development server
+uvicorn src.main:app --reload
+
+# View API docs
+# http://localhost:8000/docs
+```
+
+### Testing (TDD Workflow)
+
+```bash
+# Run all tests
+pytest
+
+# Run with coverage report
+pytest --cov=src --cov-report=html
+
+# Run only unit tests
+pytest tests/unit/
+
+# Run only integration tests
+pytest tests/integration/
+
+# Run specific test file
+pytest tests/unit/domain/test_document_metadata.py -v
+
+# Watch mode (requires pytest-watch)
+ptw
+```
+
+### Code Quality
+
+```bash
+# Lint and format with ruff
+ruff check src/ tests/
+ruff format src/ tests/
+
+# Type checking with mypy
+mypy src/
+
+# Run pre-commit hooks manually
+pre-commit run --all-files
+```
+
+## Key Architectural Patterns
+
+### 1. Split-Payload Model (ADR-001)
+
+**Critical**: Documents use a split design to enable both encryption and routing:
+- **metadata** (plaintext JSON): authorityId, referenceNumber, docType, issuedAt
+- **encryptedPayload** (encrypted blob): actual PDF/scan content
+
+The backend can route based on metadata without decrypting the payload. This is the core architectural constraint.
+
+When working with documents:
+- NEVER attempt to decrypt `encryptedPayload` in the main service
+- The backend treats encrypted content as opaque
+- Routing logic operates only on metadata fields
+
+### 2. Hybrid Concurrency Pattern (ADR-007)
+
+Python's GIL requires careful concurrency handling:
+- **I/O operations** (DB, network): Use native `async`/`await`
+- **CPU operations** (crypto, PDF): Offload to `ProcessPoolExecutor`
+
+Example pattern:
+```python
+cpu_pool = ProcessPoolExecutor(max_workers=4)
+
+async def handler():
+    # I/O: async/await
+    data = await db.fetch()
+
+    # CPU: executor
+    loop = asyncio.get_running_loop()
+    result = await loop.run_in_executor(cpu_pool, heavy_function, data)
+```
+
+### 3. Cursor-Based Pagination (ADR-003)
+
+For message threads, use cursor-based pagination (timestamp) instead of offset-based:
+- Ensures consistent performance regardless of thread length
+- Prevents duplicate/missing messages during real-time updates
+- Target: <300ms response time (NFR-1)
+
+### 4. Async Export Pattern (ADR-004)
+
+Data exports use async request-reply:
+1. POST `/exports` → 202 Accepted + jobId
+2. Background worker processes via message queue
+3. Client polls status endpoint or receives webhook
+
+### 5. Stateless Authentication (ADR-002)
+
+OAuth2/JWT with scopes:
+- `citizen:write`: Document submission, thread creation
+- `official:read`: View assigned cases
+- `official:write`: Respond to inquiries
+
+JWTs enable horizontal scaling without session affinity.
+
+### 6. GDPR-Compliant Retention (ADR-006)
+
+Documents have `retention_date` field:
+- Default: `closedAt + gracePeriod`
+- Personal archive: `null` or extended date
+- Automated cleanup via TTL engine
+
+## API Structure
+
+The API follows document-centric REST hierarchy:
+- `/documents` - Root envelope (FR-1: Document intake)
+- `/documents/{id}/threads` - Sub-resource for communication (FR-3)
+- `/threads/{threadId}/messages` - Message exchange
+- `/exports` - Async data export for authority systems (FR-7)
+
+## Critical Domain Models
+
+Located in `src/domain/models.py` (target) or `prototype-chatgpt5/src/app/domain/models.py` (reference):
+
+**Enums:**
+- **ThreadType**: TEXT_CHAT, CALLBACK_REQUEST, APPOINTMENT
+- **SenderRole**: CITIZEN, OFFICIAL, SYSTEM
+- **ExportJobStatus**: QUEUED, RUNNING, COMPLETED, FAILED
+
+**Core Models:**
+- **DocumentMetadata**: authorityId, referenceNumber, docType, issuedAt
+  - Validation: authorityId (1-50 chars), referenceNumber required
+- **DocumentCreateRequest**: metadata, encryptedPayload (base64)
+- **DocumentCreatedResponse**: id, status, assignedUnit
+- **ThreadCreateRequest**: type, initialMessage, preferredTimeSlot
+- **MessageDto**: id, senderRole, content (encrypted), timestamp
+
+## Performance Targets (NFR)
+
+- Core operations: <300ms response time
+- Document upload routing: <500ms
+- Concurrent sessions: 10k+ per region
+- Availability: ≥99.5%
+
+## Security Constraints
+
+- E2E encryption required for sensitive data (NFR-5)
+- Backend cannot decrypt document payloads
+- Least-privilege access via OAuth scopes (NFR-6)
+- No PII in plaintext application logs
+
+## TDD Development Workflow
+
+**CRITICAL**: This project follows a strict Test-Driven Development approach.
+
+### The Red-Green-Refactor Cycle
+
+1. **RED**: Write a failing test that defines desired behavior
+2. **GREEN**: Write minimal code to make the test pass
+3. **REFACTOR**: Improve code while keeping tests green
+
+### Example TDD Workflow
+
+```python
+# Step 1: Write failing test (tests/unit/domain/test_document_metadata.py)
+def test_document_metadata_validates_authority_id():
+    with pytest.raises(ValidationError):
+        DocumentMetadata(
+            authorityId="",  # Empty should fail
+            referenceNumber="123/456",
+            docType="NOTICE",
+            issuedAt=datetime.now()
+        )
+
+# Step 2: Run test (it fails)
+# $ pytest tests/unit/domain/test_document_metadata.py
+
+# Step 3: Implement validation (src/domain/models.py)
+class DocumentMetadata(BaseModel):
+    authorityId: str = Field(..., min_length=1, max_length=50)
+    # ... rest of fields
+
+# Step 4: Run test again (it passes)
+
+# Step 5: Refactor if needed, ensure tests still pass
+```
+
+### Agentic TDD Pattern
+
+When asking Claude to implement features, use this pattern:
+
+```
+"Write a test for [feature] that verifies [specific behavior].
+Use pytest and follow the pattern in tests/unit/[module]/.
+Reference ADR-[number] for architectural constraints."
+```
+
+Example:
+```
+"Write a test for document routing that verifies documents with docType='NOTICE'
+are routed to the NoticeTeam. Mock the routing adapter using the protocol defined
+in src/adapters/protocols.py. Follow ADR-0001 split-payload constraints."
+```
+
+See `docs/development/agentic-coding-guide.md` for comprehensive examples.
+
+## Documentation References
+
+### Architecture & Decisions
+- **Workplan**: `docs/WORKPLAN_MainCodebase_Integration.md` - Current migration plan
+- **ADRs**: `docs/architecture/adr/` - Individual decision records
+  - `0001-split-payload-model.md` - Core constraint: metadata vs encrypted payload
+  - `0002-stateless-authentication.md` - OAuth2/JWT architecture
+  - `0003-cursor-based-pagination.md` - Performance-optimized pagination
+  - `0004-async-export-workflow.md` - Background job pattern
+  - `0005-rest-resource-structure.md` - API design hierarchy
+  - `0006-gdpr-retention-model.md` - Data lifecycle management
+  - `0007-hybrid-concurrency-pattern.md` - Python GIL mitigation
+  - `0008-agentic-tdd-workflow.md` - LLM-driven development process
+- **Overview**: `docs/architecture/architecture-overview.md` - System design
+- **Patterns**: `docs/architecture/design-patterns.md` - Common patterns
+
+### API Documentation
+- **OpenAPI**: `docs/api/openapi.yaml` - Full API specification
+- **Scenarios**: `docs/api/api-scenarios.md` - Real-world usage examples
+
+### Development Guides
+- **Testing**: `docs/development/testing-strategy.md` - TDD practices
+- **Agentic Coding**: `docs/development/agentic-coding-guide.md` - AI-assisted development
+- **Setup**: `docs/development/setup-guide.md` - Environment setup
+
+### Legacy (Prototypes)
+- `prototype-chatgpt5/README.md` - Reference implementation (ARCHIVED)
+- `docs/decisions.md` - Original ADRs (DEPRECATED, see /docs/architecture/adr/)
+
+## Database Schema
+
+Key tables (PostgreSQL):
+- **documents**: id, reference_number, authority_id, status, storage_path, retention_date
+- **threads**: id, document_id, type, assigned_official_id, last_activity_at
+- **messages**: id, thread_id, sender_role, content_blob, created_at
+
+Indexes:
+- `idx_docs_authority` on documents(authority_id, status)
+- `idx_msgs_thread_time` on messages(thread_id, created_at DESC)
+
+## Technology Stack
+
+- **Language**: Python 3.11+
+- **Framework**: FastAPI + Uvicorn
+- **ORM**: SQLAlchemy (async)
+- **Database**: PostgreSQL 15+
+- **Blob Storage**: S3-compatible (MinIO/AWS S3)
+- **Task Queue**: ARQ (Redis-based, async) or Celery
+- **Auth**: OAuth2/JWT (stateless)
+- **Testing**: pytest + pytest-asyncio + pytest-cov
+- **Linting**: ruff (fast, replaces flake8/isort/pyupgrade)
+- **Type Checking**: mypy (strict mode)
+- **CI/CD**: GitHub Actions
+
+## Workplan & Phase Tracking
+
+**Active Workplan**: `docs/WORKPLAN_MainCodebase_Integration.md`
+
+### Current Phase Status
+
+To check which phase we're in:
+1. Open `docs/WORKPLAN_MainCodebase_Integration.md`
+2. Look for checked `[x]` items in each phase section
+3. Focus development on uncompleted `[ ]` items in the current phase
+
+### How to Contribute
+
+**Before implementing any feature:**
+1. Check if it's in the current phase of the workplan
+2. Read the relevant ADR(s) in `docs/architecture/adr/`
+3. Write tests first (TDD approach)
+4. Implement to make tests pass
+5. Ensure all quality gates pass (pytest, mypy, ruff)
+
+**When adding new architectural decisions:**
+1. Create a new ADR file in `docs/architecture/adr/`
+2. Use the template in `0000-template.md`
+3. Number sequentially (next available number)
+4. Update the ADR index in `docs/architecture/adr/README.md`
+
+## Common Gotchas
+
+1. **Stream large files**: Use `aiobotocore` to stream uploads to S3, don't load entire payloads into memory
+2. **ProcessPoolExecutor**: CPU-heavy operations MUST be offloaded to avoid blocking the event loop
+3. **Metadata separation**: The routing engine needs plaintext metadata - design API contracts accordingly
+4. **Retention dates**: Always set `retention_date` on document creation for GDPR compliance
+5. **Cursor pagination**: Use timestamp-based cursors for message history, not offset-based
--- a/docs/WORKPLAN_MainCodebase_Integration.md
+++ b/docs/WORKPLAN_MainCodebase_Integration.md
@@ -0,0 +1,653 @@
+# Workplan: Main Codebase Integration & TDD Setup
+
+**Status:** Draft v1.0
+**Date:** 2025-12-01
+**Goal:** Consolidate prototype learnings into a production-ready main codebase with TDD practices and proper ADR documentation structure.
+
+---
+
+## 1. Current State Assessment
+
+### Existing Assets
+- **3 Prototype Implementations**: chatgpt5 (most complete), geminiNbt3pro, grok4.1
+- **Documentation**:
+  - Single `decisions.md` file with all ADRs (needs restructuring)
+  - `implementation_guide.md` (comprehensive)
+  - `api_docs.md` (scenario-based)
+  - `openapi.yaml` (API spec)
+- **Architecture**: Clean Architecture pattern established in chatgpt5 prototype
+
+### Gaps
+- No unified main codebase
+- ADRs not in individual files (not easily referenceable)
+- No TDD test suite defining core interfaces
+- No CI/CD pipeline configuration
+- Prototypes have overlapping implementations without consolidation
+
+---
+
+## 2. Target State Definition
+
+### Main Codebase Structure
+```
+/
+├── docs/
+│   ├── architecture/
+│   │   ├── adr/                    # Individual ADR files
+│   │   │   ├── 0001-split-payload-model.md
+│   │   │   ├── 0002-stateless-auth.md
+│   │   │   └── ...
+│   │   ├── architecture-overview.md
+│   │   └── design-patterns.md
+│   ├── api/
+│   │   ├── openapi.yaml
+│   │   └── api-scenarios.md
+│   └── development/
+│       ├── testing-strategy.md
+│       └── agentic-coding-guide.md
+├── src/
+│   ├── domain/          # Pure business logic (TDD tested)
+│   ├── adapters/        # External integrations (mocked in tests)
+│   ├── service/         # Application services (TDD tested)
+│   ├── api/             # FastAPI routes (integration tested)
+│   └── workers/         # Background jobs
+├── tests/
+│   ├── unit/            # TDD unit tests
+│   ├── integration/     # Integration tests
+│   └── fixtures/        # Test data and mocks
+├── scripts/
+│   └── init_db.py
+├── pyproject.toml       # Dependencies & tool config
+├── pytest.ini           # Test configuration
+├── .github/
+│   └── workflows/       # CI/CD pipelines
+└── CLAUDE.md            # AI coding assistant guidance
+```
+
+### TDD Approach
+- **Interface-first**: Define Pydantic models and service interfaces via tests
+- **Red-Green-Refactor**: Write failing test → implement → refactor
+- **Test pyramid**: Many unit tests, fewer integration tests, minimal e2e tests
+- **Async testing**: Use pytest-asyncio for async operations
+
+### ADR Documentation Standards
+- **Format**: Follow Markdown Any Decision Records (MADR) template
+- **Naming**: `NNNN-title-with-dashes.md` (e.g., `0001-split-payload-model.md`)
+- **Location**: `/docs/architecture/adr/`
+- **Template**:
+  ```markdown
+  # ADR-NNNN: [Title]
+
+  **Status:** [Accepted|Proposed|Deprecated|Superseded]
+  **Date:** YYYY-MM-DD
+  **Deciders:** [Team/Role]
+
+  ## Context and Problem Statement
+  [What is the issue we're addressing?]
+
+  ## Decision Drivers
+  * [Driver 1]
+  * [Driver 2]
+
+  ## Considered Options
+  * Option 1
+  * Option 2
+
+  ## Decision Outcome
+  Chosen option: [option], because [rationale].
+
+  ### Positive Consequences
+  * [Consequence 1]
+
+  ### Negative Consequences
+  * [Consequence 1]
+
+  ## Implementation Notes
+  [Specific technical guidance]
+  ```
+
+---
+
+## 3. Phase 1: Foundation Setup (Week 1)
+
+### 3.1 ADR Restructuring
+**Goal**: Extract individual ADRs from `decisions.md` into separate files.
+
+**Tasks**:
+- [ ] Create `/docs/architecture/adr/` directory
+- [ ] Create ADR template file: `docs/architecture/adr/0000-template.md`
+- [ ] Extract ADR-001 (Split-Payload Model) → `0001-split-payload-model.md`
+- [ ] Extract ADR-002 (Stateless Auth) → `0002-stateless-authentication.md`
+- [ ] Extract ADR-003 (Pagination) → `0003-cursor-based-pagination.md`
+- [ ] Extract ADR-004 (Async Exports) → `0004-async-export-workflow.md`
+- [ ] Extract ADR-005 (Resource Naming) → `0005-rest-resource-structure.md`
+- [ ] Extract ADR-006 (Data Retention) → `0006-gdpr-retention-model.md`
+- [ ] Extract ADR-007 (Python & ProcessPoolExecutor) → `0007-hybrid-concurrency-pattern.md`
+- [ ] Create index file: `docs/architecture/adr/README.md` with ADR list and links
+- [ ] Update `decisions.md` with deprecation notice and redirect to ADR directory
+
+**Acceptance Criteria**:
+- Each ADR is a standalone markdown file
+- ADRs follow consistent template structure
+- Index file allows easy navigation
+- CLAUDE.md updated to reference new ADR location
+
+### 3.2 Main Codebase Scaffolding
+**Goal**: Create production-ready directory structure with tooling.
+
+**Tasks**:
+- [ ] Create `/src/` directory with Clean Architecture structure
+- [ ] Initialize `pyproject.toml` with dependencies from prototype-chatgpt5
+- [ ] Add dev dependencies: pytest, pytest-asyncio, pytest-cov, httpx, respx, mypy, ruff
+- [ ] Create `pytest.ini` with async and coverage configuration
+- [ ] Create `.gitignore` (Python, IDE, env files)
+- [ ] Create `tests/` structure: unit/, integration/, fixtures/
+- [ ] Create `scripts/` directory
+- [ ] Set up pre-commit hooks configuration (ruff, mypy)
+
+**Acceptance Criteria**:
+- Directory structure matches target state
+- `pip install -e ".[dev]"` works
+- `pytest` runs (even with 0 tests)
+- Type checking with `mypy src/` works
+
+### 3.3 Documentation Organization
+**Goal**: Restructure documentation for clarity and AI assistant consumption.
+
+**Tasks**:
+- [ ] Create `/docs/architecture/` directory
+- [ ] Create `/docs/api/` directory
+- [ ] Create `/docs/development/` directory
+- [ ] Move `openapi.yaml` → `docs/api/openapi.yaml`
+- [ ] Move `api_docs.md` → `docs/api/api-scenarios.md`
+- [ ] Refactor `implementation_guide.md` → split into:
+  - `docs/architecture/design-patterns.md` (architectural patterns)
+  - `docs/development/testing-strategy.md` (TDD approach)
+  - `docs/development/agentic-coding-guide.md` (LLM-specific guidance)
+- [ ] Create `docs/architecture/architecture-overview.md` (high-level system design)
+- [ ] Update CLAUDE.md with new documentation structure
+
+**Acceptance Criteria**:
+- Documentation is logically organized by concern
+- Each document has a single, clear purpose
+- CLAUDE.md points to correct locations
+
+---
+
+## 4. Phase 2: TDD Interface Definition (Week 2)
+
+### 4.1 Domain Models (TDD)
+**Goal**: Define core domain models with comprehensive test coverage.
+
+**Approach**: Write tests first, then implement models.
+
+**Tasks**:
+- [ ] **Test**: `tests/unit/domain/test_document_metadata.py`
+  - Test: Valid metadata creation
+  - Test: Invalid authorityId (empty, too long)
+  - Test: Invalid referenceNumber format
+  - Test: Future issuedAt date validation
+  - Implement: `src/domain/models.py` → `DocumentMetadata`
+
+- [ ] **Test**: `tests/unit/domain/test_thread_models.py`
+  - Test: ThreadType enum values
+  - Test: SenderRole enum values
+  - Test: ThreadCreateRequest validation
+  - Test: Message model with encrypted content
+  - Implement: Thread-related models
+
+- [ ] **Test**: `tests/unit/domain/test_document_envelope.py`
+  - Test: Split payload structure
+  - Test: encryptedPayload validation (base64, size limits)
+  - Test: Status transitions (RECEIVED → ROUTED → ASSIGNED → CLOSED)
+  - Implement: DocumentEnvelope and status management
+
+**Acceptance Criteria**:
+- All domain models have >90% test coverage
+- Pydantic validation catches invalid inputs
+- Tests run in <1 second
+- Zero mypy type errors
+
+### 4.2 Service Layer Interfaces (TDD)
+**Goal**: Define service contracts via protocol/ABC classes with tests.
+
+**Tasks**:
+- [ ] **Test**: `tests/unit/service/test_documents_service.py`
+  - Mock: Database adapter
+  - Mock: Storage adapter (S3)
+  - Mock: Routing engine
+  - Test: create_document() - happy path
+  - Test: create_document() - routing failure
+  - Test: create_document() - storage failure
+  - Test: get_document() - found
+  - Test: get_document() - not found
+  - Test: Retention date calculation (default 90 days)
+  - Implement: `src/service/documents_service.py`
+
+- [ ] **Test**: `tests/unit/service/test_threads_service.py`
+  - Test: create_thread() - links to document
+  - Test: create_thread() - document not found (404)
+  - Test: list_messages() - cursor pagination
+  - Test: add_message() - role validation
+  - Implement: `src/service/threads_service.py`
+
+- [ ] **Test**: `tests/unit/service/test_exports_service.py`
+  - Test: create_export_job() - returns jobId
+  - Test: get_export_status() - job states (QUEUED, RUNNING, COMPLETED, FAILED)
+  - Test: Async job enqueuing (mock Redis/ARQ)
+  - Implement: `src/service/exports_service.py`
+
+**Acceptance Criteria**:
+- Service layer has clear, testable interfaces
+- All external dependencies are mocked
+- Tests verify business logic, not infrastructure
+- Each service method has both success and failure test cases
+
+### 4.3 Adapter Contracts (Protocols)
+**Goal**: Define adapter interfaces using Python Protocols for mockability.
+
+**Tasks**:
+- [ ] Create `src/adapters/protocols.py`:
+  - `StorageAdapter` protocol (save_encrypted_payload, get_encrypted_payload)
+  - `DatabaseAdapter` protocol (CRUD operations)
+  - `RoutingEngine` protocol (route_document)
+  - `JobQueue` protocol (enqueue, get_status)
+
+- [ ] Create stub implementations for testing:
+  - `tests/fixtures/storage_stub.py` (in-memory storage)
+  - `tests/fixtures/db_stub.py` (in-memory DB)
+
+**Acceptance Criteria**:
+- Protocols are narrow and focused
+- Test fixtures implement all protocols
+- Production adapters can be swapped without changing service layer
+
+---
+
+## 5. Phase 3: Prototype Integration (Week 3)
+
+### 5.1 Comparative Analysis
+**Goal**: Identify best implementations across prototypes.
+
+**Tasks**:
+- [ ] Analyze `prototype-chatgpt5/src/app/adapters/`:
+  - auth.py - OAuth2 implementation quality
+  - db.py - SQLAlchemy async patterns
+  - storage.py - S3 streaming approach
+  - routing.py - Routing logic structure
+
+- [ ] Analyze `prototype-geminiNbt3pro/`:
+  - Identify unique features or better implementations
+
+- [ ] Analyze `prototype-grok4.1/`:
+  - Compare test coverage and patterns
+
+- [ ] Document findings in: `docs/development/prototype-analysis.md`
+  - Table: Feature vs Prototype vs Recommendation
+  - Rationale for selections
+
+**Acceptance Criteria**:
+- Clear decision on which implementation to use for each component
+- Documented rationale for selections
+- Identified any missing features across all prototypes
+
+### 5.2 Core Adapter Implementation
+**Goal**: Implement production adapters based on best prototype code.
+
+**Tasks**:
+- [ ] **Database Adapter** (`src/adapters/db.py`):
+  - Port SQLAlchemy models from chosen prototype
+  - Implement async session management
+  - Add connection pooling configuration
+  - Write integration tests: `tests/integration/test_db_adapter.py`
+
+- [ ] **Storage Adapter** (`src/adapters/storage.py`):
+  - Implement S3 client using aiobotocore
+  - Add streaming upload/download (no in-memory buffering)
+  - Mock S3 in tests using moto or similar
+  - Write tests: `tests/integration/test_storage_adapter.py`
+
+- [ ] **Routing Engine** (`src/adapters/routing.py`):
+  - Port routing logic from prototype
+  - Make routing rules configurable (not hardcoded)
+  - Add caching layer (Redis) for routing rules
+  - Write tests: `tests/unit/adapters/test_routing.py`
+
+- [ ] **Authentication** (`src/adapters/auth.py`):
+  - Implement JWT validation
+  - Add JWKS caching
+  - Create FastAPI dependency for auth
+  - Write tests: `tests/unit/adapters/test_auth.py`
+
+**Acceptance Criteria**:
+- All adapters follow Protocol contracts
+- Integration tests use real dependencies (testcontainers)
+- Unit tests use mocks
+- Streaming works for large files (>50MB)
+
+### 5.3 API Layer Implementation
+**Goal**: Build FastAPI routes with OpenAPI compliance.
+
+**Tasks**:
+- [ ] **Documents API** (`src/api/documents.py`):
+  - POST /documents - implement with streaming upload
+  - GET /documents/{id} - implement with ETag support
+  - Add request validation (Pydantic)
+  - Write integration tests: `tests/integration/test_documents_api.py`
+
+- [ ] **Threads API** (`src/api/threads.py`):
+  - POST /documents/{id}/threads
+  - GET /threads/{id}/messages (cursor pagination)
+  - POST /threads/{id}/messages
+  - Write integration tests: `tests/integration/test_threads_api.py`
+
+- [ ] **Exports API** (`src/api/exports.py`):
+  - POST /exports (async job creation)
+  - GET /exports/{jobId} (status polling)
+  - Write integration tests: `tests/integration/test_exports_api.py`
+
+- [ ] **Main App** (`src/main.py`):
+  - Configure FastAPI with CORS, middleware
+  - Include all routers
+  - Add exception handlers
+  - Add health check endpoint: GET /health
+
+**Acceptance Criteria**:
+- OpenAPI schema matches `docs/api/openapi.yaml`
+- All endpoints have auth middleware
+- Integration tests achieve >80% coverage
+- API responses match documented format
+
+### 5.4 Background Workers
+**Goal**: Implement async export worker.
+
+**Tasks**:
+- [ ] Choose task queue: ARQ (Redis-based, async) or Celery
+- [ ] Implement `src/workers/exports_worker.py`:
+  - Fetch document from storage
+  - Fetch message history from DB
+  - Generate export package (PDF + metadata)
+  - Update job status
+
+- [ ] Write worker tests: `tests/unit/workers/test_exports_worker.py`
+- [ ] Document worker deployment in: `docs/development/worker-deployment.md`
+
+**Acceptance Criteria**:
+- Worker processes export jobs independently
+- Failures are logged and job marked as FAILED
+- Worker can be scaled horizontally
+
+---
+
+## 6. Phase 4: CI/CD & Quality Gates (Week 4)
+
+### 6.1 GitHub Actions Workflows
+**Goal**: Automate testing and quality checks.
+
+**Tasks**:
+- [ ] Create `.github/workflows/test.yml`:
+  - Run on: push, pull_request
+  - Matrix: Python 3.11, 3.12
+  - Steps: Install deps, run pytest, upload coverage
+
+- [ ] Create `.github/workflows/lint.yml`:
+  - Run ruff linting
+  - Run mypy type checking
+  - Check code formatting
+
+- [ ] Create `.github/workflows/integration.yml`:
+  - Spin up PostgreSQL, Redis via services
+  - Run integration tests with real dependencies
+
+- [ ] Add status badges to README.md
+
+**Acceptance Criteria**:
+- All workflows pass on main branch
+- Pull requests blocked if tests fail
+- Coverage report available in PR comments
+
+### 6.2 Pre-commit Hooks
+**Goal**: Catch issues before commit.
+
+**Tasks**:
+- [ ] Create `.pre-commit-config.yaml`:
+  - ruff linting and formatting
+  - mypy type checking
+  - trailing whitespace removal
+  - YAML validation
+
+- [ ] Document setup in: `docs/development/setup-guide.md`
+
+**Acceptance Criteria**:
+- Hooks auto-format code
+- Hooks prevent commits with type errors
+- Setup documented for new developers
+
+### 6.3 Test Coverage Requirements
+**Goal**: Enforce quality thresholds.
+
+**Tasks**:
+- [ ] Configure pytest-cov in `pytest.ini`:
+  - Minimum coverage: 80%
+  - Exclude: tests/, scripts/
+
+- [ ] Add coverage badge to README.md
+- [ ] Document coverage exemptions (e.g., `# pragma: no cover`)
+
+**Acceptance Criteria**:
+- `pytest --cov` fails if <80% coverage
+- Coverage report generated in HTML format
+- Uncovered lines are intentional and documented
+
+---
+
+## 7. Phase 5: Agentic Coding Enablement (Week 5)
+
+### 7.1 Agentic Coding Guide
+**Goal**: Create comprehensive guide for LLM-driven development.
+
+**Tasks**:
+- [ ] Create `docs/development/agentic-coding-guide.md`:
+  - TDD workflow for Claude/GPT
+  - Example prompts for generating tests
+  - How to use Protocol adapters for mocking
+  - Async testing patterns
+  - Common pitfalls (GIL, blocking operations)
+
+- [ ] Add example prompt templates:
+  - "Write async pytest for POST /documents with ProcessPoolExecutor mock"
+  - "Implement cursor pagination for messages following ADR-003"
+
+- [ ] Update CLAUDE.md with agentic coding patterns
+
+**Acceptance Criteria**:
+- Guide includes concrete examples
+- Prompts reference specific ADRs
+- Guide covers both unit and integration test generation
+
+### 7.2 Testing Utilities & Fixtures
+**Goal**: Provide reusable test infrastructure.
+
+**Tasks**:
+- [ ] Create `tests/fixtures/factories.py`:
+  - DocumentMetadataFactory (faker-based)
+  - ThreadFactory
+  - MessageFactory
+
+- [ ] Create `tests/fixtures/db_fixtures.py`:
+  - @pytest.fixture for async DB session
+  - @pytest.fixture for testcontainers Postgres
+
+- [ ] Create `tests/fixtures/auth_fixtures.py`:
+  - Mock JWT tokens with different scopes
+  - Mock JWKS endpoints
+
+- [ ] Document in: `docs/development/testing-utilities.md`
+
+**Acceptance Criteria**:
+- Fixtures reduce boilerplate in tests
+- Factories generate realistic test data
+- Documentation shows usage examples
+
+### 7.3 ADR for Agentic Development
+**Goal**: Document the TDD + AI approach as architectural decision.
+
+**Tasks**:
+- [ ] Create `docs/architecture/adr/0008-agentic-tdd-workflow.md`:
+  - Context: LLM-driven development velocity vs. quality
+  - Decision: Interface-first TDD with AI assistance
+  - Rationale: Tests serve as executable specification
+  - Implementation: Workflow, tooling, prompts
+
+**Acceptance Criteria**:
+- ADR approved by team
+- Links to agentic-coding-guide.md
+- Referenced in CLAUDE.md
+
+---
+
+## 8. Phase 6: Migration & Validation (Week 6)
+
+### 8.1 Prototype Deprecation
+**Goal**: Mark prototypes as archived.
+
+**Tasks**:
+- [ ] Add README.md to each prototype directory:
+  - Status: ARCHIVED
+  - Reason: Consolidated into main codebase
+  - Date: 2025-12-XX
+
+- [ ] Document migration decisions in: `docs/development/prototype-migration.md`
+- [ ] Keep prototypes in repo for reference (don't delete)
+
+**Acceptance Criteria**:
+- Clear indication that prototypes are not maintained
+- Migration rationale documented
+
+### 8.2 End-to-End Validation
+**Goal**: Verify complete system integration.
+
+**Tasks**:
+- [ ] Write E2E test: `tests/e2e/test_full_workflow.py`:
+  - Citizen uploads document
+  - Document is routed
+  - Thread is created
+  - Messages are exchanged
+  - Export is generated
+
+- [ ] Run against local environment (Docker Compose)
+- [ ] Measure performance against NFRs:
+  - Document upload + routing: <500ms
+  - Message retrieval: <300ms
+
+- [ ] Document E2E setup in: `docs/development/e2e-testing.md`
+
+**Acceptance Criteria**:
+- E2E test passes consistently
+- Performance targets met
+- E2E environment reproducible via Docker Compose
+
+### 8.3 Documentation Review
+**Goal**: Ensure all documentation is accurate and complete.
+
+**Tasks**:
+- [ ] Review all ADRs for consistency
+- [ ] Update CLAUDE.md with final structure
+- [ ] Review API documentation against implementation
+- [ ] Spell check and grammar check all docs
+- [ ] Generate API documentation from OpenAPI spec (ReDoc or Swagger UI)
+
+**Acceptance Criteria**:
+- No broken links in documentation
+- Code examples in docs are tested
+- CLAUDE.md accurately reflects current state
+
+---
+
+## 9. Success Criteria (Overall)
+
+### Functional Requirements
+- ✅ Main codebase has all features from best prototype
+- ✅ All core APIs implemented and tested
+- ✅ Background worker functional
+
+### Quality Requirements
+- ✅ Test coverage >80%
+- ✅ Zero mypy type errors
+- ✅ All linting rules pass
+- ✅ CI/CD pipeline green
+
+### Documentation Requirements
+- ✅ ADRs in individual files with consistent structure
+- ✅ All major decisions documented
+- ✅ Agentic coding guide comprehensive
+- ✅ CLAUDE.md accurate and complete
+
+### Performance Requirements
+- ✅ Document routing <500ms (measured in E2E tests)
+- ✅ Message retrieval <300ms (measured in E2E tests)
+- ✅ Large file upload streaming works (>50MB test)
+
+### Process Requirements
+- ✅ TDD workflow established and documented
+- ✅ Pre-commit hooks prevent quality issues
+- ✅ GitHub Actions enforce quality gates
+- ✅ Agentic development patterns proven with at least 3 features
+
+---
+
+## 10. Risk Mitigation
+
+### Risk 1: Prototype Integration Conflicts
+**Mitigation**: Complete comparative analysis (Phase 3.1) before implementation. Document decision rationale.
+
+### Risk 2: TDD Slowing Initial Progress
+**Mitigation**: Front-load interface definition (Phase 2). Once interfaces stable, implementation accelerates.
+
+### Risk 3: Incomplete ADR Extraction
+**Mitigation**: Use checklist approach. Review original `decisions.md` multiple times. Cross-reference with implementation guide.
+
+### Risk 4: Agentic Coding Learning Curve
+**Mitigation**: Create example-driven guide. Include actual prompts that worked. Pair with human for first few features.
+
+### Risk 5: Performance Targets Not Met
+**Mitigation**: Include performance testing from Phase 2 onwards. Identify bottlenecks early. Profile with py-spy or similar.
+
+---
+
+## 11. Next Steps
+
+1. **Review this workplan** with the team
+2. **Adjust timeline** based on team capacity
+3. **Start Phase 1** (Foundation Setup)
+4. **Daily standup** to track progress and blockers
+5. **Weekly retrospective** to improve agentic coding workflow
+
+---
+
+## Appendix A: Recommended Tools
+
+- **Testing**: pytest, pytest-asyncio, pytest-cov, hypothesis (property testing)
+- **Mocking**: respx (HTTP), moto (AWS), testcontainers-python (real services)
+- **Linting**: ruff (fast, replaces flake8 + isort + pyupgrade)
+- **Type Checking**: mypy with strict mode
+- **Factories**: factory_boy or custom Pydantic factories
+- **Performance**: py-spy (profiling), locust (load testing)
+- **Pre-commit**: pre-commit framework
+- **CI/CD**: GitHub Actions (free for public repos)
+
+---
+
+## Appendix B: ADR Numbering Convention
+
+- **0001-0099**: Core Architecture (payload model, auth, concurrency)
+- **0100-0199**: Data & Persistence (pagination, retention, schema)
+- **0200-0299**: API & Integration (REST structure, export workflow)
+- **0300-0399**: Development Process (TDD, agentic coding)
+- **0400+**: Future decisions
+
+---
+
+**Document Owner**: Backend Engineering Team
+**Last Updated**: 2025-12-01
+**Status**: Ready for Review