chore: setup agentic coding workplan

This commit is contained in:
2025-12-01 22:37:27 +01:00
parent 45d60fc1a9
commit 9081cb80d3
2 changed files with 1013 additions and 0 deletions

360
CLAUDE.md Normal file
View File

@@ -0,0 +1,360 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
**DirektVermittlungDe (DVD)** is a document-centric communication platform between citizens and German authorities. It eliminates the "phone hunt" by:
1. Citizens upload a document or provide an *Aktenzeichen* (reference number)
2. The system auto-routes to the responsible unit
3. Opens an interaction thread for direct clarification
## Current Development Status
**⚠️ IMPORTANT**: This repository is currently in transition from prototype phase to main production codebase.
- **Current State**: Three prototype implementations (chatgpt5, geminiNbt3pro, grok4.1) in `/prototype-*` directories
- **Target State**: Unified production codebase in `/src` with TDD-driven development
- **Active Workplan**: See `docs/WORKPLAN_MainCodebase_Integration.md` for detailed migration plan
**When working on this codebase:**
- Check the workplan to understand which phase we're in
- Follow TDD practices: write tests first, then implementation
- Reference ADRs for architectural decisions (see below)
- New code goes in `/src`, not prototypes
## Repository Structure (Target State)
```
/
├── docs/
│ ├── architecture/
│ │ ├── adr/ # Individual ADR files (0001-*.md)
│ │ │ ├── 0001-split-payload-model.md
│ │ │ ├── 0002-stateless-authentication.md
│ │ │ └── ...
│ │ ├── architecture-overview.md
│ │ └── design-patterns.md
│ ├── api/
│ │ ├── openapi.yaml
│ │ └── api-scenarios.md
│ ├── development/
│ │ ├── testing-strategy.md
│ │ ├── agentic-coding-guide.md
│ │ └── setup-guide.md
│ └── WORKPLAN_MainCodebase_Integration.md
├── src/
│ ├── domain/ # Pure business logic (TDD tested)
│ ├── adapters/ # External integrations (mocked in tests)
│ ├── service/ # Application services (TDD tested)
│ ├── api/ # FastAPI routes (integration tested)
│ └── workers/ # Background jobs
├── tests/
│ ├── unit/ # TDD unit tests
│ ├── integration/ # Integration tests
│ └── fixtures/ # Test data and mocks
├── scripts/ # Utility scripts
├── prototype-*/ # ARCHIVED prototypes (reference only)
├── pyproject.toml # Dependencies & tool config
└── pytest.ini # Test configuration
```
## Development Commands
### Setup and Run (Main Codebase)
```bash
# Install dependencies (from repo root)
pip install -e ".[dev]"
# Initialize database metadata
python -m scripts.init_db
# Run development server
uvicorn src.main:app --reload
# View API docs
# http://localhost:8000/docs
```
### Testing (TDD Workflow)
```bash
# Run all tests
pytest
# Run with coverage report
pytest --cov=src --cov-report=html
# Run only unit tests
pytest tests/unit/
# Run only integration tests
pytest tests/integration/
# Run specific test file
pytest tests/unit/domain/test_document_metadata.py -v
# Watch mode (requires pytest-watch)
ptw
```
### Code Quality
```bash
# Lint and format with ruff
ruff check src/ tests/
ruff format src/ tests/
# Type checking with mypy
mypy src/
# Run pre-commit hooks manually
pre-commit run --all-files
```
## Key Architectural Patterns
### 1. Split-Payload Model (ADR-001)
**Critical**: Documents use a split design to enable both encryption and routing:
- **metadata** (plaintext JSON): authorityId, referenceNumber, docType, issuedAt
- **encryptedPayload** (encrypted blob): actual PDF/scan content
The backend can route based on metadata without decrypting the payload. This is the core architectural constraint.
When working with documents:
- NEVER attempt to decrypt `encryptedPayload` in the main service
- The backend treats encrypted content as opaque
- Routing logic operates only on metadata fields
### 2. Hybrid Concurrency Pattern (ADR-007)
Python's GIL requires careful concurrency handling:
- **I/O operations** (DB, network): Use native `async`/`await`
- **CPU operations** (crypto, PDF): Offload to `ProcessPoolExecutor`
Example pattern:
```python
cpu_pool = ProcessPoolExecutor(max_workers=4)
async def handler():
# I/O: async/await
data = await db.fetch()
# CPU: executor
loop = asyncio.get_running_loop()
result = await loop.run_in_executor(cpu_pool, heavy_function, data)
```
### 3. Cursor-Based Pagination (ADR-003)
For message threads, use cursor-based pagination (timestamp) instead of offset-based:
- Ensures consistent performance regardless of thread length
- Prevents duplicate/missing messages during real-time updates
- Target: <300ms response time (NFR-1)
### 4. Async Export Pattern (ADR-004)
Data exports use async request-reply:
1. POST `/exports` → 202 Accepted + jobId
2. Background worker processes via message queue
3. Client polls status endpoint or receives webhook
### 5. Stateless Authentication (ADR-002)
OAuth2/JWT with scopes:
- `citizen:write`: Document submission, thread creation
- `official:read`: View assigned cases
- `official:write`: Respond to inquiries
JWTs enable horizontal scaling without session affinity.
### 6. GDPR-Compliant Retention (ADR-006)
Documents have `retention_date` field:
- Default: `closedAt + gracePeriod`
- Personal archive: `null` or extended date
- Automated cleanup via TTL engine
## API Structure
The API follows document-centric REST hierarchy:
- `/documents` - Root envelope (FR-1: Document intake)
- `/documents/{id}/threads` - Sub-resource for communication (FR-3)
- `/threads/{threadId}/messages` - Message exchange
- `/exports` - Async data export for authority systems (FR-7)
## Critical Domain Models
Located in `src/domain/models.py` (target) or `prototype-chatgpt5/src/app/domain/models.py` (reference):
**Enums:**
- **ThreadType**: TEXT_CHAT, CALLBACK_REQUEST, APPOINTMENT
- **SenderRole**: CITIZEN, OFFICIAL, SYSTEM
- **ExportJobStatus**: QUEUED, RUNNING, COMPLETED, FAILED
**Core Models:**
- **DocumentMetadata**: authorityId, referenceNumber, docType, issuedAt
- Validation: authorityId (1-50 chars), referenceNumber required
- **DocumentCreateRequest**: metadata, encryptedPayload (base64)
- **DocumentCreatedResponse**: id, status, assignedUnit
- **ThreadCreateRequest**: type, initialMessage, preferredTimeSlot
- **MessageDto**: id, senderRole, content (encrypted), timestamp
## Performance Targets (NFR)
- Core operations: <300ms response time
- Document upload routing: <500ms
- Concurrent sessions: 10k+ per region
- Availability: ≥99.5%
## Security Constraints
- E2E encryption required for sensitive data (NFR-5)
- Backend cannot decrypt document payloads
- Least-privilege access via OAuth scopes (NFR-6)
- No PII in plaintext application logs
## TDD Development Workflow
**CRITICAL**: This project follows a strict Test-Driven Development approach.
### The Red-Green-Refactor Cycle
1. **RED**: Write a failing test that defines desired behavior
2. **GREEN**: Write minimal code to make the test pass
3. **REFACTOR**: Improve code while keeping tests green
### Example TDD Workflow
```python
# Step 1: Write failing test (tests/unit/domain/test_document_metadata.py)
def test_document_metadata_validates_authority_id():
with pytest.raises(ValidationError):
DocumentMetadata(
authorityId="", # Empty should fail
referenceNumber="123/456",
docType="NOTICE",
issuedAt=datetime.now()
)
# Step 2: Run test (it fails)
# $ pytest tests/unit/domain/test_document_metadata.py
# Step 3: Implement validation (src/domain/models.py)
class DocumentMetadata(BaseModel):
authorityId: str = Field(..., min_length=1, max_length=50)
# ... rest of fields
# Step 4: Run test again (it passes)
# Step 5: Refactor if needed, ensure tests still pass
```
### Agentic TDD Pattern
When asking Claude to implement features, use this pattern:
```
"Write a test for [feature] that verifies [specific behavior].
Use pytest and follow the pattern in tests/unit/[module]/.
Reference ADR-[number] for architectural constraints."
```
Example:
```
"Write a test for document routing that verifies documents with docType='NOTICE'
are routed to the NoticeTeam. Mock the routing adapter using the protocol defined
in src/adapters/protocols.py. Follow ADR-0001 split-payload constraints."
```
See `docs/development/agentic-coding-guide.md` for comprehensive examples.
## Documentation References
### Architecture & Decisions
- **Workplan**: `docs/WORKPLAN_MainCodebase_Integration.md` - Current migration plan
- **ADRs**: `docs/architecture/adr/` - Individual decision records
- `0001-split-payload-model.md` - Core constraint: metadata vs encrypted payload
- `0002-stateless-authentication.md` - OAuth2/JWT architecture
- `0003-cursor-based-pagination.md` - Performance-optimized pagination
- `0004-async-export-workflow.md` - Background job pattern
- `0005-rest-resource-structure.md` - API design hierarchy
- `0006-gdpr-retention-model.md` - Data lifecycle management
- `0007-hybrid-concurrency-pattern.md` - Python GIL mitigation
- `0008-agentic-tdd-workflow.md` - LLM-driven development process
- **Overview**: `docs/architecture/architecture-overview.md` - System design
- **Patterns**: `docs/architecture/design-patterns.md` - Common patterns
### API Documentation
- **OpenAPI**: `docs/api/openapi.yaml` - Full API specification
- **Scenarios**: `docs/api/api-scenarios.md` - Real-world usage examples
### Development Guides
- **Testing**: `docs/development/testing-strategy.md` - TDD practices
- **Agentic Coding**: `docs/development/agentic-coding-guide.md` - AI-assisted development
- **Setup**: `docs/development/setup-guide.md` - Environment setup
### Legacy (Prototypes)
- `prototype-chatgpt5/README.md` - Reference implementation (ARCHIVED)
- `docs/decisions.md` - Original ADRs (DEPRECATED, see /docs/architecture/adr/)
## Database Schema
Key tables (PostgreSQL):
- **documents**: id, reference_number, authority_id, status, storage_path, retention_date
- **threads**: id, document_id, type, assigned_official_id, last_activity_at
- **messages**: id, thread_id, sender_role, content_blob, created_at
Indexes:
- `idx_docs_authority` on documents(authority_id, status)
- `idx_msgs_thread_time` on messages(thread_id, created_at DESC)
## Technology Stack
- **Language**: Python 3.11+
- **Framework**: FastAPI + Uvicorn
- **ORM**: SQLAlchemy (async)
- **Database**: PostgreSQL 15+
- **Blob Storage**: S3-compatible (MinIO/AWS S3)
- **Task Queue**: ARQ (Redis-based, async) or Celery
- **Auth**: OAuth2/JWT (stateless)
- **Testing**: pytest + pytest-asyncio + pytest-cov
- **Linting**: ruff (fast, replaces flake8/isort/pyupgrade)
- **Type Checking**: mypy (strict mode)
- **CI/CD**: GitHub Actions
## Workplan & Phase Tracking
**Active Workplan**: `docs/WORKPLAN_MainCodebase_Integration.md`
### Current Phase Status
To check which phase we're in:
1. Open `docs/WORKPLAN_MainCodebase_Integration.md`
2. Look for checked `[x]` items in each phase section
3. Focus development on uncompleted `[ ]` items in the current phase
### How to Contribute
**Before implementing any feature:**
1. Check if it's in the current phase of the workplan
2. Read the relevant ADR(s) in `docs/architecture/adr/`
3. Write tests first (TDD approach)
4. Implement to make tests pass
5. Ensure all quality gates pass (pytest, mypy, ruff)
**When adding new architectural decisions:**
1. Create a new ADR file in `docs/architecture/adr/`
2. Use the template in `0000-template.md`
3. Number sequentially (next available number)
4. Update the ADR index in `docs/architecture/adr/README.md`
## Common Gotchas
1. **Stream large files**: Use `aiobotocore` to stream uploads to S3, don't load entire payloads into memory
2. **ProcessPoolExecutor**: CPU-heavy operations MUST be offloaded to avoid blocking the event loop
3. **Metadata separation**: The routing engine needs plaintext metadata - design API contracts accordingly
4. **Retention dates**: Always set `retention_date` on document creation for GDPR compliance
5. **Cursor pagination**: Use timestamp-based cursors for message history, not offset-based

View File

@@ -0,0 +1,653 @@
# Workplan: Main Codebase Integration & TDD Setup
**Status:** Draft v1.0
**Date:** 2025-12-01
**Goal:** Consolidate prototype learnings into a production-ready main codebase with TDD practices and proper ADR documentation structure.
---
## 1. Current State Assessment
### Existing Assets
- **3 Prototype Implementations**: chatgpt5 (most complete), geminiNbt3pro, grok4.1
- **Documentation**:
- Single `decisions.md` file with all ADRs (needs restructuring)
- `implementation_guide.md` (comprehensive)
- `api_docs.md` (scenario-based)
- `openapi.yaml` (API spec)
- **Architecture**: Clean Architecture pattern established in chatgpt5 prototype
### Gaps
- No unified main codebase
- ADRs not in individual files (not easily referenceable)
- No TDD test suite defining core interfaces
- No CI/CD pipeline configuration
- Prototypes have overlapping implementations without consolidation
---
## 2. Target State Definition
### Main Codebase Structure
```
/
├── docs/
│ ├── architecture/
│ │ ├── adr/ # Individual ADR files
│ │ │ ├── 0001-split-payload-model.md
│ │ │ ├── 0002-stateless-auth.md
│ │ │ └── ...
│ │ ├── architecture-overview.md
│ │ └── design-patterns.md
│ ├── api/
│ │ ├── openapi.yaml
│ │ └── api-scenarios.md
│ └── development/
│ ├── testing-strategy.md
│ └── agentic-coding-guide.md
├── src/
│ ├── domain/ # Pure business logic (TDD tested)
│ ├── adapters/ # External integrations (mocked in tests)
│ ├── service/ # Application services (TDD tested)
│ ├── api/ # FastAPI routes (integration tested)
│ └── workers/ # Background jobs
├── tests/
│ ├── unit/ # TDD unit tests
│ ├── integration/ # Integration tests
│ └── fixtures/ # Test data and mocks
├── scripts/
│ └── init_db.py
├── pyproject.toml # Dependencies & tool config
├── pytest.ini # Test configuration
├── .github/
│ └── workflows/ # CI/CD pipelines
└── CLAUDE.md # AI coding assistant guidance
```
### TDD Approach
- **Interface-first**: Define Pydantic models and service interfaces via tests
- **Red-Green-Refactor**: Write failing test → implement → refactor
- **Test pyramid**: Many unit tests, fewer integration tests, minimal e2e tests
- **Async testing**: Use pytest-asyncio for async operations
### ADR Documentation Standards
- **Format**: Follow Markdown Any Decision Records (MADR) template
- **Naming**: `NNNN-title-with-dashes.md` (e.g., `0001-split-payload-model.md`)
- **Location**: `/docs/architecture/adr/`
- **Template**:
```markdown
# ADR-NNNN: [Title]
**Status:** [Accepted|Proposed|Deprecated|Superseded]
**Date:** YYYY-MM-DD
**Deciders:** [Team/Role]
## Context and Problem Statement
[What is the issue we're addressing?]
## Decision Drivers
* [Driver 1]
* [Driver 2]
## Considered Options
* Option 1
* Option 2
## Decision Outcome
Chosen option: [option], because [rationale].
### Positive Consequences
* [Consequence 1]
### Negative Consequences
* [Consequence 1]
## Implementation Notes
[Specific technical guidance]
```
---
## 3. Phase 1: Foundation Setup (Week 1)
### 3.1 ADR Restructuring
**Goal**: Extract individual ADRs from `decisions.md` into separate files.
**Tasks**:
- [ ] Create `/docs/architecture/adr/` directory
- [ ] Create ADR template file: `docs/architecture/adr/0000-template.md`
- [ ] Extract ADR-001 (Split-Payload Model) → `0001-split-payload-model.md`
- [ ] Extract ADR-002 (Stateless Auth) → `0002-stateless-authentication.md`
- [ ] Extract ADR-003 (Pagination) → `0003-cursor-based-pagination.md`
- [ ] Extract ADR-004 (Async Exports) → `0004-async-export-workflow.md`
- [ ] Extract ADR-005 (Resource Naming) → `0005-rest-resource-structure.md`
- [ ] Extract ADR-006 (Data Retention) → `0006-gdpr-retention-model.md`
- [ ] Extract ADR-007 (Python & ProcessPoolExecutor) → `0007-hybrid-concurrency-pattern.md`
- [ ] Create index file: `docs/architecture/adr/README.md` with ADR list and links
- [ ] Update `decisions.md` with deprecation notice and redirect to ADR directory
**Acceptance Criteria**:
- Each ADR is a standalone markdown file
- ADRs follow consistent template structure
- Index file allows easy navigation
- CLAUDE.md updated to reference new ADR location
### 3.2 Main Codebase Scaffolding
**Goal**: Create production-ready directory structure with tooling.
**Tasks**:
- [ ] Create `/src/` directory with Clean Architecture structure
- [ ] Initialize `pyproject.toml` with dependencies from prototype-chatgpt5
- [ ] Add dev dependencies: pytest, pytest-asyncio, pytest-cov, httpx, respx, mypy, ruff
- [ ] Create `pytest.ini` with async and coverage configuration
- [ ] Create `.gitignore` (Python, IDE, env files)
- [ ] Create `tests/` structure: unit/, integration/, fixtures/
- [ ] Create `scripts/` directory
- [ ] Set up pre-commit hooks configuration (ruff, mypy)
**Acceptance Criteria**:
- Directory structure matches target state
- `pip install -e ".[dev]"` works
- `pytest` runs (even with 0 tests)
- Type checking with `mypy src/` works
### 3.3 Documentation Organization
**Goal**: Restructure documentation for clarity and AI assistant consumption.
**Tasks**:
- [ ] Create `/docs/architecture/` directory
- [ ] Create `/docs/api/` directory
- [ ] Create `/docs/development/` directory
- [ ] Move `openapi.yaml` → `docs/api/openapi.yaml`
- [ ] Move `api_docs.md` → `docs/api/api-scenarios.md`
- [ ] Refactor `implementation_guide.md` → split into:
- `docs/architecture/design-patterns.md` (architectural patterns)
- `docs/development/testing-strategy.md` (TDD approach)
- `docs/development/agentic-coding-guide.md` (LLM-specific guidance)
- [ ] Create `docs/architecture/architecture-overview.md` (high-level system design)
- [ ] Update CLAUDE.md with new documentation structure
**Acceptance Criteria**:
- Documentation is logically organized by concern
- Each document has a single, clear purpose
- CLAUDE.md points to correct locations
---
## 4. Phase 2: TDD Interface Definition (Week 2)
### 4.1 Domain Models (TDD)
**Goal**: Define core domain models with comprehensive test coverage.
**Approach**: Write tests first, then implement models.
**Tasks**:
- [ ] **Test**: `tests/unit/domain/test_document_metadata.py`
- Test: Valid metadata creation
- Test: Invalid authorityId (empty, too long)
- Test: Invalid referenceNumber format
- Test: Future issuedAt date validation
- Implement: `src/domain/models.py` → `DocumentMetadata`
- [ ] **Test**: `tests/unit/domain/test_thread_models.py`
- Test: ThreadType enum values
- Test: SenderRole enum values
- Test: ThreadCreateRequest validation
- Test: Message model with encrypted content
- Implement: Thread-related models
- [ ] **Test**: `tests/unit/domain/test_document_envelope.py`
- Test: Split payload structure
- Test: encryptedPayload validation (base64, size limits)
- Test: Status transitions (RECEIVED → ROUTED → ASSIGNED → CLOSED)
- Implement: DocumentEnvelope and status management
**Acceptance Criteria**:
- All domain models have >90% test coverage
- Pydantic validation catches invalid inputs
- Tests run in <1 second
- Zero mypy type errors
### 4.2 Service Layer Interfaces (TDD)
**Goal**: Define service contracts via protocol/ABC classes with tests.
**Tasks**:
- [ ] **Test**: `tests/unit/service/test_documents_service.py`
- Mock: Database adapter
- Mock: Storage adapter (S3)
- Mock: Routing engine
- Test: create_document() - happy path
- Test: create_document() - routing failure
- Test: create_document() - storage failure
- Test: get_document() - found
- Test: get_document() - not found
- Test: Retention date calculation (default 90 days)
- Implement: `src/service/documents_service.py`
- [ ] **Test**: `tests/unit/service/test_threads_service.py`
- Test: create_thread() - links to document
- Test: create_thread() - document not found (404)
- Test: list_messages() - cursor pagination
- Test: add_message() - role validation
- Implement: `src/service/threads_service.py`
- [ ] **Test**: `tests/unit/service/test_exports_service.py`
- Test: create_export_job() - returns jobId
- Test: get_export_status() - job states (QUEUED, RUNNING, COMPLETED, FAILED)
- Test: Async job enqueuing (mock Redis/ARQ)
- Implement: `src/service/exports_service.py`
**Acceptance Criteria**:
- Service layer has clear, testable interfaces
- All external dependencies are mocked
- Tests verify business logic, not infrastructure
- Each service method has both success and failure test cases
### 4.3 Adapter Contracts (Protocols)
**Goal**: Define adapter interfaces using Python Protocols for mockability.
**Tasks**:
- [ ] Create `src/adapters/protocols.py`:
- `StorageAdapter` protocol (save_encrypted_payload, get_encrypted_payload)
- `DatabaseAdapter` protocol (CRUD operations)
- `RoutingEngine` protocol (route_document)
- `JobQueue` protocol (enqueue, get_status)
- [ ] Create stub implementations for testing:
- `tests/fixtures/storage_stub.py` (in-memory storage)
- `tests/fixtures/db_stub.py` (in-memory DB)
**Acceptance Criteria**:
- Protocols are narrow and focused
- Test fixtures implement all protocols
- Production adapters can be swapped without changing service layer
---
## 5. Phase 3: Prototype Integration (Week 3)
### 5.1 Comparative Analysis
**Goal**: Identify best implementations across prototypes.
**Tasks**:
- [ ] Analyze `prototype-chatgpt5/src/app/adapters/`:
- auth.py - OAuth2 implementation quality
- db.py - SQLAlchemy async patterns
- storage.py - S3 streaming approach
- routing.py - Routing logic structure
- [ ] Analyze `prototype-geminiNbt3pro/`:
- Identify unique features or better implementations
- [ ] Analyze `prototype-grok4.1/`:
- Compare test coverage and patterns
- [ ] Document findings in: `docs/development/prototype-analysis.md`
- Table: Feature vs Prototype vs Recommendation
- Rationale for selections
**Acceptance Criteria**:
- Clear decision on which implementation to use for each component
- Documented rationale for selections
- Identified any missing features across all prototypes
### 5.2 Core Adapter Implementation
**Goal**: Implement production adapters based on best prototype code.
**Tasks**:
- [ ] **Database Adapter** (`src/adapters/db.py`):
- Port SQLAlchemy models from chosen prototype
- Implement async session management
- Add connection pooling configuration
- Write integration tests: `tests/integration/test_db_adapter.py`
- [ ] **Storage Adapter** (`src/adapters/storage.py`):
- Implement S3 client using aiobotocore
- Add streaming upload/download (no in-memory buffering)
- Mock S3 in tests using moto or similar
- Write tests: `tests/integration/test_storage_adapter.py`
- [ ] **Routing Engine** (`src/adapters/routing.py`):
- Port routing logic from prototype
- Make routing rules configurable (not hardcoded)
- Add caching layer (Redis) for routing rules
- Write tests: `tests/unit/adapters/test_routing.py`
- [ ] **Authentication** (`src/adapters/auth.py`):
- Implement JWT validation
- Add JWKS caching
- Create FastAPI dependency for auth
- Write tests: `tests/unit/adapters/test_auth.py`
**Acceptance Criteria**:
- All adapters follow Protocol contracts
- Integration tests use real dependencies (testcontainers)
- Unit tests use mocks
- Streaming works for large files (>50MB)
### 5.3 API Layer Implementation
**Goal**: Build FastAPI routes with OpenAPI compliance.
**Tasks**:
- [ ] **Documents API** (`src/api/documents.py`):
- POST /documents - implement with streaming upload
- GET /documents/{id} - implement with ETag support
- Add request validation (Pydantic)
- Write integration tests: `tests/integration/test_documents_api.py`
- [ ] **Threads API** (`src/api/threads.py`):
- POST /documents/{id}/threads
- GET /threads/{id}/messages (cursor pagination)
- POST /threads/{id}/messages
- Write integration tests: `tests/integration/test_threads_api.py`
- [ ] **Exports API** (`src/api/exports.py`):
- POST /exports (async job creation)
- GET /exports/{jobId} (status polling)
- Write integration tests: `tests/integration/test_exports_api.py`
- [ ] **Main App** (`src/main.py`):
- Configure FastAPI with CORS, middleware
- Include all routers
- Add exception handlers
- Add health check endpoint: GET /health
**Acceptance Criteria**:
- OpenAPI schema matches `docs/api/openapi.yaml`
- All endpoints have auth middleware
- Integration tests achieve >80% coverage
- API responses match documented format
### 5.4 Background Workers
**Goal**: Implement async export worker.
**Tasks**:
- [ ] Choose task queue: ARQ (Redis-based, async) or Celery
- [ ] Implement `src/workers/exports_worker.py`:
- Fetch document from storage
- Fetch message history from DB
- Generate export package (PDF + metadata)
- Update job status
- [ ] Write worker tests: `tests/unit/workers/test_exports_worker.py`
- [ ] Document worker deployment in: `docs/development/worker-deployment.md`
**Acceptance Criteria**:
- Worker processes export jobs independently
- Failures are logged and job marked as FAILED
- Worker can be scaled horizontally
---
## 6. Phase 4: CI/CD & Quality Gates (Week 4)
### 6.1 GitHub Actions Workflows
**Goal**: Automate testing and quality checks.
**Tasks**:
- [ ] Create `.github/workflows/test.yml`:
- Run on: push, pull_request
- Matrix: Python 3.11, 3.12
- Steps: Install deps, run pytest, upload coverage
- [ ] Create `.github/workflows/lint.yml`:
- Run ruff linting
- Run mypy type checking
- Check code formatting
- [ ] Create `.github/workflows/integration.yml`:
- Spin up PostgreSQL, Redis via services
- Run integration tests with real dependencies
- [ ] Add status badges to README.md
**Acceptance Criteria**:
- All workflows pass on main branch
- Pull requests blocked if tests fail
- Coverage report available in PR comments
### 6.2 Pre-commit Hooks
**Goal**: Catch issues before commit.
**Tasks**:
- [ ] Create `.pre-commit-config.yaml`:
- ruff linting and formatting
- mypy type checking
- trailing whitespace removal
- YAML validation
- [ ] Document setup in: `docs/development/setup-guide.md`
**Acceptance Criteria**:
- Hooks auto-format code
- Hooks prevent commits with type errors
- Setup documented for new developers
### 6.3 Test Coverage Requirements
**Goal**: Enforce quality thresholds.
**Tasks**:
- [ ] Configure pytest-cov in `pytest.ini`:
- Minimum coverage: 80%
- Exclude: tests/, scripts/
- [ ] Add coverage badge to README.md
- [ ] Document coverage exemptions (e.g., `# pragma: no cover`)
**Acceptance Criteria**:
- `pytest --cov` fails if <80% coverage
- Coverage report generated in HTML format
- Uncovered lines are intentional and documented
---
## 7. Phase 5: Agentic Coding Enablement (Week 5)
### 7.1 Agentic Coding Guide
**Goal**: Create comprehensive guide for LLM-driven development.
**Tasks**:
- [ ] Create `docs/development/agentic-coding-guide.md`:
- TDD workflow for Claude/GPT
- Example prompts for generating tests
- How to use Protocol adapters for mocking
- Async testing patterns
- Common pitfalls (GIL, blocking operations)
- [ ] Add example prompt templates:
- "Write async pytest for POST /documents with ProcessPoolExecutor mock"
- "Implement cursor pagination for messages following ADR-003"
- [ ] Update CLAUDE.md with agentic coding patterns
**Acceptance Criteria**:
- Guide includes concrete examples
- Prompts reference specific ADRs
- Guide covers both unit and integration test generation
### 7.2 Testing Utilities & Fixtures
**Goal**: Provide reusable test infrastructure.
**Tasks**:
- [ ] Create `tests/fixtures/factories.py`:
- DocumentMetadataFactory (faker-based)
- ThreadFactory
- MessageFactory
- [ ] Create `tests/fixtures/db_fixtures.py`:
- @pytest.fixture for async DB session
- @pytest.fixture for testcontainers Postgres
- [ ] Create `tests/fixtures/auth_fixtures.py`:
- Mock JWT tokens with different scopes
- Mock JWKS endpoints
- [ ] Document in: `docs/development/testing-utilities.md`
**Acceptance Criteria**:
- Fixtures reduce boilerplate in tests
- Factories generate realistic test data
- Documentation shows usage examples
### 7.3 ADR for Agentic Development
**Goal**: Document the TDD + AI approach as architectural decision.
**Tasks**:
- [ ] Create `docs/architecture/adr/0008-agentic-tdd-workflow.md`:
- Context: LLM-driven development velocity vs. quality
- Decision: Interface-first TDD with AI assistance
- Rationale: Tests serve as executable specification
- Implementation: Workflow, tooling, prompts
**Acceptance Criteria**:
- ADR approved by team
- Links to agentic-coding-guide.md
- Referenced in CLAUDE.md
---
## 8. Phase 6: Migration & Validation (Week 6)
### 8.1 Prototype Deprecation
**Goal**: Mark prototypes as archived.
**Tasks**:
- [ ] Add README.md to each prototype directory:
- Status: ARCHIVED
- Reason: Consolidated into main codebase
- Date: 2025-12-XX
- [ ] Document migration decisions in: `docs/development/prototype-migration.md`
- [ ] Keep prototypes in repo for reference (don't delete)
**Acceptance Criteria**:
- Clear indication that prototypes are not maintained
- Migration rationale documented
### 8.2 End-to-End Validation
**Goal**: Verify complete system integration.
**Tasks**:
- [ ] Write E2E test: `tests/e2e/test_full_workflow.py`:
- Citizen uploads document
- Document is routed
- Thread is created
- Messages are exchanged
- Export is generated
- [ ] Run against local environment (Docker Compose)
- [ ] Measure performance against NFRs:
- Document upload + routing: <500ms
- Message retrieval: <300ms
- [ ] Document E2E setup in: `docs/development/e2e-testing.md`
**Acceptance Criteria**:
- E2E test passes consistently
- Performance targets met
- E2E environment reproducible via Docker Compose
### 8.3 Documentation Review
**Goal**: Ensure all documentation is accurate and complete.
**Tasks**:
- [ ] Review all ADRs for consistency
- [ ] Update CLAUDE.md with final structure
- [ ] Review API documentation against implementation
- [ ] Spell check and grammar check all docs
- [ ] Generate API documentation from OpenAPI spec (ReDoc or Swagger UI)
**Acceptance Criteria**:
- No broken links in documentation
- Code examples in docs are tested
- CLAUDE.md accurately reflects current state
---
## 9. Success Criteria (Overall)
### Functional Requirements
- ✅ Main codebase has all features from best prototype
- ✅ All core APIs implemented and tested
- ✅ Background worker functional
### Quality Requirements
- ✅ Test coverage >80%
- ✅ Zero mypy type errors
- ✅ All linting rules pass
- ✅ CI/CD pipeline green
### Documentation Requirements
- ✅ ADRs in individual files with consistent structure
- ✅ All major decisions documented
- ✅ Agentic coding guide comprehensive
- ✅ CLAUDE.md accurate and complete
### Performance Requirements
- ✅ Document routing <500ms (measured in E2E tests)
- ✅ Message retrieval <300ms (measured in E2E tests)
- ✅ Large file upload streaming works (>50MB test)
### Process Requirements
- ✅ TDD workflow established and documented
- ✅ Pre-commit hooks prevent quality issues
- ✅ GitHub Actions enforce quality gates
- ✅ Agentic development patterns proven with at least 3 features
---
## 10. Risk Mitigation
### Risk 1: Prototype Integration Conflicts
**Mitigation**: Complete comparative analysis (Phase 3.1) before implementation. Document decision rationale.
### Risk 2: TDD Slowing Initial Progress
**Mitigation**: Front-load interface definition (Phase 2). Once interfaces stable, implementation accelerates.
### Risk 3: Incomplete ADR Extraction
**Mitigation**: Use checklist approach. Review original `decisions.md` multiple times. Cross-reference with implementation guide.
### Risk 4: Agentic Coding Learning Curve
**Mitigation**: Create example-driven guide. Include actual prompts that worked. Pair with human for first few features.
### Risk 5: Performance Targets Not Met
**Mitigation**: Include performance testing from Phase 2 onwards. Identify bottlenecks early. Profile with py-spy or similar.
---
## 11. Next Steps
1. **Review this workplan** with the team
2. **Adjust timeline** based on team capacity
3. **Start Phase 1** (Foundation Setup)
4. **Daily standup** to track progress and blockers
5. **Weekly retrospective** to improve agentic coding workflow
---
## Appendix A: Recommended Tools
- **Testing**: pytest, pytest-asyncio, pytest-cov, hypothesis (property testing)
- **Mocking**: respx (HTTP), moto (AWS), testcontainers-python (real services)
- **Linting**: ruff (fast, replaces flake8 + isort + pyupgrade)
- **Type Checking**: mypy with strict mode
- **Factories**: factory_boy or custom Pydantic factories
- **Performance**: py-spy (profiling), locust (load testing)
- **Pre-commit**: pre-commit framework
- **CI/CD**: GitHub Actions (free for public repos)
---
## Appendix B: ADR Numbering Convention
- **0001-0099**: Core Architecture (payload model, auth, concurrency)
- **0100-0199**: Data & Persistence (pagination, retention, schema)
- **0200-0299**: API & Integration (REST structure, export workflow)
- **0300-0399**: Development Process (TDD, agentic coding)
- **0400+**: Future decisions
---
**Document Owner**: Backend Engineering Team
**Last Updated**: 2025-12-01
**Status**: Ready for Review