# Workplan: Main Codebase Integration & TDD Setup **Status:** Draft v1.0 **Date:** 2025-12-01 **Goal:** Consolidate prototype learnings into a production-ready main codebase with TDD practices and proper ADR documentation structure. --- ## 1. Current State Assessment ### Existing Assets - **3 Prototype Implementations**: chatgpt5 (most complete), geminiNbt3pro, grok4.1 - **Documentation**: - Single `decisions.md` file with all ADRs (needs restructuring) - `implementation_guide.md` (comprehensive) - `api_docs.md` (scenario-based) - `openapi.yaml` (API spec) - **Architecture**: Clean Architecture pattern established in chatgpt5 prototype ### Gaps - No unified main codebase - ADRs not in individual files (not easily referenceable) - No TDD test suite defining core interfaces - No CI/CD pipeline configuration - Prototypes have overlapping implementations without consolidation --- ## 2. Target State Definition ### Main Codebase Structure ``` / ├── docs/ │ ├── architecture/ │ │ ├── adr/ # Individual ADR files │ │ │ ├── 0001-split-payload-model.md │ │ │ ├── 0002-stateless-auth.md │ │ │ └── ... │ │ ├── architecture-overview.md │ │ └── design-patterns.md │ ├── api/ │ │ ├── openapi.yaml │ │ └── api-scenarios.md │ └── development/ │ ├── testing-strategy.md │ └── agentic-coding-guide.md ├── src/ │ ├── domain/ # Pure business logic (TDD tested) │ ├── adapters/ # External integrations (mocked in tests) │ ├── service/ # Application services (TDD tested) │ ├── api/ # FastAPI routes (integration tested) │ └── workers/ # Background jobs ├── tests/ │ ├── unit/ # TDD unit tests │ ├── integration/ # Integration tests │ └── fixtures/ # Test data and mocks ├── scripts/ │ └── init_db.py ├── pyproject.toml # Dependencies & tool config ├── pytest.ini # Test configuration ├── .github/ │ └── workflows/ # CI/CD pipelines └── CLAUDE.md # AI coding assistant guidance ``` ### TDD Approach - **Interface-first**: Define Pydantic models and service interfaces via tests - **Red-Green-Refactor**: Write failing test → implement → refactor - **Test pyramid**: Many unit tests, fewer integration tests, minimal e2e tests - **Async testing**: Use pytest-asyncio for async operations ### ADR Documentation Standards - **Format**: Follow Markdown Any Decision Records (MADR) template - **Naming**: `NNNN-title-with-dashes.md` (e.g., `0001-split-payload-model.md`) - **Location**: `/docs/architecture/adr/` - **Template**: ```markdown # ADR-NNNN: [Title] **Status:** [Accepted|Proposed|Deprecated|Superseded] **Date:** YYYY-MM-DD **Deciders:** [Team/Role] ## Context and Problem Statement [What is the issue we're addressing?] ## Decision Drivers * [Driver 1] * [Driver 2] ## Considered Options * Option 1 * Option 2 ## Decision Outcome Chosen option: [option], because [rationale]. ### Positive Consequences * [Consequence 1] ### Negative Consequences * [Consequence 1] ## Implementation Notes [Specific technical guidance] ``` --- ## 3. Phase 1: Foundation Setup (Week 1) ### 3.1 ADR Restructuring **Goal**: Extract individual ADRs from `decisions.md` into separate files. **Tasks**: - [ ] Create `/docs/architecture/adr/` directory - [ ] Create ADR template file: `docs/architecture/adr/0000-template.md` - [ ] Extract ADR-001 (Split-Payload Model) → `0001-split-payload-model.md` - [ ] Extract ADR-002 (Stateless Auth) → `0002-stateless-authentication.md` - [ ] Extract ADR-003 (Pagination) → `0003-cursor-based-pagination.md` - [ ] Extract ADR-004 (Async Exports) → `0004-async-export-workflow.md` - [ ] Extract ADR-005 (Resource Naming) → `0005-rest-resource-structure.md` - [ ] Extract ADR-006 (Data Retention) → `0006-gdpr-retention-model.md` - [ ] Extract ADR-007 (Python & ProcessPoolExecutor) → `0007-hybrid-concurrency-pattern.md` - [ ] Create index file: `docs/architecture/adr/README.md` with ADR list and links - [ ] Update `decisions.md` with deprecation notice and redirect to ADR directory **Acceptance Criteria**: - Each ADR is a standalone markdown file - ADRs follow consistent template structure - Index file allows easy navigation - CLAUDE.md updated to reference new ADR location ### 3.2 Main Codebase Scaffolding **Goal**: Create production-ready directory structure with tooling. **Tasks**: - [ ] Create `/src/` directory with Clean Architecture structure - [ ] Initialize `pyproject.toml` with dependencies from prototype-chatgpt5 - [ ] Add dev dependencies: pytest, pytest-asyncio, pytest-cov, httpx, respx, mypy, ruff - [ ] Create `pytest.ini` with async and coverage configuration - [ ] Create `.gitignore` (Python, IDE, env files) - [ ] Create `tests/` structure: unit/, integration/, fixtures/ - [ ] Create `scripts/` directory - [ ] Set up pre-commit hooks configuration (ruff, mypy) **Acceptance Criteria**: - Directory structure matches target state - `pip install -e ".[dev]"` works - `pytest` runs (even with 0 tests) - Type checking with `mypy src/` works ### 3.3 Documentation Organization **Goal**: Restructure documentation for clarity and AI assistant consumption. **Tasks**: - [ ] Create `/docs/architecture/` directory - [ ] Create `/docs/api/` directory - [ ] Create `/docs/development/` directory - [ ] Move `openapi.yaml` → `docs/api/openapi.yaml` - [ ] Move `api_docs.md` → `docs/api/api-scenarios.md` - [ ] Refactor `implementation_guide.md` → split into: - `docs/architecture/design-patterns.md` (architectural patterns) - `docs/development/testing-strategy.md` (TDD approach) - `docs/development/agentic-coding-guide.md` (LLM-specific guidance) - [ ] Create `docs/architecture/architecture-overview.md` (high-level system design) - [ ] Update CLAUDE.md with new documentation structure **Acceptance Criteria**: - Documentation is logically organized by concern - Each document has a single, clear purpose - CLAUDE.md points to correct locations --- ## 4. Phase 2: TDD Interface Definition (Week 2) ### 4.1 Domain Models (TDD) **Goal**: Define core domain models with comprehensive test coverage. **Approach**: Write tests first, then implement models. **Tasks**: - [ ] **Test**: `tests/unit/domain/test_document_metadata.py` - Test: Valid metadata creation - Test: Invalid authorityId (empty, too long) - Test: Invalid referenceNumber format - Test: Future issuedAt date validation - Implement: `src/domain/models.py` → `DocumentMetadata` - [ ] **Test**: `tests/unit/domain/test_thread_models.py` - Test: ThreadType enum values - Test: SenderRole enum values - Test: ThreadCreateRequest validation - Test: Message model with encrypted content - Implement: Thread-related models - [ ] **Test**: `tests/unit/domain/test_document_envelope.py` - Test: Split payload structure - Test: encryptedPayload validation (base64, size limits) - Test: Status transitions (RECEIVED → ROUTED → ASSIGNED → CLOSED) - Implement: DocumentEnvelope and status management **Acceptance Criteria**: - All domain models have >90% test coverage - Pydantic validation catches invalid inputs - Tests run in <1 second - Zero mypy type errors ### 4.2 Service Layer Interfaces (TDD) **Goal**: Define service contracts via protocol/ABC classes with tests. **Tasks**: - [ ] **Test**: `tests/unit/service/test_documents_service.py` - Mock: Database adapter - Mock: Storage adapter (S3) - Mock: Routing engine - Test: create_document() - happy path - Test: create_document() - routing failure - Test: create_document() - storage failure - Test: get_document() - found - Test: get_document() - not found - Test: Retention date calculation (default 90 days) - Implement: `src/service/documents_service.py` - [ ] **Test**: `tests/unit/service/test_threads_service.py` - Test: create_thread() - links to document - Test: create_thread() - document not found (404) - Test: list_messages() - cursor pagination - Test: add_message() - role validation - Implement: `src/service/threads_service.py` - [ ] **Test**: `tests/unit/service/test_exports_service.py` - Test: create_export_job() - returns jobId - Test: get_export_status() - job states (QUEUED, RUNNING, COMPLETED, FAILED) - Test: Async job enqueuing (mock Redis/ARQ) - Implement: `src/service/exports_service.py` **Acceptance Criteria**: - Service layer has clear, testable interfaces - All external dependencies are mocked - Tests verify business logic, not infrastructure - Each service method has both success and failure test cases ### 4.3 Adapter Contracts (Protocols) **Goal**: Define adapter interfaces using Python Protocols for mockability. **Tasks**: - [ ] Create `src/adapters/protocols.py`: - `StorageAdapter` protocol (save_encrypted_payload, get_encrypted_payload) - `DatabaseAdapter` protocol (CRUD operations) - `RoutingEngine` protocol (route_document) - `JobQueue` protocol (enqueue, get_status) - [ ] Create stub implementations for testing: - `tests/fixtures/storage_stub.py` (in-memory storage) - `tests/fixtures/db_stub.py` (in-memory DB) **Acceptance Criteria**: - Protocols are narrow and focused - Test fixtures implement all protocols - Production adapters can be swapped without changing service layer --- ## 5. Phase 3: Prototype Integration (Week 3) ### 5.1 Comparative Analysis **Goal**: Identify best implementations across prototypes. **Tasks**: - [ ] Analyze `prototype-chatgpt5/src/app/adapters/`: - auth.py - OAuth2 implementation quality - db.py - SQLAlchemy async patterns - storage.py - S3 streaming approach - routing.py - Routing logic structure - [ ] Analyze `prototype-geminiNbt3pro/`: - Identify unique features or better implementations - [ ] Analyze `prototype-grok4.1/`: - Compare test coverage and patterns - [ ] Document findings in: `docs/development/prototype-analysis.md` - Table: Feature vs Prototype vs Recommendation - Rationale for selections **Acceptance Criteria**: - Clear decision on which implementation to use for each component - Documented rationale for selections - Identified any missing features across all prototypes ### 5.2 Core Adapter Implementation **Goal**: Implement production adapters based on best prototype code. **Tasks**: - [ ] **Database Adapter** (`src/adapters/db.py`): - Port SQLAlchemy models from chosen prototype - Implement async session management - Add connection pooling configuration - Write integration tests: `tests/integration/test_db_adapter.py` - [ ] **Storage Adapter** (`src/adapters/storage.py`): - Implement S3 client using aiobotocore - Add streaming upload/download (no in-memory buffering) - Mock S3 in tests using moto or similar - Write tests: `tests/integration/test_storage_adapter.py` - [ ] **Routing Engine** (`src/adapters/routing.py`): - Port routing logic from prototype - Make routing rules configurable (not hardcoded) - Add caching layer (Redis) for routing rules - Write tests: `tests/unit/adapters/test_routing.py` - [ ] **Authentication** (`src/adapters/auth.py`): - Implement JWT validation - Add JWKS caching - Create FastAPI dependency for auth - Write tests: `tests/unit/adapters/test_auth.py` **Acceptance Criteria**: - All adapters follow Protocol contracts - Integration tests use real dependencies (testcontainers) - Unit tests use mocks - Streaming works for large files (>50MB) ### 5.3 API Layer Implementation **Goal**: Build FastAPI routes with OpenAPI compliance. **Tasks**: - [ ] **Documents API** (`src/api/documents.py`): - POST /documents - implement with streaming upload - GET /documents/{id} - implement with ETag support - Add request validation (Pydantic) - Write integration tests: `tests/integration/test_documents_api.py` - [ ] **Threads API** (`src/api/threads.py`): - POST /documents/{id}/threads - GET /threads/{id}/messages (cursor pagination) - POST /threads/{id}/messages - Write integration tests: `tests/integration/test_threads_api.py` - [ ] **Exports API** (`src/api/exports.py`): - POST /exports (async job creation) - GET /exports/{jobId} (status polling) - Write integration tests: `tests/integration/test_exports_api.py` - [ ] **Main App** (`src/main.py`): - Configure FastAPI with CORS, middleware - Include all routers - Add exception handlers - Add health check endpoint: GET /health **Acceptance Criteria**: - OpenAPI schema matches `docs/api/openapi.yaml` - All endpoints have auth middleware - Integration tests achieve >80% coverage - API responses match documented format ### 5.4 Background Workers **Goal**: Implement async export worker. **Tasks**: - [ ] Choose task queue: ARQ (Redis-based, async) or Celery - [ ] Implement `src/workers/exports_worker.py`: - Fetch document from storage - Fetch message history from DB - Generate export package (PDF + metadata) - Update job status - [ ] Write worker tests: `tests/unit/workers/test_exports_worker.py` - [ ] Document worker deployment in: `docs/development/worker-deployment.md` **Acceptance Criteria**: - Worker processes export jobs independently - Failures are logged and job marked as FAILED - Worker can be scaled horizontally --- ## 6. Phase 4: CI/CD & Quality Gates (Week 4) ### 6.1 GitHub Actions Workflows **Goal**: Automate testing and quality checks. **Tasks**: - [ ] Create `.github/workflows/test.yml`: - Run on: push, pull_request - Matrix: Python 3.11, 3.12 - Steps: Install deps, run pytest, upload coverage - [ ] Create `.github/workflows/lint.yml`: - Run ruff linting - Run mypy type checking - Check code formatting - [ ] Create `.github/workflows/integration.yml`: - Spin up PostgreSQL, Redis via services - Run integration tests with real dependencies - [ ] Add status badges to README.md **Acceptance Criteria**: - All workflows pass on main branch - Pull requests blocked if tests fail - Coverage report available in PR comments ### 6.2 Pre-commit Hooks **Goal**: Catch issues before commit. **Tasks**: - [ ] Create `.pre-commit-config.yaml`: - ruff linting and formatting - mypy type checking - trailing whitespace removal - YAML validation - [ ] Document setup in: `docs/development/setup-guide.md` **Acceptance Criteria**: - Hooks auto-format code - Hooks prevent commits with type errors - Setup documented for new developers ### 6.3 Test Coverage Requirements **Goal**: Enforce quality thresholds. **Tasks**: - [ ] Configure pytest-cov in `pytest.ini`: - Minimum coverage: 80% - Exclude: tests/, scripts/ - [ ] Add coverage badge to README.md - [ ] Document coverage exemptions (e.g., `# pragma: no cover`) **Acceptance Criteria**: - `pytest --cov` fails if <80% coverage - Coverage report generated in HTML format - Uncovered lines are intentional and documented --- ## 7. Phase 5: Agentic Coding Enablement (Week 5) ### 7.1 Agentic Coding Guide **Goal**: Create comprehensive guide for LLM-driven development. **Tasks**: - [ ] Create `docs/development/agentic-coding-guide.md`: - TDD workflow for Claude/GPT - Example prompts for generating tests - How to use Protocol adapters for mocking - Async testing patterns - Common pitfalls (GIL, blocking operations) - [ ] Add example prompt templates: - "Write async pytest for POST /documents with ProcessPoolExecutor mock" - "Implement cursor pagination for messages following ADR-003" - [ ] Update CLAUDE.md with agentic coding patterns **Acceptance Criteria**: - Guide includes concrete examples - Prompts reference specific ADRs - Guide covers both unit and integration test generation ### 7.2 Testing Utilities & Fixtures **Goal**: Provide reusable test infrastructure. **Tasks**: - [ ] Create `tests/fixtures/factories.py`: - DocumentMetadataFactory (faker-based) - ThreadFactory - MessageFactory - [ ] Create `tests/fixtures/db_fixtures.py`: - @pytest.fixture for async DB session - @pytest.fixture for testcontainers Postgres - [ ] Create `tests/fixtures/auth_fixtures.py`: - Mock JWT tokens with different scopes - Mock JWKS endpoints - [ ] Document in: `docs/development/testing-utilities.md` **Acceptance Criteria**: - Fixtures reduce boilerplate in tests - Factories generate realistic test data - Documentation shows usage examples ### 7.3 ADR for Agentic Development **Goal**: Document the TDD + AI approach as architectural decision. **Tasks**: - [ ] Create `docs/architecture/adr/0008-agentic-tdd-workflow.md`: - Context: LLM-driven development velocity vs. quality - Decision: Interface-first TDD with AI assistance - Rationale: Tests serve as executable specification - Implementation: Workflow, tooling, prompts **Acceptance Criteria**: - ADR approved by team - Links to agentic-coding-guide.md - Referenced in CLAUDE.md --- ## 8. Phase 6: Migration & Validation (Week 6) ### 8.1 Prototype Deprecation **Goal**: Mark prototypes as archived. **Tasks**: - [ ] Add README.md to each prototype directory: - Status: ARCHIVED - Reason: Consolidated into main codebase - Date: 2025-12-XX - [ ] Document migration decisions in: `docs/development/prototype-migration.md` - [ ] Keep prototypes in repo for reference (don't delete) **Acceptance Criteria**: - Clear indication that prototypes are not maintained - Migration rationale documented ### 8.2 End-to-End Validation **Goal**: Verify complete system integration. **Tasks**: - [ ] Write E2E test: `tests/e2e/test_full_workflow.py`: - Citizen uploads document - Document is routed - Thread is created - Messages are exchanged - Export is generated - [ ] Run against local environment (Docker Compose) - [ ] Measure performance against NFRs: - Document upload + routing: <500ms - Message retrieval: <300ms - [ ] Document E2E setup in: `docs/development/e2e-testing.md` **Acceptance Criteria**: - E2E test passes consistently - Performance targets met - E2E environment reproducible via Docker Compose ### 8.3 Documentation Review **Goal**: Ensure all documentation is accurate and complete. **Tasks**: - [ ] Review all ADRs for consistency - [ ] Update CLAUDE.md with final structure - [ ] Review API documentation against implementation - [ ] Spell check and grammar check all docs - [ ] Generate API documentation from OpenAPI spec (ReDoc or Swagger UI) **Acceptance Criteria**: - No broken links in documentation - Code examples in docs are tested - CLAUDE.md accurately reflects current state --- ## 9. Success Criteria (Overall) ### Functional Requirements - ✅ Main codebase has all features from best prototype - ✅ All core APIs implemented and tested - ✅ Background worker functional ### Quality Requirements - ✅ Test coverage >80% - ✅ Zero mypy type errors - ✅ All linting rules pass - ✅ CI/CD pipeline green ### Documentation Requirements - ✅ ADRs in individual files with consistent structure - ✅ All major decisions documented - ✅ Agentic coding guide comprehensive - ✅ CLAUDE.md accurate and complete ### Performance Requirements - ✅ Document routing <500ms (measured in E2E tests) - ✅ Message retrieval <300ms (measured in E2E tests) - ✅ Large file upload streaming works (>50MB test) ### Process Requirements - ✅ TDD workflow established and documented - ✅ Pre-commit hooks prevent quality issues - ✅ GitHub Actions enforce quality gates - ✅ Agentic development patterns proven with at least 3 features --- ## 10. Risk Mitigation ### Risk 1: Prototype Integration Conflicts **Mitigation**: Complete comparative analysis (Phase 3.1) before implementation. Document decision rationale. ### Risk 2: TDD Slowing Initial Progress **Mitigation**: Front-load interface definition (Phase 2). Once interfaces stable, implementation accelerates. ### Risk 3: Incomplete ADR Extraction **Mitigation**: Use checklist approach. Review original `decisions.md` multiple times. Cross-reference with implementation guide. ### Risk 4: Agentic Coding Learning Curve **Mitigation**: Create example-driven guide. Include actual prompts that worked. Pair with human for first few features. ### Risk 5: Performance Targets Not Met **Mitigation**: Include performance testing from Phase 2 onwards. Identify bottlenecks early. Profile with py-spy or similar. --- ## 11. Next Steps 1. **Review this workplan** with the team 2. **Adjust timeline** based on team capacity 3. **Start Phase 1** (Foundation Setup) 4. **Daily standup** to track progress and blockers 5. **Weekly retrospective** to improve agentic coding workflow --- ## Appendix A: Recommended Tools - **Testing**: pytest, pytest-asyncio, pytest-cov, hypothesis (property testing) - **Mocking**: respx (HTTP), moto (AWS), testcontainers-python (real services) - **Linting**: ruff (fast, replaces flake8 + isort + pyupgrade) - **Type Checking**: mypy with strict mode - **Factories**: factory_boy or custom Pydantic factories - **Performance**: py-spy (profiling), locust (load testing) - **Pre-commit**: pre-commit framework - **CI/CD**: GitHub Actions (free for public repos) --- ## Appendix B: ADR Numbering Convention - **0001-0099**: Core Architecture (payload model, auth, concurrency) - **0100-0199**: Data & Persistence (pagination, retention, schema) - **0200-0299**: API & Integration (REST structure, export workflow) - **0300-0399**: Development Process (TDD, agentic coding) - **0400+**: Future decisions --- **Document Owner**: Backend Engineering Team **Last Updated**: 2025-12-01 **Status**: Ready for Review