22 KiB
Workplan: Main Codebase Integration & TDD Setup
Status: Draft v1.0 Date: 2025-12-01 Goal: Consolidate prototype learnings into a production-ready main codebase with TDD practices and proper ADR documentation structure.
1. Current State Assessment
Existing Assets
- 3 Prototype Implementations: chatgpt5 (most complete), geminiNbt3pro, grok4.1
- Documentation:
- Single
decisions.mdfile with all ADRs (needs restructuring) implementation_guide.md(comprehensive)api_docs.md(scenario-based)openapi.yaml(API spec)
- Single
- Architecture: Clean Architecture pattern established in chatgpt5 prototype
Gaps
- No unified main codebase
- ADRs not in individual files (not easily referenceable)
- No TDD test suite defining core interfaces
- No CI/CD pipeline configuration
- Prototypes have overlapping implementations without consolidation
2. Target State Definition
Main Codebase Structure
/
├── docs/
│ ├── architecture/
│ │ ├── adr/ # Individual ADR files
│ │ │ ├── 0001-split-payload-model.md
│ │ │ ├── 0002-stateless-auth.md
│ │ │ └── ...
│ │ ├── architecture-overview.md
│ │ └── design-patterns.md
│ ├── api/
│ │ ├── openapi.yaml
│ │ └── api-scenarios.md
│ └── development/
│ ├── testing-strategy.md
│ └── agentic-coding-guide.md
├── src/
│ ├── domain/ # Pure business logic (TDD tested)
│ ├── adapters/ # External integrations (mocked in tests)
│ ├── service/ # Application services (TDD tested)
│ ├── api/ # FastAPI routes (integration tested)
│ └── workers/ # Background jobs
├── tests/
│ ├── unit/ # TDD unit tests
│ ├── integration/ # Integration tests
│ └── fixtures/ # Test data and mocks
├── scripts/
│ └── init_db.py
├── pyproject.toml # Dependencies & tool config
├── pytest.ini # Test configuration
├── .github/
│ └── workflows/ # CI/CD pipelines
└── CLAUDE.md # AI coding assistant guidance
TDD Approach
- Interface-first: Define Pydantic models and service interfaces via tests
- Red-Green-Refactor: Write failing test → implement → refactor
- Test pyramid: Many unit tests, fewer integration tests, minimal e2e tests
- Async testing: Use pytest-asyncio for async operations
ADR Documentation Standards
- Format: Follow Markdown Any Decision Records (MADR) template
- Naming:
NNNN-title-with-dashes.md(e.g.,0001-split-payload-model.md) - Location:
/docs/architecture/adr/ - Template:
# ADR-NNNN: [Title] **Status:** [Accepted|Proposed|Deprecated|Superseded] **Date:** YYYY-MM-DD **Deciders:** [Team/Role] ## Context and Problem Statement [What is the issue we're addressing?] ## Decision Drivers * [Driver 1] * [Driver 2] ## Considered Options * Option 1 * Option 2 ## Decision Outcome Chosen option: [option], because [rationale]. ### Positive Consequences * [Consequence 1] ### Negative Consequences * [Consequence 1] ## Implementation Notes [Specific technical guidance]
3. Phase 1: Foundation Setup (Week 1)
3.1 ADR Restructuring
Goal: Extract individual ADRs from decisions.md into separate files.
Tasks:
- Create
/docs/architecture/adr/directory - Create ADR template file:
docs/architecture/adr/0000-template.md - Extract ADR-001 (Split-Payload Model) →
0001-split-payload-model.md - Extract ADR-002 (Stateless Auth) →
0002-stateless-authentication.md - Extract ADR-003 (Pagination) →
0003-cursor-based-pagination.md - Extract ADR-004 (Async Exports) →
0004-async-export-workflow.md - Extract ADR-005 (Resource Naming) →
0005-rest-resource-structure.md - Extract ADR-006 (Data Retention) →
0006-gdpr-retention-model.md - Extract ADR-007 (Python & ProcessPoolExecutor) →
0007-hybrid-concurrency-pattern.md - Create index file:
docs/architecture/adr/README.mdwith ADR list and links - Update
decisions.mdwith deprecation notice and redirect to ADR directory
Acceptance Criteria:
- Each ADR is a standalone markdown file
- ADRs follow consistent template structure
- Index file allows easy navigation
- CLAUDE.md updated to reference new ADR location
3.2 Main Codebase Scaffolding
Goal: Create production-ready directory structure with tooling.
Tasks:
- Create
/src/directory with Clean Architecture structure - Initialize
pyproject.tomlwith dependencies from prototype-chatgpt5 - Add dev dependencies: pytest, pytest-asyncio, pytest-cov, httpx, respx, mypy, ruff
- Create
pytest.iniwith async and coverage configuration - Create
.gitignore(Python, IDE, env files) - Create
tests/structure: unit/, integration/, fixtures/ - Create
scripts/directory - Set up pre-commit hooks configuration (ruff, mypy)
Acceptance Criteria:
- Directory structure matches target state
pip install -e ".[dev]"workspytestruns (even with 0 tests)- Type checking with
mypy src/works
3.3 Documentation Organization
Goal: Restructure documentation for clarity and AI assistant consumption.
Tasks:
- Create
/docs/architecture/directory - Create
/docs/api/directory - Create
/docs/development/directory - Move
openapi.yaml→docs/api/openapi.yaml - Move
api_docs.md→docs/api/api-scenarios.md - Refactor
implementation_guide.md→ split into:docs/architecture/design-patterns.md(architectural patterns)docs/development/testing-strategy.md(TDD approach)docs/development/agentic-coding-guide.md(LLM-specific guidance)
- Create
docs/architecture/architecture-overview.md(high-level system design) - Update CLAUDE.md with new documentation structure
Acceptance Criteria:
- Documentation is logically organized by concern
- Each document has a single, clear purpose
- CLAUDE.md points to correct locations
4. Phase 2: TDD Interface Definition (Week 2)
4.1 Domain Models (TDD)
Goal: Define core domain models with comprehensive test coverage.
Approach: Write tests first, then implement models.
Tasks:
-
Test:
tests/unit/domain/test_document_metadata.py- Test: Valid metadata creation
- Test: Invalid authorityId (empty, too long)
- Test: Invalid referenceNumber format
- Test: Future issuedAt date validation
- Implement:
src/domain/models.py→DocumentMetadata
-
Test:
tests/unit/domain/test_thread_models.py- Test: ThreadType enum values
- Test: SenderRole enum values
- Test: ThreadCreateRequest validation
- Test: Message model with encrypted content
- Implement: Thread-related models
-
Test:
tests/unit/domain/test_document_envelope.py- Test: Split payload structure
- Test: encryptedPayload validation (base64, size limits)
- Test: Status transitions (RECEIVED → ROUTED → ASSIGNED → CLOSED)
- Implement: DocumentEnvelope and status management
Acceptance Criteria:
- All domain models have >90% test coverage
- Pydantic validation catches invalid inputs
- Tests run in <1 second
- Zero mypy type errors
4.2 Service Layer Interfaces (TDD)
Goal: Define service contracts via protocol/ABC classes with tests.
Tasks:
-
Test:
tests/unit/service/test_documents_service.py- Mock: Database adapter
- Mock: Storage adapter (S3)
- Mock: Routing engine
- Test: create_document() - happy path
- Test: create_document() - routing failure
- Test: create_document() - storage failure
- Test: get_document() - found
- Test: get_document() - not found
- Test: Retention date calculation (default 90 days)
- Implement:
src/service/documents_service.py
-
Test:
tests/unit/service/test_threads_service.py- Test: create_thread() - links to document
- Test: create_thread() - document not found (404)
- Test: list_messages() - cursor pagination
- Test: add_message() - role validation
- Implement:
src/service/threads_service.py
-
Test:
tests/unit/service/test_exports_service.py- Test: create_export_job() - returns jobId
- Test: get_export_status() - job states (QUEUED, RUNNING, COMPLETED, FAILED)
- Test: Async job enqueuing (mock Redis/ARQ)
- Implement:
src/service/exports_service.py
Acceptance Criteria:
- Service layer has clear, testable interfaces
- All external dependencies are mocked
- Tests verify business logic, not infrastructure
- Each service method has both success and failure test cases
4.3 Adapter Contracts (Protocols)
Goal: Define adapter interfaces using Python Protocols for mockability.
Tasks:
-
Create
src/adapters/protocols.py:StorageAdapterprotocol (save_encrypted_payload, get_encrypted_payload)DatabaseAdapterprotocol (CRUD operations)RoutingEngineprotocol (route_document)JobQueueprotocol (enqueue, get_status)
-
Create stub implementations for testing:
tests/fixtures/storage_stub.py(in-memory storage)tests/fixtures/db_stub.py(in-memory DB)
Acceptance Criteria:
- Protocols are narrow and focused
- Test fixtures implement all protocols
- Production adapters can be swapped without changing service layer
5. Phase 3: Prototype Integration (Week 3)
5.1 Comparative Analysis
Goal: Identify best implementations across prototypes.
Tasks:
-
Analyze
prototype-chatgpt5/src/app/adapters/:- auth.py - OAuth2 implementation quality
- db.py - SQLAlchemy async patterns
- storage.py - S3 streaming approach
- routing.py - Routing logic structure
-
Analyze
prototype-geminiNbt3pro/:- Identify unique features or better implementations
-
Analyze
prototype-grok4.1/:- Compare test coverage and patterns
-
Document findings in:
docs/development/prototype-analysis.md- Table: Feature vs Prototype vs Recommendation
- Rationale for selections
Acceptance Criteria:
- Clear decision on which implementation to use for each component
- Documented rationale for selections
- Identified any missing features across all prototypes
5.2 Core Adapter Implementation
Goal: Implement production adapters based on best prototype code.
Tasks:
-
Database Adapter (
src/adapters/db.py):- Port SQLAlchemy models from chosen prototype
- Implement async session management
- Add connection pooling configuration
- Write integration tests:
tests/integration/test_db_adapter.py
-
Storage Adapter (
src/adapters/storage.py):- Implement S3 client using aiobotocore
- Add streaming upload/download (no in-memory buffering)
- Mock S3 in tests using moto or similar
- Write tests:
tests/integration/test_storage_adapter.py
-
Routing Engine (
src/adapters/routing.py):- Port routing logic from prototype
- Make routing rules configurable (not hardcoded)
- Add caching layer (Redis) for routing rules
- Write tests:
tests/unit/adapters/test_routing.py
-
Authentication (
src/adapters/auth.py):- Implement JWT validation
- Add JWKS caching
- Create FastAPI dependency for auth
- Write tests:
tests/unit/adapters/test_auth.py
Acceptance Criteria:
- All adapters follow Protocol contracts
- Integration tests use real dependencies (testcontainers)
- Unit tests use mocks
- Streaming works for large files (>50MB)
5.3 API Layer Implementation
Goal: Build FastAPI routes with OpenAPI compliance.
Tasks:
-
Documents API (
src/api/documents.py):- POST /documents - implement with streaming upload
- GET /documents/{id} - implement with ETag support
- Add request validation (Pydantic)
- Write integration tests:
tests/integration/test_documents_api.py
-
Threads API (
src/api/threads.py):- POST /documents/{id}/threads
- GET /threads/{id}/messages (cursor pagination)
- POST /threads/{id}/messages
- Write integration tests:
tests/integration/test_threads_api.py
-
Exports API (
src/api/exports.py):- POST /exports (async job creation)
- GET /exports/{jobId} (status polling)
- Write integration tests:
tests/integration/test_exports_api.py
-
Main App (
src/main.py):- Configure FastAPI with CORS, middleware
- Include all routers
- Add exception handlers
- Add health check endpoint: GET /health
Acceptance Criteria:
- OpenAPI schema matches
docs/api/openapi.yaml - All endpoints have auth middleware
- Integration tests achieve >80% coverage
- API responses match documented format
5.4 Background Workers
Goal: Implement async export worker.
Tasks:
-
Choose task queue: ARQ (Redis-based, async) or Celery
-
Implement
src/workers/exports_worker.py:- Fetch document from storage
- Fetch message history from DB
- Generate export package (PDF + metadata)
- Update job status
-
Write worker tests:
tests/unit/workers/test_exports_worker.py -
Document worker deployment in:
docs/development/worker-deployment.md
Acceptance Criteria:
- Worker processes export jobs independently
- Failures are logged and job marked as FAILED
- Worker can be scaled horizontally
6. Phase 4: CI/CD & Quality Gates (Week 4)
6.1 GitHub Actions Workflows
Goal: Automate testing and quality checks.
Tasks:
-
Create
.github/workflows/test.yml:- Run on: push, pull_request
- Matrix: Python 3.11, 3.12
- Steps: Install deps, run pytest, upload coverage
-
Create
.github/workflows/lint.yml:- Run ruff linting
- Run mypy type checking
- Check code formatting
-
Create
.github/workflows/integration.yml:- Spin up PostgreSQL, Redis via services
- Run integration tests with real dependencies
-
Add status badges to README.md
Acceptance Criteria:
- All workflows pass on main branch
- Pull requests blocked if tests fail
- Coverage report available in PR comments
6.2 Pre-commit Hooks
Goal: Catch issues before commit.
Tasks:
-
Create
.pre-commit-config.yaml:- ruff linting and formatting
- mypy type checking
- trailing whitespace removal
- YAML validation
-
Document setup in:
docs/development/setup-guide.md
Acceptance Criteria:
- Hooks auto-format code
- Hooks prevent commits with type errors
- Setup documented for new developers
6.3 Test Coverage Requirements
Goal: Enforce quality thresholds.
Tasks:
-
Configure pytest-cov in
pytest.ini:- Minimum coverage: 80%
- Exclude: tests/, scripts/
-
Add coverage badge to README.md
-
Document coverage exemptions (e.g.,
# pragma: no cover)
Acceptance Criteria:
pytest --covfails if <80% coverage- Coverage report generated in HTML format
- Uncovered lines are intentional and documented
7. Phase 5: Agentic Coding Enablement (Week 5)
7.1 Agentic Coding Guide
Goal: Create comprehensive guide for LLM-driven development.
Tasks:
-
Create
docs/development/agentic-coding-guide.md:- TDD workflow for Claude/GPT
- Example prompts for generating tests
- How to use Protocol adapters for mocking
- Async testing patterns
- Common pitfalls (GIL, blocking operations)
-
Add example prompt templates:
- "Write async pytest for POST /documents with ProcessPoolExecutor mock"
- "Implement cursor pagination for messages following ADR-003"
-
Update CLAUDE.md with agentic coding patterns
Acceptance Criteria:
- Guide includes concrete examples
- Prompts reference specific ADRs
- Guide covers both unit and integration test generation
7.2 Testing Utilities & Fixtures
Goal: Provide reusable test infrastructure.
Tasks:
-
Create
tests/fixtures/factories.py:- DocumentMetadataFactory (faker-based)
- ThreadFactory
- MessageFactory
-
Create
tests/fixtures/db_fixtures.py:- @pytest.fixture for async DB session
- @pytest.fixture for testcontainers Postgres
-
Create
tests/fixtures/auth_fixtures.py:- Mock JWT tokens with different scopes
- Mock JWKS endpoints
-
Document in:
docs/development/testing-utilities.md
Acceptance Criteria:
- Fixtures reduce boilerplate in tests
- Factories generate realistic test data
- Documentation shows usage examples
7.3 ADR for Agentic Development
Goal: Document the TDD + AI approach as architectural decision.
Tasks:
- Create
docs/architecture/adr/0008-agentic-tdd-workflow.md:- Context: LLM-driven development velocity vs. quality
- Decision: Interface-first TDD with AI assistance
- Rationale: Tests serve as executable specification
- Implementation: Workflow, tooling, prompts
Acceptance Criteria:
- ADR approved by team
- Links to agentic-coding-guide.md
- Referenced in CLAUDE.md
8. Phase 6: Migration & Validation (Week 6)
8.1 Prototype Deprecation
Goal: Mark prototypes as archived.
Tasks:
-
Add README.md to each prototype directory:
- Status: ARCHIVED
- Reason: Consolidated into main codebase
- Date: 2025-12-XX
-
Document migration decisions in:
docs/development/prototype-migration.md -
Keep prototypes in repo for reference (don't delete)
Acceptance Criteria:
- Clear indication that prototypes are not maintained
- Migration rationale documented
8.2 End-to-End Validation
Goal: Verify complete system integration.
Tasks:
-
Write E2E test:
tests/e2e/test_full_workflow.py:- Citizen uploads document
- Document is routed
- Thread is created
- Messages are exchanged
- Export is generated
-
Run against local environment (Docker Compose)
-
Measure performance against NFRs:
- Document upload + routing: <500ms
- Message retrieval: <300ms
-
Document E2E setup in:
docs/development/e2e-testing.md
Acceptance Criteria:
- E2E test passes consistently
- Performance targets met
- E2E environment reproducible via Docker Compose
8.3 Documentation Review
Goal: Ensure all documentation is accurate and complete.
Tasks:
- Review all ADRs for consistency
- Update CLAUDE.md with final structure
- Review API documentation against implementation
- Spell check and grammar check all docs
- Generate API documentation from OpenAPI spec (ReDoc or Swagger UI)
Acceptance Criteria:
- No broken links in documentation
- Code examples in docs are tested
- CLAUDE.md accurately reflects current state
9. Success Criteria (Overall)
Functional Requirements
- ✅ Main codebase has all features from best prototype
- ✅ All core APIs implemented and tested
- ✅ Background worker functional
Quality Requirements
- ✅ Test coverage >80%
- ✅ Zero mypy type errors
- ✅ All linting rules pass
- ✅ CI/CD pipeline green
Documentation Requirements
- ✅ ADRs in individual files with consistent structure
- ✅ All major decisions documented
- ✅ Agentic coding guide comprehensive
- ✅ CLAUDE.md accurate and complete
Performance Requirements
- ✅ Document routing <500ms (measured in E2E tests)
- ✅ Message retrieval <300ms (measured in E2E tests)
- ✅ Large file upload streaming works (>50MB test)
Process Requirements
- ✅ TDD workflow established and documented
- ✅ Pre-commit hooks prevent quality issues
- ✅ GitHub Actions enforce quality gates
- ✅ Agentic development patterns proven with at least 3 features
10. Risk Mitigation
Risk 1: Prototype Integration Conflicts
Mitigation: Complete comparative analysis (Phase 3.1) before implementation. Document decision rationale.
Risk 2: TDD Slowing Initial Progress
Mitigation: Front-load interface definition (Phase 2). Once interfaces stable, implementation accelerates.
Risk 3: Incomplete ADR Extraction
Mitigation: Use checklist approach. Review original decisions.md multiple times. Cross-reference with implementation guide.
Risk 4: Agentic Coding Learning Curve
Mitigation: Create example-driven guide. Include actual prompts that worked. Pair with human for first few features.
Risk 5: Performance Targets Not Met
Mitigation: Include performance testing from Phase 2 onwards. Identify bottlenecks early. Profile with py-spy or similar.
11. Next Steps
- Review this workplan with the team
- Adjust timeline based on team capacity
- Start Phase 1 (Foundation Setup)
- Daily standup to track progress and blockers
- Weekly retrospective to improve agentic coding workflow
Appendix A: Recommended Tools
- Testing: pytest, pytest-asyncio, pytest-cov, hypothesis (property testing)
- Mocking: respx (HTTP), moto (AWS), testcontainers-python (real services)
- Linting: ruff (fast, replaces flake8 + isort + pyupgrade)
- Type Checking: mypy with strict mode
- Factories: factory_boy or custom Pydantic factories
- Performance: py-spy (profiling), locust (load testing)
- Pre-commit: pre-commit framework
- CI/CD: GitHub Actions (free for public repos)
Appendix B: ADR Numbering Convention
- 0001-0099: Core Architecture (payload model, auth, concurrency)
- 0100-0199: Data & Persistence (pagination, retention, schema)
- 0200-0299: API & Integration (REST structure, export workflow)
- 0300-0399: Development Process (TDD, agentic coding)
- 0400+: Future decisions
Document Owner: Backend Engineering Team Last Updated: 2025-12-01 Status: Ready for Review