Files
direkt-vermittlung-de/docs/WORKPLAN_MainCodebase_Integration.md

22 KiB

Workplan: Main Codebase Integration & TDD Setup

Status: Draft v1.0 Date: 2025-12-01 Goal: Consolidate prototype learnings into a production-ready main codebase with TDD practices and proper ADR documentation structure.


1. Current State Assessment

Existing Assets

  • 3 Prototype Implementations: chatgpt5 (most complete), geminiNbt3pro, grok4.1
  • Documentation:
    • Single decisions.md file with all ADRs (needs restructuring)
    • implementation_guide.md (comprehensive)
    • api_docs.md (scenario-based)
    • openapi.yaml (API spec)
  • Architecture: Clean Architecture pattern established in chatgpt5 prototype

Gaps

  • No unified main codebase
  • ADRs not in individual files (not easily referenceable)
  • No TDD test suite defining core interfaces
  • No CI/CD pipeline configuration
  • Prototypes have overlapping implementations without consolidation

2. Target State Definition

Main Codebase Structure

/
├── docs/
│   ├── architecture/
│   │   ├── adr/                    # Individual ADR files
│   │   │   ├── 0001-split-payload-model.md
│   │   │   ├── 0002-stateless-auth.md
│   │   │   └── ...
│   │   ├── architecture-overview.md
│   │   └── design-patterns.md
│   ├── api/
│   │   ├── openapi.yaml
│   │   └── api-scenarios.md
│   └── development/
│       ├── testing-strategy.md
│       └── agentic-coding-guide.md
├── src/
│   ├── domain/          # Pure business logic (TDD tested)
│   ├── adapters/        # External integrations (mocked in tests)
│   ├── service/         # Application services (TDD tested)
│   ├── api/             # FastAPI routes (integration tested)
│   └── workers/         # Background jobs
├── tests/
│   ├── unit/            # TDD unit tests
│   ├── integration/     # Integration tests
│   └── fixtures/        # Test data and mocks
├── scripts/
│   └── init_db.py
├── pyproject.toml       # Dependencies & tool config
├── pytest.ini           # Test configuration
├── .github/
│   └── workflows/       # CI/CD pipelines
└── CLAUDE.md            # AI coding assistant guidance

TDD Approach

  • Interface-first: Define Pydantic models and service interfaces via tests
  • Red-Green-Refactor: Write failing test → implement → refactor
  • Test pyramid: Many unit tests, fewer integration tests, minimal e2e tests
  • Async testing: Use pytest-asyncio for async operations

ADR Documentation Standards

  • Format: Follow Markdown Any Decision Records (MADR) template
  • Naming: NNNN-title-with-dashes.md (e.g., 0001-split-payload-model.md)
  • Location: /docs/architecture/adr/
  • Template:
    # ADR-NNNN: [Title]
    
    **Status:** [Accepted|Proposed|Deprecated|Superseded]
    **Date:** YYYY-MM-DD
    **Deciders:** [Team/Role]
    
    ## Context and Problem Statement
    [What is the issue we're addressing?]
    
    ## Decision Drivers
    * [Driver 1]
    * [Driver 2]
    
    ## Considered Options
    * Option 1
    * Option 2
    
    ## Decision Outcome
    Chosen option: [option], because [rationale].
    
    ### Positive Consequences
    * [Consequence 1]
    
    ### Negative Consequences
    * [Consequence 1]
    
    ## Implementation Notes
    [Specific technical guidance]
    

3. Phase 1: Foundation Setup (Week 1)

3.1 ADR Restructuring

Goal: Extract individual ADRs from decisions.md into separate files.

Tasks:

  • Create /docs/architecture/adr/ directory
  • Create ADR template file: docs/architecture/adr/0000-template.md
  • Extract ADR-001 (Split-Payload Model) → 0001-split-payload-model.md
  • Extract ADR-002 (Stateless Auth) → 0002-stateless-authentication.md
  • Extract ADR-003 (Pagination) → 0003-cursor-based-pagination.md
  • Extract ADR-004 (Async Exports) → 0004-async-export-workflow.md
  • Extract ADR-005 (Resource Naming) → 0005-rest-resource-structure.md
  • Extract ADR-006 (Data Retention) → 0006-gdpr-retention-model.md
  • Extract ADR-007 (Python & ProcessPoolExecutor) → 0007-hybrid-concurrency-pattern.md
  • Create index file: docs/architecture/adr/README.md with ADR list and links
  • Update decisions.md with deprecation notice and redirect to ADR directory

Acceptance Criteria:

  • Each ADR is a standalone markdown file
  • ADRs follow consistent template structure
  • Index file allows easy navigation
  • CLAUDE.md updated to reference new ADR location

3.2 Main Codebase Scaffolding

Goal: Create production-ready directory structure with tooling.

Tasks:

  • Create /src/ directory with Clean Architecture structure
  • Initialize pyproject.toml with dependencies from prototype-chatgpt5
  • Add dev dependencies: pytest, pytest-asyncio, pytest-cov, httpx, respx, mypy, ruff
  • Create pytest.ini with async and coverage configuration
  • Create .gitignore (Python, IDE, env files)
  • Create tests/ structure: unit/, integration/, fixtures/
  • Create scripts/ directory
  • Set up pre-commit hooks configuration (ruff, mypy)

Acceptance Criteria:

  • Directory structure matches target state
  • pip install -e ".[dev]" works
  • pytest runs (even with 0 tests)
  • Type checking with mypy src/ works

3.3 Documentation Organization

Goal: Restructure documentation for clarity and AI assistant consumption.

Tasks:

  • Create /docs/architecture/ directory
  • Create /docs/api/ directory
  • Create /docs/development/ directory
  • Move openapi.yamldocs/api/openapi.yaml
  • Move api_docs.mddocs/api/api-scenarios.md
  • Refactor implementation_guide.md → split into:
    • docs/architecture/design-patterns.md (architectural patterns)
    • docs/development/testing-strategy.md (TDD approach)
    • docs/development/agentic-coding-guide.md (LLM-specific guidance)
  • Create docs/architecture/architecture-overview.md (high-level system design)
  • Update CLAUDE.md with new documentation structure

Acceptance Criteria:

  • Documentation is logically organized by concern
  • Each document has a single, clear purpose
  • CLAUDE.md points to correct locations

4. Phase 2: TDD Interface Definition (Week 2)

4.1 Domain Models (TDD)

Goal: Define core domain models with comprehensive test coverage.

Approach: Write tests first, then implement models.

Tasks:

  • Test: tests/unit/domain/test_document_metadata.py

    • Test: Valid metadata creation
    • Test: Invalid authorityId (empty, too long)
    • Test: Invalid referenceNumber format
    • Test: Future issuedAt date validation
    • Implement: src/domain/models.pyDocumentMetadata
  • Test: tests/unit/domain/test_thread_models.py

    • Test: ThreadType enum values
    • Test: SenderRole enum values
    • Test: ThreadCreateRequest validation
    • Test: Message model with encrypted content
    • Implement: Thread-related models
  • Test: tests/unit/domain/test_document_envelope.py

    • Test: Split payload structure
    • Test: encryptedPayload validation (base64, size limits)
    • Test: Status transitions (RECEIVED → ROUTED → ASSIGNED → CLOSED)
    • Implement: DocumentEnvelope and status management

Acceptance Criteria:

  • All domain models have >90% test coverage
  • Pydantic validation catches invalid inputs
  • Tests run in <1 second
  • Zero mypy type errors

4.2 Service Layer Interfaces (TDD)

Goal: Define service contracts via protocol/ABC classes with tests.

Tasks:

  • Test: tests/unit/service/test_documents_service.py

    • Mock: Database adapter
    • Mock: Storage adapter (S3)
    • Mock: Routing engine
    • Test: create_document() - happy path
    • Test: create_document() - routing failure
    • Test: create_document() - storage failure
    • Test: get_document() - found
    • Test: get_document() - not found
    • Test: Retention date calculation (default 90 days)
    • Implement: src/service/documents_service.py
  • Test: tests/unit/service/test_threads_service.py

    • Test: create_thread() - links to document
    • Test: create_thread() - document not found (404)
    • Test: list_messages() - cursor pagination
    • Test: add_message() - role validation
    • Implement: src/service/threads_service.py
  • Test: tests/unit/service/test_exports_service.py

    • Test: create_export_job() - returns jobId
    • Test: get_export_status() - job states (QUEUED, RUNNING, COMPLETED, FAILED)
    • Test: Async job enqueuing (mock Redis/ARQ)
    • Implement: src/service/exports_service.py

Acceptance Criteria:

  • Service layer has clear, testable interfaces
  • All external dependencies are mocked
  • Tests verify business logic, not infrastructure
  • Each service method has both success and failure test cases

4.3 Adapter Contracts (Protocols)

Goal: Define adapter interfaces using Python Protocols for mockability.

Tasks:

  • Create src/adapters/protocols.py:

    • StorageAdapter protocol (save_encrypted_payload, get_encrypted_payload)
    • DatabaseAdapter protocol (CRUD operations)
    • RoutingEngine protocol (route_document)
    • JobQueue protocol (enqueue, get_status)
  • Create stub implementations for testing:

    • tests/fixtures/storage_stub.py (in-memory storage)
    • tests/fixtures/db_stub.py (in-memory DB)

Acceptance Criteria:

  • Protocols are narrow and focused
  • Test fixtures implement all protocols
  • Production adapters can be swapped without changing service layer

5. Phase 3: Prototype Integration (Week 3)

5.1 Comparative Analysis

Goal: Identify best implementations across prototypes.

Tasks:

  • Analyze prototype-chatgpt5/src/app/adapters/:

    • auth.py - OAuth2 implementation quality
    • db.py - SQLAlchemy async patterns
    • storage.py - S3 streaming approach
    • routing.py - Routing logic structure
  • Analyze prototype-geminiNbt3pro/:

    • Identify unique features or better implementations
  • Analyze prototype-grok4.1/:

    • Compare test coverage and patterns
  • Document findings in: docs/development/prototype-analysis.md

    • Table: Feature vs Prototype vs Recommendation
    • Rationale for selections

Acceptance Criteria:

  • Clear decision on which implementation to use for each component
  • Documented rationale for selections
  • Identified any missing features across all prototypes

5.2 Core Adapter Implementation

Goal: Implement production adapters based on best prototype code.

Tasks:

  • Database Adapter (src/adapters/db.py):

    • Port SQLAlchemy models from chosen prototype
    • Implement async session management
    • Add connection pooling configuration
    • Write integration tests: tests/integration/test_db_adapter.py
  • Storage Adapter (src/adapters/storage.py):

    • Implement S3 client using aiobotocore
    • Add streaming upload/download (no in-memory buffering)
    • Mock S3 in tests using moto or similar
    • Write tests: tests/integration/test_storage_adapter.py
  • Routing Engine (src/adapters/routing.py):

    • Port routing logic from prototype
    • Make routing rules configurable (not hardcoded)
    • Add caching layer (Redis) for routing rules
    • Write tests: tests/unit/adapters/test_routing.py
  • Authentication (src/adapters/auth.py):

    • Implement JWT validation
    • Add JWKS caching
    • Create FastAPI dependency for auth
    • Write tests: tests/unit/adapters/test_auth.py

Acceptance Criteria:

  • All adapters follow Protocol contracts
  • Integration tests use real dependencies (testcontainers)
  • Unit tests use mocks
  • Streaming works for large files (>50MB)

5.3 API Layer Implementation

Goal: Build FastAPI routes with OpenAPI compliance.

Tasks:

  • Documents API (src/api/documents.py):

    • POST /documents - implement with streaming upload
    • GET /documents/{id} - implement with ETag support
    • Add request validation (Pydantic)
    • Write integration tests: tests/integration/test_documents_api.py
  • Threads API (src/api/threads.py):

    • POST /documents/{id}/threads
    • GET /threads/{id}/messages (cursor pagination)
    • POST /threads/{id}/messages
    • Write integration tests: tests/integration/test_threads_api.py
  • Exports API (src/api/exports.py):

    • POST /exports (async job creation)
    • GET /exports/{jobId} (status polling)
    • Write integration tests: tests/integration/test_exports_api.py
  • Main App (src/main.py):

    • Configure FastAPI with CORS, middleware
    • Include all routers
    • Add exception handlers
    • Add health check endpoint: GET /health

Acceptance Criteria:

  • OpenAPI schema matches docs/api/openapi.yaml
  • All endpoints have auth middleware
  • Integration tests achieve >80% coverage
  • API responses match documented format

5.4 Background Workers

Goal: Implement async export worker.

Tasks:

  • Choose task queue: ARQ (Redis-based, async) or Celery

  • Implement src/workers/exports_worker.py:

    • Fetch document from storage
    • Fetch message history from DB
    • Generate export package (PDF + metadata)
    • Update job status
  • Write worker tests: tests/unit/workers/test_exports_worker.py

  • Document worker deployment in: docs/development/worker-deployment.md

Acceptance Criteria:

  • Worker processes export jobs independently
  • Failures are logged and job marked as FAILED
  • Worker can be scaled horizontally

6. Phase 4: CI/CD & Quality Gates (Week 4)

6.1 GitHub Actions Workflows

Goal: Automate testing and quality checks.

Tasks:

  • Create .github/workflows/test.yml:

    • Run on: push, pull_request
    • Matrix: Python 3.11, 3.12
    • Steps: Install deps, run pytest, upload coverage
  • Create .github/workflows/lint.yml:

    • Run ruff linting
    • Run mypy type checking
    • Check code formatting
  • Create .github/workflows/integration.yml:

    • Spin up PostgreSQL, Redis via services
    • Run integration tests with real dependencies
  • Add status badges to README.md

Acceptance Criteria:

  • All workflows pass on main branch
  • Pull requests blocked if tests fail
  • Coverage report available in PR comments

6.2 Pre-commit Hooks

Goal: Catch issues before commit.

Tasks:

  • Create .pre-commit-config.yaml:

    • ruff linting and formatting
    • mypy type checking
    • trailing whitespace removal
    • YAML validation
  • Document setup in: docs/development/setup-guide.md

Acceptance Criteria:

  • Hooks auto-format code
  • Hooks prevent commits with type errors
  • Setup documented for new developers

6.3 Test Coverage Requirements

Goal: Enforce quality thresholds.

Tasks:

  • Configure pytest-cov in pytest.ini:

    • Minimum coverage: 80%
    • Exclude: tests/, scripts/
  • Add coverage badge to README.md

  • Document coverage exemptions (e.g., # pragma: no cover)

Acceptance Criteria:

  • pytest --cov fails if <80% coverage
  • Coverage report generated in HTML format
  • Uncovered lines are intentional and documented

7. Phase 5: Agentic Coding Enablement (Week 5)

7.1 Agentic Coding Guide

Goal: Create comprehensive guide for LLM-driven development.

Tasks:

  • Create docs/development/agentic-coding-guide.md:

    • TDD workflow for Claude/GPT
    • Example prompts for generating tests
    • How to use Protocol adapters for mocking
    • Async testing patterns
    • Common pitfalls (GIL, blocking operations)
  • Add example prompt templates:

    • "Write async pytest for POST /documents with ProcessPoolExecutor mock"
    • "Implement cursor pagination for messages following ADR-003"
  • Update CLAUDE.md with agentic coding patterns

Acceptance Criteria:

  • Guide includes concrete examples
  • Prompts reference specific ADRs
  • Guide covers both unit and integration test generation

7.2 Testing Utilities & Fixtures

Goal: Provide reusable test infrastructure.

Tasks:

  • Create tests/fixtures/factories.py:

    • DocumentMetadataFactory (faker-based)
    • ThreadFactory
    • MessageFactory
  • Create tests/fixtures/db_fixtures.py:

    • @pytest.fixture for async DB session
    • @pytest.fixture for testcontainers Postgres
  • Create tests/fixtures/auth_fixtures.py:

    • Mock JWT tokens with different scopes
    • Mock JWKS endpoints
  • Document in: docs/development/testing-utilities.md

Acceptance Criteria:

  • Fixtures reduce boilerplate in tests
  • Factories generate realistic test data
  • Documentation shows usage examples

7.3 ADR for Agentic Development

Goal: Document the TDD + AI approach as architectural decision.

Tasks:

  • Create docs/architecture/adr/0008-agentic-tdd-workflow.md:
    • Context: LLM-driven development velocity vs. quality
    • Decision: Interface-first TDD with AI assistance
    • Rationale: Tests serve as executable specification
    • Implementation: Workflow, tooling, prompts

Acceptance Criteria:

  • ADR approved by team
  • Links to agentic-coding-guide.md
  • Referenced in CLAUDE.md

8. Phase 6: Migration & Validation (Week 6)

8.1 Prototype Deprecation

Goal: Mark prototypes as archived.

Tasks:

  • Add README.md to each prototype directory:

    • Status: ARCHIVED
    • Reason: Consolidated into main codebase
    • Date: 2025-12-XX
  • Document migration decisions in: docs/development/prototype-migration.md

  • Keep prototypes in repo for reference (don't delete)

Acceptance Criteria:

  • Clear indication that prototypes are not maintained
  • Migration rationale documented

8.2 End-to-End Validation

Goal: Verify complete system integration.

Tasks:

  • Write E2E test: tests/e2e/test_full_workflow.py:

    • Citizen uploads document
    • Document is routed
    • Thread is created
    • Messages are exchanged
    • Export is generated
  • Run against local environment (Docker Compose)

  • Measure performance against NFRs:

    • Document upload + routing: <500ms
    • Message retrieval: <300ms
  • Document E2E setup in: docs/development/e2e-testing.md

Acceptance Criteria:

  • E2E test passes consistently
  • Performance targets met
  • E2E environment reproducible via Docker Compose

8.3 Documentation Review

Goal: Ensure all documentation is accurate and complete.

Tasks:

  • Review all ADRs for consistency
  • Update CLAUDE.md with final structure
  • Review API documentation against implementation
  • Spell check and grammar check all docs
  • Generate API documentation from OpenAPI spec (ReDoc or Swagger UI)

Acceptance Criteria:

  • No broken links in documentation
  • Code examples in docs are tested
  • CLAUDE.md accurately reflects current state

9. Success Criteria (Overall)

Functional Requirements

  • Main codebase has all features from best prototype
  • All core APIs implemented and tested
  • Background worker functional

Quality Requirements

  • Test coverage >80%
  • Zero mypy type errors
  • All linting rules pass
  • CI/CD pipeline green

Documentation Requirements

  • ADRs in individual files with consistent structure
  • All major decisions documented
  • Agentic coding guide comprehensive
  • CLAUDE.md accurate and complete

Performance Requirements

  • Document routing <500ms (measured in E2E tests)
  • Message retrieval <300ms (measured in E2E tests)
  • Large file upload streaming works (>50MB test)

Process Requirements

  • TDD workflow established and documented
  • Pre-commit hooks prevent quality issues
  • GitHub Actions enforce quality gates
  • Agentic development patterns proven with at least 3 features

10. Risk Mitigation

Risk 1: Prototype Integration Conflicts

Mitigation: Complete comparative analysis (Phase 3.1) before implementation. Document decision rationale.

Risk 2: TDD Slowing Initial Progress

Mitigation: Front-load interface definition (Phase 2). Once interfaces stable, implementation accelerates.

Risk 3: Incomplete ADR Extraction

Mitigation: Use checklist approach. Review original decisions.md multiple times. Cross-reference with implementation guide.

Risk 4: Agentic Coding Learning Curve

Mitigation: Create example-driven guide. Include actual prompts that worked. Pair with human for first few features.

Risk 5: Performance Targets Not Met

Mitigation: Include performance testing from Phase 2 onwards. Identify bottlenecks early. Profile with py-spy or similar.


11. Next Steps

  1. Review this workplan with the team
  2. Adjust timeline based on team capacity
  3. Start Phase 1 (Foundation Setup)
  4. Daily standup to track progress and blockers
  5. Weekly retrospective to improve agentic coding workflow

  • Testing: pytest, pytest-asyncio, pytest-cov, hypothesis (property testing)
  • Mocking: respx (HTTP), moto (AWS), testcontainers-python (real services)
  • Linting: ruff (fast, replaces flake8 + isort + pyupgrade)
  • Type Checking: mypy with strict mode
  • Factories: factory_boy or custom Pydantic factories
  • Performance: py-spy (profiling), locust (load testing)
  • Pre-commit: pre-commit framework
  • CI/CD: GitHub Actions (free for public repos)

Appendix B: ADR Numbering Convention

  • 0001-0099: Core Architecture (payload model, auth, concurrency)
  • 0100-0199: Data & Persistence (pagination, retention, schema)
  • 0200-0299: API & Integration (REST structure, export workflow)
  • 0300-0399: Development Process (TDD, agentic coding)
  • 0400+: Future decisions

Document Owner: Backend Engineering Team Last Updated: 2025-12-01 Status: Ready for Review