Files
direkt-vermittlung-de/docs/WORKPLAN_MainCodebase_Integration.md

654 lines
22 KiB
Markdown

# Workplan: Main Codebase Integration & TDD Setup
**Status:** Draft v1.0
**Date:** 2025-12-01
**Goal:** Consolidate prototype learnings into a production-ready main codebase with TDD practices and proper ADR documentation structure.
---
## 1. Current State Assessment
### Existing Assets
- **3 Prototype Implementations**: chatgpt5 (most complete), geminiNbt3pro, grok4.1
- **Documentation**:
- Single `decisions.md` file with all ADRs (needs restructuring)
- `implementation_guide.md` (comprehensive)
- `api_docs.md` (scenario-based)
- `openapi.yaml` (API spec)
- **Architecture**: Clean Architecture pattern established in chatgpt5 prototype
### Gaps
- No unified main codebase
- ADRs not in individual files (not easily referenceable)
- No TDD test suite defining core interfaces
- No CI/CD pipeline configuration
- Prototypes have overlapping implementations without consolidation
---
## 2. Target State Definition
### Main Codebase Structure
```
/
├── docs/
│ ├── architecture/
│ │ ├── adr/ # Individual ADR files
│ │ │ ├── 0001-split-payload-model.md
│ │ │ ├── 0002-stateless-auth.md
│ │ │ └── ...
│ │ ├── architecture-overview.md
│ │ └── design-patterns.md
│ ├── api/
│ │ ├── openapi.yaml
│ │ └── api-scenarios.md
│ └── development/
│ ├── testing-strategy.md
│ └── agentic-coding-guide.md
├── src/
│ ├── domain/ # Pure business logic (TDD tested)
│ ├── adapters/ # External integrations (mocked in tests)
│ ├── service/ # Application services (TDD tested)
│ ├── api/ # FastAPI routes (integration tested)
│ └── workers/ # Background jobs
├── tests/
│ ├── unit/ # TDD unit tests
│ ├── integration/ # Integration tests
│ └── fixtures/ # Test data and mocks
├── scripts/
│ └── init_db.py
├── pyproject.toml # Dependencies & tool config
├── pytest.ini # Test configuration
├── .github/
│ └── workflows/ # CI/CD pipelines
└── CLAUDE.md # AI coding assistant guidance
```
### TDD Approach
- **Interface-first**: Define Pydantic models and service interfaces via tests
- **Red-Green-Refactor**: Write failing test → implement → refactor
- **Test pyramid**: Many unit tests, fewer integration tests, minimal e2e tests
- **Async testing**: Use pytest-asyncio for async operations
### ADR Documentation Standards
- **Format**: Follow Markdown Any Decision Records (MADR) template
- **Naming**: `NNNN-title-with-dashes.md` (e.g., `0001-split-payload-model.md`)
- **Location**: `/docs/architecture/adr/`
- **Template**:
```markdown
# ADR-NNNN: [Title]
**Status:** [Accepted|Proposed|Deprecated|Superseded]
**Date:** YYYY-MM-DD
**Deciders:** [Team/Role]
## Context and Problem Statement
[What is the issue we're addressing?]
## Decision Drivers
* [Driver 1]
* [Driver 2]
## Considered Options
* Option 1
* Option 2
## Decision Outcome
Chosen option: [option], because [rationale].
### Positive Consequences
* [Consequence 1]
### Negative Consequences
* [Consequence 1]
## Implementation Notes
[Specific technical guidance]
```
---
## 3. Phase 1: Foundation Setup (Week 1)
### 3.1 ADR Restructuring
**Goal**: Extract individual ADRs from `decisions.md` into separate files.
**Tasks**:
- [ ] Create `/docs/architecture/adr/` directory
- [ ] Create ADR template file: `docs/architecture/adr/0000-template.md`
- [ ] Extract ADR-001 (Split-Payload Model) → `0001-split-payload-model.md`
- [ ] Extract ADR-002 (Stateless Auth) → `0002-stateless-authentication.md`
- [ ] Extract ADR-003 (Pagination) → `0003-cursor-based-pagination.md`
- [ ] Extract ADR-004 (Async Exports) → `0004-async-export-workflow.md`
- [ ] Extract ADR-005 (Resource Naming) → `0005-rest-resource-structure.md`
- [ ] Extract ADR-006 (Data Retention) → `0006-gdpr-retention-model.md`
- [ ] Extract ADR-007 (Python & ProcessPoolExecutor) → `0007-hybrid-concurrency-pattern.md`
- [ ] Create index file: `docs/architecture/adr/README.md` with ADR list and links
- [ ] Update `decisions.md` with deprecation notice and redirect to ADR directory
**Acceptance Criteria**:
- Each ADR is a standalone markdown file
- ADRs follow consistent template structure
- Index file allows easy navigation
- CLAUDE.md updated to reference new ADR location
### 3.2 Main Codebase Scaffolding
**Goal**: Create production-ready directory structure with tooling.
**Tasks**:
- [ ] Create `/src/` directory with Clean Architecture structure
- [ ] Initialize `pyproject.toml` with dependencies from prototype-chatgpt5
- [ ] Add dev dependencies: pytest, pytest-asyncio, pytest-cov, httpx, respx, mypy, ruff
- [ ] Create `pytest.ini` with async and coverage configuration
- [ ] Create `.gitignore` (Python, IDE, env files)
- [ ] Create `tests/` structure: unit/, integration/, fixtures/
- [ ] Create `scripts/` directory
- [ ] Set up pre-commit hooks configuration (ruff, mypy)
**Acceptance Criteria**:
- Directory structure matches target state
- `pip install -e ".[dev]"` works
- `pytest` runs (even with 0 tests)
- Type checking with `mypy src/` works
### 3.3 Documentation Organization
**Goal**: Restructure documentation for clarity and AI assistant consumption.
**Tasks**:
- [ ] Create `/docs/architecture/` directory
- [ ] Create `/docs/api/` directory
- [ ] Create `/docs/development/` directory
- [ ] Move `openapi.yaml` → `docs/api/openapi.yaml`
- [ ] Move `api_docs.md` → `docs/api/api-scenarios.md`
- [ ] Refactor `implementation_guide.md` → split into:
- `docs/architecture/design-patterns.md` (architectural patterns)
- `docs/development/testing-strategy.md` (TDD approach)
- `docs/development/agentic-coding-guide.md` (LLM-specific guidance)
- [ ] Create `docs/architecture/architecture-overview.md` (high-level system design)
- [ ] Update CLAUDE.md with new documentation structure
**Acceptance Criteria**:
- Documentation is logically organized by concern
- Each document has a single, clear purpose
- CLAUDE.md points to correct locations
---
## 4. Phase 2: TDD Interface Definition (Week 2)
### 4.1 Domain Models (TDD)
**Goal**: Define core domain models with comprehensive test coverage.
**Approach**: Write tests first, then implement models.
**Tasks**:
- [ ] **Test**: `tests/unit/domain/test_document_metadata.py`
- Test: Valid metadata creation
- Test: Invalid authorityId (empty, too long)
- Test: Invalid referenceNumber format
- Test: Future issuedAt date validation
- Implement: `src/domain/models.py` → `DocumentMetadata`
- [ ] **Test**: `tests/unit/domain/test_thread_models.py`
- Test: ThreadType enum values
- Test: SenderRole enum values
- Test: ThreadCreateRequest validation
- Test: Message model with encrypted content
- Implement: Thread-related models
- [ ] **Test**: `tests/unit/domain/test_document_envelope.py`
- Test: Split payload structure
- Test: encryptedPayload validation (base64, size limits)
- Test: Status transitions (RECEIVED → ROUTED → ASSIGNED → CLOSED)
- Implement: DocumentEnvelope and status management
**Acceptance Criteria**:
- All domain models have >90% test coverage
- Pydantic validation catches invalid inputs
- Tests run in <1 second
- Zero mypy type errors
### 4.2 Service Layer Interfaces (TDD)
**Goal**: Define service contracts via protocol/ABC classes with tests.
**Tasks**:
- [ ] **Test**: `tests/unit/service/test_documents_service.py`
- Mock: Database adapter
- Mock: Storage adapter (S3)
- Mock: Routing engine
- Test: create_document() - happy path
- Test: create_document() - routing failure
- Test: create_document() - storage failure
- Test: get_document() - found
- Test: get_document() - not found
- Test: Retention date calculation (default 90 days)
- Implement: `src/service/documents_service.py`
- [ ] **Test**: `tests/unit/service/test_threads_service.py`
- Test: create_thread() - links to document
- Test: create_thread() - document not found (404)
- Test: list_messages() - cursor pagination
- Test: add_message() - role validation
- Implement: `src/service/threads_service.py`
- [ ] **Test**: `tests/unit/service/test_exports_service.py`
- Test: create_export_job() - returns jobId
- Test: get_export_status() - job states (QUEUED, RUNNING, COMPLETED, FAILED)
- Test: Async job enqueuing (mock Redis/ARQ)
- Implement: `src/service/exports_service.py`
**Acceptance Criteria**:
- Service layer has clear, testable interfaces
- All external dependencies are mocked
- Tests verify business logic, not infrastructure
- Each service method has both success and failure test cases
### 4.3 Adapter Contracts (Protocols)
**Goal**: Define adapter interfaces using Python Protocols for mockability.
**Tasks**:
- [ ] Create `src/adapters/protocols.py`:
- `StorageAdapter` protocol (save_encrypted_payload, get_encrypted_payload)
- `DatabaseAdapter` protocol (CRUD operations)
- `RoutingEngine` protocol (route_document)
- `JobQueue` protocol (enqueue, get_status)
- [ ] Create stub implementations for testing:
- `tests/fixtures/storage_stub.py` (in-memory storage)
- `tests/fixtures/db_stub.py` (in-memory DB)
**Acceptance Criteria**:
- Protocols are narrow and focused
- Test fixtures implement all protocols
- Production adapters can be swapped without changing service layer
---
## 5. Phase 3: Prototype Integration (Week 3)
### 5.1 Comparative Analysis
**Goal**: Identify best implementations across prototypes.
**Tasks**:
- [ ] Analyze `prototype-chatgpt5/src/app/adapters/`:
- auth.py - OAuth2 implementation quality
- db.py - SQLAlchemy async patterns
- storage.py - S3 streaming approach
- routing.py - Routing logic structure
- [ ] Analyze `prototype-geminiNbt3pro/`:
- Identify unique features or better implementations
- [ ] Analyze `prototype-grok4.1/`:
- Compare test coverage and patterns
- [ ] Document findings in: `docs/development/prototype-analysis.md`
- Table: Feature vs Prototype vs Recommendation
- Rationale for selections
**Acceptance Criteria**:
- Clear decision on which implementation to use for each component
- Documented rationale for selections
- Identified any missing features across all prototypes
### 5.2 Core Adapter Implementation
**Goal**: Implement production adapters based on best prototype code.
**Tasks**:
- [ ] **Database Adapter** (`src/adapters/db.py`):
- Port SQLAlchemy models from chosen prototype
- Implement async session management
- Add connection pooling configuration
- Write integration tests: `tests/integration/test_db_adapter.py`
- [ ] **Storage Adapter** (`src/adapters/storage.py`):
- Implement S3 client using aiobotocore
- Add streaming upload/download (no in-memory buffering)
- Mock S3 in tests using moto or similar
- Write tests: `tests/integration/test_storage_adapter.py`
- [ ] **Routing Engine** (`src/adapters/routing.py`):
- Port routing logic from prototype
- Make routing rules configurable (not hardcoded)
- Add caching layer (Redis) for routing rules
- Write tests: `tests/unit/adapters/test_routing.py`
- [ ] **Authentication** (`src/adapters/auth.py`):
- Implement JWT validation
- Add JWKS caching
- Create FastAPI dependency for auth
- Write tests: `tests/unit/adapters/test_auth.py`
**Acceptance Criteria**:
- All adapters follow Protocol contracts
- Integration tests use real dependencies (testcontainers)
- Unit tests use mocks
- Streaming works for large files (>50MB)
### 5.3 API Layer Implementation
**Goal**: Build FastAPI routes with OpenAPI compliance.
**Tasks**:
- [ ] **Documents API** (`src/api/documents.py`):
- POST /documents - implement with streaming upload
- GET /documents/{id} - implement with ETag support
- Add request validation (Pydantic)
- Write integration tests: `tests/integration/test_documents_api.py`
- [ ] **Threads API** (`src/api/threads.py`):
- POST /documents/{id}/threads
- GET /threads/{id}/messages (cursor pagination)
- POST /threads/{id}/messages
- Write integration tests: `tests/integration/test_threads_api.py`
- [ ] **Exports API** (`src/api/exports.py`):
- POST /exports (async job creation)
- GET /exports/{jobId} (status polling)
- Write integration tests: `tests/integration/test_exports_api.py`
- [ ] **Main App** (`src/main.py`):
- Configure FastAPI with CORS, middleware
- Include all routers
- Add exception handlers
- Add health check endpoint: GET /health
**Acceptance Criteria**:
- OpenAPI schema matches `docs/api/openapi.yaml`
- All endpoints have auth middleware
- Integration tests achieve >80% coverage
- API responses match documented format
### 5.4 Background Workers
**Goal**: Implement async export worker.
**Tasks**:
- [ ] Choose task queue: ARQ (Redis-based, async) or Celery
- [ ] Implement `src/workers/exports_worker.py`:
- Fetch document from storage
- Fetch message history from DB
- Generate export package (PDF + metadata)
- Update job status
- [ ] Write worker tests: `tests/unit/workers/test_exports_worker.py`
- [ ] Document worker deployment in: `docs/development/worker-deployment.md`
**Acceptance Criteria**:
- Worker processes export jobs independently
- Failures are logged and job marked as FAILED
- Worker can be scaled horizontally
---
## 6. Phase 4: CI/CD & Quality Gates (Week 4)
### 6.1 GitHub Actions Workflows
**Goal**: Automate testing and quality checks.
**Tasks**:
- [ ] Create `.github/workflows/test.yml`:
- Run on: push, pull_request
- Matrix: Python 3.11, 3.12
- Steps: Install deps, run pytest, upload coverage
- [ ] Create `.github/workflows/lint.yml`:
- Run ruff linting
- Run mypy type checking
- Check code formatting
- [ ] Create `.github/workflows/integration.yml`:
- Spin up PostgreSQL, Redis via services
- Run integration tests with real dependencies
- [ ] Add status badges to README.md
**Acceptance Criteria**:
- All workflows pass on main branch
- Pull requests blocked if tests fail
- Coverage report available in PR comments
### 6.2 Pre-commit Hooks
**Goal**: Catch issues before commit.
**Tasks**:
- [ ] Create `.pre-commit-config.yaml`:
- ruff linting and formatting
- mypy type checking
- trailing whitespace removal
- YAML validation
- [ ] Document setup in: `docs/development/setup-guide.md`
**Acceptance Criteria**:
- Hooks auto-format code
- Hooks prevent commits with type errors
- Setup documented for new developers
### 6.3 Test Coverage Requirements
**Goal**: Enforce quality thresholds.
**Tasks**:
- [ ] Configure pytest-cov in `pytest.ini`:
- Minimum coverage: 80%
- Exclude: tests/, scripts/
- [ ] Add coverage badge to README.md
- [ ] Document coverage exemptions (e.g., `# pragma: no cover`)
**Acceptance Criteria**:
- `pytest --cov` fails if <80% coverage
- Coverage report generated in HTML format
- Uncovered lines are intentional and documented
---
## 7. Phase 5: Agentic Coding Enablement (Week 5)
### 7.1 Agentic Coding Guide
**Goal**: Create comprehensive guide for LLM-driven development.
**Tasks**:
- [ ] Create `docs/development/agentic-coding-guide.md`:
- TDD workflow for Claude/GPT
- Example prompts for generating tests
- How to use Protocol adapters for mocking
- Async testing patterns
- Common pitfalls (GIL, blocking operations)
- [ ] Add example prompt templates:
- "Write async pytest for POST /documents with ProcessPoolExecutor mock"
- "Implement cursor pagination for messages following ADR-003"
- [ ] Update CLAUDE.md with agentic coding patterns
**Acceptance Criteria**:
- Guide includes concrete examples
- Prompts reference specific ADRs
- Guide covers both unit and integration test generation
### 7.2 Testing Utilities & Fixtures
**Goal**: Provide reusable test infrastructure.
**Tasks**:
- [ ] Create `tests/fixtures/factories.py`:
- DocumentMetadataFactory (faker-based)
- ThreadFactory
- MessageFactory
- [ ] Create `tests/fixtures/db_fixtures.py`:
- @pytest.fixture for async DB session
- @pytest.fixture for testcontainers Postgres
- [ ] Create `tests/fixtures/auth_fixtures.py`:
- Mock JWT tokens with different scopes
- Mock JWKS endpoints
- [ ] Document in: `docs/development/testing-utilities.md`
**Acceptance Criteria**:
- Fixtures reduce boilerplate in tests
- Factories generate realistic test data
- Documentation shows usage examples
### 7.3 ADR for Agentic Development
**Goal**: Document the TDD + AI approach as architectural decision.
**Tasks**:
- [ ] Create `docs/architecture/adr/0008-agentic-tdd-workflow.md`:
- Context: LLM-driven development velocity vs. quality
- Decision: Interface-first TDD with AI assistance
- Rationale: Tests serve as executable specification
- Implementation: Workflow, tooling, prompts
**Acceptance Criteria**:
- ADR approved by team
- Links to agentic-coding-guide.md
- Referenced in CLAUDE.md
---
## 8. Phase 6: Migration & Validation (Week 6)
### 8.1 Prototype Deprecation
**Goal**: Mark prototypes as archived.
**Tasks**:
- [ ] Add README.md to each prototype directory:
- Status: ARCHIVED
- Reason: Consolidated into main codebase
- Date: 2025-12-XX
- [ ] Document migration decisions in: `docs/development/prototype-migration.md`
- [ ] Keep prototypes in repo for reference (don't delete)
**Acceptance Criteria**:
- Clear indication that prototypes are not maintained
- Migration rationale documented
### 8.2 End-to-End Validation
**Goal**: Verify complete system integration.
**Tasks**:
- [ ] Write E2E test: `tests/e2e/test_full_workflow.py`:
- Citizen uploads document
- Document is routed
- Thread is created
- Messages are exchanged
- Export is generated
- [ ] Run against local environment (Docker Compose)
- [ ] Measure performance against NFRs:
- Document upload + routing: <500ms
- Message retrieval: <300ms
- [ ] Document E2E setup in: `docs/development/e2e-testing.md`
**Acceptance Criteria**:
- E2E test passes consistently
- Performance targets met
- E2E environment reproducible via Docker Compose
### 8.3 Documentation Review
**Goal**: Ensure all documentation is accurate and complete.
**Tasks**:
- [ ] Review all ADRs for consistency
- [ ] Update CLAUDE.md with final structure
- [ ] Review API documentation against implementation
- [ ] Spell check and grammar check all docs
- [ ] Generate API documentation from OpenAPI spec (ReDoc or Swagger UI)
**Acceptance Criteria**:
- No broken links in documentation
- Code examples in docs are tested
- CLAUDE.md accurately reflects current state
---
## 9. Success Criteria (Overall)
### Functional Requirements
- ✅ Main codebase has all features from best prototype
- ✅ All core APIs implemented and tested
- ✅ Background worker functional
### Quality Requirements
- ✅ Test coverage >80%
- ✅ Zero mypy type errors
- ✅ All linting rules pass
- ✅ CI/CD pipeline green
### Documentation Requirements
- ✅ ADRs in individual files with consistent structure
- ✅ All major decisions documented
- ✅ Agentic coding guide comprehensive
- ✅ CLAUDE.md accurate and complete
### Performance Requirements
- ✅ Document routing <500ms (measured in E2E tests)
- ✅ Message retrieval <300ms (measured in E2E tests)
- ✅ Large file upload streaming works (>50MB test)
### Process Requirements
- ✅ TDD workflow established and documented
- ✅ Pre-commit hooks prevent quality issues
- ✅ GitHub Actions enforce quality gates
- ✅ Agentic development patterns proven with at least 3 features
---
## 10. Risk Mitigation
### Risk 1: Prototype Integration Conflicts
**Mitigation**: Complete comparative analysis (Phase 3.1) before implementation. Document decision rationale.
### Risk 2: TDD Slowing Initial Progress
**Mitigation**: Front-load interface definition (Phase 2). Once interfaces stable, implementation accelerates.
### Risk 3: Incomplete ADR Extraction
**Mitigation**: Use checklist approach. Review original `decisions.md` multiple times. Cross-reference with implementation guide.
### Risk 4: Agentic Coding Learning Curve
**Mitigation**: Create example-driven guide. Include actual prompts that worked. Pair with human for first few features.
### Risk 5: Performance Targets Not Met
**Mitigation**: Include performance testing from Phase 2 onwards. Identify bottlenecks early. Profile with py-spy or similar.
---
## 11. Next Steps
1. **Review this workplan** with the team
2. **Adjust timeline** based on team capacity
3. **Start Phase 1** (Foundation Setup)
4. **Daily standup** to track progress and blockers
5. **Weekly retrospective** to improve agentic coding workflow
---
## Appendix A: Recommended Tools
- **Testing**: pytest, pytest-asyncio, pytest-cov, hypothesis (property testing)
- **Mocking**: respx (HTTP), moto (AWS), testcontainers-python (real services)
- **Linting**: ruff (fast, replaces flake8 + isort + pyupgrade)
- **Type Checking**: mypy with strict mode
- **Factories**: factory_boy or custom Pydantic factories
- **Performance**: py-spy (profiling), locust (load testing)
- **Pre-commit**: pre-commit framework
- **CI/CD**: GitHub Actions (free for public repos)
---
## Appendix B: ADR Numbering Convention
- **0001-0099**: Core Architecture (payload model, auth, concurrency)
- **0100-0199**: Data & Persistence (pagination, retention, schema)
- **0200-0299**: API & Integration (REST structure, export workflow)
- **0300-0399**: Development Process (TDD, agentic coding)
- **0400+**: Future decisions
---
**Document Owner**: Backend Engineering Team
**Last Updated**: 2025-12-01
**Status**: Ready for Review