generated from coulomb/repo-seed
chore: setup agentic coding workplan
This commit is contained in:
360
CLAUDE.md
Normal file
360
CLAUDE.md
Normal file
@@ -0,0 +1,360 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
**DirektVermittlungDe (DVD)** is a document-centric communication platform between citizens and German authorities. It eliminates the "phone hunt" by:
|
||||
1. Citizens upload a document or provide an *Aktenzeichen* (reference number)
|
||||
2. The system auto-routes to the responsible unit
|
||||
3. Opens an interaction thread for direct clarification
|
||||
|
||||
## Current Development Status
|
||||
|
||||
**⚠️ IMPORTANT**: This repository is currently in transition from prototype phase to main production codebase.
|
||||
|
||||
- **Current State**: Three prototype implementations (chatgpt5, geminiNbt3pro, grok4.1) in `/prototype-*` directories
|
||||
- **Target State**: Unified production codebase in `/src` with TDD-driven development
|
||||
- **Active Workplan**: See `docs/WORKPLAN_MainCodebase_Integration.md` for detailed migration plan
|
||||
|
||||
**When working on this codebase:**
|
||||
- Check the workplan to understand which phase we're in
|
||||
- Follow TDD practices: write tests first, then implementation
|
||||
- Reference ADRs for architectural decisions (see below)
|
||||
- New code goes in `/src`, not prototypes
|
||||
|
||||
## Repository Structure (Target State)
|
||||
|
||||
```
|
||||
/
|
||||
├── docs/
|
||||
│ ├── architecture/
|
||||
│ │ ├── adr/ # Individual ADR files (0001-*.md)
|
||||
│ │ │ ├── 0001-split-payload-model.md
|
||||
│ │ │ ├── 0002-stateless-authentication.md
|
||||
│ │ │ └── ...
|
||||
│ │ ├── architecture-overview.md
|
||||
│ │ └── design-patterns.md
|
||||
│ ├── api/
|
||||
│ │ ├── openapi.yaml
|
||||
│ │ └── api-scenarios.md
|
||||
│ ├── development/
|
||||
│ │ ├── testing-strategy.md
|
||||
│ │ ├── agentic-coding-guide.md
|
||||
│ │ └── setup-guide.md
|
||||
│ └── WORKPLAN_MainCodebase_Integration.md
|
||||
├── src/
|
||||
│ ├── domain/ # Pure business logic (TDD tested)
|
||||
│ ├── adapters/ # External integrations (mocked in tests)
|
||||
│ ├── service/ # Application services (TDD tested)
|
||||
│ ├── api/ # FastAPI routes (integration tested)
|
||||
│ └── workers/ # Background jobs
|
||||
├── tests/
|
||||
│ ├── unit/ # TDD unit tests
|
||||
│ ├── integration/ # Integration tests
|
||||
│ └── fixtures/ # Test data and mocks
|
||||
├── scripts/ # Utility scripts
|
||||
├── prototype-*/ # ARCHIVED prototypes (reference only)
|
||||
├── pyproject.toml # Dependencies & tool config
|
||||
└── pytest.ini # Test configuration
|
||||
```
|
||||
|
||||
## Development Commands
|
||||
|
||||
### Setup and Run (Main Codebase)
|
||||
|
||||
```bash
|
||||
# Install dependencies (from repo root)
|
||||
pip install -e ".[dev]"
|
||||
|
||||
# Initialize database metadata
|
||||
python -m scripts.init_db
|
||||
|
||||
# Run development server
|
||||
uvicorn src.main:app --reload
|
||||
|
||||
# View API docs
|
||||
# http://localhost:8000/docs
|
||||
```
|
||||
|
||||
### Testing (TDD Workflow)
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
pytest
|
||||
|
||||
# Run with coverage report
|
||||
pytest --cov=src --cov-report=html
|
||||
|
||||
# Run only unit tests
|
||||
pytest tests/unit/
|
||||
|
||||
# Run only integration tests
|
||||
pytest tests/integration/
|
||||
|
||||
# Run specific test file
|
||||
pytest tests/unit/domain/test_document_metadata.py -v
|
||||
|
||||
# Watch mode (requires pytest-watch)
|
||||
ptw
|
||||
```
|
||||
|
||||
### Code Quality
|
||||
|
||||
```bash
|
||||
# Lint and format with ruff
|
||||
ruff check src/ tests/
|
||||
ruff format src/ tests/
|
||||
|
||||
# Type checking with mypy
|
||||
mypy src/
|
||||
|
||||
# Run pre-commit hooks manually
|
||||
pre-commit run --all-files
|
||||
```
|
||||
|
||||
## Key Architectural Patterns
|
||||
|
||||
### 1. Split-Payload Model (ADR-001)
|
||||
|
||||
**Critical**: Documents use a split design to enable both encryption and routing:
|
||||
- **metadata** (plaintext JSON): authorityId, referenceNumber, docType, issuedAt
|
||||
- **encryptedPayload** (encrypted blob): actual PDF/scan content
|
||||
|
||||
The backend can route based on metadata without decrypting the payload. This is the core architectural constraint.
|
||||
|
||||
When working with documents:
|
||||
- NEVER attempt to decrypt `encryptedPayload` in the main service
|
||||
- The backend treats encrypted content as opaque
|
||||
- Routing logic operates only on metadata fields
|
||||
|
||||
### 2. Hybrid Concurrency Pattern (ADR-007)
|
||||
|
||||
Python's GIL requires careful concurrency handling:
|
||||
- **I/O operations** (DB, network): Use native `async`/`await`
|
||||
- **CPU operations** (crypto, PDF): Offload to `ProcessPoolExecutor`
|
||||
|
||||
Example pattern:
|
||||
```python
|
||||
cpu_pool = ProcessPoolExecutor(max_workers=4)
|
||||
|
||||
async def handler():
|
||||
# I/O: async/await
|
||||
data = await db.fetch()
|
||||
|
||||
# CPU: executor
|
||||
loop = asyncio.get_running_loop()
|
||||
result = await loop.run_in_executor(cpu_pool, heavy_function, data)
|
||||
```
|
||||
|
||||
### 3. Cursor-Based Pagination (ADR-003)
|
||||
|
||||
For message threads, use cursor-based pagination (timestamp) instead of offset-based:
|
||||
- Ensures consistent performance regardless of thread length
|
||||
- Prevents duplicate/missing messages during real-time updates
|
||||
- Target: <300ms response time (NFR-1)
|
||||
|
||||
### 4. Async Export Pattern (ADR-004)
|
||||
|
||||
Data exports use async request-reply:
|
||||
1. POST `/exports` → 202 Accepted + jobId
|
||||
2. Background worker processes via message queue
|
||||
3. Client polls status endpoint or receives webhook
|
||||
|
||||
### 5. Stateless Authentication (ADR-002)
|
||||
|
||||
OAuth2/JWT with scopes:
|
||||
- `citizen:write`: Document submission, thread creation
|
||||
- `official:read`: View assigned cases
|
||||
- `official:write`: Respond to inquiries
|
||||
|
||||
JWTs enable horizontal scaling without session affinity.
|
||||
|
||||
### 6. GDPR-Compliant Retention (ADR-006)
|
||||
|
||||
Documents have `retention_date` field:
|
||||
- Default: `closedAt + gracePeriod`
|
||||
- Personal archive: `null` or extended date
|
||||
- Automated cleanup via TTL engine
|
||||
|
||||
## API Structure
|
||||
|
||||
The API follows document-centric REST hierarchy:
|
||||
- `/documents` - Root envelope (FR-1: Document intake)
|
||||
- `/documents/{id}/threads` - Sub-resource for communication (FR-3)
|
||||
- `/threads/{threadId}/messages` - Message exchange
|
||||
- `/exports` - Async data export for authority systems (FR-7)
|
||||
|
||||
## Critical Domain Models
|
||||
|
||||
Located in `src/domain/models.py` (target) or `prototype-chatgpt5/src/app/domain/models.py` (reference):
|
||||
|
||||
**Enums:**
|
||||
- **ThreadType**: TEXT_CHAT, CALLBACK_REQUEST, APPOINTMENT
|
||||
- **SenderRole**: CITIZEN, OFFICIAL, SYSTEM
|
||||
- **ExportJobStatus**: QUEUED, RUNNING, COMPLETED, FAILED
|
||||
|
||||
**Core Models:**
|
||||
- **DocumentMetadata**: authorityId, referenceNumber, docType, issuedAt
|
||||
- Validation: authorityId (1-50 chars), referenceNumber required
|
||||
- **DocumentCreateRequest**: metadata, encryptedPayload (base64)
|
||||
- **DocumentCreatedResponse**: id, status, assignedUnit
|
||||
- **ThreadCreateRequest**: type, initialMessage, preferredTimeSlot
|
||||
- **MessageDto**: id, senderRole, content (encrypted), timestamp
|
||||
|
||||
## Performance Targets (NFR)
|
||||
|
||||
- Core operations: <300ms response time
|
||||
- Document upload routing: <500ms
|
||||
- Concurrent sessions: 10k+ per region
|
||||
- Availability: ≥99.5%
|
||||
|
||||
## Security Constraints
|
||||
|
||||
- E2E encryption required for sensitive data (NFR-5)
|
||||
- Backend cannot decrypt document payloads
|
||||
- Least-privilege access via OAuth scopes (NFR-6)
|
||||
- No PII in plaintext application logs
|
||||
|
||||
## TDD Development Workflow
|
||||
|
||||
**CRITICAL**: This project follows a strict Test-Driven Development approach.
|
||||
|
||||
### The Red-Green-Refactor Cycle
|
||||
|
||||
1. **RED**: Write a failing test that defines desired behavior
|
||||
2. **GREEN**: Write minimal code to make the test pass
|
||||
3. **REFACTOR**: Improve code while keeping tests green
|
||||
|
||||
### Example TDD Workflow
|
||||
|
||||
```python
|
||||
# Step 1: Write failing test (tests/unit/domain/test_document_metadata.py)
|
||||
def test_document_metadata_validates_authority_id():
|
||||
with pytest.raises(ValidationError):
|
||||
DocumentMetadata(
|
||||
authorityId="", # Empty should fail
|
||||
referenceNumber="123/456",
|
||||
docType="NOTICE",
|
||||
issuedAt=datetime.now()
|
||||
)
|
||||
|
||||
# Step 2: Run test (it fails)
|
||||
# $ pytest tests/unit/domain/test_document_metadata.py
|
||||
|
||||
# Step 3: Implement validation (src/domain/models.py)
|
||||
class DocumentMetadata(BaseModel):
|
||||
authorityId: str = Field(..., min_length=1, max_length=50)
|
||||
# ... rest of fields
|
||||
|
||||
# Step 4: Run test again (it passes)
|
||||
|
||||
# Step 5: Refactor if needed, ensure tests still pass
|
||||
```
|
||||
|
||||
### Agentic TDD Pattern
|
||||
|
||||
When asking Claude to implement features, use this pattern:
|
||||
|
||||
```
|
||||
"Write a test for [feature] that verifies [specific behavior].
|
||||
Use pytest and follow the pattern in tests/unit/[module]/.
|
||||
Reference ADR-[number] for architectural constraints."
|
||||
```
|
||||
|
||||
Example:
|
||||
```
|
||||
"Write a test for document routing that verifies documents with docType='NOTICE'
|
||||
are routed to the NoticeTeam. Mock the routing adapter using the protocol defined
|
||||
in src/adapters/protocols.py. Follow ADR-0001 split-payload constraints."
|
||||
```
|
||||
|
||||
See `docs/development/agentic-coding-guide.md` for comprehensive examples.
|
||||
|
||||
## Documentation References
|
||||
|
||||
### Architecture & Decisions
|
||||
- **Workplan**: `docs/WORKPLAN_MainCodebase_Integration.md` - Current migration plan
|
||||
- **ADRs**: `docs/architecture/adr/` - Individual decision records
|
||||
- `0001-split-payload-model.md` - Core constraint: metadata vs encrypted payload
|
||||
- `0002-stateless-authentication.md` - OAuth2/JWT architecture
|
||||
- `0003-cursor-based-pagination.md` - Performance-optimized pagination
|
||||
- `0004-async-export-workflow.md` - Background job pattern
|
||||
- `0005-rest-resource-structure.md` - API design hierarchy
|
||||
- `0006-gdpr-retention-model.md` - Data lifecycle management
|
||||
- `0007-hybrid-concurrency-pattern.md` - Python GIL mitigation
|
||||
- `0008-agentic-tdd-workflow.md` - LLM-driven development process
|
||||
- **Overview**: `docs/architecture/architecture-overview.md` - System design
|
||||
- **Patterns**: `docs/architecture/design-patterns.md` - Common patterns
|
||||
|
||||
### API Documentation
|
||||
- **OpenAPI**: `docs/api/openapi.yaml` - Full API specification
|
||||
- **Scenarios**: `docs/api/api-scenarios.md` - Real-world usage examples
|
||||
|
||||
### Development Guides
|
||||
- **Testing**: `docs/development/testing-strategy.md` - TDD practices
|
||||
- **Agentic Coding**: `docs/development/agentic-coding-guide.md` - AI-assisted development
|
||||
- **Setup**: `docs/development/setup-guide.md` - Environment setup
|
||||
|
||||
### Legacy (Prototypes)
|
||||
- `prototype-chatgpt5/README.md` - Reference implementation (ARCHIVED)
|
||||
- `docs/decisions.md` - Original ADRs (DEPRECATED, see /docs/architecture/adr/)
|
||||
|
||||
## Database Schema
|
||||
|
||||
Key tables (PostgreSQL):
|
||||
- **documents**: id, reference_number, authority_id, status, storage_path, retention_date
|
||||
- **threads**: id, document_id, type, assigned_official_id, last_activity_at
|
||||
- **messages**: id, thread_id, sender_role, content_blob, created_at
|
||||
|
||||
Indexes:
|
||||
- `idx_docs_authority` on documents(authority_id, status)
|
||||
- `idx_msgs_thread_time` on messages(thread_id, created_at DESC)
|
||||
|
||||
## Technology Stack
|
||||
|
||||
- **Language**: Python 3.11+
|
||||
- **Framework**: FastAPI + Uvicorn
|
||||
- **ORM**: SQLAlchemy (async)
|
||||
- **Database**: PostgreSQL 15+
|
||||
- **Blob Storage**: S3-compatible (MinIO/AWS S3)
|
||||
- **Task Queue**: ARQ (Redis-based, async) or Celery
|
||||
- **Auth**: OAuth2/JWT (stateless)
|
||||
- **Testing**: pytest + pytest-asyncio + pytest-cov
|
||||
- **Linting**: ruff (fast, replaces flake8/isort/pyupgrade)
|
||||
- **Type Checking**: mypy (strict mode)
|
||||
- **CI/CD**: GitHub Actions
|
||||
|
||||
## Workplan & Phase Tracking
|
||||
|
||||
**Active Workplan**: `docs/WORKPLAN_MainCodebase_Integration.md`
|
||||
|
||||
### Current Phase Status
|
||||
|
||||
To check which phase we're in:
|
||||
1. Open `docs/WORKPLAN_MainCodebase_Integration.md`
|
||||
2. Look for checked `[x]` items in each phase section
|
||||
3. Focus development on uncompleted `[ ]` items in the current phase
|
||||
|
||||
### How to Contribute
|
||||
|
||||
**Before implementing any feature:**
|
||||
1. Check if it's in the current phase of the workplan
|
||||
2. Read the relevant ADR(s) in `docs/architecture/adr/`
|
||||
3. Write tests first (TDD approach)
|
||||
4. Implement to make tests pass
|
||||
5. Ensure all quality gates pass (pytest, mypy, ruff)
|
||||
|
||||
**When adding new architectural decisions:**
|
||||
1. Create a new ADR file in `docs/architecture/adr/`
|
||||
2. Use the template in `0000-template.md`
|
||||
3. Number sequentially (next available number)
|
||||
4. Update the ADR index in `docs/architecture/adr/README.md`
|
||||
|
||||
## Common Gotchas
|
||||
|
||||
1. **Stream large files**: Use `aiobotocore` to stream uploads to S3, don't load entire payloads into memory
|
||||
2. **ProcessPoolExecutor**: CPU-heavy operations MUST be offloaded to avoid blocking the event loop
|
||||
3. **Metadata separation**: The routing engine needs plaintext metadata - design API contracts accordingly
|
||||
4. **Retention dates**: Always set `retention_date` on document creation for GDPR compliance
|
||||
5. **Cursor pagination**: Use timestamp-based cursors for message history, not offset-based
|
||||
653
docs/WORKPLAN_MainCodebase_Integration.md
Normal file
653
docs/WORKPLAN_MainCodebase_Integration.md
Normal file
@@ -0,0 +1,653 @@
|
||||
# Workplan: Main Codebase Integration & TDD Setup
|
||||
|
||||
**Status:** Draft v1.0
|
||||
**Date:** 2025-12-01
|
||||
**Goal:** Consolidate prototype learnings into a production-ready main codebase with TDD practices and proper ADR documentation structure.
|
||||
|
||||
---
|
||||
|
||||
## 1. Current State Assessment
|
||||
|
||||
### Existing Assets
|
||||
- **3 Prototype Implementations**: chatgpt5 (most complete), geminiNbt3pro, grok4.1
|
||||
- **Documentation**:
|
||||
- Single `decisions.md` file with all ADRs (needs restructuring)
|
||||
- `implementation_guide.md` (comprehensive)
|
||||
- `api_docs.md` (scenario-based)
|
||||
- `openapi.yaml` (API spec)
|
||||
- **Architecture**: Clean Architecture pattern established in chatgpt5 prototype
|
||||
|
||||
### Gaps
|
||||
- No unified main codebase
|
||||
- ADRs not in individual files (not easily referenceable)
|
||||
- No TDD test suite defining core interfaces
|
||||
- No CI/CD pipeline configuration
|
||||
- Prototypes have overlapping implementations without consolidation
|
||||
|
||||
---
|
||||
|
||||
## 2. Target State Definition
|
||||
|
||||
### Main Codebase Structure
|
||||
```
|
||||
/
|
||||
├── docs/
|
||||
│ ├── architecture/
|
||||
│ │ ├── adr/ # Individual ADR files
|
||||
│ │ │ ├── 0001-split-payload-model.md
|
||||
│ │ │ ├── 0002-stateless-auth.md
|
||||
│ │ │ └── ...
|
||||
│ │ ├── architecture-overview.md
|
||||
│ │ └── design-patterns.md
|
||||
│ ├── api/
|
||||
│ │ ├── openapi.yaml
|
||||
│ │ └── api-scenarios.md
|
||||
│ └── development/
|
||||
│ ├── testing-strategy.md
|
||||
│ └── agentic-coding-guide.md
|
||||
├── src/
|
||||
│ ├── domain/ # Pure business logic (TDD tested)
|
||||
│ ├── adapters/ # External integrations (mocked in tests)
|
||||
│ ├── service/ # Application services (TDD tested)
|
||||
│ ├── api/ # FastAPI routes (integration tested)
|
||||
│ └── workers/ # Background jobs
|
||||
├── tests/
|
||||
│ ├── unit/ # TDD unit tests
|
||||
│ ├── integration/ # Integration tests
|
||||
│ └── fixtures/ # Test data and mocks
|
||||
├── scripts/
|
||||
│ └── init_db.py
|
||||
├── pyproject.toml # Dependencies & tool config
|
||||
├── pytest.ini # Test configuration
|
||||
├── .github/
|
||||
│ └── workflows/ # CI/CD pipelines
|
||||
└── CLAUDE.md # AI coding assistant guidance
|
||||
```
|
||||
|
||||
### TDD Approach
|
||||
- **Interface-first**: Define Pydantic models and service interfaces via tests
|
||||
- **Red-Green-Refactor**: Write failing test → implement → refactor
|
||||
- **Test pyramid**: Many unit tests, fewer integration tests, minimal e2e tests
|
||||
- **Async testing**: Use pytest-asyncio for async operations
|
||||
|
||||
### ADR Documentation Standards
|
||||
- **Format**: Follow Markdown Any Decision Records (MADR) template
|
||||
- **Naming**: `NNNN-title-with-dashes.md` (e.g., `0001-split-payload-model.md`)
|
||||
- **Location**: `/docs/architecture/adr/`
|
||||
- **Template**:
|
||||
```markdown
|
||||
# ADR-NNNN: [Title]
|
||||
|
||||
**Status:** [Accepted|Proposed|Deprecated|Superseded]
|
||||
**Date:** YYYY-MM-DD
|
||||
**Deciders:** [Team/Role]
|
||||
|
||||
## Context and Problem Statement
|
||||
[What is the issue we're addressing?]
|
||||
|
||||
## Decision Drivers
|
||||
* [Driver 1]
|
||||
* [Driver 2]
|
||||
|
||||
## Considered Options
|
||||
* Option 1
|
||||
* Option 2
|
||||
|
||||
## Decision Outcome
|
||||
Chosen option: [option], because [rationale].
|
||||
|
||||
### Positive Consequences
|
||||
* [Consequence 1]
|
||||
|
||||
### Negative Consequences
|
||||
* [Consequence 1]
|
||||
|
||||
## Implementation Notes
|
||||
[Specific technical guidance]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Phase 1: Foundation Setup (Week 1)
|
||||
|
||||
### 3.1 ADR Restructuring
|
||||
**Goal**: Extract individual ADRs from `decisions.md` into separate files.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Create `/docs/architecture/adr/` directory
|
||||
- [ ] Create ADR template file: `docs/architecture/adr/0000-template.md`
|
||||
- [ ] Extract ADR-001 (Split-Payload Model) → `0001-split-payload-model.md`
|
||||
- [ ] Extract ADR-002 (Stateless Auth) → `0002-stateless-authentication.md`
|
||||
- [ ] Extract ADR-003 (Pagination) → `0003-cursor-based-pagination.md`
|
||||
- [ ] Extract ADR-004 (Async Exports) → `0004-async-export-workflow.md`
|
||||
- [ ] Extract ADR-005 (Resource Naming) → `0005-rest-resource-structure.md`
|
||||
- [ ] Extract ADR-006 (Data Retention) → `0006-gdpr-retention-model.md`
|
||||
- [ ] Extract ADR-007 (Python & ProcessPoolExecutor) → `0007-hybrid-concurrency-pattern.md`
|
||||
- [ ] Create index file: `docs/architecture/adr/README.md` with ADR list and links
|
||||
- [ ] Update `decisions.md` with deprecation notice and redirect to ADR directory
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Each ADR is a standalone markdown file
|
||||
- ADRs follow consistent template structure
|
||||
- Index file allows easy navigation
|
||||
- CLAUDE.md updated to reference new ADR location
|
||||
|
||||
### 3.2 Main Codebase Scaffolding
|
||||
**Goal**: Create production-ready directory structure with tooling.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Create `/src/` directory with Clean Architecture structure
|
||||
- [ ] Initialize `pyproject.toml` with dependencies from prototype-chatgpt5
|
||||
- [ ] Add dev dependencies: pytest, pytest-asyncio, pytest-cov, httpx, respx, mypy, ruff
|
||||
- [ ] Create `pytest.ini` with async and coverage configuration
|
||||
- [ ] Create `.gitignore` (Python, IDE, env files)
|
||||
- [ ] Create `tests/` structure: unit/, integration/, fixtures/
|
||||
- [ ] Create `scripts/` directory
|
||||
- [ ] Set up pre-commit hooks configuration (ruff, mypy)
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Directory structure matches target state
|
||||
- `pip install -e ".[dev]"` works
|
||||
- `pytest` runs (even with 0 tests)
|
||||
- Type checking with `mypy src/` works
|
||||
|
||||
### 3.3 Documentation Organization
|
||||
**Goal**: Restructure documentation for clarity and AI assistant consumption.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Create `/docs/architecture/` directory
|
||||
- [ ] Create `/docs/api/` directory
|
||||
- [ ] Create `/docs/development/` directory
|
||||
- [ ] Move `openapi.yaml` → `docs/api/openapi.yaml`
|
||||
- [ ] Move `api_docs.md` → `docs/api/api-scenarios.md`
|
||||
- [ ] Refactor `implementation_guide.md` → split into:
|
||||
- `docs/architecture/design-patterns.md` (architectural patterns)
|
||||
- `docs/development/testing-strategy.md` (TDD approach)
|
||||
- `docs/development/agentic-coding-guide.md` (LLM-specific guidance)
|
||||
- [ ] Create `docs/architecture/architecture-overview.md` (high-level system design)
|
||||
- [ ] Update CLAUDE.md with new documentation structure
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Documentation is logically organized by concern
|
||||
- Each document has a single, clear purpose
|
||||
- CLAUDE.md points to correct locations
|
||||
|
||||
---
|
||||
|
||||
## 4. Phase 2: TDD Interface Definition (Week 2)
|
||||
|
||||
### 4.1 Domain Models (TDD)
|
||||
**Goal**: Define core domain models with comprehensive test coverage.
|
||||
|
||||
**Approach**: Write tests first, then implement models.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] **Test**: `tests/unit/domain/test_document_metadata.py`
|
||||
- Test: Valid metadata creation
|
||||
- Test: Invalid authorityId (empty, too long)
|
||||
- Test: Invalid referenceNumber format
|
||||
- Test: Future issuedAt date validation
|
||||
- Implement: `src/domain/models.py` → `DocumentMetadata`
|
||||
|
||||
- [ ] **Test**: `tests/unit/domain/test_thread_models.py`
|
||||
- Test: ThreadType enum values
|
||||
- Test: SenderRole enum values
|
||||
- Test: ThreadCreateRequest validation
|
||||
- Test: Message model with encrypted content
|
||||
- Implement: Thread-related models
|
||||
|
||||
- [ ] **Test**: `tests/unit/domain/test_document_envelope.py`
|
||||
- Test: Split payload structure
|
||||
- Test: encryptedPayload validation (base64, size limits)
|
||||
- Test: Status transitions (RECEIVED → ROUTED → ASSIGNED → CLOSED)
|
||||
- Implement: DocumentEnvelope and status management
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- All domain models have >90% test coverage
|
||||
- Pydantic validation catches invalid inputs
|
||||
- Tests run in <1 second
|
||||
- Zero mypy type errors
|
||||
|
||||
### 4.2 Service Layer Interfaces (TDD)
|
||||
**Goal**: Define service contracts via protocol/ABC classes with tests.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] **Test**: `tests/unit/service/test_documents_service.py`
|
||||
- Mock: Database adapter
|
||||
- Mock: Storage adapter (S3)
|
||||
- Mock: Routing engine
|
||||
- Test: create_document() - happy path
|
||||
- Test: create_document() - routing failure
|
||||
- Test: create_document() - storage failure
|
||||
- Test: get_document() - found
|
||||
- Test: get_document() - not found
|
||||
- Test: Retention date calculation (default 90 days)
|
||||
- Implement: `src/service/documents_service.py`
|
||||
|
||||
- [ ] **Test**: `tests/unit/service/test_threads_service.py`
|
||||
- Test: create_thread() - links to document
|
||||
- Test: create_thread() - document not found (404)
|
||||
- Test: list_messages() - cursor pagination
|
||||
- Test: add_message() - role validation
|
||||
- Implement: `src/service/threads_service.py`
|
||||
|
||||
- [ ] **Test**: `tests/unit/service/test_exports_service.py`
|
||||
- Test: create_export_job() - returns jobId
|
||||
- Test: get_export_status() - job states (QUEUED, RUNNING, COMPLETED, FAILED)
|
||||
- Test: Async job enqueuing (mock Redis/ARQ)
|
||||
- Implement: `src/service/exports_service.py`
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Service layer has clear, testable interfaces
|
||||
- All external dependencies are mocked
|
||||
- Tests verify business logic, not infrastructure
|
||||
- Each service method has both success and failure test cases
|
||||
|
||||
### 4.3 Adapter Contracts (Protocols)
|
||||
**Goal**: Define adapter interfaces using Python Protocols for mockability.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Create `src/adapters/protocols.py`:
|
||||
- `StorageAdapter` protocol (save_encrypted_payload, get_encrypted_payload)
|
||||
- `DatabaseAdapter` protocol (CRUD operations)
|
||||
- `RoutingEngine` protocol (route_document)
|
||||
- `JobQueue` protocol (enqueue, get_status)
|
||||
|
||||
- [ ] Create stub implementations for testing:
|
||||
- `tests/fixtures/storage_stub.py` (in-memory storage)
|
||||
- `tests/fixtures/db_stub.py` (in-memory DB)
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Protocols are narrow and focused
|
||||
- Test fixtures implement all protocols
|
||||
- Production adapters can be swapped without changing service layer
|
||||
|
||||
---
|
||||
|
||||
## 5. Phase 3: Prototype Integration (Week 3)
|
||||
|
||||
### 5.1 Comparative Analysis
|
||||
**Goal**: Identify best implementations across prototypes.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Analyze `prototype-chatgpt5/src/app/adapters/`:
|
||||
- auth.py - OAuth2 implementation quality
|
||||
- db.py - SQLAlchemy async patterns
|
||||
- storage.py - S3 streaming approach
|
||||
- routing.py - Routing logic structure
|
||||
|
||||
- [ ] Analyze `prototype-geminiNbt3pro/`:
|
||||
- Identify unique features or better implementations
|
||||
|
||||
- [ ] Analyze `prototype-grok4.1/`:
|
||||
- Compare test coverage and patterns
|
||||
|
||||
- [ ] Document findings in: `docs/development/prototype-analysis.md`
|
||||
- Table: Feature vs Prototype vs Recommendation
|
||||
- Rationale for selections
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Clear decision on which implementation to use for each component
|
||||
- Documented rationale for selections
|
||||
- Identified any missing features across all prototypes
|
||||
|
||||
### 5.2 Core Adapter Implementation
|
||||
**Goal**: Implement production adapters based on best prototype code.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] **Database Adapter** (`src/adapters/db.py`):
|
||||
- Port SQLAlchemy models from chosen prototype
|
||||
- Implement async session management
|
||||
- Add connection pooling configuration
|
||||
- Write integration tests: `tests/integration/test_db_adapter.py`
|
||||
|
||||
- [ ] **Storage Adapter** (`src/adapters/storage.py`):
|
||||
- Implement S3 client using aiobotocore
|
||||
- Add streaming upload/download (no in-memory buffering)
|
||||
- Mock S3 in tests using moto or similar
|
||||
- Write tests: `tests/integration/test_storage_adapter.py`
|
||||
|
||||
- [ ] **Routing Engine** (`src/adapters/routing.py`):
|
||||
- Port routing logic from prototype
|
||||
- Make routing rules configurable (not hardcoded)
|
||||
- Add caching layer (Redis) for routing rules
|
||||
- Write tests: `tests/unit/adapters/test_routing.py`
|
||||
|
||||
- [ ] **Authentication** (`src/adapters/auth.py`):
|
||||
- Implement JWT validation
|
||||
- Add JWKS caching
|
||||
- Create FastAPI dependency for auth
|
||||
- Write tests: `tests/unit/adapters/test_auth.py`
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- All adapters follow Protocol contracts
|
||||
- Integration tests use real dependencies (testcontainers)
|
||||
- Unit tests use mocks
|
||||
- Streaming works for large files (>50MB)
|
||||
|
||||
### 5.3 API Layer Implementation
|
||||
**Goal**: Build FastAPI routes with OpenAPI compliance.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] **Documents API** (`src/api/documents.py`):
|
||||
- POST /documents - implement with streaming upload
|
||||
- GET /documents/{id} - implement with ETag support
|
||||
- Add request validation (Pydantic)
|
||||
- Write integration tests: `tests/integration/test_documents_api.py`
|
||||
|
||||
- [ ] **Threads API** (`src/api/threads.py`):
|
||||
- POST /documents/{id}/threads
|
||||
- GET /threads/{id}/messages (cursor pagination)
|
||||
- POST /threads/{id}/messages
|
||||
- Write integration tests: `tests/integration/test_threads_api.py`
|
||||
|
||||
- [ ] **Exports API** (`src/api/exports.py`):
|
||||
- POST /exports (async job creation)
|
||||
- GET /exports/{jobId} (status polling)
|
||||
- Write integration tests: `tests/integration/test_exports_api.py`
|
||||
|
||||
- [ ] **Main App** (`src/main.py`):
|
||||
- Configure FastAPI with CORS, middleware
|
||||
- Include all routers
|
||||
- Add exception handlers
|
||||
- Add health check endpoint: GET /health
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- OpenAPI schema matches `docs/api/openapi.yaml`
|
||||
- All endpoints have auth middleware
|
||||
- Integration tests achieve >80% coverage
|
||||
- API responses match documented format
|
||||
|
||||
### 5.4 Background Workers
|
||||
**Goal**: Implement async export worker.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Choose task queue: ARQ (Redis-based, async) or Celery
|
||||
- [ ] Implement `src/workers/exports_worker.py`:
|
||||
- Fetch document from storage
|
||||
- Fetch message history from DB
|
||||
- Generate export package (PDF + metadata)
|
||||
- Update job status
|
||||
|
||||
- [ ] Write worker tests: `tests/unit/workers/test_exports_worker.py`
|
||||
- [ ] Document worker deployment in: `docs/development/worker-deployment.md`
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Worker processes export jobs independently
|
||||
- Failures are logged and job marked as FAILED
|
||||
- Worker can be scaled horizontally
|
||||
|
||||
---
|
||||
|
||||
## 6. Phase 4: CI/CD & Quality Gates (Week 4)
|
||||
|
||||
### 6.1 GitHub Actions Workflows
|
||||
**Goal**: Automate testing and quality checks.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Create `.github/workflows/test.yml`:
|
||||
- Run on: push, pull_request
|
||||
- Matrix: Python 3.11, 3.12
|
||||
- Steps: Install deps, run pytest, upload coverage
|
||||
|
||||
- [ ] Create `.github/workflows/lint.yml`:
|
||||
- Run ruff linting
|
||||
- Run mypy type checking
|
||||
- Check code formatting
|
||||
|
||||
- [ ] Create `.github/workflows/integration.yml`:
|
||||
- Spin up PostgreSQL, Redis via services
|
||||
- Run integration tests with real dependencies
|
||||
|
||||
- [ ] Add status badges to README.md
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- All workflows pass on main branch
|
||||
- Pull requests blocked if tests fail
|
||||
- Coverage report available in PR comments
|
||||
|
||||
### 6.2 Pre-commit Hooks
|
||||
**Goal**: Catch issues before commit.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Create `.pre-commit-config.yaml`:
|
||||
- ruff linting and formatting
|
||||
- mypy type checking
|
||||
- trailing whitespace removal
|
||||
- YAML validation
|
||||
|
||||
- [ ] Document setup in: `docs/development/setup-guide.md`
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Hooks auto-format code
|
||||
- Hooks prevent commits with type errors
|
||||
- Setup documented for new developers
|
||||
|
||||
### 6.3 Test Coverage Requirements
|
||||
**Goal**: Enforce quality thresholds.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Configure pytest-cov in `pytest.ini`:
|
||||
- Minimum coverage: 80%
|
||||
- Exclude: tests/, scripts/
|
||||
|
||||
- [ ] Add coverage badge to README.md
|
||||
- [ ] Document coverage exemptions (e.g., `# pragma: no cover`)
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- `pytest --cov` fails if <80% coverage
|
||||
- Coverage report generated in HTML format
|
||||
- Uncovered lines are intentional and documented
|
||||
|
||||
---
|
||||
|
||||
## 7. Phase 5: Agentic Coding Enablement (Week 5)
|
||||
|
||||
### 7.1 Agentic Coding Guide
|
||||
**Goal**: Create comprehensive guide for LLM-driven development.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Create `docs/development/agentic-coding-guide.md`:
|
||||
- TDD workflow for Claude/GPT
|
||||
- Example prompts for generating tests
|
||||
- How to use Protocol adapters for mocking
|
||||
- Async testing patterns
|
||||
- Common pitfalls (GIL, blocking operations)
|
||||
|
||||
- [ ] Add example prompt templates:
|
||||
- "Write async pytest for POST /documents with ProcessPoolExecutor mock"
|
||||
- "Implement cursor pagination for messages following ADR-003"
|
||||
|
||||
- [ ] Update CLAUDE.md with agentic coding patterns
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Guide includes concrete examples
|
||||
- Prompts reference specific ADRs
|
||||
- Guide covers both unit and integration test generation
|
||||
|
||||
### 7.2 Testing Utilities & Fixtures
|
||||
**Goal**: Provide reusable test infrastructure.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Create `tests/fixtures/factories.py`:
|
||||
- DocumentMetadataFactory (faker-based)
|
||||
- ThreadFactory
|
||||
- MessageFactory
|
||||
|
||||
- [ ] Create `tests/fixtures/db_fixtures.py`:
|
||||
- @pytest.fixture for async DB session
|
||||
- @pytest.fixture for testcontainers Postgres
|
||||
|
||||
- [ ] Create `tests/fixtures/auth_fixtures.py`:
|
||||
- Mock JWT tokens with different scopes
|
||||
- Mock JWKS endpoints
|
||||
|
||||
- [ ] Document in: `docs/development/testing-utilities.md`
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Fixtures reduce boilerplate in tests
|
||||
- Factories generate realistic test data
|
||||
- Documentation shows usage examples
|
||||
|
||||
### 7.3 ADR for Agentic Development
|
||||
**Goal**: Document the TDD + AI approach as architectural decision.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Create `docs/architecture/adr/0008-agentic-tdd-workflow.md`:
|
||||
- Context: LLM-driven development velocity vs. quality
|
||||
- Decision: Interface-first TDD with AI assistance
|
||||
- Rationale: Tests serve as executable specification
|
||||
- Implementation: Workflow, tooling, prompts
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- ADR approved by team
|
||||
- Links to agentic-coding-guide.md
|
||||
- Referenced in CLAUDE.md
|
||||
|
||||
---
|
||||
|
||||
## 8. Phase 6: Migration & Validation (Week 6)
|
||||
|
||||
### 8.1 Prototype Deprecation
|
||||
**Goal**: Mark prototypes as archived.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Add README.md to each prototype directory:
|
||||
- Status: ARCHIVED
|
||||
- Reason: Consolidated into main codebase
|
||||
- Date: 2025-12-XX
|
||||
|
||||
- [ ] Document migration decisions in: `docs/development/prototype-migration.md`
|
||||
- [ ] Keep prototypes in repo for reference (don't delete)
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- Clear indication that prototypes are not maintained
|
||||
- Migration rationale documented
|
||||
|
||||
### 8.2 End-to-End Validation
|
||||
**Goal**: Verify complete system integration.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Write E2E test: `tests/e2e/test_full_workflow.py`:
|
||||
- Citizen uploads document
|
||||
- Document is routed
|
||||
- Thread is created
|
||||
- Messages are exchanged
|
||||
- Export is generated
|
||||
|
||||
- [ ] Run against local environment (Docker Compose)
|
||||
- [ ] Measure performance against NFRs:
|
||||
- Document upload + routing: <500ms
|
||||
- Message retrieval: <300ms
|
||||
|
||||
- [ ] Document E2E setup in: `docs/development/e2e-testing.md`
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- E2E test passes consistently
|
||||
- Performance targets met
|
||||
- E2E environment reproducible via Docker Compose
|
||||
|
||||
### 8.3 Documentation Review
|
||||
**Goal**: Ensure all documentation is accurate and complete.
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Review all ADRs for consistency
|
||||
- [ ] Update CLAUDE.md with final structure
|
||||
- [ ] Review API documentation against implementation
|
||||
- [ ] Spell check and grammar check all docs
|
||||
- [ ] Generate API documentation from OpenAPI spec (ReDoc or Swagger UI)
|
||||
|
||||
**Acceptance Criteria**:
|
||||
- No broken links in documentation
|
||||
- Code examples in docs are tested
|
||||
- CLAUDE.md accurately reflects current state
|
||||
|
||||
---
|
||||
|
||||
## 9. Success Criteria (Overall)
|
||||
|
||||
### Functional Requirements
|
||||
- ✅ Main codebase has all features from best prototype
|
||||
- ✅ All core APIs implemented and tested
|
||||
- ✅ Background worker functional
|
||||
|
||||
### Quality Requirements
|
||||
- ✅ Test coverage >80%
|
||||
- ✅ Zero mypy type errors
|
||||
- ✅ All linting rules pass
|
||||
- ✅ CI/CD pipeline green
|
||||
|
||||
### Documentation Requirements
|
||||
- ✅ ADRs in individual files with consistent structure
|
||||
- ✅ All major decisions documented
|
||||
- ✅ Agentic coding guide comprehensive
|
||||
- ✅ CLAUDE.md accurate and complete
|
||||
|
||||
### Performance Requirements
|
||||
- ✅ Document routing <500ms (measured in E2E tests)
|
||||
- ✅ Message retrieval <300ms (measured in E2E tests)
|
||||
- ✅ Large file upload streaming works (>50MB test)
|
||||
|
||||
### Process Requirements
|
||||
- ✅ TDD workflow established and documented
|
||||
- ✅ Pre-commit hooks prevent quality issues
|
||||
- ✅ GitHub Actions enforce quality gates
|
||||
- ✅ Agentic development patterns proven with at least 3 features
|
||||
|
||||
---
|
||||
|
||||
## 10. Risk Mitigation
|
||||
|
||||
### Risk 1: Prototype Integration Conflicts
|
||||
**Mitigation**: Complete comparative analysis (Phase 3.1) before implementation. Document decision rationale.
|
||||
|
||||
### Risk 2: TDD Slowing Initial Progress
|
||||
**Mitigation**: Front-load interface definition (Phase 2). Once interfaces stable, implementation accelerates.
|
||||
|
||||
### Risk 3: Incomplete ADR Extraction
|
||||
**Mitigation**: Use checklist approach. Review original `decisions.md` multiple times. Cross-reference with implementation guide.
|
||||
|
||||
### Risk 4: Agentic Coding Learning Curve
|
||||
**Mitigation**: Create example-driven guide. Include actual prompts that worked. Pair with human for first few features.
|
||||
|
||||
### Risk 5: Performance Targets Not Met
|
||||
**Mitigation**: Include performance testing from Phase 2 onwards. Identify bottlenecks early. Profile with py-spy or similar.
|
||||
|
||||
---
|
||||
|
||||
## 11. Next Steps
|
||||
|
||||
1. **Review this workplan** with the team
|
||||
2. **Adjust timeline** based on team capacity
|
||||
3. **Start Phase 1** (Foundation Setup)
|
||||
4. **Daily standup** to track progress and blockers
|
||||
5. **Weekly retrospective** to improve agentic coding workflow
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Recommended Tools
|
||||
|
||||
- **Testing**: pytest, pytest-asyncio, pytest-cov, hypothesis (property testing)
|
||||
- **Mocking**: respx (HTTP), moto (AWS), testcontainers-python (real services)
|
||||
- **Linting**: ruff (fast, replaces flake8 + isort + pyupgrade)
|
||||
- **Type Checking**: mypy with strict mode
|
||||
- **Factories**: factory_boy or custom Pydantic factories
|
||||
- **Performance**: py-spy (profiling), locust (load testing)
|
||||
- **Pre-commit**: pre-commit framework
|
||||
- **CI/CD**: GitHub Actions (free for public repos)
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: ADR Numbering Convention
|
||||
|
||||
- **0001-0099**: Core Architecture (payload model, auth, concurrency)
|
||||
- **0100-0199**: Data & Persistence (pagination, retention, schema)
|
||||
- **0200-0299**: API & Integration (REST structure, export workflow)
|
||||
- **0300-0399**: Development Process (TDD, agentic coding)
|
||||
- **0400+**: Future decisions
|
||||
|
||||
---
|
||||
|
||||
**Document Owner**: Backend Engineering Team
|
||||
**Last Updated**: 2025-12-01
|
||||
**Status**: Ready for Review
|
||||
Reference in New Issue
Block a user