9.4 KiB
Implementation Guide: DirektVermittlungDe
While the API Documentation explains how to use the API, this guide explains how to build it, focusing on the backend engineering challenges like encryption handling, database schema, and performance patterns defined in your architecture.
Status: Draft v1.0
Target Audience: Backend Engineering Team
[cite_start]Context: Implements constraints from DvdArchitektur.txt [cite: 6]
1. Technology Stack & Setup
[cite_start]Based on the architectural constraints[cite: 45, 46, 47], the recommended reference stack is:
- Service Layer: Java (Spring Boot 3.x) or Go (Gin/Echo) for high-concurrency performance.
- Primary Database: PostgreSQL 15+ (Relational data for Routing/Threads).
- Blob Storage: S3-Compatible Storage (AWS S3 / MinIO) for encrypted PDF payloads.
- Cache/PubSub: Redis 7.x (Session store, Rate limiting, Async Job queues).
Project Structure (Bounded Contexts)
[cite_start]Organize the codebase into modules matching the architecture[cite: 43]:
dvd-intake-service: Handles/documentsand Metadata extraction.dvd-communication-service: Handles/threadsand/messages.dvd-routing-engine: The logic component for assigning units.dvd-export-worker: Async background workers for eAkte exports.
2. Security Implementation Details
[cite_start]2.1 Handling "Blind" E2E Encryption [cite: 27]
The backend must not attempt to decrypt the encryptedPayload.
- Ingest: Receive the
encryptedPayload(Base64 or binary multipart). Stream it directly to S3 Blob Storage. Do not load the full file into RAM to avoid OOM on large files. - Metadata: Only persist the
metadataJSON object to PostgreSQL for routing logic. - Validation: Verify the
encryptedPayloadis a valid encrypted container (e.g., check PGP/AES headers) but treat the content as opaque.
[cite_start]2.2 Stateless Authentication (OAuth2) [cite: 44]
- Gateway Layer: Implement a centralized API Gateway (e.g., Spring Cloud Gateway / Nginx) that validates JWT signatures (JWKS) from BundID (Citizens) and Authority-IDP (Officials).
- Context Propagation: Extract the
sub(User ID) andscopefrom the JWT and pass them to downstream microservices via HTTP Headers (e.g.,X-User-Id,X-User-Role).
3. Database Schema Recommendations (PostgreSQL)
[cite_start]Map the domain objects [cite: 64] to the following relational schema.
Table: documents
CREATE TABLE documents (
id UUID PRIMARY KEY,
reference_number VARCHAR(50) NOT NULL, -- "Aktenzeichen"
authority_id VARCHAR(50) NOT NULL, -- Routing target
status VARCHAR(20) DEFAULT 'RECEIVED', -- RECEIVED, ROUTED, ASSIGNED
storage_path VARCHAR(255) NOT NULL, -- S3 Key for encrypted blob
created_at TIMESTAMPTZ DEFAULT NOW(),
[cite_start]retention_date TIMESTAMPTZ -- For GDPR auto-deletion [cite: 14]
);
CREATE INDEX idx_docs_authority ON documents(authority_id, status);
Table: threads
CREATE TABLE threads (
id UUID PRIMARY KEY,
document_id UUID REFERENCES documents(id),
type VARCHAR(20) NOT NULL, -- CHAT, CALLBACK, APPOINTMENT
assigned_official_id VARCHAR(100), -- Nullable until claimed
last_activity_at TIMESTAMPTZ
);
Table: messages
CREATE TABLE messages (
id UUID PRIMARY KEY,
thread_id UUID REFERENCES threads(id),
sender_role VARCHAR(20) NOT NULL,
content_blob TEXT NOT NULL, -- Encrypted content
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Efficient Cursor Pagination: Index on (thread_id, created_at)
CREATE INDEX idx_msgs_thread_time ON messages(thread_id, created_at DESC);
4. Performance & Scalability Patterns
[cite_start]4.1 Rate Limiting (Redis Token Bucket) [cite: 24]
To protect against DDoS and ensure fair usage (NFR-2), implement specific limits:
- Citizens: 10 requests/minute (prevent spamming threads).
- Officials: 1000 requests/minute (allow rapid batch processing).
Implementation Tip: Use a Redis-based "Sliding Window" Lua script. Key format: rate_limit:{user_id}.
[cite_start]4.2 Caching Strategy [cite: 47]
- Routing Rules: Cache
RoutingRulesin Redis for 1 hour. Invalidate immediately on Admin updates. - ETags: For
GET /documents/{id}, generate an ETag based on theupdated_attimestamp. Return304 Not Modifiedto save bandwidth if the client has the latest version.
[cite_start]5. Async Export Workflow [cite: 16]
For the POST /exports endpoint:
- API Layer: Validate request -> Publish event
ExportRequestedto RabbitMQ/Redis Stream -> Return202 Accepted+jobId. - Worker:
- Consume
ExportRequested. - Fetch
encryptedPayloadfrom S3. - Fetch Message History from Postgres.
- Note: The Worker might need a special "Authority Key" to re-encrypt the package for the target eAkte system, depending on the specific crypto-concept.
- Push result to the Authority's Ingest Interface.
- Update Job Status to
COMPLETED.
- Consume
6. Definition of Done Checklist
Before deploying to the staging environment, ensure:
- [cite_start][ ] Load Test: System handles 500 concurrent document uploads without error[cite: 24].
- Security Audit: Confirm no PII (Aktenzeichen) is logged in plaintext application logs.
- [cite_start][ ] Cleanup: The "GDPR Reaper" job is active and deletes
documentswhereretention_date < NOW()[cite: 14].
Implementation Guide: DirektVermittlungDe
Status: Draft v1.1
Stack: Python / FastAPI
Context: Implements constraints from DvdArchitektur.txt and ADR-007.
1. Technology Stack
- Language: Python 3.11+
- Web Framework: FastAPI (with Uvicorn + Gunicorn)
- Validation: Pydantic V2 (Strict Mode)
- Database ORM: SQLAlchemy (Async) or Tortoise-ORM
- Task Queue: ARQ (Redis-based) or Celery
- Primary DB: PostgreSQL 15+
- Blob Store: MinIO / AWS S3
2. Project Structure & Patterns
Organize the monolithic repo or microservices using "Clean Architecture":
/src
/domain # Pydantic models & Business Rules (Pure Python)
/adapters # DB, S3, External APIs
/service # Application Logic
/api # FastAPI Routes
/workers # Background Job Definitions
2. Project Structure & Patterns
Organize the monolithic repo or microservices using "Clean Architecture":
/src
/domain # Pydantic models & Business Rules (Pure Python)
/adapters # DB, S3, External APIs
/service # Application Logic
/api # FastAPI Routes
/workers # Background Job Definitions
## 3. The "Hybrid Concurrency" Pattern (Critical)
To meet NFR-1 (<300ms) and NFR-2 (10k sessions), you must not block the Event Loop.
### 3.1 The Rule
- NEVER use time.sleep, requests, or heavy computation (e.g., pypdf, cryptography) inside an async def.
- ALWAYS use await for I/O.
- ALWAYS use loop.run_in_executor for CPU tasks.
### 3.2 Implementation Snippet
```python
import asyncio
from concurrent.futures import ProcessPoolExecutor
from fastapi import APIRouter, UploadFile
import some_heavy_crypto_lib
router = APIRouter()
# Create a dedicated pool for CPU tasks
cpu_pool = ProcessPoolExecutor(max_workers=4)
def cpu_bound_decryption(payload: bytes) -> dict:
# This runs in a separate process, bypassing the GIL
return some_heavy_crypto_lib.decrypt_and_parse(payload)
@router.post("/documents")
async def upload_document(file: UploadFile):
content = await file.read() # Non-blocking I/O
# Offload CPU work to the pool
loop = asyncio.get_running_loop()
metadata = await loop.run_in_executor(
cpu_pool,
cpu_bound_decryption,
content
)
return {"status": "processed", "meta": metadata}
4. Security Implementation
4.1 "Blind" Ingest
- Stream uploads directly to S3 using aiobotocore to avoid loading 50MB PDFs into RAM.
- Do not attempt to read the encryptedPayload in the main web service process.
4.2 Auth Middleware
Use fastapi.security.OAuth2AuthorizationCodeBearer. Implement a dependency that validates the JWT signature using a cached JWKS (JSON Web Key Set) to avoid a network call on every request.
5. Database Schema (SQLAlchemy Async)
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
from sqlalchemy import String, UUID, DateTime
import uuid
from datetime import datetime
class Base(DeclarativeBase):
pass
class Document(Base):
__tablename__ = "documents"
id: Mapped[uuid.UUID] = mapped_column(primary_key=True, default=uuid.uuid4)
reference_number: Mapped[str] = mapped_column(String(50), index=True)
status: Mapped[str] = mapped_column(String(20), default="RECEIVED")
# …
6. Testing Strategy (Agentic TDD)
- Framework: pytest + pytest-asyncio.
- Mocking: Use respx for mocking external HTTP calls (Authority Systems).
- Database: Use testcontainers-python to spin up a real Postgres for integration tests.
- Prompting the Agent: "Write an async pytest for POST /documents. Use ProcessPoolExecutor mock to verify CPU offloading."
xxx