init: documentation and prototypes

This commit is contained in:
2025-12-01 22:01:19 +01:00
parent e936fb41fa
commit 45d60fc1a9
51 changed files with 2476 additions and 1 deletions

View File

@@ -0,0 +1,290 @@
DvdArchitektur
Architektur Dokument für Direkt Vermittlung Deutschland
# Direkt Vermittlung Deutschland Architektur
## 1. Functional Requirements
**Description: Core features and user stories**
### 1.1 Kernfunktionen (auf Beleg fokussiert)
**FR-1 Schreiben erfassen (Document Intake)**
* Bürger:innen können ein behördliches Schreiben als PDF hochladen oder via Kamera erfassen.
* Alternativ: Eingabe eines eindeutigen Merkmals (z. B. Aktenzeichen/Kassenzeichen + Behörde).
* Das System erzeugt einen **DocumentEnvelope** inkl. Metadaten (Aktenzeichen, Belegart, Datum, Absenderbehörde).
**FR-2 Automatisches Routing zur zuständigen Stelle**
* Die Plattform verfügt über eine **Routing-Engine**, die anhand von Metadaten & Konfiguration
* die zuständige Behörde / Organisationseinheit
* und idealerweise den zuständigen Sachbearbeiter ermittelt.
* Fallback-Regeln (z. B. Standard-Team, Zentrale) greifen, wenn keine eindeutige Zuordnung möglich ist.
**FR-3 Interaktionsstart zum Schreiben (Interaction Thread)**
* Bürger:innen können zu einem bestimmten Schreiben:
* eine Rückfrage stellen (Textnachricht),
* einen Rückruf anfragen,
* einen Direktanruf starten (sofern verfügbar),
* einen Terminwunsch äußern.
* Für jede solche Interaktion wird ein **InteractionThread** zum **DocumentEnvelope** angelegt.
**FR-4 Kommunikation & Dokumentaustausch**
* Bürger und Behörde können in einem Thread:
* Nachrichten austauschen,
* ergänzende Belege hochladen (z. B. Nachweise, Antworten),
* interne Notizen (behördenintern, nicht für Bürger sichtbar) anlegen.
* Alle Aktivitäten werden chronologisch protokolliert.
**FR-5 Call-/Rückruf-/Termin-Orchestrierung**
* Das System unterstützt:
* Weitervermittlung bei eingehenden Anrufen anhand eines Merkmals,
* Anlegen & Verwalten von Rückrufanforderungen,
* optionale Terminvereinbarung (Kalenderintegration der Behörde).
**FR-6 Temporäre Vorhaltung & persönliche Ablage**
* Standard: Daten (Schreiben, Threads, Dateien) werden nur bis zur Erledigung + definierter Inaktivitätsfrist gehalten und dann gelöscht.
* Bürger und Behörde können optional eine **verlängerte Aufbewahrung** (persönliche Ablage) buchen/aktivieren.
**FR-7 Datenexport an Behördensysteme**
* Behörden können Daten eines Vorgangs (Schreiben, Metadaten, Interaktionshistorie, Anhänge)
* in ihre eAkte / Fachverfahren exportieren (pull oder push).
* Der Export ist konfigurierbar (welche Daten, welches Format).
---
### 1.2 User Stories (Beispiele)
* *Als Bürger* möchte ich ein behördliches Schreiben scannen oder hochladen können,
damit ich ohne manuelle Zuständigkeitsrecherche direkt den richtigen Ansprechpartner erreiche.
* *Als Sachbearbeiterin* möchte ich alle offenen Vorgänge zu den Schreiben meiner Behörde in einer Liste sehen,
damit ich Rückfragen gezielt bearbeiten und priorisieren kann.
* *Als Behördenadministrator* möchte ich Routingregeln konfigurieren (z. B. nach Aktenzeichenbereichen),
damit Anfragen automatisiert der richtigen Organisationseinheit zugeordnet werden.
* *Als Bürger* möchte ich optional eine längerfristige digitale Ablage wichtiger Schreiben und Klärungsverläufe,
damit ich später auf diese Informationen zugreifen kann.
**Why It Matters:**
Diese Functional Requirements definieren klar, **was** DVD tut:
* Sie trennen sauber zwischen “Beleg”, “Interaktion” und “Routing” → gute Grundlage für REST-/API-Design.
* Sie machen DVD bewusst **beleggesteuert** (nicht generische Kommunikation) und ermöglichen spätere Erweiterungen (z. B. KI-Extraktion, andere Kanäle), ohne das Kernmodell zu brechen.
---
## 2. Non-Functional Requirements
**Description: Performance, security, usability SLAs**
### 2.1 Performance & Skalierung
* **NFR-1:** 95. Perzentil der Antwortzeiten für Kernoperationen (Thread öffnen, Nachricht senden, Routing) < 300 ms bei normaler Last.
* **NFR-2:** System unterstützt mindestens **10k gleichzeitige Sessions** (Behörden + Bürger) pro Region.
* **NFR-3:** Routingentscheidungen werden in < 500 ms getroffen (inkl. Lookups).
### 2.2 Sicherheit & Datenschutz
* **NFR-4:** Vollständige **DSGVO-Konformität**, inkl.
* Recht auf Auskunft, Löschung, Berichtigung,
* dokumentierte Retentions-Policies,
* Privacy-by-Design & by-Default.
* **NFR-5:** End-to-End-Verschlüsselung in Transport (TLS 1.2+) und Verschlüsselung sensibler Daten im Ruhezustand (z. B. Aktenzeichen, PDF-Dokumente).
* **NFR-6:** Zugriff nur auf Basis rollenspezifischer Berechtigungen (least privilege, mandantensicher).
### 2.3 Verfügbarkeit & Resilienz
* **NFR-7:** Zielverfügbarkeit: ≥ 99,5 % im Jahresmittel.
* **NFR-8:** Disaster-Recovery-Konzept mit RPO ≤ 15 min, RTO ≤ 4 h.
### 2.4 Usability
* **NFR-9:** Bürgeroberfläche barrierearm nach WCAG 2.1 AA-Orientierung.
* **NFR-10:** Max. 3 Schritte vom Start bis zur gestellten Anfrage.
**Why:**
Diese NFRs treiben Architekturentscheidungen:
* Asynchrone Verarbeitung, Caching & horizontale Skalierung werden notwendig.
* Strikte IAM- & Verschlüsselungsschichten sind Pflicht in GovTech-Umgebungen.
* Klare SLAs verhindern, dass “Prototyping-Architektur” im Produktivbetrieb kollabiert.
---
## 3. Business Context
**Description: Use cases, target audience, integrations**
### 3.1 Use Cases (Business-Sicht)
* Reduktion der durchschnittlichen **Telefonzeit pro Klärung** (z. B. von 30 auf 10 Minuten).
* Entlastung von Telefonzentralen, Bündelung der Klärung direkt bei den Fachbereichen.
* Schaffung eines digitalen Rückkanals zu behördlichen Schreiben, der nachvollziehbar und sicher ist.
### 3.2 Zielgruppen
* **Primäre Business-Kunden:**
* Kommunalverwaltungen, Gerichtskassen, Amtsgerichte, Landesbehörden, Finanzämter.
* **Endnutzer:innen:**
* Bürger:innen (Privatpersonen, Unternehmen) mit Bescheiden / Schreiben.
* **Sekundäre Stakeholder:**
* Kommunale/regionale IT-Dienstleister, Fachverfahrenshersteller.
### 3.3 Integrationskontext
* eAkte-Systeme der Behörden (unterstützt durch generische Exporte und custom Adapter).
* Identity-Provider (z. B. BundID, eID, SAML/OIDC-basierte Behörden-Singlesignon).
* TK-Anlagen / Callcenter-Software (für Direktvermittlung, Rückruflisten).
* Optional: Dokumentenmanagement / Archivsysteme.
**Why:**
Der Business-Kontext sorgt dafür, dass du die “Nomen” des Systems (DocumentEnvelope, InteractionThread, AuthorityUnit) passend zu realen Strukturen modellierst. Er verhindert, dass du eine “abstrakte Messaging-Plattform” baust, statt eines **belegzentrierten GovTech-Dienstes**.
---
## 4. Technical Constraints
**Description: Existing systems, tech stack, and standards**
*(Hier formuliere ich bewusst Vorschläge, die du später bestätigen/ändern kannst.)*
### 4.1 Architekturparadigmen
* **TC-1:** Service-orientierte / modularisierte Architektur mit klaren Bounded Contexts:
* Citizen & Channels,
* Document & Interaction,
* Routing & Org,
* Integration & Export.
* **TC-2:** Stateless APIs im Frontend-Layer, Session-Handling via Token (JWT/OAuth2), um horizontale Skalierung zu ermöglichen.
### 4.2 Tech Stack (Vorschlag / Beispiel)
* **Backend:** z. B. Java (Spring Boot) oder Go (stark in Gov-Umfeldern verbreitet), Node.js denkbar Standard: REST-APIs, optional GraphQL.
* **Datenbank:** Relationale DB (PostgreSQL) für Kernobjekte; Blob-Storage (z. B. S3-kompatibel) für PDF-Dokumente.
* **Caching:** Redis für Sessions, Routing-Cache und häufige Lookups.
* **Messaging:** Optionale Message-Bus/Queue (z. B. Kafka/RabbitMQ) für Export-Jobs, Notifications.
### 4.3 Standards & Schnittstellen
* **TC-3:** REST/JSON-APIs als Primärschnittstelle; sprechende Ressourcen (z. B. `/documents`, `/threads`, `/exports`).
* **TC-4:** HTTPS/TLS verpflichtend, HSTS aktiviert.
* **TC-5:** OpenAPI-Spezifikation für alle externen APIs (für Behörden & Integrationspartner).
**Why:**
Diese Constraints helfen, früh falsche Technologieentscheidungen auszuschließen und das System **von Anfang an auf Skalierbarkeit & Interoperabilität** auszurichten. OpenAPI-First erleichtert LLM-gestützte Codegenerierung und Mock-APIs.
---
## 5. Stakeholder Feedback
**Description: Early input from consumers (e.g., devs using the API)**
### 5.1 Stakeholder-Gruppen für Feedback
* **Behörden-Fachseite:** Sachbearbeiter:innen und Fachbereichsleitungen
* Fokus: Usability, Prozesslogik, Zuständigkeitsmodell.
* **Behörden-IT:** Architekt:innen, Admins, Datenschutzbeauftragte
* Fokus: Integrationsfähigkeit, Sicherheit, Betriebsmodell.
* **Bürgerperspektive:** Pilot-Testgruppen
* Fokus: Verständlichkeit, Barrierefreiheit, Friktion beim Einstieg (Schreiben hochladen, Merkmal finden etc.).
### 5.2 Feedback-Mechanismen
* Interaktive Mock-UIs / klickbare Prototypen (z. B. Figma) zur frühen Usability-Validierung.
* **Mock-APIs** (z. B. via Postman, Insomnia oder Swagger-UI) zur API-Validierung mit Behördensystemen.
* Kurze Surveys oder Interviews (z. B. „Was ist für Sie das größte Problem bei Rückfragen zu Schreiben?“).
### 5.3 Iterationsschleifen
* Frühzeitige “Pilot-Phase” mit einer kleinen Behörde/Abteilung, bevor generisch ausgebaut wird.
* Veränderungswünsche in **Versionierte API-Designs** überführen (z. B. v1 → v1.1), kein “API-Chaos”.
**Why:**
Stakeholder-Feedback verhindert, dass du an den eigentlichen Bedürfnissen vorbeientwickelst (z. B. zu komplizierte Ablaufsteuerung). Es ist essentiell für **cleanes, *wirklich* extensibles Design**, insbesondere, weil Behörden-IT traditionell komplexe und gewachsene Umfelder hat.
---
## 6. Existing Assets
**Description: Legacy docs, wireframes, or code snippets**
Im Moment sind das bei DVD überwiegend **noch zu definierende** Assets aber hier ist, was du konkret sammeln/erstellen solltest (und wie es der Architektur hilft):
### 6.1 Fachliche Assets
* Beispiel-Briefe aus verschiedenen Behörden (gerichtliche Schreiben, Gebührenbescheide, Steuerbescheide etc.)
* inkl. Struktur der Aktenzeichen/Kassenzeichen (Masken, Beispiele).
* Prozessbeschreibungen, wie Rückfragen heute laufen (IST-Prozess) z. B. BPMN oder einfache Swimlane-Diagramme.
### 6.2 Modell- & Design-Assets
* Erste **Domänenmodelle / UML-Klassendiagramme** für:
* `DocumentEnvelope`, `InteractionThread`, `RoutingRule`, `AuthorityUnit`, `User`.
* Wireframes/Mockups für:
* Bürger-Oberfläche (Schreiben hochladen, Anfrage stellen),
* Sachbearbeiter-Oberfläche (Postkorb, Vorgänge, Antworten).
### 6.3 Technische Assets
* Beispiel-JSON-Payloads für zentrale API-Objekte, z. B.:
```json
{
"documentId": "doc-123",
"organisationId": "court-xyz",
"caseNumber": "12 C 345/25",
"documentType": "PaymentNotice",
"issuedAt": "2025-12-01",
"citizenHint": "Bitte geben Sie dieses Aktenzeichen bei Rückfragen an."
}
```
```json
{
"threadId": "thr-789",
"documentId": "doc-123",
"messages": [
{
"messageId": "msg-1",
"from": "citizen",
"content": "Ich habe eine Frage zur Zahlungsfrist.",
"createdAt": "2025-12-02T09:15:00Z"
}
]
}
```
**Why:**
Diese Assets sind Gold für:
* spätere LLM-Unterstützung (z. B. “Generiere Routing-Regeln basierend auf diesen Beispielbriefen”),
* Reduktion von Missverständnissen bei Domainmodell & API-Design,
* schnellere Implementierung, weil Entwickler konkrete Fälle sehen.
---

View File

@@ -0,0 +1,22 @@
DirektVermittlungDe
Gegen Schnitzeljagd und Flüsterpost in der öffentlichen Verwaltung
# DirektVermittlungDe (DVD)
*Effiziente, sichere und direkte Kommunikation zwischen Bürgern und Behörden.*
Behördenkommunikation krankt oft nicht am Inhalt, sondern an Überlast und umständlicher Koordination:
Falscher Ansprechpartner → Telefonzentrale → Weiterleitung → erneute Identifikation → erneutes Erklären.
Die Auskunft selbst dauert 2 Minuten, der Weg dahin 30.
DirektVermittlungDe reduziert diesen Weg auf einen Schritt.
Der Bürger übermittelt einen Ausgangsbeleg oder ein darin eindeutiges Merkmal (z. B. Aktenzeichen/Kassenzeichen) →
das System ermittelt die fachliche Zuständigkeit → stellt möglichst direkt zum korrekten Sachbearbeiter durch.
Anstatt Bürger in Warteschleifen zu parken und Sachbearbeiter mit Weiterleitungen an den richtigen Zuständigen zu belasten, koordiniert das System Durchstellen, Rückruf oder Terminvereinbarung und ermöglicht bei Bedarf die Zuordnung von Notizen und ergänzenden Belegen.
**Problemklärung statt Problemverwaltung!**
xxx

175
docs/api_docs.md Normal file
View File

@@ -0,0 +1,175 @@
Here is the **API Documentation & Implementation Guide** for DirektVermittlungDe (DVD). This document focuses on the **practical implementation scenarios** derived from the architectural requirements and the OpenAPI specification designed in the previous step.
-----
# API Documentation: Core Scenarios & Implementation Guide
**Version:** 1.0
**Target Audience:** Frontend Developers (Citizen App), Backend Integrators (Authority Systems)
**Base URL:** `https://api.direktvermittlung.de/v1`
## Overview
[cite_start]This API enables the "Direct Vermittlung" workflow: receiving documents, routing them to the correct authority, and facilitating direct communication[cite: 1, 4]. [cite_start]It is designed to be **stateless** [cite: 40][cite_start], **secure** (E2E encryption) [cite: 23][cite_start], and **scalable** (10k+ sessions)[cite: 20].
-----
## 0\. Authentication
[cite_start]All endpoints require a valid OAuth2 Bearer Token[cite: 40].
* **Citizens** use `citizen:write` scope (via BundID/eID).
* **Officials** use `official:read`/`official:write` scopes (via Authority SSO).
-----
## Scenario 1: The "Digital Intake" (Document Submission)
[cite_start]**User Story:** A citizen scans a tax assessment letter to find the right contact person without manual searching[cite: 1, 13].
### Implementation Logic
1. [cite_start]**Metadata Separation:** The client must extract non-sensitive routing data (Authority Name, Reference Number) as plaintext metadata[cite: 2, 4].
2. [cite_start]**Encryption:** The actual PDF content (`encryptedPayload`) is encrypted on the client side before upload to meet NFR-5 (E2E Encryption)[cite: 23].
3. [cite_start]**Routing:** The backend Routing Engine uses the plaintext metadata to assign the `DocumentEnvelope` to the correct unit[cite: 4].
### API Request: `POST /documents`
```json
{
"metadata": {
"authorityId": "Finanzamt-München-I",
"referenceNumber": "123/456/789",
"docType": "NOTICE",
"issuedAt": "2025-10-25T00:00:00Z"
},
"encryptedPayload": "BASE64_ENCRYPTED_BLOB_..."
}
```
### Success Response (`201 Created`)
```json
{
"id": "doc-882291",
"status": "ROUTED",
"assignedUnit": "Steuerfestsetzung-Team-B"
}
```
> [cite_start]**Note:** The `assignedUnit` confirms that the Routing Engine successfully mapped the request in \< 500ms[cite: 20].
-----
## Scenario 2: Starting a Clarification Thread
[cite_start]**User Story:** The citizen has a question about the document and wants to start a chat or request a callback[cite: 5, 6].
### Implementation Logic
1. [cite_start]**Context:** The thread is explicitly linked to the `documentId` (`doc-882291`), creating a "Subject-Context" binding[cite: 6].
2. [cite_start]**Type Selection:** The user selects the channel: `TEXT_CHAT`, `CALLBACK_REQUEST`, or `APPOINTMENT`[cite: 5].
### API Request: `POST /documents/doc-882291/threads`
```json
{
"type": "CALLBACK_REQUEST",
"initialMessage": "I do not understand the calculation on page 2. Please call me.",
"preferredTimeSlot": "2025-10-28T14:00:00Z"
}
```
### Success Response (`201 Created`)
```json
{
"threadId": "th-9912",
"status": "PENDING_OFFICIAL",
"estimatedWaitTime": "4h"
}
```
-----
## Scenario 3: Real-time Communication & History
[cite_start]**User Story:** An official replies to the inquiry, and the citizen views the chat history[cite: 7].
### Implementation Logic
1. [cite_start]**Performance:** To support fast loading (NFR-1 \< 300ms)[cite: 19], we use **Cursor-based pagination** for messages.
2. **Polling/Updates:** The frontend polls this endpoint (or uses a WebSocket subscription, if extended) to show new messages.
### API Request: `GET /threads/th-9912/messages`
*Query Parameters:* `?limit=20&before=2025-10-27T10:00:00Z`
### Success Response (`200 OK`)
```json
{
"data": [
{
"id": "msg-552",
"senderRole": "OFFICIAL",
"content": "encrypted_content_string...",
"timestamp": "2025-10-27T09:45:00Z"
},
{
"id": "msg-551",
"senderRole": "CITIZEN",
"content": "encrypted_content_string...",
"timestamp": "2025-10-27T09:30:00Z"
}
],
"paging": {
"nextCursor": "2025-10-27T09:30:00Z"
}
}
```
-----
## Scenario 4: Data Export (Authority Integration)
[cite_start]**User Story:** The issue is resolved, and the authority imports the chat history and document into their eAkte system[cite: 11, 35].
### Implementation Logic
1. [cite_start]**Asynchronous Processing:** Since exports can be large (PDFs + Chat Logs), the API returns immediately with a `jobId`[cite: 28, 43].
2. [cite_start]**Format:** The export format is configurable (e.g., PDF summary of chat + original attachments)[cite: 12].
### API Request: `POST /exports`
```json
{
"caseId": "doc-882291",
"targetSystem": "eAkte-Standard-V2",
"includeAttachments": true
}
```
### Success Response (`202 Accepted`)
```json
{
"jobId": "job-5512",
"status": "QUEUED",
"statusUrl": "/exports/job-5512"
}
```
-----
## Error Handling Standards
[cite_start]To ensure clarity for developers[cite: 16], the API uses standard HTTP codes:
* `400 Bad Request`: Validation failed (e.g., missing Aktenzeichen).
* `404 Not Found`: Document or Thread ID does not exist.
* [cite_start]`429 Too Many Requests`: Rate limit exceeded (NFR protection)[cite: 19].
* `503 Service Unavailable`: Maintenance or backend overload.
xxx

125
docs/decisions.md Normal file
View File

@@ -0,0 +1,125 @@
# Decision Log: DirektVermittlungDe
**Status:** Approved
**Date:** 2025-12-01
**Context:** [DvdArchitektur.txt], [Introduction.txt]
Here is the formal **Decision Log** for the DirektVermittlungDe (DVD) project. This document captures the critical architectural choices, the options considered, and the rationale based on the provided requirements and constraints.
# Decision Log: DirektVermittlungDe
**Status:** Approved
**Date:** 2025-12-01
**Context:** [DvdArchitektur.txt], [Introduction.txt]
---
## ADR-001: Separation of Metadata and Content Payload
**Context:**
[cite_start]The system requires an automated **Routing-Engine** to assign documents to the correct authority[cite: 4]. [cite_start]However, strict **End-to-End Encryption (E2E)** is required for sensitive data in transport and rest[cite: 23]. The server cannot route what it cannot read.
**Decision:**
We will implement a **Split-Payload Model**:
1. [cite_start]**Metadata (Plaintext):** A defined set of non-sensitive routing criteria (e.g., Authority ID, Reference/Aktenzeichen, Document Type) is sent as unencrypted JSON (over TLS)[cite: 3].
2. [cite_start]**Payload (Encrypted):** The actual PDF document and sensitive message content are encrypted on the client side and stored as opaque blobs[cite: 23, 42].
**Rationale:**
* [cite_start]**Compliance:** Satisfies NFR-5 (Privacy/Encryption) while enabling FR-2 (Auto-routing)[cite: 4, 23].
* [cite_start]**Performance:** The Routing Engine operates on lightweight JSON metadata (< 500ms target) without needing to decrypt/encrypt heavy files[cite: 20].
[Image of end to end encryption architecture]
---
## ADR-002: Stateless Authentication via OAuth2/JWT
**Context:**
[cite_start]The system must support **10k+ concurrent sessions** per region [cite: 20] [cite_start]and allow horizontal scaling (TC-2)[cite: 40]. Traditional server-side sessions (sticky sessions) would hinder scalability.
**Decision:**
Use **OAuth2 with OpenID Connect (OIDC)** and stateless **JWTs (JSON Web Tokens)** for session handling.
* **Citizens:** Authenticate via BundID/eID.
* [cite_start]**Officials:** Authenticate via Authority SSO (SAML/OIDC integration)[cite: 35].
**Rationale:**
* [cite_start]**Scalability:** Allows the backend to be purely stateless; any instance can service any request (TC-2)[cite: 40].
* [cite_start]**Security:** Scopes (e.g., `citizen:write`, `official:read`) map directly to the least-privilege NFR-6[cite: 24].
---
## ADR-003: Pagination Strategy for Interaction Threads
**Context:**
Interaction threads (FR-3) can grow long over time. [cite_start]The NFR-1 target is a response time of **< 300ms** for core operations[cite: 19]. Standard "Page/Offset" pagination degrades in performance as datasets grow (Offset Drifting) and handles real-time updates poorly.
**Decision:**
* **Cursor-based Pagination:** Used for `InteractionThreads` (chat history). The cursor will be the `timestamp` of the message.
* [cite_start]**Offset-based Pagination:** Retained for the **Officials Case List/Inbox** (FR-1.2), where users expect to "jump to page 2"[cite: 13].
**Rationale:**
* [cite_start]**Performance:** Cursor seeking is O(1) complexity, ensuring the 300ms SLA is met regardless of thread length[cite: 19].
* [cite_start]**Usability:** Prevents "missing messages" or duplicates if new messages arrive while a user is scrolling (essential for chat)[cite: 8].
---
## ADR-004: Asynchronous Processing for Data Exports
**Context:**
[cite_start]Authorities need to export data (PDFs + History) to their eAkte systems (FR-7)[cite: 11]. Generating these packages is resource-intensive and unpredictable in duration, which risks timing out a synchronous HTTP request.
**Decision:**
Implement the **Asynchronous Request-Reply Pattern**.
1. Client POSTs to `/exports` and receives `202 Accepted` + `Job-ID`.
2. [cite_start]Background workers (via Message Queue) process the PDF assembly[cite: 43].
3. Client polls for completion or receives a webhook.
**Rationale:**
* [cite_start]**Resilience:** Prevents blocking the main API threads, protecting the availability goal of ≥ 99.5%[cite: 25].
* **User Experience:** Provides immediate feedback to the official instead of a loading spinner that might freeze.
---
## ADR-005: Resource Naming and Structure
**Context:**
[cite_start]The API must be intuitive ("Clean API") and extensible[cite: 17]. The domain model includes "Documents", "Threads", and "Routing".
**Decision:**
Adopt a **Document-Centric REST hierarchy**:
* [cite_start]Root: `/documents` (The core "Envelope")[cite: 2].
* [cite_start]Sub-resource: `/documents/{id}/threads` (The communication context)[cite: 6].
* [cite_start]**Strict Nouns:** Use `/documents` instead of `/uploadDocument`[cite: 44].
**Rationale:**
* [cite_start]**Alignment:** Matches the architectural definition of DVD being "belegorientiert" (document-driven), not just a generic chat app[cite: 17].
* **Extensibility:** Allows adding new sub-resources (e.g., `/documents/{id}/audit-log` or `/documents/{id}/appeals`) without breaking the root model.
---
## ADR-006: Data Retention & Deletion
**Context:**
[cite_start]Standard behavior requires data deletion after the case is closed (FR-6)[cite: 9]. [cite_start]However, users can opt for a "Personal Archive"[cite: 10, 15].
**Decision:**
Implement a **TTL (Time-To-Live) Engine** on the database rows.
* **Default:** `deletionDate` = `closedAt` + `gracePeriod`.
* **Archive Option:** If `personalArchive` is active, `deletionDate` is set to `null` or an extended timestamp.
**Rationale:**
* [cite_start]**GDPR Compliance:** Ensures Privacy-by-Design (NFR-4) by automating the "Right to Erasure" / data economy principles[cite: 21].
* **Automation:** Reduces administrative overhead for cleanup.
## ADR-007: Python & ProcessPoolExecutor for Backend Services
**Context:**
The implementation team utilizes Agentic Coding (LLM-driven TDD). While Go/Java offers native concurrency, Python provides superior velocity with LLMs. However, Python's GIL risks blocking the event loop during CPU-intensive tasks (Encryption, PDF merging).
**Decision:**
Implement the Service Layer in **Python (FastAPI)** with a strict **Hybrid Concurrency Pattern**:
1. **I/O (DB/Network):** Native `async`/`await`.
2. **CPU (Crypto/PDF):** Must be offloaded to a `ProcessPoolExecutor`.
**Rationale:**
* **Velocity:** Maximizes the efficiency of AI coding assistants (Claude/GPT).
* **Performance:** `ProcessPoolExecutor` bypasses the GIL, ensuring the main loop remains non-blocking for the 10k concurrent connections.
* **Ecosystem:** Access to superior Python-based libraries for potential future AI features (classification/extraction).
xxx

View File

@@ -0,0 +1,260 @@
# Implementation Guide: DirektVermittlungDe
While the API Documentation explains *how to use* the API, this guide explains *how to build* it, focusing on the backend engineering challenges like encryption handling, database schema, and performance patterns defined in your architecture.
---
**Status:** Draft v1.0
**Target Audience:** Backend Engineering Team
[cite_start]**Context:** Implements constraints from `DvdArchitektur.txt` [cite: 6]
---
## 1\. Technology Stack & Setup
[cite_start]Based on the architectural constraints[cite: 45, 46, 47], the recommended reference stack is:
* **Service Layer:** Java (Spring Boot 3.x) or Go (Gin/Echo) for high-concurrency performance.
* **Primary Database:** PostgreSQL 15+ (Relational data for Routing/Threads).
* **Blob Storage:** S3-Compatible Storage (AWS S3 / MinIO) for encrypted PDF payloads.
* **Cache/PubSub:** Redis 7.x (Session store, Rate limiting, Async Job queues).
### Project Structure (Bounded Contexts)
[cite_start]Organize the codebase into modules matching the architecture[cite: 43]:
* `dvd-intake-service`: Handles `/documents` and Metadata extraction.
* `dvd-communication-service`: Handles `/threads` and `/messages`.
* `dvd-routing-engine`: The logic component for assigning units.
* `dvd-export-worker`: Async background workers for eAkte exports.
-----
## 2\. Security Implementation Details
### [cite_start]2.1 Handling "Blind" E2E Encryption [cite: 27]
The backend **must not** attempt to decrypt the `encryptedPayload`.
* **Ingest:** Receive the `encryptedPayload` (Base64 or binary multipart). Stream it directly to S3 Blob Storage. Do not load the full file into RAM to avoid OOM on large files.
* **Metadata:** Only persist the `metadata` JSON object to PostgreSQL for routing logic.
* **Validation:** Verify the `encryptedPayload` is a valid encrypted container (e.g., check PGP/AES headers) but treat the content as opaque.
### [cite_start]2.2 Stateless Authentication (OAuth2) [cite: 44]
* **Gateway Layer:** Implement a centralized API Gateway (e.g., Spring Cloud Gateway / Nginx) that validates JWT signatures (JWKS) from BundID (Citizens) and Authority-IDP (Officials).
* **Context Propagation:** Extract the `sub` (User ID) and `scope` from the JWT and pass them to downstream microservices via HTTP Headers (e.g., `X-User-Id`, `X-User-Role`).
-----
## 3\. Database Schema Recommendations (PostgreSQL)
[cite_start]Map the domain objects [cite: 64] to the following relational schema.
**Table: `documents`**
```sql
CREATE TABLE documents (
id UUID PRIMARY KEY,
reference_number VARCHAR(50) NOT NULL, -- "Aktenzeichen"
authority_id VARCHAR(50) NOT NULL, -- Routing target
status VARCHAR(20) DEFAULT 'RECEIVED', -- RECEIVED, ROUTED, ASSIGNED
storage_path VARCHAR(255) NOT NULL, -- S3 Key for encrypted blob
created_at TIMESTAMPTZ DEFAULT NOW(),
[cite_start]retention_date TIMESTAMPTZ -- For GDPR auto-deletion [cite: 14]
);
CREATE INDEX idx_docs_authority ON documents(authority_id, status);
```
**Table: `threads`**
```sql
CREATE TABLE threads (
id UUID PRIMARY KEY,
document_id UUID REFERENCES documents(id),
type VARCHAR(20) NOT NULL, -- CHAT, CALLBACK, APPOINTMENT
assigned_official_id VARCHAR(100), -- Nullable until claimed
last_activity_at TIMESTAMPTZ
);
```
**Table: `messages`**
```sql
CREATE TABLE messages (
id UUID PRIMARY KEY,
thread_id UUID REFERENCES threads(id),
sender_role VARCHAR(20) NOT NULL,
content_blob TEXT NOT NULL, -- Encrypted content
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Efficient Cursor Pagination: Index on (thread_id, created_at)
CREATE INDEX idx_msgs_thread_time ON messages(thread_id, created_at DESC);
```
-----
## 4\. Performance & Scalability Patterns
### [cite_start]4.1 Rate Limiting (Redis Token Bucket) [cite: 24]
To protect against DDoS and ensure fair usage (NFR-2), implement specific limits:
* **Citizens:** 10 requests/minute (prevent spamming threads).
* **Officials:** 1000 requests/minute (allow rapid batch processing).
*Implementation Tip:* Use a Redis-based "Sliding Window" Lua script. Key format: `rate_limit:{user_id}`.
### [cite_start]4.2 Caching Strategy [cite: 47]
* **Routing Rules:** Cache `RoutingRules` in Redis for 1 hour. Invalidate immediately on Admin updates.
* **ETags:** For `GET /documents/{id}`, generate an ETag based on the `updated_at` timestamp. Return `304 Not Modified` to save bandwidth if the client has the latest version.
-----
## [cite_start]5. Async Export Workflow [cite: 16]
For the `POST /exports` endpoint:
1. **API Layer:** Validate request -\> Publish event `ExportRequested` to RabbitMQ/Redis Stream -\> Return `202 Accepted` + `jobId`.
2. **Worker:**
* Consume `ExportRequested`.
* Fetch `encryptedPayload` from S3.
* Fetch Message History from Postgres.
* *Note:* The Worker might need a special "Authority Key" to re-encrypt the package for the target eAkte system, depending on the specific crypto-concept.
* Push result to the Authority's Ingest Interface.
* Update Job Status to `COMPLETED`.
-----
## 6\. Definition of Done Checklist
Before deploying to the staging environment, ensure:
* [cite_start][ ] **Load Test:** System handles 500 concurrent document uploads without error[cite: 24].
* [ ] **Security Audit:** Confirm no PII (Aktenzeichen) is logged in plaintext application logs.
* [cite_start][ ] **Cleanup:** The "GDPR Reaper" job is active and deletes `documents` where `retention_date < NOW()`[cite: 14].
# Implementation Guide: DirektVermittlungDe
**Status:** Draft v1.1
**Stack:** Python / FastAPI
**Context:** Implements constraints from `DvdArchitektur.txt` and ADR-007.
## 1. Technology Stack
* **Language:** Python 3.11+
* **Web Framework:** FastAPI (with Uvicorn + Gunicorn)
* **Validation:** Pydantic V2 (Strict Mode)
* **Database ORM:** SQLAlchemy (Async) or Tortoise-ORM
* **Task Queue:** ARQ (Redis-based) or Celery
* **Primary DB:** PostgreSQL 15+
* **Blob Store:** MinIO / AWS S3
## 2. Project Structure & Patterns
Organize the monolithic repo or microservices using "Clean Architecture":
```text
/src
/domain # Pydantic models & Business Rules (Pure Python)
/adapters # DB, S3, External APIs
/service # Application Logic
/api # FastAPI Routes
/workers # Background Job Definitions
```
## 2. Project Structure & Patterns
Organize the monolithic repo or microservices using "Clean Architecture":
```text
/src
/domain # Pydantic models & Business Rules (Pure Python)
/adapters # DB, S3, External APIs
/service # Application Logic
/api # FastAPI Routes
/workers # Background Job Definitions
## 3. The "Hybrid Concurrency" Pattern (Critical)
To meet NFR-1 (<300ms) and NFR-2 (10k sessions), you must not block the Event Loop.
### 3.1 The Rule
- NEVER use time.sleep, requests, or heavy computation (e.g., pypdf, cryptography) inside an async def.
- ALWAYS use await for I/O.
- ALWAYS use loop.run_in_executor for CPU tasks.
### 3.2 Implementation Snippet
```python
import asyncio
from concurrent.futures import ProcessPoolExecutor
from fastapi import APIRouter, UploadFile
import some_heavy_crypto_lib
router = APIRouter()
# Create a dedicated pool for CPU tasks
cpu_pool = ProcessPoolExecutor(max_workers=4)
def cpu_bound_decryption(payload: bytes) -> dict:
# This runs in a separate process, bypassing the GIL
return some_heavy_crypto_lib.decrypt_and_parse(payload)
@router.post("/documents")
async def upload_document(file: UploadFile):
content = await file.read() # Non-blocking I/O
# Offload CPU work to the pool
loop = asyncio.get_running_loop()
metadata = await loop.run_in_executor(
cpu_pool,
cpu_bound_decryption,
content
)
return {"status": "processed", "meta": metadata}
```
## 4. Security Implementation
### 4.1 "Blind" Ingest
- Stream uploads directly to S3 using aiobotocore to avoid loading 50MB PDFs into RAM.
- Do not attempt to read the encryptedPayload in the main web service process.
### 4.2 Auth Middleware
Use fastapi.security.OAuth2AuthorizationCodeBearer. Implement a dependency that validates the JWT signature using a cached JWKS (JSON Web Key Set) to avoid a network call on every request.
## 5. Database Schema (SQLAlchemy Async)
```python
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
from sqlalchemy import String, UUID, DateTime
import uuid
from datetime import datetime
class Base(DeclarativeBase):
pass
class Document(Base):
__tablename__ = "documents"
id: Mapped[uuid.UUID] = mapped_column(primary_key=True, default=uuid.uuid4)
reference_number: Mapped[str] = mapped_column(String(50), index=True)
status: Mapped[str] = mapped_column(String(20), default="RECEIVED")
# …
```
## 6. Testing Strategy (Agentic TDD)
- Framework: pytest + pytest-asyncio.
- Mocking: Use respx for mocking external HTTP calls (Authority Systems).
- Database: Use testcontainers-python to spin up a real Postgres for integration tests.
- Prompting the Agent: "Write an async pytest for POST /documents. Use ProcessPoolExecutor mock to verify CPU offloading."
xxx

189
docs/openapi.yaml Normal file
View File

@@ -0,0 +1,189 @@
openapi: 3.1.0
info:
title: DirektVermittlungDe API
version: 1.0.0
description: |
API for efficient communication between citizens and authorities.
Key features: Document Intake, Auto-Routing, Interaction Threads.
[cite: 1, 3]
servers:
- url: https://api.direktvermittlung.de/v1
description: Production Server
components:
securitySchemes:
OAuth2:
type: oauth2
flows:
authorizationCode:
authorizationUrl: https://id.bund.de/auth
tokenUrl: https://api.direktvermittlung.de/oauth/token
scopes:
citizen:write: "Submit documents and inquiries"
official:read: "View assigned cases"
official:write: "Respond to inquiries"
schemas:
DocumentEnvelope:
type: object
required: [metadata]
properties:
id:
type: string
format: uuid
readOnly: true
metadata:
[cite_start]description: Plaintext metadata required for the Routing Engine[cite: 8].
type: object
required: [authorityId, docType]
properties:
authorityId:
type: string
example: "Finanzamt-München-I"
referenceNumber:
type: string
[cite_start]description: Aktenzeichen/Kassenzeichen [cite: 3]
docType:
type: string
enum: [NOTICE, COURT_ORDER, GENERAL_INQUIRY]
issuedAt:
type: string
format: date-time
encryptedPayload:
type: string
format: base64
[cite_start]description: E2E Encrypted PDF content or scan[cite: 27].
status:
type: string
enum: [RECEIVED, ROUTED, ASSIGNED, CLOSED]
readOnly: true
InteractionThread:
type: object
properties:
id:
type: string
format: uuid
documentId:
type: string
format: uuid
type:
type: string
[cite_start]enum: [TEXT_CHAT, CALLBACK_REQUEST, APPOINTMENT] [cite: 10]
status:
type: string
enum: [OPEN, PENDING_OFFICIAL, PENDING_CITIZEN, RESOLVED]
Message:
type: object
properties:
id:
type: string
senderRole:
type: string
enum: [CITIZEN, OFFICIAL]
content:
type: string
description: Encrypted message content.
timestamp:
type: string
format: date-time
paths:
/documents:
post:
summary: Upload a new document (FR-1)
description: Creates a DocumentEnvelope. [cite_start]The Routing Engine will process the metadata asynchronously[cite: 8].
security:
- OAuth2: [citizen:write]
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/DocumentEnvelope'
responses:
'201':
description: Document Created
content:
application/json:
schema:
$ref: '#/components/schemas/DocumentEnvelope'
/documents/{id}/threads:
post:
summary: Start an interaction (FR-3)
[cite_start]description: Initiate a callback request, text inquiry, or appointment[cite: 10].
security:
- OAuth2: [citizen:write]
parameters:
- name: id
in: path
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
type: object
required: [type]
properties:
type:
type: string
enum: [TEXT_CHAT, CALLBACK_REQUEST, APPOINTMENT]
initialMessage:
type: string
responses:
'201':
description: Thread Started
/threads/{threadId}/messages:
get:
summary: Get message history (FR-4)
description: Uses cursor-based pagination for chat history.
parameters:
- name: threadId
in: path
required: true
schema:
type: string
- name: limit
in: query
schema:
type: integer
default: 20
- name: before
in: query
schema:
type: string
format: date-time
responses:
'200':
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/Message'
/exports:
post:
summary: Export data to authority system (FR-7)
[cite_start]description: Triggers an async export job to the eAkte[cite: 16].
security:
- OAuth2: [official:read]
responses:
'202':
description: Export Accepted
content:
application/json:
schema:
type: object
properties:
jobId:
type: string
status:
type: string
example: "QUEUED"