Files
email-connect/docs/initial-runtime-architecture.md

117 lines
4.3 KiB
Markdown

# Initial Runtime Architecture
## Status
This is the first implementation architecture for the mailbox evidence scanner
slice. It is intentionally small and stdlib-only so the repo can run before a
larger service stack is chosen.
## Service Boundary
The first slice is a CLI scanner:
```text
email-connect scan-mailbox --config config/mailbox.example.yml --out reports/
```
It scans an inbound return mailbox source, classifies messages, stores scan
state in SQLite, updates endpoint-quality hints, and writes timestamped CSV
evidence reports.
The source layer supports deterministic fixture directories and a read-only IMAP
connector. IMAP scans select the configured folder with `readonly=True`, fetch
messages using `BODY.PEEK[]`, and reject `mark_seen` because mailbox write-back
actions are out of scope for this MVP.
## Package Layout
```text
src/email_connect/
adapter_contract.py # coordination-engine descriptor and evidence ceiling
cli.py # command line entry points
config.py # config loading
evidence.py # native class to normalized evidence mapping
mailbox.py # fixture and IMAP mailbox sources
models.py # mailbox, parse, evidence, endpoint quality dataclasses
parser.py # MIME/header parsing and conservative classification
reporting.py # CSV report generation
scanner.py # scan orchestration
storage.py # SQLite state store
```
## Persistence
SQLite is the MVP store. The initial schema includes:
- `mailbox_scans`
- `mailbox_messages`
- `parsed_messages`
- `evidence_candidates`
- `scan_cursors`
- `endpoint_quality`
Message deduplication is keyed by mailbox ID, IMAP UID when present, message ID,
received timestamp, sender, subject hash, and body hash. Evidence
deduplication follows the workplan fields: message, parser version, normalized
event, affected recipient, original message, SMTP/enhanced status, and reason.
Incremental scans use `scan_cursors` by mailbox and folder. Full rescans ignore
the cursor while preserving message and evidence deduplication. Endpoint-quality
rows are diagnostic hints derived from explicit evidence events; they are not
coordination outcomes.
## Evidence Mapping
Parser output is represented as `ParsedMailboxMessage`. The mapper converts it
to `EmailEvidenceCandidate` using coordination-engine event names and advisory
assessment classes.
Examples:
- `hard_bounce` -> `notification.endpoint.rejected_permanent`
- `soft_bounce` -> `notification.endpoint.rejected_temporary`
- `delayed_delivery_notice` -> `notification.endpoint.deferred`
- `complaint_or_abuse` -> `notification.channel.complaint_received`
- `unsubscribe_or_opt_out` -> `notification.channel.unsubscribe_received`
- `out_of_office` -> `interaction.out_of_office_received`
- `challenge_response` -> `interaction.unverified_actor_interaction`
- `human_reply` -> `interaction.reply_received`
- `parse_failed` -> `diagnostic.message.parse_failed`
The mapper does not emit evidence for unrelated messages. Unknown return
messages stay visible as `notification.endpoint.unknown`. Parse failures are
visible as diagnostics without claiming delivery, interaction, identity, or
endpoint quality.
## coordination-engine Alignment
The implementation keeps these coordination-engine concepts explicit:
- adapter descriptor
- adapter capability profile
- evidence ceiling
- advisory assessment
- endpoint quality update shape
- event observation and raw reference preservation
- golden tests for overclaim prevention
Email evidence remains below the coordination result layer. The scanner does
not infer inbox placement, human awareness, legal acceptance, payload access, or
case success.
## Provider Boundary
Provider webhook ingestion and outbound send APIs are deliberately outside this
slice. The mailbox scanner uses the same evidence model so future provider
events can enter through a parallel ingestion path and converge at the same
normalization layer.
## Development Commands
```bash
PYTHONPATH=src python3 -m unittest discover -s tests
PYTHONPATH=src python3 -m email_connect.cli adapter-descriptor
PYTHONPATH=src python3 -m email_connect.cli scan-mailbox --config config/mailbox.example.yml --out reports/
PYTHONPATH=src python3 -m email_connect.cli scan-mailbox --config config/mailbox.example.yml --report-only-new --out reports/
```