generated from coulomb/repo-seed
117 lines
4.3 KiB
Markdown
117 lines
4.3 KiB
Markdown
# Initial Runtime Architecture
|
|
|
|
## Status
|
|
|
|
This is the first implementation architecture for the mailbox evidence scanner
|
|
slice. It is intentionally small and stdlib-only so the repo can run before a
|
|
larger service stack is chosen.
|
|
|
|
## Service Boundary
|
|
|
|
The first slice is a CLI scanner:
|
|
|
|
```text
|
|
email-connect scan-mailbox --config config/mailbox.example.yml --out reports/
|
|
```
|
|
|
|
It scans an inbound return mailbox source, classifies messages, stores scan
|
|
state in SQLite, updates endpoint-quality hints, and writes timestamped CSV
|
|
evidence reports.
|
|
|
|
The source layer supports deterministic fixture directories and a read-only IMAP
|
|
connector. IMAP scans select the configured folder with `readonly=True`, fetch
|
|
messages using `BODY.PEEK[]`, and reject `mark_seen` because mailbox write-back
|
|
actions are out of scope for this MVP.
|
|
|
|
## Package Layout
|
|
|
|
```text
|
|
src/email_connect/
|
|
adapter_contract.py # coordination-engine descriptor and evidence ceiling
|
|
cli.py # command line entry points
|
|
config.py # config loading
|
|
evidence.py # native class to normalized evidence mapping
|
|
mailbox.py # fixture and IMAP mailbox sources
|
|
models.py # mailbox, parse, evidence, endpoint quality dataclasses
|
|
parser.py # MIME/header parsing and conservative classification
|
|
reporting.py # CSV report generation
|
|
scanner.py # scan orchestration
|
|
storage.py # SQLite state store
|
|
```
|
|
|
|
## Persistence
|
|
|
|
SQLite is the MVP store. The initial schema includes:
|
|
|
|
- `mailbox_scans`
|
|
- `mailbox_messages`
|
|
- `parsed_messages`
|
|
- `evidence_candidates`
|
|
- `scan_cursors`
|
|
- `endpoint_quality`
|
|
|
|
Message deduplication is keyed by mailbox ID, IMAP UID when present, message ID,
|
|
received timestamp, sender, subject hash, and body hash. Evidence
|
|
deduplication follows the workplan fields: message, parser version, normalized
|
|
event, affected recipient, original message, SMTP/enhanced status, and reason.
|
|
|
|
Incremental scans use `scan_cursors` by mailbox and folder. Full rescans ignore
|
|
the cursor while preserving message and evidence deduplication. Endpoint-quality
|
|
rows are diagnostic hints derived from explicit evidence events; they are not
|
|
coordination outcomes.
|
|
|
|
## Evidence Mapping
|
|
|
|
Parser output is represented as `ParsedMailboxMessage`. The mapper converts it
|
|
to `EmailEvidenceCandidate` using coordination-engine event names and advisory
|
|
assessment classes.
|
|
|
|
Examples:
|
|
|
|
- `hard_bounce` -> `notification.endpoint.rejected_permanent`
|
|
- `soft_bounce` -> `notification.endpoint.rejected_temporary`
|
|
- `delayed_delivery_notice` -> `notification.endpoint.deferred`
|
|
- `complaint_or_abuse` -> `notification.channel.complaint_received`
|
|
- `unsubscribe_or_opt_out` -> `notification.channel.unsubscribe_received`
|
|
- `out_of_office` -> `interaction.out_of_office_received`
|
|
- `challenge_response` -> `interaction.unverified_actor_interaction`
|
|
- `human_reply` -> `interaction.reply_received`
|
|
- `parse_failed` -> `diagnostic.message.parse_failed`
|
|
|
|
The mapper does not emit evidence for unrelated messages. Unknown return
|
|
messages stay visible as `notification.endpoint.unknown`. Parse failures are
|
|
visible as diagnostics without claiming delivery, interaction, identity, or
|
|
endpoint quality.
|
|
|
|
## coordination-engine Alignment
|
|
|
|
The implementation keeps these coordination-engine concepts explicit:
|
|
|
|
- adapter descriptor
|
|
- adapter capability profile
|
|
- evidence ceiling
|
|
- advisory assessment
|
|
- endpoint quality update shape
|
|
- event observation and raw reference preservation
|
|
- golden tests for overclaim prevention
|
|
|
|
Email evidence remains below the coordination result layer. The scanner does
|
|
not infer inbox placement, human awareness, legal acceptance, payload access, or
|
|
case success.
|
|
|
|
## Provider Boundary
|
|
|
|
Provider webhook ingestion and outbound send APIs are deliberately outside this
|
|
slice. The mailbox scanner uses the same evidence model so future provider
|
|
events can enter through a parallel ingestion path and converge at the same
|
|
normalization layer.
|
|
|
|
## Development Commands
|
|
|
|
```bash
|
|
PYTHONPATH=src python3 -m unittest discover -s tests
|
|
PYTHONPATH=src python3 -m email_connect.cli adapter-descriptor
|
|
PYTHONPATH=src python3 -m email_connect.cli scan-mailbox --config config/mailbox.example.yml --out reports/
|
|
PYTHONPATH=src python3 -m email_connect.cli scan-mailbox --config config/mailbox.example.yml --report-only-new --out reports/
|
|
```
|