Files
email-connect/docs/initial-runtime-architecture.md

4.3 KiB

Initial Runtime Architecture

Status

This is the first implementation architecture for the mailbox evidence scanner slice. It is intentionally small and stdlib-only so the repo can run before a larger service stack is chosen.

Service Boundary

The first slice is a CLI scanner:

email-connect scan-mailbox --config config/mailbox.example.yml --out reports/

It scans an inbound return mailbox source, classifies messages, stores scan state in SQLite, updates endpoint-quality hints, and writes timestamped CSV evidence reports.

The source layer supports deterministic fixture directories and a read-only IMAP connector. IMAP scans select the configured folder with readonly=True, fetch messages using BODY.PEEK[], and reject mark_seen because mailbox write-back actions are out of scope for this MVP.

Package Layout

src/email_connect/
  adapter_contract.py  # coordination-engine descriptor and evidence ceiling
  cli.py               # command line entry points
  config.py            # config loading
  evidence.py          # native class to normalized evidence mapping
  mailbox.py           # fixture and IMAP mailbox sources
  models.py            # mailbox, parse, evidence, endpoint quality dataclasses
  parser.py            # MIME/header parsing and conservative classification
  reporting.py         # CSV report generation
  scanner.py           # scan orchestration
  storage.py           # SQLite state store

Persistence

SQLite is the MVP store. The initial schema includes:

  • mailbox_scans
  • mailbox_messages
  • parsed_messages
  • evidence_candidates
  • scan_cursors
  • endpoint_quality

Message deduplication is keyed by mailbox ID, IMAP UID when present, message ID, received timestamp, sender, subject hash, and body hash. Evidence deduplication follows the workplan fields: message, parser version, normalized event, affected recipient, original message, SMTP/enhanced status, and reason.

Incremental scans use scan_cursors by mailbox and folder. Full rescans ignore the cursor while preserving message and evidence deduplication. Endpoint-quality rows are diagnostic hints derived from explicit evidence events; they are not coordination outcomes.

Evidence Mapping

Parser output is represented as ParsedMailboxMessage. The mapper converts it to EmailEvidenceCandidate using coordination-engine event names and advisory assessment classes.

Examples:

  • hard_bounce -> notification.endpoint.rejected_permanent
  • soft_bounce -> notification.endpoint.rejected_temporary
  • delayed_delivery_notice -> notification.endpoint.deferred
  • complaint_or_abuse -> notification.channel.complaint_received
  • unsubscribe_or_opt_out -> notification.channel.unsubscribe_received
  • out_of_office -> interaction.out_of_office_received
  • challenge_response -> interaction.unverified_actor_interaction
  • human_reply -> interaction.reply_received
  • parse_failed -> diagnostic.message.parse_failed

The mapper does not emit evidence for unrelated messages. Unknown return messages stay visible as notification.endpoint.unknown. Parse failures are visible as diagnostics without claiming delivery, interaction, identity, or endpoint quality.

coordination-engine Alignment

The implementation keeps these coordination-engine concepts explicit:

  • adapter descriptor
  • adapter capability profile
  • evidence ceiling
  • advisory assessment
  • endpoint quality update shape
  • event observation and raw reference preservation
  • golden tests for overclaim prevention

Email evidence remains below the coordination result layer. The scanner does not infer inbox placement, human awareness, legal acceptance, payload access, or case success.

Provider Boundary

Provider webhook ingestion and outbound send APIs are deliberately outside this slice. The mailbox scanner uses the same evidence model so future provider events can enter through a parallel ingestion path and converge at the same normalization layer.

Development Commands

PYTHONPATH=src python3 -m unittest discover -s tests
PYTHONPATH=src python3 -m email_connect.cli adapter-descriptor
PYTHONPATH=src python3 -m email_connect.cli scan-mailbox --config config/mailbox.example.yml --out reports/
PYTHONPATH=src python3 -m email_connect.cli scan-mailbox --config config/mailbox.example.yml --report-only-new --out reports/