feat: expand mailbox evidence scanner

This commit is contained in:
2026-06-02 02:07:50 +02:00
parent 8532583182
commit 226c045397
16 changed files with 670 additions and 33 deletions

View File

@@ -46,3 +46,19 @@ coordination runtime decides whether those facts satisfy a coordination case.
- Human reply does not prove legal acceptance.
- Unknown return messages remain visible.
- Scanner and proxy interactions must stay below identity-bound interaction.
## Endpoint Quality Hints
Endpoint quality rows are diagnostic state, not verdicts:
| Evidence | Quality hint |
| --- | --- |
| Hard bounce or final failure | `reachability = unreachable`, `last_failure_at` |
| Soft bounce or delayed delivery | `reachability = degraded`, `last_failure_at` |
| Complaint | `suppression_state = suppressed` |
| Unsubscribe | `suppression_state = opted_out` |
| Human reply | `last_success_at` |
| Out-of-office | `reachability = uncertain`, `last_auto_reply_at` |
These hints can guide future suppression and review workflows, but they do not
prove human awareness, authority, payload access, or coordination success.

View File

@@ -15,11 +15,13 @@ email-connect scan-mailbox --config config/mailbox.example.yml --out reports/
```
It scans an inbound return mailbox source, classifies messages, stores scan
state in SQLite, and writes timestamped CSV evidence reports.
state in SQLite, updates endpoint-quality hints, and writes timestamped CSV
evidence reports.
The initial source implementation supports fixture directories. The IMAP
connector remains the next mailbox boundary to complete under
`EMAIL-WP-0002-T02`.
The source layer supports deterministic fixture directories and a read-only IMAP
connector. IMAP scans select the configured folder with `readonly=True`, fetch
messages using `BODY.PEEK[]`, and reject `mark_seen` because mailbox write-back
actions are out of scope for this MVP.
## Package Layout
@@ -29,6 +31,7 @@ src/email_connect/
cli.py # command line entry points
config.py # config loading
evidence.py # native class to normalized evidence mapping
mailbox.py # fixture and IMAP mailbox sources
models.py # mailbox, parse, evidence, endpoint quality dataclasses
parser.py # MIME/header parsing and conservative classification
reporting.py # CSV report generation
@@ -44,12 +47,19 @@ SQLite is the MVP store. The initial schema includes:
- `mailbox_messages`
- `parsed_messages`
- `evidence_candidates`
- `scan_cursors`
- `endpoint_quality`
Message deduplication is keyed by mailbox ID, IMAP UID when present, message ID,
received timestamp, sender, subject hash, and body hash. Evidence
deduplication follows the workplan fields: message, parser version, normalized
event, affected recipient, original message, SMTP/enhanced status, and reason.
Incremental scans use `scan_cursors` by mailbox and folder. Full rescans ignore
the cursor while preserving message and evidence deduplication. Endpoint-quality
rows are diagnostic hints derived from explicit evidence events; they are not
coordination outcomes.
## Evidence Mapping
Parser output is represented as `ParsedMailboxMessage`. The mapper converts it
@@ -60,6 +70,9 @@ Examples:
- `hard_bounce` -> `notification.endpoint.rejected_permanent`
- `soft_bounce` -> `notification.endpoint.rejected_temporary`
- `delayed_delivery_notice` -> `notification.endpoint.deferred`
- `complaint_or_abuse` -> `notification.channel.complaint_received`
- `unsubscribe_or_opt_out` -> `notification.channel.unsubscribe_received`
- `out_of_office` -> `interaction.out_of_office_received`
- `human_reply` -> `interaction.reply_received`
@@ -95,4 +108,5 @@ normalization layer.
PYTHONPATH=src python3 -m unittest discover -s tests
PYTHONPATH=src python3 -m email_connect.cli adapter-descriptor
PYTHONPATH=src python3 -m email_connect.cli scan-mailbox --config config/mailbox.example.yml --out reports/
PYTHONPATH=src python3 -m email_connect.cli scan-mailbox --config config/mailbox.example.yml --report-only-new --out reports/
```