generated from coulomb/repo-seed
1202 lines
24 KiB
Markdown
1202 lines
24 KiB
Markdown
---
|
|
id: EMAIL-WP-0002
|
|
type: workplan
|
|
title: "MVP Mailbox Evidence Scanner"
|
|
domain: custodian
|
|
repo: email-connect
|
|
status: finished
|
|
owner: codex
|
|
topic_slug: custodian
|
|
created: "2026-06-02"
|
|
updated: "2026-06-02"
|
|
state_hub_workstream_id: "c81788aa-0d0a-4493-bf41-ab6cc2068f2f"
|
|
---
|
|
|
|
# EMAIL-WP-0002 - MVP Mailbox Evidence Scanner
|
|
|
|
## Review Fixes Applied
|
|
|
|
This workplan was reviewed and registered against the local State Hub workplan
|
|
convention.
|
|
|
|
Fixes applied:
|
|
|
|
- Added ADR-001 frontmatter so State Hub can index the workstream.
|
|
- Converted the MVP work packages into `EMAIL-WP-0002-TNN` task blocks.
|
|
- Clarified that out-of-office replies are evidence signals, not proof of
|
|
reachability or awareness.
|
|
- Aligned suggested repository paths with the current repo layout.
|
|
|
|
Implementation should preserve the product rule from `INTENT.md`: email events
|
|
are evidence, not result satisfaction.
|
|
|
|
## 1. MVP Name
|
|
|
|
**Mailbox Evidence Scanner MVP**
|
|
|
|
## 2. Purpose
|
|
|
|
This MVP establishes the first practical implementation slice of `email-connect`.
|
|
|
|
Given access to a mailbox that receives bounce mails, delivery-status notifications, out-of-office replies, human replies, complaints, unsubscribe messages, and other return messages from a previous batch of emails, the system shall scan and rescan the mailbox, classify inbound messages, extract email-channel evidence, and generate timestamped CSV evidence reports.
|
|
|
|
The MVP proves the core `email-connect` value without requiring outbound sending, provider webhook integration, template management, or a full UI.
|
|
|
|
## 3. Core MVP Hypothesis
|
|
|
|
`email-connect` can provide immediate standalone value by turning an inbound return mailbox into structured, timestamped email-channel evidence.
|
|
|
|
The MVP validates:
|
|
|
|
```text
|
|
mailbox access
|
|
message scanning
|
|
rescan safety
|
|
bounce parsing
|
|
reply classification
|
|
evidence normalization
|
|
deduplication
|
|
endpoint-quality hints
|
|
CSV reporting
|
|
```
|
|
|
|
The result should be useful to humans, scripts, and future `coordination-engine` integrations.
|
|
|
|
## 4. In Scope
|
|
|
|
The MVP shall support:
|
|
|
|
* Connecting to one IMAP mailbox.
|
|
* Scanning messages in a selected folder.
|
|
* Incremental scans using a stored cursor.
|
|
* Full rescans.
|
|
* Raw message metadata extraction.
|
|
* Basic MIME parsing.
|
|
* Bounce / DSN classification.
|
|
* Out-of-office classification.
|
|
* Human reply classification.
|
|
* Unknown/unparseable message classification.
|
|
* Extraction of affected recipient address where possible.
|
|
* Extraction of SMTP status code and enhanced status code where possible.
|
|
* Evidence event candidate generation.
|
|
* Deduplication of already-seen mailbox messages.
|
|
* Deduplication of already-emitted evidence.
|
|
* Timestamped CSV report generation.
|
|
* Basic local storage for scan state and parsed evidence.
|
|
* CLI entry point.
|
|
* Minimal configuration file.
|
|
|
|
## 5. Out of Scope for MVP
|
|
|
|
The MVP does not need to support:
|
|
|
|
* Outbound email sending.
|
|
* Provider-specific webhooks.
|
|
* Multiple email providers.
|
|
* Full UI.
|
|
* OAuth mailbox login.
|
|
* Advanced deliverability analytics.
|
|
* Advanced natural-language reply interpretation.
|
|
* Full suppression management UI.
|
|
* Full endpoint quality dashboard.
|
|
* coordination-engine live integration.
|
|
* Database server deployment.
|
|
* Multi-tenant operation.
|
|
* Complex mailbox write-back actions.
|
|
* Deleting, moving, or marking mailbox messages.
|
|
* Legal delivery assessment.
|
|
|
|
## 6. MVP User Story
|
|
|
|
As an operator or developer, I want to point `email-connect` at a mailbox containing return emails from a previous outbound batch, scan and rescan it safely, and receive a timestamped CSV report showing what email-channel evidence was found for each affected address or message.
|
|
|
|
## 7. Target Workflow
|
|
|
|
```text
|
|
1. Configure mailbox access.
|
|
2. Run scanner.
|
|
3. Scanner fetches messages from selected folder.
|
|
4. Scanner parses headers, body, MIME parts, and DSN attachments.
|
|
5. Scanner classifies each message.
|
|
6. Scanner extracts evidence fields.
|
|
7. Scanner deduplicates message and evidence records.
|
|
8. Scanner stores scan state.
|
|
9. Scanner writes timestamped CSV report.
|
|
10. User reviews report or imports it elsewhere.
|
|
```
|
|
|
|
## 8. CLI Target
|
|
|
|
Initial CLI command:
|
|
|
|
```text
|
|
email-connect scan-mailbox --config config/mailbox.yml --out reports/
|
|
```
|
|
|
|
Recommended CLI variants:
|
|
|
|
```text
|
|
email-connect scan-mailbox --config config/mailbox.yml --out reports/
|
|
email-connect scan-mailbox --config config/mailbox.yml --full-rescan --out reports/
|
|
email-connect scan-mailbox --config config/mailbox.yml --since 2026-01-01 --out reports/
|
|
email-connect scan-mailbox --config config/mailbox.yml --report-only-new --out reports/
|
|
email-connect scan-mailbox --config config/mailbox.yml --dry-run
|
|
```
|
|
|
|
## 9. Configuration
|
|
|
|
Example `config/mailbox.yml`:
|
|
|
|
```yaml
|
|
mailbox:
|
|
id: return-mailbox-default
|
|
protocol: imap
|
|
host: imap.example.com
|
|
port: 993
|
|
tls: true
|
|
username_env: EMAIL_CONNECT_IMAP_USER
|
|
password_env: EMAIL_CONNECT_IMAP_PASSWORD
|
|
folder: INBOX
|
|
|
|
scan:
|
|
mode: incremental
|
|
max_messages_per_run: 5000
|
|
since: null
|
|
include_seen: true
|
|
mark_seen: false
|
|
store_raw_headers: true
|
|
store_raw_body: false
|
|
store_raw_message_ref: true
|
|
|
|
storage:
|
|
path: .email-connect/state.sqlite
|
|
|
|
reports:
|
|
output_dir: reports
|
|
include_all_evidence: true
|
|
include_unknown_messages: true
|
|
timestamp_timezone: UTC
|
|
```
|
|
|
|
## 10. Minimal Data Model
|
|
|
|
### 10.1 MailboxScan
|
|
|
|
```yaml
|
|
MailboxScan:
|
|
scan_id: string
|
|
mailbox_id: string
|
|
started_at: timestamp
|
|
finished_at: timestamp?
|
|
scan_mode: incremental | full_rescan
|
|
since: timestamp?
|
|
folder: string
|
|
status: running | completed | failed
|
|
messages_seen: integer
|
|
messages_new: integer
|
|
messages_parsed: integer
|
|
evidence_events_created: integer
|
|
report_path: string?
|
|
```
|
|
|
|
### 10.2 InboundMailboxMessage
|
|
|
|
```yaml
|
|
InboundMailboxMessage:
|
|
mailbox_message_id: string
|
|
mailbox_id: string
|
|
imap_uid: string?
|
|
message_id_header: string?
|
|
received_at: timestamp?
|
|
from_address: string?
|
|
to_addresses:
|
|
- string
|
|
subject: string?
|
|
raw_headers_ref: string?
|
|
raw_message_ref: string?
|
|
first_seen_at: timestamp
|
|
last_seen_at: timestamp
|
|
deduplication_key: string
|
|
```
|
|
|
|
### 10.3 ParsedMailboxMessage
|
|
|
|
```yaml
|
|
ParsedMailboxMessage:
|
|
parsed_message_id: string
|
|
mailbox_message_id: string
|
|
parser_version: string
|
|
message_class: hard_bounce | soft_bounce | delayed_delivery_notice | final_delivery_failure | out_of_office | human_reply | complaint_or_abuse | unsubscribe_or_opt_out | challenge_response | unknown_return_message | unrelated_message | parse_failed
|
|
affected_email_address: string?
|
|
original_message_id: string?
|
|
original_recipient: string?
|
|
smtp_status_code: string?
|
|
enhanced_status_code: string?
|
|
reason_code: string?
|
|
confidence: low | medium | high
|
|
parsed_at: timestamp
|
|
notes:
|
|
- string
|
|
```
|
|
|
|
### 10.4 EmailEvidenceCandidate
|
|
|
|
```yaml
|
|
EmailEvidenceCandidate:
|
|
evidence_candidate_id: string
|
|
mailbox_message_id: string
|
|
parsed_message_id: string
|
|
event_type: string
|
|
assessment_category: success | fail | undef
|
|
assessment_subclass: string
|
|
affected_email_address: string?
|
|
original_message_id: string?
|
|
confidence: low | medium | high
|
|
evidence_strength: none | weak | medium | strong | negative | ambiguous
|
|
occurred_at: timestamp?
|
|
observed_at: timestamp
|
|
deduplication_key: string
|
|
raw_message_ref: string?
|
|
notes:
|
|
- string
|
|
```
|
|
|
|
## 11. Message Classification Rules
|
|
|
|
### 11.1 Hard Bounce
|
|
|
|
Signals:
|
|
|
|
```text
|
|
Delivery failure notice
|
|
Permanent failure
|
|
Unknown user
|
|
Mailbox not found
|
|
Domain not found
|
|
5xx SMTP status
|
|
Enhanced status code 5.x.x
|
|
```
|
|
|
|
Normalized event:
|
|
|
|
```text
|
|
notification.endpoint.rejected_permanent
|
|
```
|
|
|
|
Assessment:
|
|
|
|
```text
|
|
category: fail
|
|
subclass: fail.hard_bounce
|
|
```
|
|
|
|
### 11.2 Soft Bounce
|
|
|
|
Signals:
|
|
|
|
```text
|
|
Temporary failure
|
|
Mailbox full
|
|
Greylisting
|
|
Temporary server failure
|
|
4xx SMTP status
|
|
Enhanced status code 4.x.x
|
|
```
|
|
|
|
Normalized event:
|
|
|
|
```text
|
|
notification.endpoint.rejected_temporary
|
|
```
|
|
|
|
Assessment:
|
|
|
|
```text
|
|
category: undef
|
|
subclass: undef.deferred
|
|
```
|
|
|
|
### 11.3 Delayed Delivery Notice
|
|
|
|
Signals:
|
|
|
|
```text
|
|
Delivery delayed
|
|
Will keep trying
|
|
Message not yet delivered
|
|
```
|
|
|
|
Normalized event:
|
|
|
|
```text
|
|
notification.endpoint.deferred
|
|
```
|
|
|
|
Assessment:
|
|
|
|
```text
|
|
category: undef
|
|
subclass: undef.deferred
|
|
```
|
|
|
|
### 11.4 Final Delivery Failure
|
|
|
|
Signals:
|
|
|
|
```text
|
|
Could not deliver after retry period
|
|
Final failure
|
|
Giving up
|
|
```
|
|
|
|
Normalized event:
|
|
|
|
```text
|
|
notification.endpoint.rejected_permanent
|
|
```
|
|
|
|
Assessment:
|
|
|
|
```text
|
|
category: fail
|
|
subclass: fail.expired_without_delivery
|
|
```
|
|
|
|
### 11.5 Out-of-Office Reply
|
|
|
|
Signals:
|
|
|
|
```text
|
|
Auto-reply
|
|
Out of office
|
|
Vacation
|
|
Abwesenheitsnotiz
|
|
Ich bin nicht im Büro
|
|
```
|
|
|
|
Normalized event:
|
|
|
|
```text
|
|
interaction.out_of_office_received
|
|
```
|
|
|
|
Assessment:
|
|
|
|
```text
|
|
category: undef
|
|
subclass: undef.out_of_office
|
|
```
|
|
|
|
### 11.6 Human Reply
|
|
|
|
Signals:
|
|
|
|
```text
|
|
Non-automated reply
|
|
No bounce markers
|
|
No auto-reply markers
|
|
Human-written body likely
|
|
```
|
|
|
|
Normalized event:
|
|
|
|
```text
|
|
interaction.reply_received
|
|
```
|
|
|
|
Assessment:
|
|
|
|
```text
|
|
category: success
|
|
subclass: success.reply_received
|
|
```
|
|
|
|
Note: this is email-channel success, not necessarily coordination success.
|
|
|
|
### 11.7 Complaint / Abuse
|
|
|
|
Signals:
|
|
|
|
```text
|
|
Abuse report
|
|
Spam complaint
|
|
Feedback loop
|
|
Complaint notification
|
|
```
|
|
|
|
Normalized event:
|
|
|
|
```text
|
|
notification.channel.complaint_received
|
|
```
|
|
|
|
Assessment:
|
|
|
|
```text
|
|
category: fail
|
|
subclass: fail.complaint_received
|
|
```
|
|
|
|
### 11.8 Unsubscribe / Opt-Out
|
|
|
|
Signals:
|
|
|
|
```text
|
|
Unsubscribe request
|
|
Opt-out
|
|
Remove me
|
|
STOP
|
|
```
|
|
|
|
Normalized event:
|
|
|
|
```text
|
|
notification.channel.unsubscribe_received
|
|
```
|
|
|
|
Assessment:
|
|
|
|
```text
|
|
category: fail
|
|
subclass: fail.unsubscribed
|
|
```
|
|
|
|
### 11.9 Unknown Return Message
|
|
|
|
Signals:
|
|
|
|
```text
|
|
Message appears related to return mailbox
|
|
but no reliable classification is possible
|
|
```
|
|
|
|
Normalized event:
|
|
|
|
```text
|
|
notification.endpoint.unknown
|
|
```
|
|
|
|
Assessment:
|
|
|
|
```text
|
|
category: undef
|
|
subclass: undef.conflicting_evidence or undef.no_signal
|
|
```
|
|
|
|
## 12. CSV Report Format
|
|
|
|
### 12.1 Required Columns
|
|
|
|
The timestamped CSV report shall include:
|
|
|
|
```text
|
|
report_generated_at
|
|
scan_id
|
|
mailbox_id
|
|
mailbox_message_id
|
|
mailbox_received_at
|
|
source_from
|
|
source_to
|
|
source_subject
|
|
message_id_header
|
|
detected_message_class
|
|
normalized_event_type
|
|
assessment_category
|
|
assessment_subclass
|
|
affected_email_address
|
|
original_message_id
|
|
original_recipient
|
|
smtp_status_code
|
|
enhanced_status_code
|
|
reason_code
|
|
confidence
|
|
evidence_strength
|
|
occurred_at
|
|
observed_at
|
|
first_seen_at
|
|
last_seen_at
|
|
deduplication_key
|
|
raw_message_ref
|
|
notes
|
|
```
|
|
|
|
### 12.2 Filename Convention
|
|
|
|
```text
|
|
email-channel-evidence-report-YYYYMMDD-HHMMSS.csv
|
|
```
|
|
|
|
Example:
|
|
|
|
```text
|
|
email-channel-evidence-report-20260602-173000.csv
|
|
```
|
|
|
|
### 12.3 Optional Secondary Reports
|
|
|
|
The MVP may also generate:
|
|
|
|
```text
|
|
email-channel-summary-report-YYYYMMDD-HHMMSS.csv
|
|
email-endpoint-quality-report-YYYYMMDD-HHMMSS.csv
|
|
email-parse-failures-YYYYMMDD-HHMMSS.csv
|
|
```
|
|
|
|
## 13. Deduplication Strategy
|
|
|
|
### 13.1 Message Deduplication
|
|
|
|
Message deduplication should use:
|
|
|
|
```text
|
|
mailbox_id
|
|
imap_uid
|
|
message_id_header
|
|
received_at
|
|
from_address
|
|
subject hash
|
|
body hash if available
|
|
```
|
|
|
|
### 13.2 Evidence Deduplication
|
|
|
|
Evidence deduplication should use:
|
|
|
|
```text
|
|
mailbox_message_id
|
|
parser_version
|
|
normalized_event_type
|
|
affected_email_address
|
|
original_message_id
|
|
smtp_status_code
|
|
enhanced_status_code
|
|
reason_code
|
|
```
|
|
|
|
### 13.3 Rescan Behavior
|
|
|
|
A rescan should not duplicate existing evidence.
|
|
|
|
Rescans should support:
|
|
|
|
```text
|
|
same parser version → preserve previous result unless raw message changed
|
|
new parser version → create new parse result version
|
|
report-only-new → export only newly discovered evidence
|
|
full report → export all current evidence
|
|
```
|
|
|
|
## 14. Storage Plan
|
|
|
|
For MVP, use local SQLite.
|
|
|
|
Suggested tables:
|
|
|
|
```text
|
|
mailbox_scans
|
|
mailbox_messages
|
|
parsed_messages
|
|
evidence_candidates
|
|
endpoint_quality
|
|
scan_cursors
|
|
raw_event_refs
|
|
```
|
|
|
|
This keeps the MVP simple while preserving a path to a server database later.
|
|
|
|
## 15. Parser Implementation Plan
|
|
|
|
### 15.1 Parser Pipeline
|
|
|
|
```text
|
|
load raw message
|
|
→ parse headers
|
|
→ parse MIME parts
|
|
→ identify DSN/report parts
|
|
→ extract original recipient
|
|
→ extract final recipient
|
|
→ extract original message id
|
|
→ extract SMTP/enhanced status codes
|
|
→ classify message
|
|
→ create parsed message record
|
|
→ create evidence candidate
|
|
```
|
|
|
|
### 15.2 Parser Layers
|
|
|
|
Implement parsers in layers:
|
|
|
|
```text
|
|
HeaderParser
|
|
MimeParser
|
|
DsnParser
|
|
BounceHeuristicParser
|
|
AutoReplyParser
|
|
ComplaintParser
|
|
UnsubscribeParser
|
|
HumanReplyHeuristicParser
|
|
EvidenceMapper
|
|
```
|
|
|
|
### 15.3 Parser Versioning
|
|
|
|
Every parse result shall include:
|
|
|
|
```text
|
|
parser_version
|
|
```
|
|
|
|
This allows rescanning old messages after parser improvements.
|
|
|
|
## 16. Evidence Mapping
|
|
|
|
Initial mappings:
|
|
|
|
| Parsed class | Normalized event | Assessment |
|
|
| ------------------------- | ------------------------------------------- | ------------------------------- |
|
|
| `hard_bounce` | `notification.endpoint.rejected_permanent` | `fail.hard_bounce` |
|
|
| `soft_bounce` | `notification.endpoint.rejected_temporary` | `undef.deferred` |
|
|
| `delayed_delivery_notice` | `notification.endpoint.deferred` | `undef.deferred` |
|
|
| `final_delivery_failure` | `notification.endpoint.rejected_permanent` | `fail.expired_without_delivery` |
|
|
| `out_of_office` | `interaction.out_of_office_received` | `undef.out_of_office` |
|
|
| `human_reply` | `interaction.reply_received` | `success.reply_received` |
|
|
| `complaint_or_abuse` | `notification.channel.complaint_received` | `fail.complaint_received` |
|
|
| `unsubscribe_or_opt_out` | `notification.channel.unsubscribe_received` | `fail.unsubscribed` |
|
|
| `unknown_return_message` | `notification.endpoint.unknown` | `undef.conflicting_evidence` |
|
|
| `parse_failed` | no event or diagnostic event | `undef` |
|
|
|
|
## 17. Endpoint Quality Updates
|
|
|
|
The scanner should update basic endpoint quality.
|
|
|
|
Examples:
|
|
|
|
| Evidence | Endpoint quality update |
|
|
| ------------- | ---------------------------------------------------------- |
|
|
| Hard bounce | `reachability = unreachable`, `last_failure_at = now` |
|
|
| Soft bounce | `reachability = degraded`, `last_failure_at = now` |
|
|
| Complaint | `suppression_state = suppressed` |
|
|
| Unsubscribe | `suppression_state = opted_out` |
|
|
| Human reply | `last_success_at = now` |
|
|
| Out of office | `last_auto_reply_at = now`, `reachability = uncertain` |
|
|
|
|
Endpoint quality is diagnostic and must not be treated as coordination success.
|
|
|
|
## 18. Work Packages
|
|
|
|
## T01 - Repository Bootstrap
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T01
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "3a17215d-62a9-48ef-877f-a6fbc7e95a22"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Create repo structure
|
|
Add INTENT.md
|
|
Add or update spec/ProductRequirementsDocument.md
|
|
Add this MVP workplan
|
|
Set up basic build/test tooling
|
|
Add initial CLI entry point
|
|
Add config loading
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Project can run a placeholder CLI command.
|
|
Config file is loaded and validated.
|
|
```
|
|
|
|
## T02 - Mailbox Connector
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T02
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "25a4da12-1bcd-4c6d-a0eb-a2f525b9c4b9"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Implement IMAP connection
|
|
Support TLS
|
|
Read credentials from environment variables
|
|
List/select configured folder
|
|
Fetch message metadata
|
|
Fetch full message source
|
|
Support max_messages_per_run
|
|
Support dry-run mode
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
CLI can connect to mailbox and list/fetch messages without modifying mailbox.
|
|
```
|
|
|
|
## T03 - Scan State and Storage
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T03
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "16b95a6b-1375-4c91-8b78-0b75d51e0aeb"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Add SQLite state store
|
|
Create schema
|
|
Store scan records
|
|
Store mailbox messages
|
|
Store scan cursor
|
|
Implement message deduplication
|
|
Implement incremental scan
|
|
Implement full rescan
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Running scanner twice does not duplicate mailbox messages.
|
|
Full rescan can revisit all messages while preserving deduplication.
|
|
```
|
|
|
|
## T04 - MIME and Header Parsing
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T04
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "5a50cd85-b0ab-4017-aba0-b2087068abb4"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Parse RFC message headers
|
|
Parse Message-ID
|
|
Parse From/To/Subject/Date
|
|
Parse MIME body parts
|
|
Extract text/plain
|
|
Extract text/html fallback
|
|
Extract message/delivery-status parts where present
|
|
Store raw headers reference
|
|
Optionally store raw body reference
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Scanner extracts basic metadata and text from representative bounce and reply messages.
|
|
```
|
|
|
|
## T05 - Bounce and DSN Parser
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T05
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "8ea826d1-0add-4573-9bb4-2b73adefba55"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Detect delivery-status notifications
|
|
Extract original recipient
|
|
Extract final recipient
|
|
Extract action
|
|
Extract status
|
|
Extract diagnostic code
|
|
Extract remote MTA if present
|
|
Classify hard vs soft bounce
|
|
Map SMTP/enhanced status codes
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Representative hard and soft bounce samples are classified correctly.
|
|
```
|
|
|
|
## T06 - Auto-Reply and Human Reply Classifier
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T06
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "4d94a332-173b-4787-8fb2-27aa63db6a8d"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Detect out-of-office patterns
|
|
Detect auto-reply headers
|
|
Detect common German and English OOO phrases
|
|
Detect human reply fallback
|
|
Classify challenge-response as separate class if possible
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Representative OOO and human reply samples are classified with confidence.
|
|
```
|
|
|
|
## T07 - Complaint and Unsubscribe Classifier
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T07
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "8637d383-25f7-45b5-9680-427ed2ca87bf"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Detect abuse/complaint messages
|
|
Detect unsubscribe/opt-out requests
|
|
Map to channel complaint/unsubscribe events
|
|
Create suppression candidates
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Representative complaint and unsubscribe examples are classified.
|
|
```
|
|
|
|
## T08 - Evidence Candidate Generation
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T08
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "6d62dea0-f416-4c0b-80a0-7c16422b8e5f"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Map parsed classes to normalized event types
|
|
Generate EmailEvidenceCandidate records
|
|
Assign assessment category/subclass
|
|
Assign evidence strength
|
|
Assign confidence
|
|
Generate evidence deduplication key
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Parsed messages produce evidence candidates according to the mapping table.
|
|
```
|
|
|
|
## T09 - Endpoint Quality Updates
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T09
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "0d110877-953f-4aa2-961b-eec81e0159d4"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Create endpoint_quality table
|
|
Update endpoint quality from evidence
|
|
Track last failure/success
|
|
Track suppression state
|
|
Track reason code history
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Hard bounce updates endpoint quality to unreachable.
|
|
Complaint/unsubscribe updates suppression state.
|
|
```
|
|
|
|
## T10 - CSV Report Generator
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T10
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "5ab35176-d6c2-4c73-b7b3-bde4c097e3ee"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Generate timestamped CSV report
|
|
Include required columns
|
|
Support report-only-new
|
|
Support full evidence report
|
|
Support parse failure report
|
|
Write deterministic headers
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Running scanner creates a CSV report with evidence rows.
|
|
Report can be opened in spreadsheet tools.
|
|
```
|
|
|
|
## T11 - Golden Test Corpus
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T11
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "514fa099-781b-4590-aae4-c28970413b3f"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Create test fixture directory
|
|
Add synthetic hard bounce
|
|
Add synthetic soft bounce
|
|
Add delayed delivery notice
|
|
Add final failure
|
|
Add out-of-office
|
|
Add human reply
|
|
Add unsubscribe
|
|
Add unknown return message
|
|
Add parse failure sample
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
Automated tests verify expected classification and normalized event output.
|
|
```
|
|
|
|
## T12 - Minimal Documentation
|
|
|
|
```task
|
|
id: EMAIL-WP-0002-T12
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "a5f7067e-87be-4438-ba35-b12d06a8181e"
|
|
```
|
|
|
|
Tasks:
|
|
|
|
```text
|
|
Add README quickstart
|
|
Document config file
|
|
Document CLI commands
|
|
Document CSV format
|
|
Document evidence mapping
|
|
Document limitations
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
```text
|
|
A developer can run the scanner against a test mailbox or fixture directory.
|
|
```
|
|
|
|
## 19. MVP Milestones
|
|
|
|
### Milestone 1: Scan and Store
|
|
|
|
Goal:
|
|
|
|
```text
|
|
Connect to mailbox, fetch messages, store metadata, deduplicate.
|
|
```
|
|
|
|
Includes:
|
|
|
|
```text
|
|
T01
|
|
T02
|
|
T03
|
|
```
|
|
|
|
### Milestone 2: Parse and Classify
|
|
|
|
Goal:
|
|
|
|
```text
|
|
Parse messages and classify bounces, OOO, replies, complaints, unsubscribe.
|
|
```
|
|
|
|
Includes:
|
|
|
|
```text
|
|
T04
|
|
T05
|
|
T06
|
|
T07
|
|
```
|
|
|
|
### Milestone 3: Evidence and Reports
|
|
|
|
Goal:
|
|
|
|
```text
|
|
Generate normalized evidence candidates and timestamped CSV reports.
|
|
```
|
|
|
|
Includes:
|
|
|
|
```text
|
|
T08
|
|
T09
|
|
T10
|
|
```
|
|
|
|
### Milestone 4: Confidence and Repeatability
|
|
|
|
Goal:
|
|
|
|
```text
|
|
Add golden tests, parser versioning, and documentation.
|
|
```
|
|
|
|
Includes:
|
|
|
|
```text
|
|
T11
|
|
T12
|
|
```
|
|
|
|
## 20. MVP Acceptance Criteria
|
|
|
|
The MVP is complete when:
|
|
|
|
1. A user can configure access to one IMAP mailbox.
|
|
2. The scanner can run without modifying mailbox contents.
|
|
3. The scanner can perform incremental scans.
|
|
4. The scanner can perform full rescans.
|
|
5. Already-seen messages are deduplicated.
|
|
6. Hard bounces are classified.
|
|
7. Soft bounces are classified.
|
|
8. Delayed delivery notices are classified.
|
|
9. Out-of-office replies are classified.
|
|
10. Human replies are classified.
|
|
11. Complaints or unsubscribe messages are classified where detectable.
|
|
12. Unknown messages are preserved as unknown rather than ignored silently.
|
|
13. Evidence candidates are generated.
|
|
14. Endpoint quality is updated.
|
|
15. A timestamped CSV report is produced.
|
|
16. Golden tests cover representative sample messages.
|
|
17. The report format aligns with the `email-connect` evidence model.
|
|
18. The implementation does not overclaim email evidence.
|
|
|
|
## 21. Design Rules
|
|
|
|
### 21.1 Do Not Overclaim
|
|
|
|
The scanner must not infer more than the mailbox evidence supports.
|
|
|
|
Examples:
|
|
|
|
```text
|
|
No bounce found ≠ delivery success
|
|
Out-of-office ≠ recipient completed action
|
|
Human reply ≠ legally valid acceptance
|
|
Unknown message ≠ failure
|
|
```
|
|
|
|
### 21.2 Preserve Unknowns
|
|
|
|
Unknown and parse-failed messages should be visible in reports.
|
|
|
|
### 21.3 Prefer Evidence Over Status
|
|
|
|
The scanner should produce evidence rows, not only final statuses.
|
|
|
|
### 21.4 Make Rescans Safe
|
|
|
|
Rescans should be safe, deduplicated, and parser-version-aware.
|
|
|
|
### 21.5 Keep Raw References
|
|
|
|
Store enough raw reference data to allow later inspection.
|
|
|
|
## 22. Suggested Initial Repository Structure
|
|
|
|
```text
|
|
email-connect/
|
|
INTENT.md
|
|
spec/
|
|
ProductRequirementsDocument.md
|
|
workplans/
|
|
EMAIL-WP-0002-mvp-mailbox-evidence-scanner.md
|
|
docs/
|
|
EmailAdapterSpecification.md
|
|
src/
|
|
email_connect/
|
|
cli/
|
|
config/
|
|
mailbox/
|
|
parsing/
|
|
evidence/
|
|
reporting/
|
|
storage/
|
|
tests/
|
|
fixtures/
|
|
hard_bounce/
|
|
soft_bounce/
|
|
delayed_delivery/
|
|
out_of_office/
|
|
human_reply/
|
|
unsubscribe/
|
|
unknown/
|
|
test_mailbox_scanner.py
|
|
test_bounce_parser.py
|
|
test_evidence_mapping.py
|
|
config/
|
|
mailbox.example.yml
|
|
reports/
|
|
.gitkeep
|
|
```
|
|
|
|
## 23. Future Extensions After MVP
|
|
|
|
Possible next steps:
|
|
|
|
```text
|
|
Provider webhook ingestion
|
|
Outbound send API
|
|
Template manager
|
|
Minimal UI
|
|
Multi-mailbox scanning
|
|
OAuth mailbox access
|
|
Mailbox write-back actions
|
|
Advanced DSN parsing
|
|
Advanced German/English auto-reply classification
|
|
Natural-language reply intent extraction
|
|
Suppression export
|
|
coordination-engine evidence event push
|
|
Endpoint quality dashboard
|
|
Provider-specific bounce mappings
|
|
```
|
|
|
|
## 24. Summary
|
|
|
|
The Mailbox Evidence Scanner MVP is a strong first implementation slice for `email-connect`.
|
|
|
|
It delivers immediate practical value by converting a return mailbox into structured email-channel evidence reports. It also validates the core `email-connect` evidence model before implementing outbound sending, provider webhooks, or full coordination-engine integration.
|
|
|
|
The guiding rule is:
|
|
|
|
> Scan the mailbox, preserve the evidence, classify conservatively, report clearly, and never treat missing evidence as delivery success.
|