Token Evidence Model

State Hub token events distinguish source-backed measurements from inferred operational signals. Dashboards and reports should use structured fields for quality and provenance; note remains human context only.

Measurement Kinds

Kind	Meaning	Default confidence
`measured`	Parsed from a source that reports usage metadata, such as Codex session logs or Claude transcript usage blocks.	`1.0`
`allocated`	A share of a larger known total, assigned to a task/workstream by a documented allocation method.	`0.70`
`estimated`	A fallback or operator-entered estimate without direct source evidence.	`0.35`
`superseded`	Historical rows retained for audit but excluded from active totals.	`0.0`

Source Providers

Provider	Source
`codex_session`	Codex Desktop `.codex/sessions/` and `.codex/archived_sessions/` JSONL token_count events.
`claude_transcript`	Claude Code `.claude/projects/*/.jsonl` usage metadata. Transcript text is never stored.
`llm_connect`	Future llm-connect usage metadata.
`manual`	Explicit operator/API input.
`task_fallback`	Fixed task-completion fallback rows created when no source data is available.

Provenance Fields

Each source-backed row should include:

source_provider, source_id, source_path, source_created_at
parser_version, ingested_at, confidence
cached_input_tokens, reasoning_output_tokens, raw_total_tokens
raw_metadata with parser and attribution metadata, never transcript content

tokens_in + tokens_out remains the default active total. Cached input and reasoning output are preserved separately so dashboards can show both default and provider-style totals without rewriting history.

Idempotency

Measured sources must be written with a stable source_id. State Hub enforces one row for each (measurement_kind, source_provider, source_id) tuple and POST /token-events/upsert updates a growing live session rather than creating duplicates.

Migration Playbook

Run the token-event provenance migration.
Run python3 scripts/token_reconcile.py --since 2026-05-19 and inspect the dry-run report.
Run python3 scripts/token_reconcile.py --since 2026-05-19 --apply to upsert measured Codex/Claude source rows.
Run the same command with --zero-superseded-fallbacks only after measured source rows cover the affected window.
Check /token-events/quality/ or the Token Cost dashboard for fallback, missing-provenance, duplicate-source, and unattributed measured signals.
Keep historical fallback rows as superseded; do not delete them.

2.6 KiB Raw Blame History