BatchEvaluator runs evaluation prompts across item batches with
incremental evaluation (skip unchanged via content digest), per-item
error isolation, progress callbacks, and aggregate token usage tracking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>