fix(pipeline): retry on 0-entity response, save raw debug, improve template

- SourcePipeline: retry split_entities stage once when 0 entity delimiters
  are found (free-tier models intermittently return short non-formatted
  responses); save raw LLM response to <stage>-raw.md alongside prompts
- Return None (pause pipeline) rather than writing empty view file when
  no entities found after max retries
- _http.py: wrap json.JSONDecodeError in LLMAPIError with body preview
- extract-entities.md: add explicit H2-heading format example to Output
  Format section to prevent models from using inline "Section:" format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-19 14:26:28 +01:00
parent 72d9904485
commit 5ede1de4b8
3 changed files with 70 additions and 6 deletions

View File

@@ -40,7 +40,14 @@ def post_json(
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
body = resp.read().decode()
return json.loads(body)
try:
return json.loads(body)
except json.JSONDecodeError as exc:
preview = body[:300].replace("\n", "\\n")
raise LLMAPIError(
f"Invalid JSON response from {url}: {exc} — body preview: {preview!r}",
cause=exc,
) from exc
except urllib.error.HTTPError as exc:
body = ""
try: