generated from coulomb/repo-seed
Compare commits
8 Commits
e489d614c2
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 0a7176bf2d | |||
| f6003bc4a1 | |||
| 7cf52213bf | |||
| 25fb6946bc | |||
| 720e46eef5 | |||
| 40b2f12797 | |||
| d37f22ac18 | |||
| d8a08d6032 |
20
.claude/rules/agents.md
Normal file
20
.claude/rules/agents.md
Normal file
@@ -0,0 +1,20 @@
|
|||||||
|
## Kaizen Agents
|
||||||
|
|
||||||
|
Specialized agent personas available on demand via the state-hub MCP.
|
||||||
|
|
||||||
|
**Discover:** `list_kaizen_agents()` — returns all agents with name, description, category
|
||||||
|
**Load:** `get_kaizen_agent("tdd-workflow")` — returns full instructions; read and follow them
|
||||||
|
|
||||||
|
Common agents:
|
||||||
|
|
||||||
|
| Agent | Category | When to use |
|
||||||
|
|-------|----------|-------------|
|
||||||
|
| `tdd-workflow` | testing | Step-by-step TDD8 workflow for any feature |
|
||||||
|
| `code-refactoring` | quality | Code quality analysis and safe refactoring |
|
||||||
|
| `test-maintenance` | testing | Diagnose and fix failing tests |
|
||||||
|
| `requirements-engineering` | process | Prevent interface/mock mismatches upfront |
|
||||||
|
| `keepaTodofile` | process | Maintain TODO.md during work |
|
||||||
|
| `project-management` | process | Track status, determine next steps |
|
||||||
|
| `datamodel-optimization` | quality | Optimize dataclasses and data structures |
|
||||||
|
|
||||||
|
All 17 agents: call `list_kaizen_agents()` for the full list.
|
||||||
8
.claude/rules/architecture.md
Normal file
8
.claude/rules/architecture.md
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
## Architecture
|
||||||
|
|
||||||
|
<!-- TODO: Describe the key design decisions and component structure.
|
||||||
|
Key modules, data flows, external integrations, state machines, etc. -->
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
`~/state-hub/mcp_server/TOOLS.md` — MCP tool reference
|
||||||
50
.claude/rules/credential-routing.md
Normal file
50
.claude/rules/credential-routing.md
Normal file
@@ -0,0 +1,50 @@
|
|||||||
|
# Credential and access routing
|
||||||
|
|
||||||
|
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
|
||||||
|
for inference. Run this check **before** requesting secrets, API keys, SSH access,
|
||||||
|
login tokens, or database passwords — in any repo, not only `ops-warden`.
|
||||||
|
|
||||||
|
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
|
||||||
|
other credential need belongs to another subsystem. **Do not** message
|
||||||
|
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
|
||||||
|
|
||||||
|
### Lookup (do this first)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
warden route find "<describe your need>" --json
|
||||||
|
warden route show <catalog-id> --json
|
||||||
|
```
|
||||||
|
|
||||||
|
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
|
||||||
|
|
||||||
|
| Agent runtime | How to orient |
|
||||||
|
| --- | --- |
|
||||||
|
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=evidence-source` is for coordination, not secret vending |
|
||||||
|
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
|
||||||
|
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
|
||||||
|
|
||||||
|
### Quick routing table
|
||||||
|
|
||||||
|
| I need… | Owner | ops-warden executes? |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
|
||||||
|
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
|
||||||
|
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
|
||||||
|
| Authorization decision | flex-auth | No — route only |
|
||||||
|
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
|
||||||
|
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
|
||||||
|
|
||||||
|
### Anti-patterns (do not do these)
|
||||||
|
|
||||||
|
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
|
||||||
|
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
|
||||||
|
- Pasting secrets into Git, State Hub, workplans, logs, or chat
|
||||||
|
|
||||||
|
### Other capabilities (reuse-surface)
|
||||||
|
|
||||||
|
Non-credential capabilities are usually discovered through **reuse-surface** federation
|
||||||
|
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
|
||||||
|
every repo's agent instructions because it is high-frequency, high-risk, and easy to
|
||||||
|
get wrong.
|
||||||
|
|
||||||
|
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
|
||||||
38
.claude/rules/first-session.md
Normal file
38
.claude/rules/first-session.md
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
## First Session Protocol
|
||||||
|
|
||||||
|
Triggered when `get_domain_summary("infotech")` shows **no workstreams**.
|
||||||
|
The project is registered but work has not yet been structured.
|
||||||
|
|
||||||
|
**Step 1 — Read, don't write**
|
||||||
|
- `~/the-custodian/canon/projects/infotech/project_charter_v0.1.md` — purpose, scope
|
||||||
|
- `~/the-custodian/canon/projects/infotech/roadmap_v0.1.md` — planned phases
|
||||||
|
- Scan repo root: README, directory structure, existing code or docs
|
||||||
|
|
||||||
|
**Step 2 — Survey in-progress work**
|
||||||
|
Look for TODOs, open branches, half-finished files. Note done vs. started but incomplete.
|
||||||
|
|
||||||
|
**Step 3 — Propose workstreams to Bernd**
|
||||||
|
Propose 1–3 workstreams — each a coherent strand, weeks to months, anchored to a
|
||||||
|
roadmap phase. **Wait for approval before creating.**
|
||||||
|
|
||||||
|
**Step 4 — Create workplan file first, then DB record (ADR-001)**
|
||||||
|
```
|
||||||
|
workplans/ESRC-WP-NNNN-<slug>.md ← write this first
|
||||||
|
```
|
||||||
|
Then register in the hub:
|
||||||
|
```
|
||||||
|
create_workstream(topic_id="cee7bedf-2b48-46ef-8601-006474f2ad7a", title="...", owner="...", description="...")
|
||||||
|
create_task(workstream_id="<id>", title="...", priority="high|medium|low")
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 5 — Record the setup**
|
||||||
|
```
|
||||||
|
add_progress_event(
|
||||||
|
summary="First session: structured infotech into N workstreams, M tasks",
|
||||||
|
event_type="milestone",
|
||||||
|
topic_id="cee7bedf-2b48-46ef-8601-006474f2ad7a",
|
||||||
|
detail={"workstreams": [...], "tasks_created": M}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
<!-- Delete or archive this file once past first session -->
|
||||||
8
.claude/rules/repo-boundary.md
Normal file
8
.claude/rules/repo-boundary.md
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
## Repo boundary
|
||||||
|
|
||||||
|
This repo owns **evidence-source** only. It does not own:
|
||||||
|
|
||||||
|
<!-- TODO: List what belongs in adjacent repos, e.g.:
|
||||||
|
- SSH key management → railiance-infra/
|
||||||
|
- State hub code → state-hub/
|
||||||
|
-->
|
||||||
5
.claude/rules/repo-identity.md
Normal file
5
.claude/rules/repo-identity.md
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
**Purpose:** Document ingestion, extraction, fingerprinting, citation recovery. Depends only on citation-engine. INTENT-only during umbrella-first MVP.
|
||||||
|
|
||||||
|
**Domain:** infotech
|
||||||
|
**Repo slug:** evidence-source
|
||||||
|
**Topic ID:** cee7bedf-2b48-46ef-8601-006474f2ad7a
|
||||||
85
.claude/rules/session-protocol.md
Normal file
85
.claude/rules/session-protocol.md
Normal file
@@ -0,0 +1,85 @@
|
|||||||
|
## Session Protocol
|
||||||
|
|
||||||
|
Dev Hub (State Hub API): http://127.0.0.1:8000
|
||||||
|
MCP server name in `~/.claude.json`: `dev-hub`
|
||||||
|
|
||||||
|
**Step 1 — Orient**
|
||||||
|
|
||||||
|
Read the offline-safe brief first — it works without a live hub connection:
|
||||||
|
```bash
|
||||||
|
cat .custodian-brief.md
|
||||||
|
```
|
||||||
|
Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
|
||||||
|
```
|
||||||
|
get_domain_summary("infotech")
|
||||||
|
```
|
||||||
|
If MCP tools are unavailable in the current agent session, use the REST API:
|
||||||
|
```bash
|
||||||
|
curl -s "http://127.0.0.1:8000/state/summary" | python3 -m json.tool
|
||||||
|
```
|
||||||
|
If the hub is offline: `cd ~/state-hub && make api`
|
||||||
|
|
||||||
|
**Step 2 — Check inbox**
|
||||||
|
With MCP tools:
|
||||||
|
```
|
||||||
|
get_messages(to_agent="evidence-source", unread_only=True)
|
||||||
|
```
|
||||||
|
Mark read with `mark_message_read(message_id)`. Reply or act on coordination
|
||||||
|
requests before proceeding.
|
||||||
|
|
||||||
|
Without MCP tools:
|
||||||
|
```bash
|
||||||
|
curl -s "http://127.0.0.1:8000/messages/?to_agent=evidence-source&unread_only=true" \
|
||||||
|
| python3 -m json.tool
|
||||||
|
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
|
||||||
|
-H "Content-Type: application/json" -d '{}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 3 — Scan workplans**
|
||||||
|
```bash
|
||||||
|
ls workplans/
|
||||||
|
```
|
||||||
|
For each file with `status: ready`, `active`, or `blocked`, note pending
|
||||||
|
`wait`/`todo`/`progress` tasks.
|
||||||
|
|
||||||
|
**Step 4 — Present brief**
|
||||||
|
|
||||||
|
1. **Active workstreams** for `infotech` — title, task counts, blocking decisions
|
||||||
|
2. **Pending tasks** from `workplans/` + any `[repo:evidence-source]` hub tasks
|
||||||
|
3. **Goal guidance** — if `goal_guidance` in summary:
|
||||||
|
- `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
|
||||||
|
- `alignment_warnings`: flag if active work is not aligned with current goal
|
||||||
|
4. **Suggested next action** — highest-priority open item
|
||||||
|
5. **SBOM status** — flag if `last_sbom_at` is unset for this repo
|
||||||
|
|
||||||
|
If no workstreams: follow First Session Protocol (`first-session.md`).
|
||||||
|
|
||||||
|
**During work:** `record_decision()` · `add_progress_event()` · `resolve_decision()`
|
||||||
|
|
||||||
|
> State Hub is a *read model*. Bootstrap tools (`create_workstream`, `create_task`)
|
||||||
|
> are First Session Protocol only. Work structure belongs in repo files (ADR-001).
|
||||||
|
|
||||||
|
**Session close:**
|
||||||
|
With MCP tools:
|
||||||
|
```
|
||||||
|
add_progress_event(summary="...", topic_id="cee7bedf-2b48-46ef-8601-006474f2ad7a", workstream_id="<uuid>")
|
||||||
|
```
|
||||||
|
Without MCP tools:
|
||||||
|
```bash
|
||||||
|
curl -s -X POST http://127.0.0.1:8000/progress/ \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"topic_id":"cee7bedf-2b48-46ef-8601-006474f2ad7a","workstream_id":"<uuid>","event_type":"note","summary":"what changed","author":"codex"}'
|
||||||
|
```
|
||||||
|
If workplan files were modified, ensure the local copy is up to date first:
|
||||||
|
```bash
|
||||||
|
git -C <repo_path> pull --ff-only
|
||||||
|
cd ~/state-hub && make fix-consistency REPO=evidence-source
|
||||||
|
```
|
||||||
|
For repos where implementation runs on a remote machine (e.g. CoulombCore),
|
||||||
|
use the combined target which pulls before fixing:
|
||||||
|
```bash
|
||||||
|
cd ~/state-hub && make fix-consistency-remote REPO=evidence-source
|
||||||
|
```
|
||||||
|
**C-15** (DB task ahead of file) is normal in multi-machine workflows — writeback
|
||||||
|
will sync the file to match DB. **C-16** (repo behind remote) blocks all writes
|
||||||
|
until you pull — intentional to prevent clobbering remote progress.
|
||||||
19
.claude/rules/stack-and-commands.md
Normal file
19
.claude/rules/stack-and-commands.md
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
## Stack
|
||||||
|
|
||||||
|
<!-- TODO: Fill in language, frameworks, and key dependencies -->
|
||||||
|
- **Language:**
|
||||||
|
- **Key deps:**
|
||||||
|
|
||||||
|
## Dev Commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# TODO: Fill in the standard commands for this repo
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
|
||||||
|
# Run tests
|
||||||
|
|
||||||
|
# Lint / type check
|
||||||
|
|
||||||
|
# Build / package (if applicable)
|
||||||
|
```
|
||||||
40
.claude/rules/workplan-convention.md
Normal file
40
.claude/rules/workplan-convention.md
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
## Workplan Convention (ADR-001)
|
||||||
|
|
||||||
|
File location: `workplans/ESRC-WP-NNNN-<slug>.md`
|
||||||
|
ID prefix: `ESRC-WP-`
|
||||||
|
|
||||||
|
Work items originate as files in this repo **before** being registered in the hub.
|
||||||
|
|
||||||
|
Canonical workplan/workstream frontmatter statuses are:
|
||||||
|
`proposed`, `ready`, `active`, `blocked`, `backlog`, `finished`, `archived`.
|
||||||
|
Use `proposed` for a newly drafted plan, `ready` after review against current
|
||||||
|
repo state, and `finished` when implementation is complete. `stalled` and
|
||||||
|
`needs_review` are derived health labels, not stored statuses.
|
||||||
|
|
||||||
|
Closed workplans may be moved to `workplans/archived/` with a completion-date
|
||||||
|
prefix: `YYMMDD-ESRC-WP-NNNN-<slug>.md`. The frontmatter id remains
|
||||||
|
unchanged; the prefix is only for quick visual reference.
|
||||||
|
|
||||||
|
Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
|
||||||
|
`workplans/ADHOC-YYYY-MM-DD.md`, workstream slug `adhoc-YYYY-MM-DD`, and task ids
|
||||||
|
`ADHOC-YYYY-MM-DD-T01`, `T02`, etc. Use adhocs only for low-risk work completed
|
||||||
|
directly. Promote anything requiring analysis, design, approval, dependencies, or
|
||||||
|
multiple planned phases into a normal workplan.
|
||||||
|
|
||||||
|
Ecosystem todos from other agents arrive as `[repo:evidence-source]` hub tasks —
|
||||||
|
visible at session start. Pick one up by creating the workplan file, then registering
|
||||||
|
the workstream.
|
||||||
|
|
||||||
|
Task blocks use this shape:
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ESRC-WP-NNNN-T01
|
||||||
|
status: wait | todo | progress | done | cancel
|
||||||
|
priority: high | medium | low
|
||||||
|
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
|
||||||
|
```
|
||||||
|
|
||||||
|
Status progression is `todo` → `progress` → `done`; use `wait` for waiting or
|
||||||
|
blocked work and `cancel` for stopped work.
|
||||||
|
|
||||||
|
<!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->
|
||||||
18
.custodian-brief.md
Normal file
18
.custodian-brief.md
Normal file
@@ -0,0 +1,18 @@
|
|||||||
|
<!-- custodian-brief: generated by fix-consistency — do not edit manually -->
|
||||||
|
# Custodian Brief — evidence-source
|
||||||
|
|
||||||
|
**Domain:** infotech
|
||||||
|
**Last synced:** 2026-06-22 18:28 UTC
|
||||||
|
**State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
|
||||||
|
|
||||||
|
## Active Workstreams
|
||||||
|
|
||||||
|
*(none — repo may need first-session setup)*
|
||||||
|
|
||||||
|
---
|
||||||
|
## MCP Orientation (when available)
|
||||||
|
|
||||||
|
If the state-hub MCP server is reachable, call:
|
||||||
|
`get_domain_summary("infotech")`
|
||||||
|
This provides richer cross-domain context.
|
||||||
|
If the MCP call fails, use this file as your orientation source.
|
||||||
19
.repo-classification.yaml
Normal file
19
.repo-classification.yaml
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
repo_classification:
|
||||||
|
standard: Repo Classification Standard
|
||||||
|
version: '1.0'
|
||||||
|
classified_at: '2026-06-22'
|
||||||
|
classified_by: agent
|
||||||
|
category: project
|
||||||
|
domain: infotech
|
||||||
|
secondary_domains: []
|
||||||
|
capability_tags:
|
||||||
|
- evidence
|
||||||
|
- traceability
|
||||||
|
- source-management
|
||||||
|
business_stake:
|
||||||
|
- technology
|
||||||
|
- product
|
||||||
|
- operations
|
||||||
|
business_mechanics:
|
||||||
|
- coordination
|
||||||
|
- operation
|
||||||
219
AGENTS.md
Normal file
219
AGENTS.md
Normal file
@@ -0,0 +1,219 @@
|
|||||||
|
# evidence-source — Agent Instructions
|
||||||
|
|
||||||
|
## Repo Identity
|
||||||
|
|
||||||
|
**Purpose:** Document ingestion, extraction, fingerprinting, citation recovery. Depends only on citation-engine. INTENT-only during umbrella-first MVP.
|
||||||
|
|
||||||
|
**Domain:** infotech
|
||||||
|
**Repo slug:** evidence-source
|
||||||
|
**Topic ID:** `cee7bedf-2b48-46ef-8601-006474f2ad7a`
|
||||||
|
**Workplan prefix:** `ESRC-WP-`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## State Hub Integration
|
||||||
|
|
||||||
|
The Custodian State Hub tracks work across all domains. Interact via HTTP REST —
|
||||||
|
there is no MCP server for Codex agents.
|
||||||
|
|
||||||
|
| Context | URL |
|
||||||
|
|---------|-----|
|
||||||
|
| Local workstation | `http://127.0.0.1:8000` |
|
||||||
|
| Remote via tunnel | `http://127.0.0.1:18000` |
|
||||||
|
|
||||||
|
### Orient at session start
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Offline brief — works without hub connection
|
||||||
|
cat .custodian-brief.md
|
||||||
|
|
||||||
|
# Active workstreams for this domain
|
||||||
|
curl -s "http://127.0.0.1:8000/workstreams/?topic_id=cee7bedf-2b48-46ef-8601-006474f2ad7a&status=active" \
|
||||||
|
| python3 -m json.tool
|
||||||
|
|
||||||
|
# Check inbox
|
||||||
|
curl -s "http://127.0.0.1:8000/messages/?to_agent=evidence-source&unread_only=true" \
|
||||||
|
| python3 -m json.tool
|
||||||
|
```
|
||||||
|
|
||||||
|
Mark a message read:
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
|
||||||
|
-H "Content-Type: application/json" -d '{}'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Log progress (required at session close)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -s -X POST http://127.0.0.1:8000/progress/ \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"summary": "what was done",
|
||||||
|
"event_type": "note",
|
||||||
|
"author": "codex",
|
||||||
|
"workstream_id": "<uuid>",
|
||||||
|
"task_id": "<uuid>"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
Omit `workstream_id` / `task_id` when not applicable.
|
||||||
|
|
||||||
|
### Update task status
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"status": "progress"}'
|
||||||
|
# values: wait | todo | progress | done | cancel
|
||||||
|
```
|
||||||
|
|
||||||
|
### Flag a task for human review
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"needs_human": true, "intervention_note": "reason"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Session Protocol
|
||||||
|
|
||||||
|
**Start:**
|
||||||
|
1. `cat .custodian-brief.md` — domain goal and open workstreams (offline-safe)
|
||||||
|
2. Check inbox: `GET /messages/?to_agent=evidence-source&unread_only=true`; mark read
|
||||||
|
3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks
|
||||||
|
4. Check human-needed tasks: `GET /tasks/?needs_human=true`
|
||||||
|
|
||||||
|
**During work:**
|
||||||
|
- Update task statuses in workplan files as tasks progress
|
||||||
|
- Record significant decisions via `POST /decisions/`
|
||||||
|
|
||||||
|
**Close:**
|
||||||
|
1. Update workplan file task statuses to reflect progress
|
||||||
|
2. Log: `POST /progress/` with a summary of what changed
|
||||||
|
3. Note for the custodian operator: after workplan file changes, run from
|
||||||
|
`~/state-hub`:
|
||||||
|
```bash
|
||||||
|
make fix-consistency REPO=evidence-source
|
||||||
|
```
|
||||||
|
This syncs task status from files into the hub DB.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Credential and access routing
|
||||||
|
|
||||||
|
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
|
||||||
|
for inference. Run this check **before** requesting secrets, API keys, SSH access,
|
||||||
|
login tokens, or database passwords — in any repo, not only `ops-warden`.
|
||||||
|
|
||||||
|
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
|
||||||
|
other credential need belongs to another subsystem. **Do not** message
|
||||||
|
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
|
||||||
|
|
||||||
|
### Lookup (do this first)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
warden route find "<describe your need>" --json
|
||||||
|
warden route show <catalog-id> --json
|
||||||
|
```
|
||||||
|
|
||||||
|
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
|
||||||
|
|
||||||
|
| Agent runtime | How to orient |
|
||||||
|
| --- | --- |
|
||||||
|
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=evidence-source` is for coordination, not secret vending |
|
||||||
|
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
|
||||||
|
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
|
||||||
|
|
||||||
|
### Quick routing table
|
||||||
|
|
||||||
|
| I need… | Owner | ops-warden executes? |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
|
||||||
|
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
|
||||||
|
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
|
||||||
|
| Authorization decision | flex-auth | No — route only |
|
||||||
|
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
|
||||||
|
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
|
||||||
|
|
||||||
|
### Anti-patterns (do not do these)
|
||||||
|
|
||||||
|
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
|
||||||
|
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
|
||||||
|
- Pasting secrets into Git, State Hub, workplans, logs, or chat
|
||||||
|
|
||||||
|
### Other capabilities (reuse-surface)
|
||||||
|
|
||||||
|
Non-credential capabilities are usually discovered through **reuse-surface** federation
|
||||||
|
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
|
||||||
|
every repo's agent instructions because it is high-frequency, high-risk, and easy to
|
||||||
|
get wrong.
|
||||||
|
|
||||||
|
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
|
||||||
|
|
||||||
|
<!-- REPO-AGENTS-EXTENSIONS -->
|
||||||
|
<!-- Append repo-specific agent instructions below this marker.
|
||||||
|
The state-hub template sync preserves content after this line. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Workplan Convention (ADR-001)
|
||||||
|
|
||||||
|
Work items originate as files in this repo — not in the hub. The hub is a
|
||||||
|
read/cache/index layer that rebuilds from files.
|
||||||
|
|
||||||
|
**File location:** `workplans/EVIDENCE-WP-NNNN-<slug>.md`
|
||||||
|
|
||||||
|
**Archived location:** finished workplans may move to
|
||||||
|
`workplans/archived/YYMMDD-EVIDENCE-WP-NNNN-<slug>.md`. The `YYMMDD` prefix is
|
||||||
|
the completion/archive date; the frontmatter `id` does not change.
|
||||||
|
|
||||||
|
**Ad Hoc Tasks:** small opportunistic fixes discovered during a session use
|
||||||
|
`workplans/ADHOC-YYYY-MM-DD.md` with task ids `ADHOC-YYYY-MM-DD-T01`, etc. Use
|
||||||
|
this only for low-risk work completed directly; create a normal workplan for
|
||||||
|
anything needing analysis, design, approval, dependencies, or multiple phases.
|
||||||
|
|
||||||
|
**Frontmatter:**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
---
|
||||||
|
id: EVIDENCE-WP-NNNN
|
||||||
|
type: workplan
|
||||||
|
title: "..."
|
||||||
|
domain: infotech
|
||||||
|
repo: evidence-source
|
||||||
|
status: proposed | ready | active | blocked | backlog | finished | archived
|
||||||
|
owner: codex
|
||||||
|
topic_slug: ...
|
||||||
|
created: "YYYY-MM-DD"
|
||||||
|
updated: "YYYY-MM-DD"
|
||||||
|
state_hub_workstream_id: "<uuid>" # written by fix-consistency — do not edit
|
||||||
|
---
|
||||||
|
```
|
||||||
|
|
||||||
|
Use `proposed` for a new draft, `ready` after review against current repo
|
||||||
|
state, and `finished` after implementation. `stalled` and `needs_review` are
|
||||||
|
derived health labels, not frontmatter statuses.
|
||||||
|
|
||||||
|
**Task block format** (one per `##` section):
|
||||||
|
|
||||||
|
```
|
||||||
|
## Task Title
|
||||||
|
|
||||||
|
` ` `task
|
||||||
|
id: EVIDENCE-WP-NNNN-T01
|
||||||
|
status: wait | todo | progress | done | cancel
|
||||||
|
priority: high | medium | low
|
||||||
|
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
|
||||||
|
` ` `
|
||||||
|
|
||||||
|
Task description text.
|
||||||
|
```
|
||||||
|
|
||||||
|
Status progression: `todo` → `progress` → `done`; use `wait` for waiting/blocked work and `cancel` for stopped work.
|
||||||
|
|
||||||
|
To create a new workplan:
|
||||||
|
1. Write the file following the format above
|
||||||
|
2. Notify the custodian operator to run `make fix-consistency REPO=evidence-source`
|
||||||
|
(or send a message to the hub agent via `POST /messages/`)
|
||||||
12
CLAUDE.md
Normal file
12
CLAUDE.md
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
# evidence-source — Claude Code Instructions
|
||||||
|
|
||||||
|
@SCOPE.md
|
||||||
|
@.claude/rules/repo-identity.md
|
||||||
|
@.claude/rules/session-protocol.md
|
||||||
|
@.claude/rules/first-session.md
|
||||||
|
@.claude/rules/workplan-convention.md
|
||||||
|
@.claude/rules/stack-and-commands.md
|
||||||
|
@.claude/rules/architecture.md
|
||||||
|
@.claude/rules/repo-boundary.md
|
||||||
|
@.claude/rules/credential-routing.md
|
||||||
|
@.claude/rules/agents.md
|
||||||
492
INTENT.md
Normal file
492
INTENT.md
Normal file
@@ -0,0 +1,492 @@
|
|||||||
|
# INTENT
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
This repository exists to provide the document source, ingestion, extraction, metadata, and citation recovery layer for the **citation-evidence** ecosystem.
|
||||||
|
|
||||||
|
**evidence-source** turns raw documents and source clues into usable, searchable, addressable document representations that can support annotations, evidence items, citation recovery, and source-backed workflows.
|
||||||
|
|
||||||
|
It is responsible for answering the source-side questions:
|
||||||
|
|
||||||
|
> What is this document?
|
||||||
|
> How can we extract usable text and structure from it?
|
||||||
|
> How can we find or recover a cited source passage?
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Primary Utility
|
||||||
|
|
||||||
|
The repository provides the source pipeline for citation-evidence.
|
||||||
|
|
||||||
|
It should make it possible to:
|
||||||
|
|
||||||
|
- import documents into a collection or workspace,
|
||||||
|
- identify document type and media type,
|
||||||
|
- compute stable document fingerprints,
|
||||||
|
- extract document metadata,
|
||||||
|
- extract canonical text,
|
||||||
|
- create document representations for PDFs, Markdown, HTML, and later other formats,
|
||||||
|
- build maps between text, pages, sections, and rendered views,
|
||||||
|
- support local full-text search,
|
||||||
|
- support source lookup and citation recovery,
|
||||||
|
- provide the document representations needed by **evidence-anchor** and **citation-work**.
|
||||||
|
|
||||||
|
This repository turns documents into evidence-ready sources.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Intended Users
|
||||||
|
|
||||||
|
Primary users of this repository are developers and agents implementing source handling for citation-evidence.
|
||||||
|
|
||||||
|
They include:
|
||||||
|
|
||||||
|
- developers building document import workflows,
|
||||||
|
- developers building review collections,
|
||||||
|
- developers implementing PDF, Markdown, and HTML source handling,
|
||||||
|
- developers implementing citation recovery,
|
||||||
|
- developers integrating local or external source libraries,
|
||||||
|
- coding agents that need structured access to document text and metadata.
|
||||||
|
|
||||||
|
End users should experience this repository indirectly whenever they add a document, search source text, or recover a citation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Strategic Role
|
||||||
|
|
||||||
|
The strategic role of **evidence-source** is to make source documents usable as reliable evidence substrates.
|
||||||
|
|
||||||
|
Without this repository, the system would depend on whatever a viewer happens to show at runtime. That would make citation capture, re-opening, search, and recovery fragile.
|
||||||
|
|
||||||
|
**evidence-source** creates the normalized source representations that allow the rest of the system to operate consistently across document formats.
|
||||||
|
|
||||||
|
It enables the flow:
|
||||||
|
|
||||||
|
```text
|
||||||
|
Raw Source
|
||||||
|
→ Document Identity
|
||||||
|
→ Metadata
|
||||||
|
→ Canonical Text
|
||||||
|
→ Document Representation
|
||||||
|
→ Searchable Source
|
||||||
|
→ Anchorable Evidence Context
|
||||||
|
````
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Core Concept
|
||||||
|
|
||||||
|
The core concept of this repository is the **document representation**.
|
||||||
|
|
||||||
|
A document representation is a normalized, searchable, addressable view of a source document.
|
||||||
|
|
||||||
|
For a PDF, a representation may include:
|
||||||
|
|
||||||
|
```text
|
||||||
|
document fingerprint
|
||||||
|
metadata
|
||||||
|
page count
|
||||||
|
page text
|
||||||
|
global canonical text
|
||||||
|
page-local offset map
|
||||||
|
text item map
|
||||||
|
page dimensions
|
||||||
|
source-to-rendering hints
|
||||||
|
```
|
||||||
|
|
||||||
|
For Markdown or HTML, a representation may include:
|
||||||
|
|
||||||
|
```text
|
||||||
|
canonical text
|
||||||
|
rendered HTML
|
||||||
|
sanitized content
|
||||||
|
heading map
|
||||||
|
section map
|
||||||
|
DOM or AST structure
|
||||||
|
offset-to-node map
|
||||||
|
source line map where available
|
||||||
|
```
|
||||||
|
|
||||||
|
These representations allow **evidence-anchor** to create and resolve selectors and allow **citation-work** to display and search documents efficiently.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
This repository should own:
|
||||||
|
|
||||||
|
* document import workflows,
|
||||||
|
* document source identification,
|
||||||
|
* media type detection,
|
||||||
|
* document fingerprinting,
|
||||||
|
* source URI handling,
|
||||||
|
* metadata extraction,
|
||||||
|
* canonical text extraction,
|
||||||
|
* PDF text extraction,
|
||||||
|
* Markdown normalization,
|
||||||
|
* HTML normalization and sanitization,
|
||||||
|
* document representation generation,
|
||||||
|
* representation caching,
|
||||||
|
* local source search support,
|
||||||
|
* quote search support,
|
||||||
|
* citation clue parsing,
|
||||||
|
* local citation recovery,
|
||||||
|
* external source discovery hooks,
|
||||||
|
* recovery state tracking,
|
||||||
|
* privacy boundaries for source lookup.
|
||||||
|
|
||||||
|
It should provide the source-side capabilities consumed by:
|
||||||
|
|
||||||
|
* **citation-engine** for creating `Document` and `DocumentRepresentation` records,
|
||||||
|
* **evidence-anchor** for selector creation and resolution,
|
||||||
|
* **citation-work** for document review workflows,
|
||||||
|
* **evidence-binder** when evidence needs source context,
|
||||||
|
* **citation-evidence** for the integrated product experience.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Out of Scope
|
||||||
|
|
||||||
|
This repository should not own the broader evidence domain or user workflows.
|
||||||
|
|
||||||
|
Specifically, it should not own:
|
||||||
|
|
||||||
|
* the canonical evidence domain model,
|
||||||
|
* persistence policy beyond source and representation storage contracts,
|
||||||
|
* low-level anchor resolution algorithms,
|
||||||
|
* visual highlight rendering,
|
||||||
|
* review workspace UI,
|
||||||
|
* form-field binding semantics,
|
||||||
|
* visual guide overlay behavior,
|
||||||
|
* citation card rendering,
|
||||||
|
* application shell and deployment,
|
||||||
|
* final human validation of evidence quality.
|
||||||
|
|
||||||
|
Those responsibilities belong to the appropriate citation-evidence subsystem repositories.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architectural Position
|
||||||
|
|
||||||
|
```text
|
||||||
|
citation-evidence
|
||||||
|
integrated product shell
|
||||||
|
|
||||||
|
citation-engine
|
||||||
|
core domain model, services, persistence contracts
|
||||||
|
|
||||||
|
evidence-source
|
||||||
|
document ingestion, extraction, metadata, representations, citation recovery
|
||||||
|
|
||||||
|
evidence-anchor
|
||||||
|
selectors, anchor resolution, re-anchoring, highlighting contracts
|
||||||
|
|
||||||
|
citation-work
|
||||||
|
review workspace and annotation UX
|
||||||
|
|
||||||
|
evidence-binder
|
||||||
|
evidence-to-target binding and active evidence state
|
||||||
|
```
|
||||||
|
|
||||||
|
**evidence-source** should provide document representations, not define what evidence means.
|
||||||
|
|
||||||
|
It should feed reliable source material into the rest of the system.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Primary Workflows
|
||||||
|
|
||||||
|
### 1. Import Document
|
||||||
|
|
||||||
|
A user or system adds a source document.
|
||||||
|
|
||||||
|
```text
|
||||||
|
Add Source
|
||||||
|
→ Identify Media Type
|
||||||
|
→ Compute Fingerprint
|
||||||
|
→ Extract Metadata
|
||||||
|
→ Extract Text
|
||||||
|
→ Build Representation
|
||||||
|
→ Register Document
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Generate PDF Representation
|
||||||
|
|
||||||
|
A PDF is converted into a representation suitable for review and anchoring.
|
||||||
|
|
||||||
|
```text
|
||||||
|
PDF Source
|
||||||
|
→ Load PDF
|
||||||
|
→ Extract Page Text
|
||||||
|
→ Normalize Text
|
||||||
|
→ Build Page Map
|
||||||
|
→ Build Offset Map
|
||||||
|
→ Store Representation
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Generate Markdown / HTML Representation
|
||||||
|
|
||||||
|
A Markdown or HTML source is converted into a normalized rendered and searchable representation.
|
||||||
|
|
||||||
|
```text
|
||||||
|
Markdown / HTML Source
|
||||||
|
→ Parse / Sanitize
|
||||||
|
→ Render if needed
|
||||||
|
→ Extract Canonical Text
|
||||||
|
→ Build Heading / Section Map
|
||||||
|
→ Build Offset Map
|
||||||
|
→ Store Representation
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Search Local Sources
|
||||||
|
|
||||||
|
A user or subsystem searches available source material.
|
||||||
|
|
||||||
|
```text
|
||||||
|
Search Query / Quote
|
||||||
|
→ Search Metadata
|
||||||
|
→ Search Full Text
|
||||||
|
→ Return Candidate Documents / Passages
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Recover Citation
|
||||||
|
|
||||||
|
A user provides a citation, quote, or source clue.
|
||||||
|
|
||||||
|
```text
|
||||||
|
Citation Clue
|
||||||
|
→ Parse Source Metadata
|
||||||
|
→ Search Local Library
|
||||||
|
→ Optionally Search Configured External Sources
|
||||||
|
→ Load Candidate Source
|
||||||
|
→ Search Exact Quote
|
||||||
|
→ Search Fuzzy Quote
|
||||||
|
→ Present Candidate Passages
|
||||||
|
→ User Confirms
|
||||||
|
→ Create Source Context for Annotation
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Initial Source Types
|
||||||
|
|
||||||
|
The first version should support or prepare for:
|
||||||
|
|
||||||
|
```text
|
||||||
|
PDF
|
||||||
|
Markdown
|
||||||
|
HTML
|
||||||
|
plain text
|
||||||
|
remote URL references
|
||||||
|
```
|
||||||
|
|
||||||
|
Later versions may support:
|
||||||
|
|
||||||
|
```text
|
||||||
|
DOCX
|
||||||
|
EPUB
|
||||||
|
scanned image documents
|
||||||
|
OCR-derived text
|
||||||
|
IIIF resources
|
||||||
|
TEI XML
|
||||||
|
structured datasets with source passages
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Citation Recovery States
|
||||||
|
|
||||||
|
Citation recovery should be modeled explicitly.
|
||||||
|
|
||||||
|
Initial recovery states may include:
|
||||||
|
|
||||||
|
```text
|
||||||
|
created
|
||||||
|
source-found-fulltext
|
||||||
|
source-found-preview-only
|
||||||
|
source-found-metadata-only
|
||||||
|
source-not-found
|
||||||
|
quote-found
|
||||||
|
quote-not-found
|
||||||
|
candidate-passages-found
|
||||||
|
manual-confirmation-needed
|
||||||
|
confirmed
|
||||||
|
annotation-created
|
||||||
|
failed
|
||||||
|
```
|
||||||
|
|
||||||
|
The system should distinguish between finding a source and finding the exact cited passage.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Privacy and Source Lookup Principles
|
||||||
|
|
||||||
|
Source lookup can create privacy risks.
|
||||||
|
|
||||||
|
The repository should follow these principles:
|
||||||
|
|
||||||
|
* search local sources first,
|
||||||
|
* make external lookup explicit and configurable,
|
||||||
|
* avoid sending private document text to external services by default,
|
||||||
|
* record which external services were queried,
|
||||||
|
* distinguish public metadata lookup from full-text upload,
|
||||||
|
* allow deployments to disable external lookup completely,
|
||||||
|
* prefer deterministic local processing where possible.
|
||||||
|
|
||||||
|
External source discovery should be an extension point, not an unavoidable default behavior.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Design Principles
|
||||||
|
|
||||||
|
### Source Identity First
|
||||||
|
|
||||||
|
Every imported document should receive a stable identity based on available metadata, source URI, and fingerprint.
|
||||||
|
|
||||||
|
### Canonical Text Matters
|
||||||
|
|
||||||
|
Anchoring and search depend on canonical text. The repository should make text normalization explicit and repeatable.
|
||||||
|
|
||||||
|
### Representation Is Not Source
|
||||||
|
|
||||||
|
The original source and generated representation are different things.
|
||||||
|
|
||||||
|
The system should preserve this distinction.
|
||||||
|
|
||||||
|
### Local Before External
|
||||||
|
|
||||||
|
Citation recovery should search local documents before looking elsewhere.
|
||||||
|
|
||||||
|
### Human Confirmation
|
||||||
|
|
||||||
|
Recovered citations should not silently become confirmed evidence. Candidate matches should be presented for confirmation when uncertainty exists.
|
||||||
|
|
||||||
|
### Format-Aware, Model-Neutral
|
||||||
|
|
||||||
|
The repository should understand document formats but should not own the broader evidence model.
|
||||||
|
|
||||||
|
### Cache Expensive Work
|
||||||
|
|
||||||
|
Text extraction, fingerprinting, and representation generation should be cacheable by source fingerprint and version.
|
||||||
|
|
||||||
|
### Agent-Friendly Output
|
||||||
|
|
||||||
|
Extracted metadata, representations, and recovery candidates should be structured enough for agents to inspect, rank, and explain.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Expected Dependencies
|
||||||
|
|
||||||
|
This repository is expected to depend on shared types and service contracts from:
|
||||||
|
|
||||||
|
```text
|
||||||
|
citation-engine
|
||||||
|
Document, DocumentRepresentation, CitationRecoveryAttempt, source-related contracts
|
||||||
|
```
|
||||||
|
|
||||||
|
It may be consumed by:
|
||||||
|
|
||||||
|
```text
|
||||||
|
citation-work
|
||||||
|
to load reviewable documents and document representations
|
||||||
|
|
||||||
|
evidence-anchor
|
||||||
|
to resolve selectors against extracted representations
|
||||||
|
|
||||||
|
evidence-binder
|
||||||
|
to retrieve source context for linked evidence
|
||||||
|
|
||||||
|
citation-evidence
|
||||||
|
to provide integrated import and recovery workflows
|
||||||
|
```
|
||||||
|
|
||||||
|
It should avoid depending on review UI or form-binding implementation details.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## First Useful Version
|
||||||
|
|
||||||
|
A first useful version of **evidence-source** should provide:
|
||||||
|
|
||||||
|
* source import interface,
|
||||||
|
* media type detection,
|
||||||
|
* document fingerprinting,
|
||||||
|
* basic metadata extraction,
|
||||||
|
* PDF text extraction,
|
||||||
|
* Markdown text extraction,
|
||||||
|
* HTML sanitization and text extraction,
|
||||||
|
* canonical text normalization,
|
||||||
|
* document representation generation,
|
||||||
|
* simple local quote search,
|
||||||
|
* recovery attempt model or contract,
|
||||||
|
* examples showing how a document becomes a representation usable by **evidence-anchor**.
|
||||||
|
|
||||||
|
The first version does not need full external source discovery or OCR, but it should establish the ingestion and representation pattern.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Success Criteria
|
||||||
|
|
||||||
|
The repository is successful when another subsystem can use it to:
|
||||||
|
|
||||||
|
1. import a source document,
|
||||||
|
2. identify and fingerprint it,
|
||||||
|
3. extract useful metadata,
|
||||||
|
4. generate canonical text,
|
||||||
|
5. generate a document representation,
|
||||||
|
6. search the source text,
|
||||||
|
7. provide representation data to **evidence-anchor**,
|
||||||
|
8. support a local citation recovery attempt from a quote or citation clue.
|
||||||
|
|
||||||
|
A developer or coding agent should be able to understand from this repository how raw documents become evidence-ready sources.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Repository Character
|
||||||
|
|
||||||
|
This repository should be:
|
||||||
|
|
||||||
|
* source-focused,
|
||||||
|
* ingestion-oriented,
|
||||||
|
* privacy-conscious,
|
||||||
|
* format-aware,
|
||||||
|
* representation-centered,
|
||||||
|
* cache-friendly,
|
||||||
|
* suitable for local-first and server-side use,
|
||||||
|
* explicit about uncertainty in citation recovery,
|
||||||
|
* careful not to absorb review or binding responsibilities.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## MVP Coordination — Code Lives Upstream
|
||||||
|
|
||||||
|
During the umbrella-first MVP phase (decided 2026-05-24), **the source code
|
||||||
|
for this subsystem does not live in this repository yet**. It lives in the
|
||||||
|
umbrella repo at `citation-evidence/src/source/`.
|
||||||
|
|
||||||
|
This INTENT.md documents the *intended* responsibilities and boundaries.
|
||||||
|
When the ingestion and representation interfaces have stabilized through
|
||||||
|
actual MVP use, the corresponding code extracts into this repository.
|
||||||
|
|
||||||
|
**Shared contracts** (Document and DocumentRepresentation shapes,
|
||||||
|
CitationRecoveryAttempt state enum, canonical text normalization, allowed
|
||||||
|
dependency edges) are maintained in the umbrella repo:
|
||||||
|
|
||||||
|
* `citation-evidence/wiki/SharedContracts.md`
|
||||||
|
* `citation-evidence/wiki/DependencyMap.md`
|
||||||
|
* `citation-evidence/docs/decisions/` (ADRs)
|
||||||
|
|
||||||
|
This subsystem's eventual code must not contradict those documents. Changes
|
||||||
|
to shared contracts happen in the umbrella, not here.
|
||||||
|
|
||||||
|
Under the dependency map, **`evidence-source` may depend only on
|
||||||
|
`citation-engine`** — not on `evidence-anchor`. When ingestion needs to know
|
||||||
|
"could a selector resolve here?", the answer travels through events, not
|
||||||
|
direct calls.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Guiding Statement
|
||||||
|
|
||||||
|
**evidence-source exists to turn documents and citation clues into reliable, searchable, anchorable source context.**
|
||||||
|
|
||||||
17
README.md
17
README.md
@@ -1,3 +1,16 @@
|
|||||||
# repo-seed
|
# evidence-source
|
||||||
|
|
||||||
A git repository template to bootstrap coulomb projects from.
|
Document source, ingestion, extraction, metadata, and citation recovery —
|
||||||
|
PDF/HTML/MD ingest, fingerprinting, page-/offset-map construction,
|
||||||
|
canonical-text extraction, and the recovery behavior for stale selectors.
|
||||||
|
|
||||||
|
## MVP status: INTENT only
|
||||||
|
|
||||||
|
During the citation-evidence MVP, code lives upstream in
|
||||||
|
[`citation-evidence`](../citation-evidence/) under `src/source/`. This repo
|
||||||
|
currently holds `INTENT.md` describing what will move here. Contract
|
||||||
|
changes belong in
|
||||||
|
[`citation-evidence/wiki/SharedContracts.md`](../citation-evidence/wiki/SharedContracts.md),
|
||||||
|
not here.
|
||||||
|
|
||||||
|
Per the dependency map, source depends on `shared/` and `engine/` only.
|
||||||
|
|||||||
137
SCOPE.md
Normal file
137
SCOPE.md
Normal file
@@ -0,0 +1,137 @@
|
|||||||
|
# SCOPE
|
||||||
|
|
||||||
|
> This file helps you quickly understand what this repository is about,
|
||||||
|
> when it is relevant, and when it is not.
|
||||||
|
> It is intentionally lightweight and may be incomplete.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## One-liner
|
||||||
|
|
||||||
|
<!-- Describe the purpose of this repository in one precise sentence. -->
|
||||||
|
<!-- Example: "Provides a lightweight event router for Kubernetes-native systems." -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Core Idea
|
||||||
|
|
||||||
|
<!-- What is the main capability or idea behind this repository? -->
|
||||||
|
<!-- What problem does it try to solve? -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## In Scope
|
||||||
|
|
||||||
|
<!-- What this repository is responsible for. -->
|
||||||
|
<!-- Be explicit and concrete. -->
|
||||||
|
|
||||||
|
-
|
||||||
|
-
|
||||||
|
-
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Out of Scope
|
||||||
|
|
||||||
|
<!-- What this repository deliberately does NOT do. -->
|
||||||
|
<!-- This is often more important than "In Scope". -->
|
||||||
|
|
||||||
|
-
|
||||||
|
-
|
||||||
|
-
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Relevant When
|
||||||
|
|
||||||
|
<!-- When should someone consider using or exploring this repository? -->
|
||||||
|
|
||||||
|
-
|
||||||
|
-
|
||||||
|
-
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Not Relevant When
|
||||||
|
|
||||||
|
<!-- When should someone ignore this repository? -->
|
||||||
|
|
||||||
|
-
|
||||||
|
-
|
||||||
|
-
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Current State
|
||||||
|
|
||||||
|
<!-- Rough indication of maturity. No strict format required. -->
|
||||||
|
|
||||||
|
- Status: <!-- e.g. concept / experimental / active / stable / deprecated -->
|
||||||
|
- Implementation: <!-- e.g. idea / partial / substantial / complete -->
|
||||||
|
- Stability: <!-- e.g. unstable / evolving / stable -->
|
||||||
|
- Usage: <!-- e.g. none / personal / internal / production -->
|
||||||
|
|
||||||
|
<!-- Add any notes that help set expectations. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How It Fits
|
||||||
|
|
||||||
|
<!-- Where does this repository sit in the bigger picture? -->
|
||||||
|
|
||||||
|
- Upstream dependencies:
|
||||||
|
- Downstream consumers:
|
||||||
|
- Often used with:
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Terminology
|
||||||
|
|
||||||
|
<!-- Terms that are important to understand this repo. -->
|
||||||
|
<!-- Especially useful if naming differs from other repos. -->
|
||||||
|
|
||||||
|
- Preferred terms:
|
||||||
|
- Also known as:
|
||||||
|
- Potentially confusing terms:
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Related / Overlapping Repositories
|
||||||
|
|
||||||
|
<!-- List repositories that have similar or adjacent responsibilities. -->
|
||||||
|
<!-- Helps detect duplication and navigate the ecosystem. -->
|
||||||
|
|
||||||
|
- <repo-name> — <!-- how it relates -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Getting Oriented
|
||||||
|
|
||||||
|
<!-- If someone decides to look deeper, where should they start? -->
|
||||||
|
|
||||||
|
- Start with:
|
||||||
|
- Key files / directories:
|
||||||
|
- Entry points:
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Provided Capabilities
|
||||||
|
|
||||||
|
<!-- What can this repo's domain provide to other domains on request? -->
|
||||||
|
<!-- Each capability block is parsed by the state-hub capability catalog ingest. -->
|
||||||
|
<!-- Remove the examples and add your own, or leave empty if none. -->
|
||||||
|
|
||||||
|
<!--
|
||||||
|
```capability
|
||||||
|
type: infrastructure
|
||||||
|
title: Example capability title
|
||||||
|
description: What this capability provides, in one or two sentences.
|
||||||
|
keywords: [keyword1, keyword2, keyword3]
|
||||||
|
```
|
||||||
|
-->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
<!-- Anything else worth knowing. Keep it short. -->
|
||||||
12
registry/README.md
Normal file
12
registry/README.md
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
# Capability Registry
|
||||||
|
|
||||||
|
Markdown-first capability index for federation and reuse planning.
|
||||||
|
|
||||||
|
## Authoring
|
||||||
|
|
||||||
|
1. Copy a capability entry template (see reuse-surface `templates/capability-entry.template.md`).
|
||||||
|
2. Add the row to `indexes/capabilities.yaml`.
|
||||||
|
3. Run `reuse-surface validate` from a checkout with the CLI installed.
|
||||||
|
4. Merge to `main` and verify publish with `reuse-surface establish --publish-check`.
|
||||||
|
|
||||||
|
Federation contract: reuse-surface `docs/RegistryFederation.md`.
|
||||||
0
registry/capabilities/.gitkeep
Normal file
0
registry/capabilities/.gitkeep
Normal file
4
registry/indexes/capabilities.yaml
Normal file
4
registry/indexes/capabilities.yaml
Normal file
@@ -0,0 +1,4 @@
|
|||||||
|
version: 1
|
||||||
|
updated: '2026-06-16'
|
||||||
|
domain: helix_forge
|
||||||
|
capabilities: []
|
||||||
19
workplans/ESRC-WP-0001-intent-placeholder.md
Normal file
19
workplans/ESRC-WP-0001-intent-placeholder.md
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
---
|
||||||
|
id: ESRC-WP-0001
|
||||||
|
type: workplan
|
||||||
|
title: "INTENT placeholder — await extraction from citation-evidence"
|
||||||
|
domain: infotech
|
||||||
|
repo: evidence-source
|
||||||
|
status: backlog
|
||||||
|
owner: codex
|
||||||
|
topic_slug: citation_evidence_mvp
|
||||||
|
created: "2026-06-21"
|
||||||
|
updated: "2026-06-21"
|
||||||
|
state_hub_workstream_id: "64771b5d-4b83-4848-a562-4b00aad017b2"
|
||||||
|
---
|
||||||
|
|
||||||
|
# ESRC-WP-0001 — INTENT Placeholder
|
||||||
|
|
||||||
|
Umbrella-first MVP: source/ingestion code will extract from `citation-evidence`
|
||||||
|
when the subsystem boundary stabilizes. This file satisfies ADR-001 workplan
|
||||||
|
structure until then. See `INTENT.md`.
|
||||||
Reference in New Issue
Block a user