docs(adr): establish three foundational ADRs for Event Bridge architecture

ADR-001: activity-core as org-wide Event Bridge — boundaries, NATS as org infrastructure, state hub delegation, rules-core module-first, issue-core adapter interface, capabilities domain assignment. ADR-002: markdown-as-definition format for event types and ActivityDefinitions — co-located intent/schema/logic/debugging, publisher-declared governance with environment-configurable curator gate, attribute type system, task template files. ADR-003: Rule vs. Instruction model and expression DSL — sandboxed Python-like AST evaluator for Rules, trusted-fields prompt injection protection for Instructions, output schema enforcement, audit trail, testing strategy, rules-core module boundary. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 16:48:42 +02:00
parent 0818ce3eb1
commit 617b2420d3
3 changed files with 827 additions and 0 deletions
--- a/docs/adr/adr-001-event-bridge-architecture.md
+++ b/docs/adr/adr-001-event-bridge-architecture.md
@@ -0,0 +1,190 @@
 ---
 id: ACT-ADR-001
 type: architecture-decision-record
 title: "Activity-Core as Coulomb Org Event Bridge"
 status: accepted
 decided_by: Bernd Worsch
 date: "2026-05-14"
 scope: cross-repo
 affects:
  - activity-core
  - the-custodian/state-hub
  - issue-facade (→ issue-core)
  - repo-scoping
 tags: ["architecture", "event-bridge", "activity-core", "orchestration", "event-loop"]
 ---
 # ACT-ADR-001: Activity-Core as Coulomb Org Event Bridge
 ## Status
 Accepted.
 ## Context
 The Coulomb organization's set of repositories, services, and deployments is growing
 beyond what a single person can coordinate manually. The state hub tracks cross-domain
 state but has no mechanism to automatically respond to it. Recurring maintenance
 (dependency scans, SBOM staleness checks, consistency audits) is implemented as
 bespoke cron jobs baked into individual services — scattered, hard to audit, and
 impossible to govern from a single vantage point.
 Three forces drive the need for a dedicated orchestration layer:
 1. **Scale**: as the repo count grows, manual coordination becomes the bottleneck.
 2. **Reactivity**: org-level events (new repo registered, CVE published, deployment
   completed) should trigger coordinated responses without human intervention.
 3. **Separation of concerns**: the state hub is a read model and should remain one.
   It must not accumulate automation logic to avoid becoming a God object.
 ## Decision
 **activity-core is the org-wide Event Bridge for the Coulomb organization.**
 Its responsibility is exactly three things:
 1. **Receive events** — time-based (cron, one-off scheduled) and domain events (NATS,
   Gitea webhooks, state hub lifecycle signals).
 2. **Evaluate rules and instructions** — given event payload and resolved context,
   determine what work must be created.
 3. **Emit task sets** — publish structured task creation requests to issue-core.
 It does not execute work. It does not track task lifecycle. It does not manage projects.
 ### Boundary rules
 | Concern | Owner |
 |---|---|
 | Cross-org task scheduling and reactive automation | **activity-core** |
 | Task lifecycle (create, assign, track, close) | **issue-core** |
 | Project and initiative management (phased, completion-gated) | **project-core** (future) |
 | Repository capability profiling | **repo-scoping** |
 | Cross-domain coordination state | **state hub** |
 | Execution of automatable tasks | Temporal workers (per-repo) |
 ### Event type registry
 Event types are declared by publishers as markdown definition files (see ACT-ADR-002).
 Governance is **publisher-declared by default**: a publisher registers its event types
 by committing definition files to the event-types registry. In production environments,
 a curator gate can be enabled — registry entries must be reviewed before the runtime
 accepts events of that type. This is a configuration flag per runtime scope (dev,
 staging, prod), not a hard-coded rule.
 ### State hub relationship
 The state hub **delegates automation to activity-core** rather than implementing it
 internally. Concretely:
 - Maintenance jobs currently baked into the state hub (consistency sync, SBOM
  staleness checks) are migrated to ActivityDefinitions in activity-core.
 - The state hub becomes a **publisher** of lifecycle events on NATS
  (`org.workstream.created`, `org.decision.resolved`, `org.repo.registered`, etc.).
 - The state hub does not subscribe to activity-core's output directly; it reads
  task state from issue-core when needed.
 This preserves the state hub as a read model and makes activity-core the single
 home for automation policy.
 ### rules-core: module-first
 The rule and instruction evaluation engine starts as `src/activity_core/rules/` — a
 module with a clean internal boundary (no imports from Temporal, Postgres, or FastAPI
 within the module). Extraction to a standalone `rules-core` repository happens when a
 **second consumer** (e.g. state hub governance, project-core) needs the engine. This
 follows the same discipline as the task-flow-engine extraction plan (CUST-TFE-SCOPE).
 ### NATS as org infrastructure
 NATS JetStream is promoted from an activity-core internal component to **org-wide
 event bus infrastructure**. It runs as a standalone service (not bundled in
 activity-core's docker-compose) with its own lifecycle. All services that publish
 or subscribe to org events do so via NATS streams.
 ### issue-core integration
 activity-core communicates with issue-core via a **task emission adapter** — an
 abstraction layer that, in the initial implementation, calls issue-core's REST API.
 The adapter interface is defined now; the transport can migrate to NATS subscription
 (issue-core subscribes to `task.spawned` events) once issue-core adds that capability.
 This avoids hardcoding REST coupling throughout the codebase.
 ### Webhook receiver
 A new HTTP endpoint within activity-core accepts inbound webhooks from Gitea (and
 later GitHub, other services). It normalises payloads to the canonical EventEnvelope
 format, validates against the event type registry, and publishes to NATS. This runs
 alongside the existing FastAPI `api.py`.
 ### Domain assignment
 activity-core and issue-core are assigned to the **`capabilities`** domain — the
 same domain as repo-scoping. These are org-wide infrastructure tools that serve all
 domains equally, not artefacts of any single project or custodian's personal workflow.
 issue-core is explicitly disassociated from the markitect domain.
 ## Trigger types
 Three trigger types are supported:
 | Type | Description | Temporal mechanism |
 |---|---|---|
 | `cron` | Recurring schedule (5-field cron + timezone + misfire policy) | Temporal Schedule (implemented WP-0002) |
 | `event` | React to a named event type on NATS | Temporal workflow started by Event Router |
 | `scheduled` | One-off at a future datetime | Temporal Schedule with `remaining_actions: 1` |
 `scheduled` is a new trigger type added in WP-0003.
 ## Consequences
 ### Immediate
 - activity-core's `INTENT.md` and `SCOPE.md` are rewritten to reflect this architecture.
 - The `task_instances` Postgres table is reclassified as a **spawn audit trail** —
  it records the act of spawning (what was created, when, which issue-core reference)
  but is not the authoritative task record. Authoritative lifecycle state lives in
  issue-core.
 - A task emission adapter interface (`src/activity_core/issue_sink.py`) replaces any
  direct Postgres writes to `task_instances` with calls through the adapter.
 - The `TaskExecutorWorkflow` stub from WP-0001 is replaced with the actual adapter
  call in WP-0003.
 ### Medium term
 - State hub adds NATS publishing to its lifecycle operations.
 - Gitea webhook receiver added to activity-core as a new HTTP router.
 - Existing state hub maintenance crons are migrated to ActivityDefinitions.
 - issue-facade is renamed issue-core and re-registered under the `capabilities` domain.
 ### Long term
 - rules-core extracted as a standalone package when a second consumer appears.
 - project-core created (depends on task-flow-engine extraction) for multi-phase
  initiative management — explicitly out of scope for activity-core.
 - NATS gets its own operational runbook and monitoring as org infrastructure.
 ## Alternatives Considered
 **State hub absorbs activity-core functionality**: rejected — turns the state hub into
 a God object, violates the read-model boundary, and makes automation logic impossible
 to test independently.
 **Per-repo automation (GitHub Actions style)**: rejected — cross-repo coordination
 requires a single vantage point that can see all repos; per-repo actions can't express
 org-level triggers or context.
 **Activity-core as a thin Temporal wrapper only**: rejected — without the event type
 registry and rule model, it's just a scheduler. The governance and introspection
 properties are the point.
 **Separate rules-core from day one**: rejected — premature extraction adds dependency
 management overhead before a second consumer exists. Module-first with a clean boundary
 costs nothing and preserves the extraction option.
 ## Related
 - ACT-ADR-002 — Event type and ActivityDefinition definition format
 - ACT-ADR-003 — Rule vs. Instruction model and DSL
 - CUST-ADR-001 — Workplans as repository artefacts (canon/architecture/)
 - CUST-TFE-SCOPE-2026-000001 — task-flow-engine extraction plan (canon/projects/)
 - activity-core INTENT.md (to be written)
 - activity-core WP-0003 (to be written)
--- a/docs/adr/adr-002-definition-format.md
+++ b/docs/adr/adr-002-definition-format.md
@@ -0,0 +1,356 @@
 ---
 id: ACT-ADR-002
 type: architecture-decision-record
 title: "Markdown-as-Definition Format for Event Types and ActivityDefinitions"
 status: accepted
 decided_by: Bernd Worsch
 date: "2026-05-14"
 scope: cross-repo
 affects:
  - activity-core
  - any event publisher registering event types
 tags: ["architecture", "format", "event-type", "activity-definition", "markdown", "documentation"]
 ---
 # ACT-ADR-002: Markdown-as-Definition Format
 ## Status
 Accepted.
 ## Context
 Event type schemas and ActivityDefinition rules need to be understood and authored
 by three distinct audiences simultaneously: humans reviewing and debugging automation,
 agents creating and modifying definitions at runtime, and machines parsing and
 evaluating them. Traditional approaches split these concerns — schemas go in JSON
 Schema or YAML, documentation goes in a wiki, logic goes in code — and they drift
 apart. A bug in a rule requires cross-referencing three places to understand intent,
 check the schema, and read the condition.
 The Custodian ecosystem already uses markdown files with YAML frontmatter as the
 authoritative format for workplans, ADRs, SCOPE.md, and INTENT.md — all understood
 by humans and agents without additional tooling. The same pattern should apply here.
 ## Decision
 **Event type definitions and ActivityDefinitions are markdown files** where machine-
 parseable structure (frontmatter YAML and fenced definition blocks) is embedded within
 human-readable narrative. Intent, schema, logic, and debugging notes live in one file.
 ### Event Type Definition Files
 **Location**: `event-types/{namespace}.{event-name}.md` within the activity-core repo
 (or a registered event-types registry repo if volumes justify separation).
 **Naming convention**: `{publisher-domain}.{noun}.{verb}.md`, e.g.:
 - `org.repo.registered.md`
 - `org.security.cve.published.md`
 - `org.workstream.completed.md`
 **Structure**:
 ```markdown
 ---
 id: org.repo.registered
 type: event-type
 version: "1.0"
 publisher: the-custodian/state-hub
 governance: publisher-declared   # publisher-declared | curated
 status: active                   # active | deprecated | draft
 introduced: "2026-05-14"
 ---
 # Event: org.repo.registered
 ## Intent
 One-paragraph statement of why this event exists and what it signals.
 Written for an agent or human who has never seen it before.
 ## When Published
 Bulleted list of the exact conditions under which the publisher fires this event.
 Be precise — ambiguity here causes missed or duplicate activations.
 ## Attributes
 | Attribute | Type | Required | Description |
 |---|---|---|---|
 | `repo_slug` | string | yes | URL-safe repository identifier |
 | `domain` | string | yes | Domain slug the repo is assigned to |
 | `tags` | string[] | no | Capability tags set at registration time |
 | `registered_at` | datetime | yes | ISO 8601 UTC timestamp |
 ## Example Payload
 ```json
 {
  "id": "evt-7f3a1b2c",
  "type": "org.repo.registered",
  "version": "1.0",
  "timestamp": "2026-05-14T10:00:00Z",
  "publisher": "the-custodian/state-hub",
  "attributes": {
    "repo_slug": "new-python-service",
    "domain": "railiance",
    "tags": ["python-service", "fastapi"],
    "registered_at": "2026-05-14T10:00:00Z"
  }
 }
 ```
 ## Consumer Notes
 Guidance for agents and humans writing rules against this event type:
 - Which attributes are safe for instruction prompts (trusted fields)
 - Common misuses or gotchas
 - Related events that are often used together
 ## Debugging
 What to check when an activity that subscribes to this event does not fire:
 - How to verify the event was published (NATS subject, log entry)
 - How to inspect the event payload in the registry
 - Common schema validation failures
 ```
 ### Attribute Types
 The type system for event attributes is intentionally small:
 | Type | Notes |
 |---|---|
 | `string` | UTF-8 string |
 | `integer` | 64-bit signed integer |
 | `float` | 64-bit float |
 | `boolean` | true / false |
 | `datetime` | ISO 8601 UTC string in payload, parsed to datetime in evaluator |
 | `uuid` | String in payload, validated as UUID v4 |
 | `string[]` | JSON array of strings |
 | `integer[]` | JSON array of integers |
 | `object` | Freeform JSON object — cannot be used in rule conditions; instruction-only |
 `object` type attributes are available to instructions but excluded from rule
 conditions deliberately — rules must be deterministic and schema-validatable.
 ### ActivityDefinition Files
 **Location**: `activity-definitions/{slug}.md` within the repo that owns the automation.
 For org-wide automations: `activity-core/activity-definitions/`.
 For domain-specific automations: `{domain-repo}/activity-definitions/`.
 **Structure**:
 ```markdown
 ---
 id: ACT-DEF-onboard-python-repo
 type: activity-definition
 version: "1.0"
 status: active
 trigger:
  type: event                        # event | cron | scheduled
  event_type: org.repo.registered    # for type: event
  # cron: "0 9 * * 1"               # for type: cron (5-field, UTC)
  # timezone: "Europe/Berlin"        # optional, cron only
  # misfire_policy: skip             # skip | catchup | compress (cron only)
  # at: "2026-06-01T09:00:00Z"      # for type: scheduled (one-off)
 context_sources:
  - type: repo-scoping
    query: repo_profile
    bind_to: context.repo_profile
  - type: state-hub
    query: domain_summary
    bind_to: context.domain_summary
 governance: publisher-declared
 owner: custodian-agent
 created: "2026-05-14"
 ---
 # ActivityDefinition: Onboard New Python Service
 ## Purpose
 One paragraph. What does this automation do and why does it exist? What problem
 would accumulate if this automation were turned off?
 ## Trigger
 Which event type fires this activity, and under what conditions does it apply?
 Cross-reference the event type definition file.
 ## Context Sources
 What context is resolved before rules are evaluated? Explain what each source
 provides and why it is needed.
 ## Rules
 Each rule is a fenced block tagged `rule`. Rules are evaluated in order; all
 matching rules fire (not first-match-only). See ACT-ADR-003 for the expression
 language specification.
 ```rule
 id: create-sbom-scan
 condition: '"python-service" in event.attributes.tags'
 action:
  task_template: tasks/sbom-initial-scan.md
  target_repo: event.attributes.repo_slug
  priority: high
  labels: ["onboarding", "security"]
 ```
 ```rule
 id: create-scope-generation
 condition: '"python-service" in event.attributes.tags and context.repo_profile.scope_md_exists == false'
 action:
  task_template: tasks/generate-scope-md.md
  target_repo: event.attributes.repo_slug
  priority: medium
  labels: ["onboarding", "documentation"]
 ```
 ## Instructions
 Instructions are evaluated after all rules. An instruction asks an LLM to decide
 what additional tasks (if any) to create. See ACT-ADR-003 for safety requirements.
 ```instruction
 id: domain-specific-onboarding
 condition: 'event.attributes.domain != "test_domain_v2"'
 trusted_fields:
  - event.attributes.repo_slug
  - event.attributes.domain
  - event.attributes.tags
 model: claude-sonnet-4-6
 review_required: false
 prompt: |
  A new repository has been registered in the Coulomb organization.
  Repository: {event.attributes.repo_slug}
  Domain: {event.attributes.domain}
  Tags: {event.attributes.tags}
  Based on the domain's current standards and the repository profile above,
  determine what additional domain-specific onboarding tasks should be created
  beyond the standard SBOM scan and SCOPE.md generation. Return an empty list
  if no additional tasks are warranted.
 output_schema: tasks/task-template-list-schema.json
 ```
 ## Task Templates
 References to task template files used in rule actions. Each template is a
 separate markdown file under `tasks/` that defines the task title, description
 template, default labels, and default assignee logic.
 - `tasks/sbom-initial-scan.md`
 - `tasks/generate-scope-md.md`
 ## Notes
 Operational notes, edge cases, and context that does not fit elsewhere.
 ## Debugging
 Checklist for when this ActivityDefinition fires but produces unexpected output:
 1. Was the triggering event published with the correct type and attributes?
 2. Do the rule conditions evaluate as expected? (Use `make eval-rule` with a fixture)
 3. Is issue-core reachable and configured for the target domain?
 4. For instructions: check the audit log for the model response and output validation result.
 ## Change History
 - v1.0 (2026-05-14): Initial definition
 ```
 ### Governance model
 The `governance` field on an event type definition determines how the registry
 runtime handles it:
 | Value | Behaviour |
 |---|---|
 | `publisher-declared` | Accepted immediately on publish; no review required |
 | `curated` | Held in `pending` state until a curator approves via registry API |
 The runtime checks the **environment's curator gate configuration** — not just the
 file's governance field. An environment configured with `curator_gate: disabled`
 treats all event types as `publisher-declared` regardless of the field value.
 An environment with `curator_gate: required` treats all event types as `curated`
 regardless of the field value. The field is the publisher's declared preference;
 the environment config is the enforcement point.
 This means:
 - **Dev / integration**: `curator_gate: disabled` — developers and agents iterate
  freely; new event types take effect immediately.
 - **Staging / production**: `curator_gate: required` — all new event types queue
  for curator review before the runtime accepts events of that type.
 ### File as source of truth
 Following CUST-ADR-001 (Workplans as Repository Artefacts), definition files are
 the canonical source of truth. The activity-core runtime indexes them into its
 database on startup and via a sync command. The database is a queryable cache,
 not the origin. A definition deleted from the filesystem is disabled at next sync.
 ### Task Templates
 Task templates are separate markdown files (`tasks/{slug}.md`) referenced from
 ActivityDefinition action blocks. They define:
 ```markdown
 ---
 id: tasks/sbom-initial-scan
 type: task-template
 ---
 # Task: Run Initial SBOM Scan
 ## Title template
 `Run SBOM scan — {target_repo}`
 ## Description template
 Initial SBOM scan required for newly registered repository `{target_repo}`.
 Run: `make ingest-sbom REPO={target_repo} SCAN=1`
 ## Default labels
 ["sbom", "security", "automated"]
 ## Default assignee
 None (unassigned)
 ```
 This keeps task content editable separately from the routing logic in
 ActivityDefinitions.
 ## Consequences
 - A new `event-types/` directory in activity-core (and eventually a shared registry)
  holds all org event type definitions.
 - A new `activity-definitions/` directory in activity-core holds org-wide automations.
 - Domain repos may hold their own `activity-definitions/` for domain-specific
  automations, scanned by activity-core at sync time.
 - The runtime requires a parser for the `rule` and `instruction` fenced blocks.
 - SCOPE.md for activity-core must be updated to list these directories.
 ## Alternatives Considered
 **Pure JSON Schema for event types, separate wiki for docs**: rejected — documentation
 and schema diverge immediately; agents must cross-reference two systems to author
 a rule correctly.
 **OpenAPI / AsyncAPI specification**: rejected — those formats are excellent for
 API and broker documentation but not designed for co-locating operational intent
 and debugging guidance. They are also less readable for non-specialists.
 **Code-only (Python dataclasses for event schemas, Python functions for rules)**:
 rejected — requires code deployment for any definition change; agents cannot modify
 definitions without write access to the codebase; non-technical stakeholders cannot
 review or understand automation policies.
 ## Related
 - ACT-ADR-001 — Event Bridge Architecture
 - ACT-ADR-003 — Rule vs. Instruction model and DSL
 - CUST-ADR-001 — Workplans as repository artefacts
--- a/docs/adr/adr-003-rule-instruction-model.md
+++ b/docs/adr/adr-003-rule-instruction-model.md
@@ -0,0 +1,281 @@
 ---
 id: ACT-ADR-003
 type: architecture-decision-record
 title: "Rule vs. Instruction Model and Expression DSL"
 status: accepted
 decided_by: Bernd Worsch
 date: "2026-05-14"
 scope: cross-repo
 affects:
  - activity-core
  - rules-core (future extraction)
 tags: ["architecture", "rules", "instructions", "dsl", "llm", "safety", "evaluation"]
 ---
 # ACT-ADR-003: Rule vs. Instruction Model and Expression DSL
 ## Status
 Accepted.
 ## Context
 ActivityDefinitions need two distinct evaluation modes to cover the full range
 of automation scenarios in the Coulomb org:
 **Deterministic cases**: "if this repo has tag `python-service` AND has no SBOM
 in the last 30 days, create a scan task." The condition is fully expressible as a
 boolean predicate over known attributes. The output is fixed by the template. No
 ambiguity, no LLM required, fully testable.
 **Judgement cases**: "a new repository has been registered — based on its domain
 and profile, determine what domain-specific onboarding tasks are appropriate." The
 right answer depends on context that is expensive to encode as explicit rules. An
 LLM is a better evaluator than a rule tree, but introduces non-determinism, cost,
 and a new attack surface (prompt injection via event payload).
 Conflating these two modes into one mechanism produces a system that is either
 too rigid (rules only) or too unpredictable (LLM everywhere). The two modes
 need different evaluation pipelines, testing strategies, and audit trails.
 ## Decision
 **Two named, distinct evaluation modes: Rule and Instruction.**
 Terminology is deliberate. A **Rule** is deterministic and mechanical — it applies
 or it does not. An **Instruction** is contextual and interpretive — it guides an
 LLM agent to make a judgement call. Both are expressed as fenced blocks in
 ActivityDefinition markdown files (see ACT-ADR-002).
 ### Rules
 A Rule has two parts: a **condition** (boolean predicate) and one or more
 **actions** (task template references).
 #### Condition expression language
 The condition is a single-line string expression evaluated by a sandboxed
 AST walker — never `exec()` or `eval()`. The evaluator walks the parsed AST
 and whitelist-checks every node type before executing. Unknown node types
 raise an `UnsafeExpression` error at parse time, not at evaluation time.
 **Available operations**:
 | Category | Syntax | Example |
 |---|---|---|
 | Equality | `==`, `!=` | `event.type == "org.repo.registered"` |
 | Comparison | `>`, `<`, `>=`, `<=` | `event.attributes.sbom_age_days > 30` |
 | Membership | `in`, `not in` | `"python-service" in event.attributes.tags` |
 | Boolean | `and`, `or`, `not` | `a and (b or not c)` |
 | Grouping | `( )` | `(a or b) and c` |
 | Length | `len(x)` | `len(event.attributes.affected_repos) > 0` |
 | Existence | `x is None`, `x is not None` | `event.attributes.domain is not None` |
 **Attribute access** follows dot notation on the `event` object and the `context`
 object (populated by context sources declared in the ActivityDefinition):
 - `event.id` — UUID string
 - `event.type` — event type identifier
 - `event.version` — event type version
 - `event.timestamp` — ISO 8601 datetime string
 - `event.publisher` — publisher identifier
 - `event.attributes.{name}` — typed attribute per event type schema
 - `context.{source}.{field}` — resolved context data
 **Explicitly forbidden** (evaluator rejects at parse time):
 - Function calls other than `len()` and `None` tests
 - Attribute access on arbitrary Python objects
 - String interpolation or formatting
 - Any control flow (`if`, `for`, `while`, `lambda`)
 - Import statements
 - Assignments
 **Design rationale**: the expression language is intentionally small. Anything
 complex enough to need more than this belongs in an Instruction, not a Rule.
 When a rule condition becomes difficult to express, that is a signal that the
 case requires LLM judgement, not a signal that the DSL needs more features.
 #### Actions
 A Rule's action block specifies:
 ```yaml
 action:
  task_template: tasks/{template-slug}.md   # required
  target_repo: event.attributes.repo_slug    # expression — attribute access only
  priority: high                             # high | medium | low | literal
  labels: ["onboarding", "security"]        # literal list
  due_in_days: 7                             # optional, integer literal
 ```
 `target_repo` and similar fields accept simple attribute access expressions
 (no boolean logic — just path traversal). This allows dynamic routing to the
 correct issue-core instance without arbitrary expression evaluation in action
 fields.
 #### Evaluation semantics
 - All rules in an ActivityDefinition are evaluated; **all matching rules fire**
  (not first-match-only). There is no implicit ordering beyond the file order,
  which is documented in the ActivityDefinition for human clarity.
 - A rule whose condition raises an error during evaluation is skipped and logged
  as `rule_error`; other rules still fire. This prevents a single malformed rule
  from silencing an entire ActivityDefinition.
 - An empty condition (omitted `condition` field) evaluates to `true` — the rule
  always fires when the trigger fires.
 ### Instructions
 An Instruction defers the task-creation decision to an LLM. It specifies what
 context to provide, how to frame the prompt, and what output schema to enforce.
 #### Structure
 ```yaml
 # in an instruction fenced block:
 id: {slug}
 condition: '{expression}'          # optional pre-filter (Rule DSL); runs before LLM
 trusted_fields:                    # REQUIRED — explicit allowlist of payload fields
  - event.attributes.repo_slug     # safe to interpolate into prompt
  - event.attributes.domain
  - event.attributes.tags
 model: claude-sonnet-4-6
 review_required: false             # true | false — curator gate for output
 prompt: |
  {prompt template — only trusted_fields may be interpolated}
 output_schema: {path to JSON schema file}
 ```
 #### Trusted fields and prompt injection protection
 The `trusted_fields` list is **required** and enforced at parse time. Any field
 not listed is unavailable to the prompt template. The template engine raises
 `UntrustedFieldError` if the prompt references a field not in `trusted_fields`.
 The rationale: event payloads may contain free-text from untrusted sources —
 commit messages, issue titles, CVE descriptions, repo descriptions. Interpolating
 these directly into a prompt creates a prompt injection surface. Trusted fields
 are those whose values are validated by the event type schema (typed attributes
 like slugs, domain names, tag lists) and cannot carry arbitrary instruction text
 by construction.
 Fields of type `object` (freeform JSON) are **never eligible** for `trusted_fields`
 even if listed — the evaluator rejects this at parse time.
 #### Output schema enforcement
 The LLM response is validated against `output_schema` using JSON Schema validation.
 If validation fails, the instruction retries once with the schema error appended
 to the prompt. If the second attempt also fails, the instruction records an
 `instruction_output_error` audit event and emits no tasks. Tasks are **never
 created from unvalidated output**.
 Structured output mode (tool_use / JSON mode) is used where the model supports
 it. The output schema must define `List[TaskSpec]` or a compatible envelope.
 #### `review_required: true`
 When set, the instruction's proposed task list is written to a **pending review
 queue** in issue-core rather than directly created. A human or curator agent
 reviews and approves/rejects before tasks are materialised. This is the default
 for instructions that create high-impact tasks (cross-repo changes, security
 responses, production operations).
 #### Evaluation semantics
 - Instructions are evaluated **after** all rules in the ActivityDefinition.
 - The optional `condition` field on an instruction uses the same Rule DSL as
  a first-pass filter — if the condition is false, the LLM is not called.
  This avoids LLM cost for events that clearly do not need instruction judgement.
 - Instructions are **not** first-match-only; all instructions whose conditions
  pass fire. An ActivityDefinition may have zero instructions.
 ### Audit trail
 Every task emission records:
 | Field | Rule | Instruction |
 |---|---|---|
 | `source_type` | `"rule"` | `"instruction"` |
 | `source_id` | rule `id` from definition | instruction `id` from definition |
 | `source_version` | ActivityDefinition version | ActivityDefinition version |
 | `triggering_event_id` | event UUID | event UUID |
 | `condition_matched` | expression string | expression string (pre-filter) |
 | `prompt_hash` | — | SHA-256 of rendered prompt |
 | `model` | — | model ID used |
 | `output_validated` | — | `true` / `false` |
 | `review_required` | — | `true` / `false` |
 The audit trail is written to the `task_spawn_log` table in activity-core's database
 and referenced from the task record in issue-core.
 ### Testing strategy
 **Rules**: every rule can and should be unit-tested with fixture event payloads.
 A test helper `evaluate_rule(condition_str, event_fixture)` returns `bool` and
 raises on syntax errors. Tests live alongside ActivityDefinition files:
 `activity-definitions/{slug}.test.json` — a list of `{event, expected_rules_fired}`
 fixtures.
 **Instructions**: instructions cannot be deterministically unit-tested. Instead:
 - Sample evaluations are collected: given a fixture event, record the LLM response.
 - Samples are committed to `activity-definitions/{slug}.samples/` for human review.
 - Output schema validation is unit-tested independently of the LLM call.
 - Prompt injection resistance is tested by including injection strings in fixture
  event payloads and asserting they do not appear in the rendered prompt.
 ### rules-core module boundary
 The rule evaluator and instruction executor live in `src/activity_core/rules/`.
 Within this module:
 - **No imports from** `temporalio`, `sqlalchemy`, `fastapi`, or any activity-core
  application code.
 - Public surface: `evaluate_condition(expr: str, event: EventEnvelope, context: dict) -> bool`
  and `execute_instruction(instr: InstructionDef, event: EventEnvelope, context: dict) -> List[TaskSpec]`.
 - The module is independently importable and testable without starting the Temporal
  worker or Postgres.
 This boundary makes future extraction to `rules-core` a packaging exercise, not a refactor.
 ## Consequences
 - The `ActivityDefinition` Pydantic model gains `rules: List[RuleDef]` and
  `instructions: List[InstructionDef]` fields. The current implicit "always create
  tasks" behaviour is replaced by explicit rule blocks.
 - A new `RuleEvaluator` class (AST walker) is added to `src/activity_core/rules/`.
 - A new `InstructionExecutor` class handles prompt rendering, LLM call, output
  validation, and review queue routing.
 - Integration tests for rule evaluation use fixture JSON; no running Temporal required.
 - The `task_spawn_log` table is added to the Postgres schema (new Alembic migration).
 - ActivityDefinition files that omit both `rules` and `instructions` are valid
  (they fire with no output) — this supports future placeholder definitions.
 ## Alternatives Considered
 **OPA / Rego for rule conditions**: powerful, well-established policy language,
 supports complex logic. Rejected — Rego's learning curve is high for non-specialists;
 agents rarely produce correct Rego without fine-tuning; it adds a runtime dependency.
 The simple AST-walker DSL covers the realistic condition complexity for this org.
 **Rules as Python lambdas**: maximum expressiveness. Rejected — arbitrary code
 execution in a rule condition is a serious security surface, especially in an
 org-wide event loop. Code deployment required for any rule change; agents cannot
 write rules without code write access.
 **LLM for all conditions (no Rule/Instruction split)**: simpler model, more
 flexible. Rejected — non-deterministic for cases that are deterministic; expensive
 for high-frequency events like cron ticks; impossible to unit-test; audit trail
 for deterministic rules becomes murky.
 **Instructions only, no Rules**: allows arbitrary LLM judgement for everything.
 Rejected — LLM cost for every event, latency, and non-determinism are unacceptable
 for high-frequency maintenance automations. Many cases (SBOM staleness check,
 tag-based routing) are fully deterministic and should stay that way.
 ## Related
 - ACT-ADR-001 — Event Bridge Architecture
 - ACT-ADR-002 — Definition format (where rule/instruction blocks live)
 - CUST-TFE-SCOPE-2026-000001 — task-flow-engine extraction (analogue pattern)
 - `src/activity_core/rules/` — implementation home