guide-board/docs/EXTENSION-SDK.md

# Guide Board Extension SDK

Status: draft
Created: 2026-05-07

## Purpose

This document defines the first extension integration contract for `guide-board`.
It is intentionally small: extensions declare metadata in `extension.json`, the
core discovers them, and runners can produce normalized evidence through a stable
dictionary contract.

## Extension Layout

Bundled incubating extensions live under:

```text
extensions/<extension-id>/
  INTENT.md
  extension.json
  src/
  docs/
  schemas/
  evidence-requests/
  checks/
  mappings/
  profiles/
  runners/
  normalizers/
  reports/
  workplans/
```

Production extensions may also live in their own repositories. The repository
root is then the extension root and must contain `extension.json`:

```text
open-cmis-tck/
  INTENT.md
  extension.json
  src/
  mappings/
  profiles/
  runners/
  workplans/
```

Pass external extension repos to the core CLI with:

```sh
guide-board --extension-dir ../open-cmis-tck extensions list
```

Multiple `--extension-dir` values are allowed. `GUIDE_BOARD_EXTENSION_PATHS`
may also provide an OS-path-separated list for local automation and containers.

Only `INTENT.md` and `extension.json` are required for discovery. Additional
folders appear as the extension grows.

## Manifest Contract

`extension.json` must validate against:

```text
docs/schemas/extension-manifest.schema.json
```

The key runtime fields are:

- `id`: must match the extension directory name.
- `extension_type`: one of the supported archetypes from the architecture
  blueprint.
- `supported_frameworks`: framework IDs this extension can contribute evidence
  for. Descriptor objects with `id`, `version`, `source_url`, and
  `authority_ref` may be used when source metadata is available.
- `authorities`: authority IDs or descriptor objects with optional source URL,
  version, license, and access notes.
- `metadata`: optional extension-level metadata such as adapter version or
  source URL. The core preserves it in source locks and evidence metadata.
- `check_groups`: named groups that assessment profiles can select.
- `preflight_runner`: optional runner ID used before selected check groups.
- `runner_entrypoints`: concrete runner declarations.
- `normalizers`: optional plug-ins that convert native runner output into the
  stable runner-result shape before evidence is written.
- `mappings`: mapping set IDs under `mappings/<mapping-id>.json`.
- `report_fragments`: optional Markdown file or Python module descriptors for
  extension-owned report content.
- `certification_boundary`: explicit statement of what the extension does not
  certify.

`profile_schemas` may use the original string shorthand for core schemas:

```json
["target-profile", "assessment-profile"]
```

Extensions that need stricter domain-specific validation can add schema
descriptors:

```json
[
  "target-profile",
  "assessment-profile",
  {
    "id": "cmis-browser-target",
    "profile_kind": "target",
    "path": "schemas/cmis-browser-target.schema.json",
    "subject_type": "cmis-browser-binding-endpoint",
    "description": "Requires the target shape expected by the CMIS Browser Binding harness."
  }
]
```

Descriptor fields:

- `id`: stable schema descriptor ID used in validation errors.
- `profile_kind`: `target` or `assessment`.
- `path`: JSON schema path relative to the extension root.
- `subject_type`: optional target-profile selector. When present, the schema is
  applied only to targets with that `subject_type`.
- `description`: optional authoring note.

The core validates the generic guide-board schema first, then applies matching
extension-owned schemas during `profile validate-*`, `plan`, and `run`.
Extension schema paths must stay inside the extension root. The baseline
validator intentionally supports the small JSON Schema subset used by
guide-board contracts: `type`, `enum`, `required`, `properties`,
`additionalProperties`, `items`, and `minItems`.

## Runner Entry Points

Runner entry points currently support these kinds:

- `python_module`: load a Python file from the extension directory and call a
  function.
- `command`: execute a manifest-declared argv without shell expansion. The core
  writes a context JSON file and expects the command to print a JSON runner
  result to stdout.
- `external`: declare an external harness that the baseline core cannot execute
  yet.

Example:

```json
{
  "id": "cmis-browser-preflight",
  "kind": "python_module",
  "module_path": "src/open_cmis_tck/preflight.py",
  "callable": "run",
  "command": null,
  "metadata": {
    "harness_id": "opencmis-tck",
    "harness_version": "extension-detected-or-declared",
    "source_url": "https://chemistry.apache.org/java/opencmis.html"
  },
  "description": "Checks whether the CMIS Browser Binding endpoint is reachable."
}
```

Command runner example:

```json
{
  "id": "opencmis-tck",
  "kind": "command",
  "module_path": null,
  "callable": null,
  "command": ["python3", "runners/opencmis_tck.py", "--context", "{context_json}"],
  "description": "Checks dependency posture and prepares OpenCMIS TCK execution."
}
```

Command placeholders:

- `{context_json}`: generated context file for the current step.
- `{root}`: repository root.
- `{run_dir}`: current run directory.
- `{extension_path}`: current extension directory.

The command is executed with the extension directory as its working directory.
The core does not use a shell for command runners.

Runner context values are stable for bundled and external extensions:

- `root`: the guide-board core root.
- `extension_path`: the absolute path to the extension root.
- `run_dir`: the current run output directory.
- `plan`: the immutable run plan snapshot.

## Mapping Sets

Mapping sets connect normalized evidence requirement refs to capability groups,
controls, conformance classes, quality dimensions, or other assessment targets.

Each mapping set lives under:

```text
extensions/<extension-id>/mappings/<mapping-id>.json
```

and validates against:

```text
docs/schemas/mapping-set.schema.json
```

The core does not embed domain policy. It only joins evidence `requirement_refs`
to extension-owned mappings and writes normalized mapping records to:

```text
runs/<run-id>/normalized/mappings.json
```

## Report Fragments

Extensions can contribute report fragments through `report_fragments`.

Static Markdown file:

```json
{
  "id": "overview",
  "kind": "markdown_file",
  "path": "reports/overview.md",
  "title": "Overview"
}
```

Dynamic Python fragment:

```json
{
  "id": "sdk-fixture-summary",
  "kind": "python_module",
  "module_path": "reports/sdk_fixture_summary.py",
  "callable": "build_fragment",
  "path": null,
  "title": "SDK Fixture Summary"
}
```

Fragment paths are resolved relative to the extension root and must stay inside
that root. A Python fragment receives `root`, `run_dir`, `run_id`, `plan`,
`evidence`, `findings`, `mappings`, `assessment_package`, `policy_summary`,
`source_lock`, `extension_path`, and `report_fragment`.

It returns:

```python
def build_fragment(context: dict) -> dict:
    return {
        "markdown": "### Extension Summary\n\n- evidence items: 2",
        "structured": {"evidence_count": 2},
    }
```

Fragments are written to `reports/fragments.json`, embedded in the assessment
package, rendered in `reports/report.md`, and summarized in
`exports/export-manifest.json`.

## Evidence Request Sets

Procedural and hybrid compliance extensions may include evidence request sets
under:

```text
evidence-requests/<request-set-id>.json
```

These files validate against:

```text
docs/schemas/evidence-request-set.schema.json
```

Evidence request sets are for collection guidance and review workflow. They
should reference official requirements by stable IDs or user-held licensed
material, but they must not redistribute proprietary standard text. A starter
template lives at:

```text
extensions/_template/evidence-request-set.json
```

See `docs/COMPLIANCE-EVIDENCE-PACKS.md` for the compliance-pack strategy.

## Expectations And Waivers

Assessment profiles may reference expectation and waiver sets:

```json
{
  "expectations_ref": "profiles/expectations/example.json",
  "waivers_ref": "profiles/waivers/example.json"
}
```

Expectation sets mark known posture as expected. Waiver sets mark approved,
time-bounded exceptions. Both are applied after findings are generated, and the
assessment package records policy summary counts.

## Challenges And Authority Exclusions

Assessment profiles may also reference challenge and exclusion sets:

```json
{
  "challenges_ref": "profiles/challenges/example.json",
  "exclusions_ref": "profiles/exclusions/example.json"
}
```

Challenge sets validate against `docs/schemas/challenge-set.schema.json`.
Exclusion sets validate against `docs/schemas/exclusion-set.schema.json`.
Records can match findings by requirement refs, check refs, evidence refs,
result refs, or classification refs. They also carry owner, review status,
rationale, authority source refs, review dates, optional expiry, native IDs,
and free-form metadata.

Use challenges when an extension author or assessment team believes a finding
needs review because a check is invalid, a native harness result is disputed, or
a mapping is wrong. Use exclusions when an authority or program explicitly
removes a requirement, check, or result from the assessment scope. The core
preserves these distinctions in findings, evidence review annotations,
assessment packages, reports, and retained summaries, but default gate semantics
still count the underlying finding as unexpected unless it is separately
expected or waived.

## Python Runner Contract

A Python runner receives one context object and returns one result object.

```python
def run(context: dict) -> dict:
    return {
        "result": "pass",
        "observations": ["Observed the expected condition."],
        "facts": {"key": "value"},
        "artifact_refs": [],
    }
```

Context fields:

- `root`: repository root path as a string.
- `run_dir`: output run directory path as a string.
- `run_id`: current run ID.
- `plan`: full run plan snapshot.
- `step`: the step being executed.
- `target_profile`: target profile snapshot.
- `assessment_profile`: assessment profile snapshot.
- `extension_path`: extension directory path as a string.
- `runner`: manifest runner declaration.

Result fields:

- `result`: one of the guide-board evidence result statuses.
- `observations`: human-readable observations.
- `facts`: structured facts extracted by the runner.
- `artifact_refs`: references to raw artifacts written by the runner.
- `requirement_refs`: optional requirement refs discovered by the runner.
- `metadata`: optional generic metadata such as `harness_version`,
  `test_suite_id`, `adapter_version`, `source_url`, or native result IDs.

Artifact refs must be paths relative to the run directory. After runner
execution, the core fingerprints existing artifact refs into the assessment
package `artifact_manifest`.

Runner metadata is merged with manifest entrypoint metadata and preserved under
evidence `facts.source_metadata`. The same metadata is also summarized in the
submission package manifest, which lets reviewers distinguish the extension
version from the harness or native test-suite version without adding
domain-specific fields to the core.

If a Python runner raises an exception, the core converts that failure into
`infrastructure_error` evidence so the assessment package remains complete.

Preflight runners are gates. If an extension preflight returns `fail`, `blocked`,
or `infrastructure_error`, downstream check groups for that extension are not
executed; they receive `blocked` evidence with `blocked_reason:
preflight_failed`.

## Normalizer Plug-ins

Runners can keep returning guide-board-ready result objects directly. When a
runner wraps a native harness or scanner that writes its own result format, the
extension can add a normalizer descriptor:

```json
{
  "id": "native-probe-normalizer",
  "kind": "python_module",
  "module_path": "normalizers/native_probe.py",
  "callable": "normalize",
  "runner_ref": "native-probe",
  "metadata": {
    "adapter_version": "0.1.0"
  },
  "description": "Converts native runner output into guide-board evidence."
}
```

Normalizers are declared in `extension.json` under `normalizers`. The original
string shorthand remains valid for descriptive-only entries, but only descriptor
objects are loaded and invoked by the core.

The first supported normalizer kind is `python_module`. Its module path is
resolved relative to the extension root and must stay inside that root. The
callable receives one context object:

- `root`: guide-board core root path as a string.
- `extension_path`: extension root path as a string.
- `run_dir`: output run directory path as a string.
- `run_id`: current run ID.
- `plan`: full run plan snapshot.
- `step`: the step being normalized.
- `target_profile`: target profile snapshot.
- `assessment_profile`: assessment profile snapshot.
- `normalizer`: manifest normalizer descriptor.
- `runner_result`: the current runner-result object.

A normalizer returns any subset of the runner-result fields:

```python
def normalize(context: dict) -> dict:
    return {
        "result": "pass",
        "observations": ["Native result was normalized."],
        "facts": {"native_status": "ok"},
        "artifact_refs": ["artifacts/native-result.json"],
        "requirement_refs": ["framework.requirement"],
    }
```

The core merges the normalizer output over the runner result:

- `result` replaces the previous result.
- `observations` are appended.
- `facts` are merged.
- `artifact_refs` and `requirement_refs` are deduplicated.
- `metadata` is merged.
- `normalizer_refs` is recorded in evidence facts when any normalizer runs.

If a normalizer raises an exception, the step becomes
`infrastructure_error` evidence and the run still produces its normal artifact
set.

The bundled `extensions/sdk-fixture` extension is the copyable reference path
for profile schemas, a native-output runner, a normalizer, mappings, and fixture
profiles.

## Source Lock And Submission Package

Every new run writes `sources.lock.json`, `reports/submission-package.json`,
and the generic portable export manifest at `exports/export-manifest.json`.
Extension authors should treat source metadata as part of the evidence contract:

- declare extension, authority, framework, runner, and normalizer metadata in
  `extension.json` when it is static;
- return runner or normalizer `metadata` when versions, native result IDs, or
  test-suite IDs are detected at runtime;
- keep mapping sets under `mappings/` so the core can checksum them in the
  source lock;
- keep restricted or licensed assets referenced by metadata or artifacts rather
  than vendored into the core.

The submission package manifest is generic guide-board output. Authority-specific
final submissions, trademark assertions, or certification conclusions remain
extension-owned or reviewer-owned.

## Result Statuses

Initial statuses:

- `pass`
- `fail`
- `warning`
- `manual`
- `not_applicable`
- `skipped`
- `expected_gap`
- `waiver_applied`
- `unsupported_by_design`
- `infrastructure_error`
- `blocked`
- `unknown`

## Current Extension Examples

- `sample-noop`: no runner, used to validate the core contracts.
- `sdk-fixture`: compact SDK fixture covering profile schemas, runner output,
  normalizer invocation, mapping, and fixture profiles.
- `open-cmis-tck`: provides a Python CMIS Browser Binding preflight runner and
  declares the future external OpenCMIS TCK runner.

## Next SDK Steps

- Broaden normalizer examples as real external extensions adopt native harness
  result formats.
- Add more extension-owned schema validation examples for assessment-specific
  domain constraints.