Files
guide-board/docs/ARCHITECTURE-BLUEPRINT.md

22 KiB

Guide Board Core Architecture Blueprint

Status: draft Created: 2026-05-07

Purpose

This blueprint defines the first core architecture for guide-board: a certification and compliance preparation framework that can orchestrate extension-specific conformance harnesses, validators, repository-quality checks, and procedural evidence packs without embedding domain policy in the core.

The design is based on recurring patterns from official or authority-backed programs such as OGC TEAM Engine, OpenID Foundation Conformance Suite, CNCF Kubernetes Conformance, web-platform-tests, Khronos CTS, NIST ACVP, HL7/FHIR Inferno, Jakarta EE TCK, OPC UA CTT, NIST SCAP/OSCAL, CIS-CAT, and OpenSSF Scorecard.

Research Lessons

Suite Engine

Examples: OGC TEAM Engine, OpenCMIS TCK, Jakarta EE TCK.

Pattern:

  • installable suites with their own test definitions,
  • command-line execution and sometimes web/API execution,
  • target-specific input forms or profiles,
  • raw logs plus structured result formats,
  • conformance classes or capability areas,
  • certification boundary outside normal self-testing.

Architecture lesson:

guide-board needs a runner bridge that can call external harnesses, capture artifacts, and normalize tool-specific result formats without making the harness part of the core.

Sources:

Hosted Or Local Certification Suite

Examples: OpenID Foundation Conformance Suite, Inferno.

Pattern:

  • open source suite,
  • hosted public/staging environments,
  • local Docker execution,
  • named test plans or test kits,
  • logs and public result pages,
  • fee or accredited-review boundary for formal certification.

Architecture lesson:

guide-board should model execution environment tiers, test plans, and certification submission packages separately from normal development runs.

Sources:

Submit-Results Program

Example: CNCF Kubernetes Conformance.

Pattern:

  • vendors run the same open source conformance application used by the program,
  • result artifacts are submitted for review,
  • accepted results feed a public certification list,
  • users can rerun the same conformance application to confirm behavior.

Architecture lesson:

An assessment package should be a first-class artifact with source metadata, runner version, target identity, raw evidence, normalized results, and a review boundary suitable for downstream submission.

Source:

Protocol Validation Service

Example: NIST ACVP.

Pattern:

  • authority-operated demo and production services,
  • client authentication,
  • machine-to-machine protocol,
  • generated test vectors and submitted responses,
  • validation tied to an external authority process.

Architecture lesson:

Some extensions will not run a local test suite. They will coordinate a session with an authority service. The core must support credential references, remote session IDs, generated inputs, submitted responses, and external verdicts.

Source:

Web-Scale Shared Test Repository

Example: web-platform-tests.

Pattern:

  • shared specification-linked test repository,
  • canonical manifest generation,
  • multiple test types including automated, reference, and manual tests,
  • local and public execution surfaces.

Architecture lesson:

guide-board check discovery should be manifest-driven where possible. It must support heterogeneous check types instead of assuming every check is a simple pass/fail command.

Sources:

Conformance Submission Package

Examples: Khronos Vulkan CTS and OpenXR CTS.

Pattern:

  • automated and sometimes interactive test runs,
  • XML result files,
  • console output,
  • build and CTS version metadata,
  • explicit conformance statement,
  • trademark or adopter-program boundary.

Architecture lesson:

The guide-board assessment package should preserve both normalized evidence and the original submission-grade artifacts expected by an authority.

Sources:

Restricted Tool

Examples: OPC UA CTT, CIS-CAT Pro.

Pattern:

  • official tool may be restricted to members, licensees, or controlled access,
  • tests are organized by profiles, facets, conformance units, benchmarks, or controls,
  • command-line execution may exist for automation,
  • redistribution is not allowed or not appropriate.

Architecture lesson:

guide-board must represent restricted harnesses as externally supplied runtime assets. The registry can describe how to integrate them, but the core and extensions must not vendor restricted tools or proprietary standard text.

Sources:

Security Configuration And Assessment Content

Examples: NIST SCAP, OpenSCAP, CIS-CAT Pro.

Pattern:

  • machine-readable security configuration content,
  • profiles or tailored benchmarks,
  • local or remote system assessment,
  • automated and manual checks,
  • reports mapped to controls.

Architecture lesson:

guide-board must support content-driven validators where the extension supplies policy content and a scanner, not a fixed test suite. The evidence model must handle manual, automated, and partially automated checks.

Sources:

Assessment Data Interchange

Example: NIST OSCAL.

Pattern:

  • layered machine-readable models for controls, implementation, assessment plans, assessment results, and remediation milestones,
  • multiple serializations such as JSON, XML, and YAML,
  • assessment results expressed relative to a system and controls.

Architecture lesson:

guide-board should keep its internal evidence model small, but design it so later OSCAL export is natural for compliance packs that need formal assessment interchange.

Source:

Repository Quality And Supply Chain Scoring

Example: OpenSSF Scorecard.

Pattern:

  • automated checks over source repositories,
  • score and risk level per check,
  • aggregate posture score,
  • remediation prompts,
  • CI and API integration.

Architecture lesson:

Repository quality packs should be normal extensions. A score is not a certification verdict; it is a normalized finding and trend signal.

Quality gates should be core policy decisions over retained posture, not extension-specific verdicts. The first gate layer checks latest run status, unexpected finding count, and whether the latest trend regressed.

Sources:

Architecture Principles

  • The core is extension-neutral.
  • Authority, framework, and harness versions are evidence, not prose.
  • Local CLI behavior is the execution source of truth.
  • Optional service APIs wrap the same contracts used by the CLI.
  • Restricted harnesses and proprietary standards are mounted or referenced, not redistributed.
  • Raw artifacts are preserved, but normalized evidence is the primary interface.
  • Every assessment package must state its certification boundary.
  • Manual, semi-automated, and fully automated checks all use the same evidence model.
  • Expected gaps and waivers never suppress unexpected failures silently.
  • Extension extraction to separate repositories should be possible without changing core contracts.

Core Components

Authority Catalog

Tracks source authorities, framework names, versions, official URLs, licensing posture, access constraints, certification boundaries, and lifecycle status.

Extension Registry

Discovers installed or incubating extensions. Each extension declares:

  • extension ID,
  • type,
  • supported frameworks,
  • source authority references,
  • profile schemas,
  • check groups,
  • runner or validator entry points,
  • normalizers,
  • mappings,
  • report fragments,
  • dependency and license posture.

Profile Registry

Loads and validates target profiles and assessment profiles.

Target profiles describe the subject being assessed: repository, service, cluster, product, API, data archive, host, organization, process, or policy set.

Assessment profiles select frameworks, controls, check groups, expectations, waivers, output policies, and retention policies.

Local Service Facade

Wraps the CLI/core contracts in a dependency-light local HTTP API. The service can list extensions, validate profiles, build plans, start assessment jobs, inspect job status, and fetch generated reports.

The first implementation stores job status in memory and leaves durable evidence in the normal run directory. It does not introduce separate execution semantics.

Assessment Planner

Resolves an assessment profile into an executable run plan:

  • selected extensions,
  • selected check groups,
  • required credentials,
  • preflight checks,
  • dependency checks,
  • execution order,
  • isolation and timeout policy,
  • artifact retention policy.

At execution time, a failing preflight blocks downstream check groups for the same extension so expensive or misleading harness steps are not invoked.

Runner Bridge

Executes or coordinates extension checks.

Supported runner kinds:

  • local command,
  • container command,
  • in-process validator,
  • remote protocol session,
  • hosted test-suite interaction,
  • manual evidence request,
  • imported result package.

Artifact Store

Stores run artifacts by reference and checksum:

  • raw logs,
  • XML/JSON/HTML reports,
  • screenshots or rendered documents,
  • authority submission files,
  • request/response transcripts,
  • input forms,
  • profile snapshots,
  • source lockfiles.

The first implementation builds the assessment package artifact manifest from runner-emitted artifact refs and computes checksums for files inside the run directory. New runs also write a source lock and a submission package manifest that fingerprint reviewable run files and summarize runner or normalizer metadata reported by extensions.

Normalizer

Converts extension output into guide-board evidence records.

The normalizer should preserve native identifiers such as test case IDs, conformance class IDs, control IDs, profile IDs, benchmark IDs, or requirement references.

Mapping Engine

Maps evidence to:

  • capabilities,
  • controls,
  • conformance classes,
  • requirements,
  • policy questions,
  • repository quality dimensions,
  • scorecard dimensions.

Mappings belong to extensions or assessment packs, not the core.

The first implementation loads extension-owned JSON mapping sets from extensions/<extension-id>/mappings/, joins them to evidence requirement_refs, and writes normalized mapping records under each run directory.

Expectation And Waiver Engine

Applies declared target posture after evidence normalization.

Use expectations for known optional behavior, unsupported-by-design features, and accepted gaps.

Use waivers for time-bounded exceptions with owner, reason, expiry, and review metadata.

The first implementation supports assessment-profile references to JSON expectation and waiver sets. These policies annotate findings as expected or waived after evidence normalization and finding creation.

Report Builder

Builds human and machine-readable outputs:

  • compact JSON assessment package,
  • Markdown summary,
  • extension-specific fragments,
  • submission package manifest,
  • trend summaries,
  • future OSCAL or other interchange exports.

Retention Index

Keeps compact summaries over time while allowing raw artifact retention to be bounded by policy. The first implementation writes retention-summary.json for each run and can build a trend summary grouped by target and assessment profile.

Extension Archetypes

Executable Harness Extension

Runs an external TCK, CTS, or conformance suite.

Examples: open-cmis-tck, OGC TEAM Engine, Jakarta EE TCK, Khronos CTS.

Validator Extension

Validates structured artifacts against schemas, profiles, or data-stream requirements.

Examples: SCAP content validation, FHIR resource validation.

Protocol Service Extension

Coordinates with an external authority-operated service.

Example: NIST ACVP.

Hosted Suite Extension

Uses a hosted or locally containerized suite with named test plans.

Examples: OpenID Conformance Suite, Inferno.

Repository Quality Extension

Runs checks against repository configuration, development process, supply chain signals, and release hygiene.

Example: OpenSSF Scorecard.

Procedural Evidence Extension

Guides collection of policy, process, and control evidence where no official executable harness exists.

Examples: GDPR, SOC 2, HIPAA, NF Z 42-013, NF 461, ISO 14641, ISO 15489.

Procedural packs use evidence request sets to describe artifact collection, review roles, acceptance criteria, confidentiality, renewal expectations, and waiver paths without reproducing restricted standard text. See docs/COMPLIANCE-EVIDENCE-PACKS.md.

Hybrid Extension

Combines automated checks, manual evidence, external auditor review, and imported result packages.

Core Data Contracts

The first implementation should define these as simple JSON/YAML schemas before building complex runtime code.

Authority

  • id
  • name
  • authority_type
  • source_urls
  • frameworks
  • license_posture
  • access_constraints
  • certification_boundary
  • lifecycle_status

ExtensionManifest

  • id
  • name
  • version
  • extension_type
  • supported_frameworks
  • profile_schemas
  • check_groups
  • runner_entrypoints
  • normalizers
  • mappings
  • report_fragments
  • dependencies
  • restricted_assets

Framework

  • id
  • authority_id
  • name
  • version
  • status
  • source_urls
  • requirement_index
  • profile_index
  • license_posture

TargetProfile

  • id
  • subject_type
  • subject_name
  • environment
  • scope
  • endpoints
  • artifacts
  • credentials_ref
  • declared_capabilities
  • known_gaps

AssessmentProfile

  • id
  • framework_refs
  • extension_refs
  • target_profile_ref
  • selected_check_groups
  • expectations_ref
  • waivers_ref
  • output_policy
  • retention_policy

CheckDefinition

  • id
  • extension_id
  • check_type
  • framework_refs
  • requirement_refs
  • inputs
  • preconditions
  • timeout
  • runner_ref
  • expected_artifacts

RunPlan

  • id
  • assessment_profile_snapshot
  • extension_snapshots
  • source_lock
  • ordered_steps
  • credential_refs
  • artifact_policy
  • runtime_policy

SourceLock

  • framework_refs
  • extension_refs
  • frameworks
  • extensions
  • mapping_sets
  • profiles
  • policy_refs
  • authorities
  • metadata_hooks

RawArtifact

  • id
  • run_id
  • path
  • media_type
  • producer
  • checksum
  • created_at
  • retention_class

EvidenceItem

  • id
  • run_id
  • extension_id
  • check_id
  • subject_ref
  • result
  • observations
  • facts
  • requirement_refs
  • artifact_refs
  • started_at
  • completed_at

Finding

  • id
  • run_id
  • status
  • severity
  • classification
  • requirement_refs
  • evidence_refs
  • expected
  • waiver_ref
  • remediation

Waiver

  • id
  • scope
  • requirement_refs
  • reason
  • owner
  • approved_by
  • created_at
  • expires_at
  • review_status

AssessmentPackage

  • id
  • run_id
  • target
  • frameworks
  • extensions
  • source_lock
  • summary
  • findings
  • evidence_refs
  • artifact_manifest
  • waivers
  • certification_boundary
  • created_at

SubmissionPackage

  • run_id
  • package_identity
  • source_lock_ref
  • source_lock
  • reports
  • normalized_outputs
  • profile_snapshots
  • artifact_manifest
  • reported_metadata
  • certification_boundary

Result Vocabulary

The evidence model should allow these statuses:

  • pass
  • fail
  • warning
  • manual
  • not_applicable
  • skipped
  • expected_gap
  • waiver_applied
  • unsupported_by_design
  • infrastructure_error
  • blocked
  • unknown

The reporting layer should distinguish at least:

  • conformant evidence,
  • nonconformant evidence,
  • expected limitation,
  • waived limitation,
  • missing evidence,
  • infrastructure failure,
  • human review required.

Proposed Repository Layout

guide-board/
  INTENT.md
  README.md
  docs/
    ARCHITECTURE-BLUEPRINT.md
    schemas/
  extensions/
    CANDIDATES.md
    _template/
    sample-noop/
  runs/
  reports/
  workplans/

runs/ and reports/ should be local generated outputs and ignored by default. Production extensions should usually live in separate repositories and be attached with --extension-dir or GUIDE_BOARD_EXTENSION_PATHS.

Execution Flow

discover extensions
  -> load authority catalog
  -> validate target profile
  -> validate assessment profile
  -> plan run
  -> run preflight
  -> execute checks
  -> collect artifacts
  -> normalize evidence
  -> map findings
  -> apply expectations and waivers
  -> build assessment package
  -> write reports
  -> retain summaries

Run Directory Contract

Each run should be reproducible from captured metadata where possible.

runs/<run-id>/
  run.json
  retention-summary.json
  plan.json
  sources.lock.json
  target-profile.snapshot.json
  assessment-profile.snapshot.json
  artifacts/
  normalized/
    evidence.json
    findings.json
    mappings.json
  reports/
    report.md
    assessment-package.json
    submission-package.json
  exports/

Container And Service Model

The local CLI should come first. Containerization should preserve the same CLI contracts.

Recommended container model:

  • guide-board-core image contains the core CLI and schema tooling.
  • Extension dependencies are either installed by extension-specific images or mounted as external assets.
  • Profiles, credentials, runs, and reports are mounted explicitly.
  • Restricted tools are mounted from licensed local paths.
  • Network access is declared per extension and per assessment profile.

The baseline Containerfile packages the local CLI, schemas, sample profiles, and incubating extensions. See docs/CONTAINER.md for mount contracts and the extension-specific image path.

Optional service model:

  • service lists extensions and profiles,
  • service validates and plans runs,
  • service starts jobs that call the CLI contracts,
  • service streams status and exposes reports,
  • service does not invent separate execution semantics.

Candidate API resources:

  • GET /extensions
  • GET /authorities
  • POST /profiles/validate
  • POST /assessments/plan
  • POST /runs
  • GET /runs/{run_id}
  • GET /runs/{run_id}/artifacts
  • GET /runs/{run_id}/reports

Governance Model

Extension Lifecycle

  • candidate: researched and registered.
  • incubating: has an intent and workplan.
  • active: runnable through core contracts.
  • external: maintained outside the repo but compatible.
  • deprecated: retained for historical runs only.

Challenge And Exclusion Handling

Use separate concepts:

  • authority exclusion: imported from an official TCK or program process,
  • extension challenge: local claim that a check is invalid or mis-mapped,
  • target expectation: declared optional or unsupported behavior,
  • waiver: approved and time-bounded exception,
  • defect: unexpected product or process failure.

The report must make these visible separately. The current policy layer loads challenge and exclusion refs from assessment profiles, annotates findings and evidence, and keeps unexpected_findings visible for gate semantics unless a finding is separately expected or waived.

Source Locking

Each run should lock:

  • extension version,
  • framework version,
  • harness version,
  • authority source URLs,
  • test suite IDs,
  • mapping version,
  • target profile snapshot,
  • expectation and waiver refs.

The current source lock remains backward-compatible with the original framework_refs and extension_refs fields while adding checksummed profiles, mapping-set refs, optional policy refs, authority descriptors, and metadata hooks for runners and normalizers.

Implementation Sequence

  1. Create schema drafts for the core data contracts.
  2. Add an extension manifest format and a minimal sample extension.
  3. Build the CLI commands: extensions list, profile validate, plan, run, and report.
  4. Integrate open-cmis-tck through the same contracts.
  5. Add generated-output ignores for runs/ and reports/.
  6. Add container design after the CLI baseline is stable.
  7. Add optional service API around the CLI job model.
  8. Add OSCAL export and procedural evidence-pack support after the internal evidence model proves itself with executable extensions.

The first extension SDK contract is documented in docs/EXTENSION-SDK.md.