Add challenge and exclusion review handling

This commit is contained in:
2026-05-16 02:58:18 +02:00
parent c8ac42154c
commit b1dff0440d
16 changed files with 644 additions and 21 deletions

View File

@@ -803,6 +803,9 @@ Use separate concepts:
- defect: unexpected product or process failure.
The report must make these visible separately.
The current policy layer loads challenge and exclusion refs from assessment
profiles, annotates findings and evidence, and keeps `unexpected_findings`
visible for gate semantics unless a finding is separately expected or waived.
### Source Locking

View File

@@ -27,7 +27,8 @@ Every run needs:
The target profile describes the candidate system or artifact being assessed.
The assessment profile selects frameworks, extensions, check groups, runtime
policy, waivers, expectations, and output policy.
policy, expectations, waivers, challenges, authority exclusions, and output
policy.
## CLI Flow
@@ -99,10 +100,10 @@ artifacts/
```
`sources.lock.json` records the framework refs, extension versions, mapping
sets, profile snapshots, policy refs, authority refs, and extension metadata
hooks used for the run. `reports/submission-package.json` points at the
reviewable package files, includes checksums where files exist, carries the raw
artifact manifest, and repeats the certification boundary. It is a portable
sets, profile snapshots, policy and review refs, authority refs, and extension
metadata hooks used for the run. `reports/submission-package.json` points at
the reviewable package files, includes checksums where files exist, carries the
raw artifact manifest, and repeats the certification boundary. It is a portable
handoff manifest for preparation evidence, not an authority-specific final
submission.
@@ -200,6 +201,23 @@ Individual evidence items use:
- `expected_gap`
- `infrastructure_error`
## Review State
Assessment profiles may reference:
- `expectations_ref`: known target posture, optional scope, or accepted gaps,
- `waivers_ref`: approved, time-bounded exceptions,
- `challenges_ref`: review claims that a finding, check, mapping, or native
result should be challenged,
- `exclusions_ref`: authority or program exclusions that apply to selected
findings.
Challenges and exclusions annotate findings and evidence. They do not silently
turn failures into passing evidence and they do not reduce the
`unexpected_findings` count used by default gates. Retained summaries expose
separate counts for expected findings, waived findings, challenged findings,
authority exclusions, unresolved defects, and unresolved review items.
## Candidate System Checklist
Before starting a run against candidate software, confirm:

View File

@@ -8,8 +8,8 @@ Created: 2026-05-07
Compliance evidence packs cover frameworks where guide-board cannot rely on an
official executable harness. They help prepare and perform assessments by
organizing evidence requests, expected artifacts, reviewer workflow, waivers,
and run reports. They do not replace auditors, accredited certification bodies,
legal counsel, or official standard text.
challenges, authority exclusions, and run reports. They do not replace auditors,
accredited certification bodies, legal counsel, or official standard text.
Examples include GDPR, SOC 2, HIPAA, NF Z 42-013, NF 461, ISO 14641, ISO 15489,
and similar procedural or control-oriented frameworks.
@@ -83,7 +83,7 @@ Each request should include:
Requests should be phrased as collection guidance, not as legal conclusions.
## Waivers And Expected Gaps
## Review Policy Records
Evidence packs use the same expectation and waiver model as executable
extensions.
@@ -103,6 +103,16 @@ Use waivers for:
Every waiver should include owner, reason, approval status, and expiry.
Use challenges for disputed checks, disputed mappings, imported native result
questions, or evidence that needs a reviewer decision before it can be treated
as a defect. Use authority exclusions only when a program, standard, or
authorized reviewer excludes a requirement or check from the assessment scope.
Both records should cite stable requirement refs, check refs, evidence refs, or
authority source refs rather than reproducing restricted standard text.
Challenges and exclusions make review state visible; they do not by themselves
claim compliance or remove default gate-visible unexpected findings.
## Framework Notes
GDPR packs should emphasize processing inventory, lawful basis records, data
@@ -129,6 +139,7 @@ extensions:
- normalized evidence,
- findings,
- review annotations for expectations, waivers, challenges, and exclusions,
- mapping records,
- assessment packages,
- retention summaries,

View File

@@ -250,6 +250,33 @@ Expectation sets mark known posture as expected. Waiver sets mark approved,
time-bounded exceptions. Both are applied after findings are generated, and the
assessment package records policy summary counts.
## Challenges And Authority Exclusions
Assessment profiles may also reference challenge and exclusion sets:
```json
{
"challenges_ref": "profiles/challenges/example.json",
"exclusions_ref": "profiles/exclusions/example.json"
}
```
Challenge sets validate against `docs/schemas/challenge-set.schema.json`.
Exclusion sets validate against `docs/schemas/exclusion-set.schema.json`.
Records can match findings by requirement refs, check refs, evidence refs,
result refs, or classification refs. They also carry owner, review status,
rationale, authority source refs, review dates, optional expiry, native IDs,
and free-form metadata.
Use challenges when an extension author or assessment team believes a finding
needs review because a check is invalid, a native harness result is disputed, or
a mapping is wrong. Use exclusions when an authority or program explicitly
removes a requirement, check, or result from the assessment scope. The core
preserves these distinctions in findings, evidence review annotations,
assessment packages, reports, and retained summaries, but default gate semantics
still count the underlying finding as unexpected unless it is separately
expected or waived.
## Python Runner Contract
A Python runner receives one context object and returns one result object.

View File

@@ -17,6 +17,8 @@
"evidence_refs",
"artifact_manifest",
"waivers",
"challenges",
"exclusions",
"certification_boundary",
"created_at"
],
@@ -34,6 +36,8 @@
"evidence_refs": { "type": "array", "items": { "type": "string" } },
"artifact_manifest": { "type": "array", "items": { "type": "object" } },
"waivers": { "type": "array", "items": { "type": "object" } },
"challenges": { "type": "array", "items": { "type": "object" } },
"exclusions": { "type": "array", "items": { "type": "object" } },
"certification_boundary": { "type": "string" },
"created_at": { "type": "string" }
}

View File

@@ -28,6 +28,8 @@
},
"expectations_ref": { "type": ["string", "null"] },
"waivers_ref": { "type": ["string", "null"] },
"challenges_ref": { "type": ["string", "null"] },
"exclusions_ref": { "type": ["string", "null"] },
"output_policy": { "type": "object" },
"retention_policy": { "type": "object" },
"runtime_policy": { "type": "object" }

View File

@@ -0,0 +1,56 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Guide Board Challenge Set",
"type": "object",
"additionalProperties": false,
"required": [
"id",
"target_profile_ref",
"challenges"
],
"properties": {
"id": { "type": "string" },
"target_profile_ref": { "type": "string" },
"challenges": {
"type": "array",
"items": {
"type": "object",
"additionalProperties": false,
"required": [
"id",
"requirement_refs",
"check_refs",
"evidence_refs",
"result_refs",
"classification_refs",
"authority_source_refs",
"owner",
"review_status",
"rationale",
"created_at",
"review_due_at",
"expires_at",
"native_challenge_id",
"metadata"
],
"properties": {
"id": { "type": "string" },
"requirement_refs": { "type": "array", "items": { "type": "string" } },
"check_refs": { "type": "array", "items": { "type": "string" } },
"evidence_refs": { "type": "array", "items": { "type": "string" } },
"result_refs": { "type": "array", "items": { "type": "string" } },
"classification_refs": { "type": "array", "items": { "type": "string" } },
"authority_source_refs": { "type": "array", "items": { "type": "string" } },
"owner": { "type": "string" },
"review_status": { "type": "string" },
"rationale": { "type": "string" },
"created_at": { "type": "string" },
"review_due_at": { "type": ["string", "null"] },
"expires_at": { "type": ["string", "null"] },
"native_challenge_id": { "type": ["string", "null"] },
"metadata": { "type": "object" }
}
}
}
}
}

View File

@@ -42,6 +42,7 @@
},
"observations": { "type": "array", "items": { "type": "string" } },
"facts": { "type": "object" },
"review": { "type": "object" },
"requirement_refs": { "type": "array", "items": { "type": "string" } },
"artifact_refs": { "type": "array", "items": { "type": "string" } },
"started_at": { "type": "string" },

View File

@@ -0,0 +1,60 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Guide Board Authority Exclusion Set",
"type": "object",
"additionalProperties": false,
"required": [
"id",
"target_profile_ref",
"exclusions"
],
"properties": {
"id": { "type": "string" },
"target_profile_ref": { "type": "string" },
"exclusions": {
"type": "array",
"items": {
"type": "object",
"additionalProperties": false,
"required": [
"id",
"authority_ref",
"requirement_refs",
"check_refs",
"evidence_refs",
"result_refs",
"classification_refs",
"authority_source_refs",
"owner",
"approved_by",
"review_status",
"rationale",
"created_at",
"review_due_at",
"expires_at",
"native_exclusion_id",
"metadata"
],
"properties": {
"id": { "type": "string" },
"authority_ref": { "type": "string" },
"requirement_refs": { "type": "array", "items": { "type": "string" } },
"check_refs": { "type": "array", "items": { "type": "string" } },
"evidence_refs": { "type": "array", "items": { "type": "string" } },
"result_refs": { "type": "array", "items": { "type": "string" } },
"classification_refs": { "type": "array", "items": { "type": "string" } },
"authority_source_refs": { "type": "array", "items": { "type": "string" } },
"owner": { "type": "string" },
"approved_by": { "type": ["string", "null"] },
"review_status": { "type": "string" },
"rationale": { "type": "string" },
"created_at": { "type": "string" },
"review_due_at": { "type": ["string", "null"] },
"expires_at": { "type": ["string", "null"] },
"native_exclusion_id": { "type": ["string", "null"] },
"metadata": { "type": "object" }
}
}
}
}
}

View File

@@ -14,7 +14,10 @@
"evidence_refs",
"expected",
"waiver_ref",
"challenge_ref",
"exclusion_ref",
"policy_ref",
"review_status",
"remediation"
],
"properties": {
@@ -28,7 +31,10 @@
"evidence_refs": { "type": "array", "items": { "type": "string" } },
"expected": { "type": "boolean" },
"waiver_ref": { "type": ["string", "null"] },
"challenge_ref": { "type": ["string", "null"] },
"exclusion_ref": { "type": ["string", "null"] },
"policy_ref": { "type": ["string", "null"] },
"review_status": { "type": "string" },
"remediation": { "type": ["string", "null"] }
}
}

View File

@@ -35,7 +35,15 @@ def run_assessment(
assert_valid(item, "evidence-item")
findings = _findings_for_evidence(run_id, evidence)
findings, policy_summary, applied_waivers = apply_policy(root, plan, findings)
(
findings,
policy_summary,
applied_waivers,
applied_challenges,
applied_exclusions,
) = apply_policy(root, plan, evidence, findings)
for item in evidence:
assert_valid(item, "evidence-item")
for finding in findings:
assert_valid(finding, "finding")
@@ -52,6 +60,8 @@ def run_assessment(
mapping_summary,
policy_summary,
applied_waivers,
applied_challenges,
applied_exclusions,
created_at,
)
assert_valid(assessment_package, "assessment-package")
@@ -308,6 +318,7 @@ def _findings_for_evidence(run_id: str, evidence: list[dict[str, Any]]) -> list[
for item in evidence:
if item["result"] not in {"blocked", "fail", "infrastructure_error"}:
continue
expected = _expected_for_item(item)
findings.append(
{
"id": f"finding:{item['check_id']}",
@@ -318,9 +329,12 @@ def _findings_for_evidence(run_id: str, evidence: list[dict[str, Any]]) -> list[
"classification": _classification_for_item(item),
"requirement_refs": item["requirement_refs"],
"evidence_refs": [item["id"]],
"expected": _expected_for_item(item),
"expected": expected,
"waiver_ref": None,
"challenge_ref": None,
"exclusion_ref": None,
"policy_ref": None,
"review_status": "expected" if expected else "unresolved_defect",
"remediation": _remediation_for_item(item),
}
)
@@ -382,6 +396,8 @@ def _assessment_package(
mapping_summary: dict[str, Any],
policy_summary: dict[str, Any],
applied_waivers: list[dict[str, Any]],
applied_challenges: list[dict[str, Any]],
applied_exclusions: list[dict[str, Any]],
created_at: str,
) -> dict[str, Any]:
summary = dict(Counter(item["result"] for item in evidence))
@@ -401,6 +417,8 @@ def _assessment_package(
"evidence_refs": [item["id"] for item in evidence],
"artifact_manifest": artifact_manifest,
"waivers": applied_waivers,
"challenges": applied_challenges,
"exclusions": applied_exclusions,
"certification_boundary": "Guide Board produces preparation evidence only and does not issue certifications or audit assurance.",
"created_at": created_at,
}
@@ -452,6 +470,7 @@ def _markdown_report(run_metadata: dict[str, Any], package: dict[str, Any]) -> s
summary_lines = "- no evidence produced"
mapping_lines = _mapping_summary_lines(package)
policy_lines = _policy_summary_lines(package)
review_lines = _review_summary_lines(package)
return "\n".join(
[
@@ -473,6 +492,10 @@ def _markdown_report(run_metadata: dict[str, Any], package: dict[str, Any]) -> s
"",
policy_lines,
"",
"## Review",
"",
review_lines,
"",
"## Boundary",
"",
package["certification_boundary"],
@@ -502,10 +525,27 @@ def _policy_summary_lines(package: dict[str, Any]) -> str:
f"- applied expectations: {summary.get('applied_expectations', 0)}",
f"- applied waivers: {summary.get('applied_waivers', 0)}",
f"- unexpected findings: {summary.get('unexpected_findings', 0)}",
f"- challenged findings: {summary.get('challenged_findings', 0)}",
f"- authority exclusions: {summary.get('authority_exclusions', 0)}",
f"- unresolved defects: {summary.get('unresolved_defects', 0)}",
]
)
def _review_summary_lines(package: dict[str, Any]) -> str:
findings = package.get("findings", [])
if not findings:
return "- no findings requiring review"
counts = Counter(
finding.get("review_status", "unreviewed")
for finding in findings
if isinstance(finding, dict)
)
return "\n".join(
f"- {status}: {count}" for status, count in sorted(counts.items())
)
def _run_status(evidence: list[dict[str, Any]]) -> str:
if any(item["result"] == "fail" for item in evidence):
return "failed"

View File

@@ -262,6 +262,18 @@ def _build_source_lock(
assessment.get("waivers_ref"),
"waiver-set",
),
"challenges": _optional_policy_source_record(
root,
assessment_path,
assessment.get("challenges_ref"),
"challenge-set",
),
"exclusions": _optional_policy_source_record(
root,
assessment_path,
assessment.get("exclusions_ref"),
"exclusion-set",
),
},
"authorities": _authority_source_records(extensions),
"metadata_hooks": {

View File

@@ -13,20 +13,36 @@ from guide_board.schema import assert_valid
def apply_policy(
root: Path,
plan: dict[str, Any],
evidence: list[dict[str, Any]],
findings: list[dict[str, Any]],
) -> tuple[list[dict[str, Any]], dict[str, Any], list[dict[str, Any]]]:
) -> tuple[
list[dict[str, Any]],
dict[str, Any],
list[dict[str, Any]],
list[dict[str, Any]],
list[dict[str, Any]],
]:
expectations = _load_optional_set(root, plan, "expectations_ref", "expectation-set")
waiver_set = _load_optional_set(root, plan, "waivers_ref", "waiver-set")
challenge_set = _load_optional_set(root, plan, "challenges_ref", "challenge-set")
exclusion_set = _load_optional_set(root, plan, "exclusions_ref", "exclusion-set")
waivers = waiver_set.get("waivers", []) if waiver_set else []
challenges = challenge_set.get("challenges", []) if challenge_set else []
exclusions = exclusion_set.get("exclusions", []) if exclusion_set else []
applied_expectations = 0
applied_waivers: list[dict[str, Any]] = []
applied_challenges: list[dict[str, Any]] = []
applied_exclusions: list[dict[str, Any]] = []
evidence_by_id = {item["id"]: item for item in evidence}
for finding in findings:
for expectation in expectations.get("expectations", []) if expectations else []:
if _matches_rule(finding, expectation):
finding["expected"] = expectation["expected"]
finding["policy_ref"] = expectation["id"]
finding["review_status"] = "expected" if expectation["expected"] else "unresolved_defect"
_annotate_evidence(evidence_by_id, finding, "expectation_refs", expectation["id"])
applied_expectations += 1
break
@@ -37,20 +53,60 @@ def apply_policy(
finding["waiver_ref"] = waiver["id"]
finding["expected"] = True
finding["policy_ref"] = waiver["id"]
finding["review_status"] = "waived"
finding["remediation"] = f"Waived: {waiver['reason']}"
applied_waivers.append(waiver)
_annotate_evidence(evidence_by_id, finding, "waiver_refs", waiver["id"])
break
for exclusion in exclusions:
if not _review_record_active(exclusion):
continue
if _matches_rule(finding, exclusion):
finding["exclusion_ref"] = exclusion["id"]
if finding.get("review_status") == "unresolved_defect":
finding["review_status"] = "authority_excluded"
applied_exclusions.append(exclusion)
_annotate_evidence(evidence_by_id, finding, "exclusion_refs", exclusion["id"])
break
for challenge in challenges:
if not _review_record_active(challenge):
continue
if _matches_rule(finding, challenge):
finding["challenge_ref"] = challenge["id"]
if finding.get("review_status") == "unresolved_defect":
finding["review_status"] = "challenged"
applied_challenges.append(challenge)
_annotate_evidence(evidence_by_id, finding, "challenge_refs", challenge["id"])
break
policy_summary = {
"expectations_ref": plan["assessment_profile_snapshot"].get("expectations_ref"),
"waivers_ref": plan["assessment_profile_snapshot"].get("waivers_ref"),
"challenges_ref": plan["assessment_profile_snapshot"].get("challenges_ref"),
"exclusions_ref": plan["assessment_profile_snapshot"].get("exclusions_ref"),
"applied_expectations": applied_expectations,
"applied_waivers": len(applied_waivers),
"challenged_findings": _unique_applied_count(findings, "challenge_ref"),
"authority_exclusions": _unique_applied_count(findings, "exclusion_ref"),
"unexpected_findings": sum(
1 for finding in findings if not finding.get("expected") and not finding.get("waiver_ref")
),
"unresolved_defects": sum(
1 for finding in findings if finding.get("review_status") == "unresolved_defect"
),
"unresolved_review_items": sum(
1 for finding in findings if finding.get("review_status") in {"challenged", "authority_excluded"}
),
}
return findings, policy_summary, applied_waivers
return (
findings,
policy_summary,
_dedupe_records(applied_waivers),
_dedupe_records(applied_challenges),
_dedupe_records(applied_exclusions),
)
def _load_optional_set(
@@ -94,6 +150,7 @@ def _matches_rule(finding: dict[str, Any], rule: dict[str, Any]) -> bool:
return (
_matches_any(finding.get("requirement_refs", []), rule.get("requirement_refs", []))
and _matches_any([finding.get("check_id", "")], rule.get("check_refs", []))
and _matches_any(finding.get("evidence_refs", []), rule.get("evidence_refs", []))
and _matches_scalar(finding.get("status"), rule.get("result_refs", []))
and _matches_scalar(finding.get("classification"), rule.get("classification_refs", []))
)
@@ -122,3 +179,57 @@ def _waiver_active(waiver: dict[str, Any]) -> bool:
except ValueError:
return False
return expiry >= date.today()
def _review_record_active(record: dict[str, Any]) -> bool:
status = record.get("review_status")
if status in {"rejected", "withdrawn", "closed", "expired"}:
return False
expires_at = record.get("expires_at")
if not expires_at:
return True
try:
expiry = date.fromisoformat(expires_at)
except ValueError:
return False
return expiry >= date.today()
def _annotate_evidence(
evidence_by_id: dict[str, dict[str, Any]],
finding: dict[str, Any],
ref_key: str,
ref_value: str,
) -> None:
for evidence_ref in finding.get("evidence_refs", []):
item = evidence_by_id.get(evidence_ref)
if item is None:
continue
review = item.setdefault(
"review",
{
"expectation_refs": [],
"waiver_refs": [],
"challenge_refs": [],
"exclusion_refs": [],
},
)
refs = review.setdefault(ref_key, [])
if ref_value not in refs:
refs.append(ref_value)
def _unique_applied_count(findings: list[dict[str, Any]], ref_name: str) -> int:
return sum(1 for finding in findings if finding.get(ref_name))
def _dedupe_records(records: list[dict[str, Any]]) -> list[dict[str, Any]]:
seen = set()
deduped = []
for record in records:
record_id = record.get("id")
if not isinstance(record_id, str) or record_id in seen:
continue
seen.add(record_id)
deduped.append(record)
return deduped

View File

@@ -37,6 +37,10 @@ def build_retention_summary(
"unexpected_findings": policy_summary.get("unexpected_findings", 0),
"expected_findings": sum(1 for finding in findings if finding.get("expected")),
"waived_findings": sum(1 for finding in findings if finding.get("waiver_ref")),
"challenged_findings": policy_summary.get("challenged_findings", 0),
"authority_exclusions": policy_summary.get("authority_exclusions", 0),
"unresolved_defects": policy_summary.get("unresolved_defects", 0),
"unresolved_review_items": policy_summary.get("unresolved_review_items", 0),
"mapping_target_count": len(
assessment_package.get("mapping_summary", {}).get("targets", [])
),
@@ -197,6 +201,10 @@ def _run_projection(run: dict[str, Any]) -> dict[str, Any]:
"unexpected_findings": _summary_int(summary, "unexpected_findings"),
"finding_count": _summary_int(summary, "finding_count"),
"artifact_count": _summary_int(summary, "artifact_count"),
"challenged_findings": _summary_int(summary, "challenged_findings"),
"authority_exclusions": _summary_int(summary, "authority_exclusions"),
"unresolved_defects": _summary_int(summary, "unresolved_defects"),
"unresolved_review_items": _summary_int(summary, "unresolved_review_items"),
"run_dir": run.get("run_dir"),
}
@@ -211,9 +219,10 @@ def _trend_between(
"status_changed": False,
"unexpected_findings_delta": 0,
"finding_count_delta": 0,
"artifact_count_delta": 0,
"evidence_result_deltas": {},
}
"artifact_count_delta": 0,
"unresolved_review_items_delta": 0,
"evidence_result_deltas": {},
}
previous_summary = previous.get("summary", {})
latest_summary = latest.get("summary", {})
@@ -230,6 +239,9 @@ def _trend_between(
artifact_delta = _summary_int(latest_summary, "artifact_count") - _summary_int(
previous_summary, "artifact_count"
)
review_delta = _summary_int(latest_summary, "unresolved_review_items") - _summary_int(
previous_summary, "unresolved_review_items"
)
previous_status = _status_for(previous)
latest_status = _status_for(latest)
@@ -239,6 +251,7 @@ def _trend_between(
"unexpected_findings_delta": unexpected_delta,
"finding_count_delta": finding_delta,
"artifact_count_delta": artifact_delta,
"unresolved_review_items_delta": review_delta,
"evidence_result_deltas": evidence_deltas,
}

View File

@@ -334,6 +334,69 @@ class CoreArchitectureTests(unittest.TestCase):
self.assertEqual(len(mappings), 1)
self.assertEqual(mappings[0]["target_id"], "profile-readiness")
def test_applies_challenges_and_exclusions_without_hiding_gate_failures(self) -> None:
with TemporaryDirectory() as temporary_directory:
temp_root = Path(temporary_directory)
extension_dir = temp_root / "review-noop"
_write_review_extension(extension_dir)
target_path = temp_root / "review-target.json"
assessment_path = temp_root / "review-assessment.json"
challenge_path = temp_root / "review-challenges.json"
exclusion_path = temp_root / "review-exclusions.json"
_write_review_target(target_path)
_write_review_assessment(assessment_path)
_write_review_challenges(challenge_path)
_write_review_exclusions(exclusion_path)
result = run_assessment(
ROOT,
target_path,
assessment_path,
temp_root / "runs" / "review",
[extension_dir],
)
run_dir = Path(result["run_dir"])
evidence = json.loads(
(run_dir / "normalized" / "evidence.json").read_text(encoding="utf-8")
)["evidence"]
assessment_package = json.loads(
(run_dir / "reports" / "assessment-package.json").read_text(encoding="utf-8")
)
retention = json.loads(
(run_dir / "retention-summary.json").read_text(encoding="utf-8")
)
report = (run_dir / "reports" / "report.md").read_text(encoding="utf-8")
self.assertEqual(result["status"], "blocked")
finding = assessment_package["findings"][0]
self.assertEqual(finding["challenge_ref"], "challenge-review-blocked")
self.assertEqual(finding["exclusion_ref"], "exclusion-review-blocked")
self.assertEqual(finding["review_status"], "authority_excluded")
self.assertFalse(finding["expected"])
self.assertEqual(assessment_package["policy_summary"]["unexpected_findings"], 1)
self.assertEqual(assessment_package["policy_summary"]["challenged_findings"], 1)
self.assertEqual(assessment_package["policy_summary"]["authority_exclusions"], 1)
self.assertEqual(assessment_package["policy_summary"]["unresolved_defects"], 0)
self.assertEqual(
evidence[1]["review"]["challenge_refs"],
["challenge-review-blocked"],
)
self.assertEqual(
evidence[1]["review"]["exclusion_refs"],
["exclusion-review-blocked"],
)
self.assertEqual(assessment_package["challenges"][0]["owner"], "qa")
self.assertEqual(assessment_package["exclusions"][0]["authority_ref"], "review-authority")
self.assertEqual(retention["summary"]["challenged_findings"], 1)
self.assertEqual(retention["summary"]["authority_exclusions"], 1)
self.assertEqual(retention["summary"]["unresolved_review_items"], 1)
self.assertIn("- authority_excluded: 1", report)
gate = evaluate_trend_gates(build_trend_summary(temp_root / "runs"))
self.assertEqual(gate["status"], "failed")
checks = {check["id"]: check for check in gate["groups"][0]["checks"]}
self.assertEqual(checks["unexpected-findings"]["observed"], 1)
def test_serves_local_api_run_lifecycle(self) -> None:
with TemporaryDirectory() as temporary_directory:
service = start_service(ROOT, host="127.0.0.1", port=0)
@@ -742,5 +805,166 @@ def _write_schema_assessment(path: Path, runtime_policy: dict[str, object]) -> N
)
def _write_review_extension(extension_dir: Path) -> None:
extension_dir.mkdir(parents=True, exist_ok=True)
(extension_dir / "extension.json").write_text(
json.dumps(
{
"id": "review-noop",
"name": "Review No-op",
"version": "0.1.0",
"extension_type": "repository_quality",
"lifecycle_status": "incubating",
"supported_frameworks": ["review.framework.v1"],
"authorities": ["review-authority"],
"profile_schemas": ["target-profile", "assessment-profile"],
"check_groups": [
{
"id": "review",
"name": "Review",
"check_type": "repository_quality",
"requirement_refs": ["review.requirement"],
"runner_ref": "external-review",
}
],
"preflight_runner": None,
"runner_entrypoints": [
{
"id": "external-review",
"kind": "external",
"module_path": None,
"callable": None,
"command": None,
"metadata": {"test_suite_id": "review-suite"},
"description": "External runner used to produce reviewable blocked evidence.",
}
],
"normalizers": [],
"mappings": [],
"report_fragments": [],
"dependencies": [],
"restricted_assets": [],
"certification_boundary": "Review fixture only.",
}
),
encoding="utf-8",
)
def _write_review_target(path: Path) -> None:
path.write_text(
json.dumps(
{
"id": "review-target",
"subject_type": "repository",
"subject_name": "Review Target",
"environment": "test",
"scope": ["review"],
"endpoints": [],
"artifacts": [],
"credentials_ref": None,
"declared_capabilities": [],
"known_gaps": [],
}
),
encoding="utf-8",
)
def _write_review_assessment(path: Path) -> None:
path.write_text(
json.dumps(
{
"id": "review-assessment",
"framework_refs": ["review.framework.v1"],
"extension_refs": ["review-noop"],
"target_profile_ref": "review-target",
"selected_check_groups": {"review-noop": ["review"]},
"expectations_ref": None,
"waivers_ref": None,
"challenges_ref": "review-challenges.json",
"exclusions_ref": "review-exclusions.json",
"output_policy": {
"report_formats": ["json", "markdown"],
"artifact_retention": "summary-only",
},
"retention_policy": {
"summary_days": 365,
"raw_artifact_days": 0,
},
"runtime_policy": {
"offline": True,
"timeout_seconds": 2,
},
}
),
encoding="utf-8",
)
def _write_review_challenges(path: Path) -> None:
path.write_text(
json.dumps(
{
"id": "review-challenges",
"target_profile_ref": "review-target",
"challenges": [
{
"id": "challenge-review-blocked",
"requirement_refs": ["review.requirement"],
"check_refs": ["check-group:review-noop:review"],
"evidence_refs": [],
"result_refs": ["blocked"],
"classification_refs": ["runner_not_implemented"],
"authority_source_refs": ["review-authority:rule-1"],
"owner": "qa",
"review_status": "open",
"rationale": "The external suite is not wired in this fixture.",
"created_at": "2026-05-16",
"review_due_at": "2026-06-16",
"expires_at": None,
"native_challenge_id": "native-challenge-1",
"metadata": {"kind": "fixture"},
}
],
}
),
encoding="utf-8",
)
def _write_review_exclusions(path: Path) -> None:
path.write_text(
json.dumps(
{
"id": "review-exclusions",
"target_profile_ref": "review-target",
"exclusions": [
{
"id": "exclusion-review-blocked",
"authority_ref": "review-authority",
"requirement_refs": ["review.requirement"],
"check_refs": ["check-group:review-noop:review"],
"evidence_refs": [],
"result_refs": ["blocked"],
"classification_refs": ["runner_not_implemented"],
"authority_source_refs": ["review-authority:rule-1"],
"owner": "qa",
"approved_by": "authority-reviewer",
"review_status": "approved",
"rationale": "Fixture demonstrates authority exclusion annotation.",
"created_at": "2026-05-16",
"review_due_at": "2026-06-16",
"expires_at": None,
"native_exclusion_id": "native-exclusion-1",
"metadata": {"kind": "fixture"},
}
],
}
),
encoding="utf-8",
)
if __name__ == "__main__":
unittest.main()

View File

@@ -4,12 +4,12 @@ type: workplan
title: "Challenge And Exclusion Handling"
repo: guide-board
domain: markitect
status: active
status: completed
owner: codex
planning_priority: high
planning_order: 5
created: "2026-05-15"
updated: "2026-05-15"
updated: "2026-05-16"
state_hub_workstream_id: "fb11e1c7-6c0c-4ec7-a163-da98b2fe9f8f"
---
@@ -42,7 +42,7 @@ but the core should preserve them without embedding domain policy.
```task
id: GUIDE-BOARD-WP-0005-T001
status: todo
status: done
priority: high
state_hub_task_id: "6ff4e6f7-bce6-4e7f-a5af-e0c67cfa7e55"
```
@@ -57,11 +57,21 @@ Acceptance:
- Keep the data contract usable by executable harnesses, hosted suites, and
procedural packs.
Progress:
- Added `docs/schemas/challenge-set.schema.json` and
`docs/schemas/exclusion-set.schema.json`.
- Added optional `challenges_ref` and `exclusions_ref` assessment profile
fields.
- Supported requirement, check, evidence, result, classification, authority
source, owner, review status, rationale, review date, expiry, native ID, and
metadata fields.
## D5.2 - Policy Application And Finding Annotation
```task
id: GUIDE-BOARD-WP-0005-T002
status: todo
status: done
priority: high
state_hub_task_id: "fd384bd3-40c4-4344-8b7d-cb123dbf2cac"
```
@@ -76,11 +86,20 @@ Acceptance:
- Add tests that prove challenge and exclusion records affect reporting without
corrupting gate semantics.
Progress:
- Loaded challenge and exclusion refs through the policy layer.
- Annotated findings with challenge refs, exclusion refs, and review status.
- Annotated matching evidence with review refs.
- Kept default `unexpected_findings` gate semantics visible unless a finding is
separately expected or waived.
- Added tests proving challenged and excluded findings remain gate-visible.
## D5.3 - Report Visibility And Review Workflow
```task
id: GUIDE-BOARD-WP-0005-T003
status: todo
status: done
priority: medium
state_hub_task_id: "791071c0-8a9a-462b-83b3-75548bb8524f"
```
@@ -94,11 +113,19 @@ Acceptance:
run.
- Document how an operator should treat challenged or excluded findings.
Progress:
- Added Markdown report review summaries.
- Added challenge, exclusion, unresolved defect, and unresolved review counts to
retention summaries and trend projections.
- Included applied challenge and exclusion records in JSON assessment packages.
- Exposed review counts through existing retained run helpers.
## D5.4 - Tests And Documentation
```task
id: GUIDE-BOARD-WP-0005-T004
status: todo
status: done
priority: medium
state_hub_task_id: "43b966da-af8d-479b-93bd-6b6741fdab37"
```
@@ -111,6 +138,14 @@ Acceptance:
- Update assessment operations, extension SDK, and compliance evidence pack docs.
- Keep certification boundary language explicit.
Progress:
- Added focused schema and policy tests through a fixture extension scenario.
- Updated assessment operations, extension SDK, compliance evidence pack, and
architecture docs.
- Kept boundary language explicit: challenges and exclusions are review state,
not certification conclusions.
## Definition Of Done
- The core has separate, tested concepts for expectations, waivers, challenges,