Followup workplan for more in depth document information processing

2026-05-03 23:05:11 +02:00
parent 28043f491e
commit 910503701e
2 changed files with 265 additions and 0 deletions
--- a/docs/contract-framework.md
+++ b/docs/contract-framework.md
@@ -152,6 +152,22 @@ It should return:
 - model/provider metadata
 - diagnostics using the shared diagnostic model
 ## Deferred Runtime Work
 The deterministic contract framework is ready now. The runtime engines are
 deferred to `MKTT-WP-0005-runtime-context-and-assessment-engines.md`.
 Pick that work up when one of these becomes true:
 - contract checks need external user, project, or entity context
 - generation needs reliable field prefill before rendering
 - a UI or agent workflow needs form state, defaults, and dynamic requiredness
 - deterministic section assertions are not enough and rubric-based semantic
  assessment becomes necessary
 The intended order is context and form runtime first, deterministic dynamic
 rules second, LLM assessment execution third.
 ## CLI
 ```text
--- a/workplans/MKTT-WP-0005-runtime-context-and-assessment-engines.md
+++ b/workplans/MKTT-WP-0005-runtime-context-and-assessment-engines.md
@@ -0,0 +1,249 @@
 ---
 id: MKTT-WP-0005
 type: workplan
 title: "Runtime Context, Form, and Assessment Engines"
 domain: markitect
 status: todo
 owner: markitect-tool
 topic_slug: markitect
 created: "2026-05-03"
 updated: "2026-05-03"
 state_hub_workstream_id: "7918687e-2364-46b1-ab7e-65aa77cb8449"
 ---
 # MKTT-WP-0005: Runtime Context, Form, and Assessment Engines
 ## Purpose
 Turn the contract framework extension points into executable runtime engines:
 context loading, field prefill, form state evaluation, dynamic rules, and
 provider-neutral LLM assessment execution.
 This workplan picks up the deferred runtime scope from
 `MKTT-WP-0004-practical-contract-framework.md`.
 ## Decision
 Do not start this immediately unless one of these is true:
 - We are implementing template/generation flows that need reliable field
  prefill and pre-render validation.
 - We need document checks that depend on case-specific external context.
 - Deterministic assertions are no longer enough to assess whether sections do
  their semantic job.
 - A user-facing or agent-facing workflow needs structured form state, defaults,
  conditional requiredness, or guided repair.
 Recommended sequencing:
 1. Implement context and form runtime first.
 2. Add deterministic context-aware rules.
 3. Add LLM assessment execution only after the diagnostic/caching boundary is
   stable.
 This is probably not the next immediate implementation if we want to first
 finish core query/extraction and deterministic transform primitives. It should
 come before serious generation pipelines or any LLM review loop.
 ## Background
 The deterministic contract framework already supports:
 - field declarations
 - deterministic section assertions
 - metric bands
 - provider-neutral rubric declarations as contract vocabulary
 - one shared diagnostic model
 It does not yet execute:
 - context resolvers
 - form state evaluation
 - dynamic requiredness or visibility
 - calculated values
 - prefill
 - provider-neutral LLM assessment requests
 - assessment result caching
 ## P5.1 - Define runtime context model
 ```task
 id: MKTT-WP-0005-T001
 status: todo
 priority: high
 state_hub_task_id: "e24e6238-efef-41c4-9f1e-ca677c1be89b"
 ```
 Define how external context is supplied to contract checks and generation:
 - inline YAML/JSON files
 - named context objects
 - typed context schemas
 - explicit source paths
 - conflict behavior when frontmatter and context both provide a value
 - diagnostic behavior for missing or malformed context
 Expected output: design notes and tests for context loading.
 ## P5.2 - Implement context resolver API and CLI input
 ```task
 id: MKTT-WP-0005-T002
 status: todo
 priority: high
 state_hub_task_id: "d180bb6d-dae8-4305-88de-64c80b708b8a"
 ```
 Add a small runtime API and CLI option such as:
 ```text
 mkt contract check <document.md> --contract <contract.md> --context <context.yaml>
 ```
 Resolvers must be deterministic, local-first, and provider-neutral. Network or
 application-specific data access belongs in adapters outside the core package.
 ## P5.3 - Implement field prefill and validation runtime
 ```task
 id: MKTT-WP-0005-T003
 status: todo
 priority: high
 state_hub_task_id: "b954984a-6f67-4e5b-8744-35e3c4fcc992"
 ```
 Evaluate field specs against document frontmatter and context:
 - required fields
 - defaults
 - source paths
 - enum/pattern/range validation
 - type coercion policy
 - diagnostics for missing, ambiguous, or conflicting values
 Expected utility: contracts become useful before generation, not only after a
 document exists.
 ## P5.4 - Implement form state model
 ```task
 id: MKTT-WP-0005-T004
 status: todo
 priority: medium
 state_hub_task_id: "cccdf868-2308-42a1-b564-8b54fccd3c8b"
 ```
 Represent form state without binding to a UI framework:
 - field id
 - value
 - defaulted/prefilled/manual/calculated origin
 - visible/hidden
 - enabled/disabled
 - required/optional
 - validation diagnostics
 - display hints as metadata, not behavior
 This should support future UI adapters while remaining useful from the CLI.
 ## P5.5 - Implement dynamic rules
 ```task
 id: MKTT-WP-0005-T005
 status: todo
 priority: medium
 state_hub_task_id: "6e420e1e-2465-40d3-8e64-d8681a294e63"
 ```
 Add deterministic dynamic rules for field and section behavior:
 - `if` / `then` / `else`
 - requiredness
 - visibility
 - allowed values
 - calculated values
 - context-dependent assertions
 Keep the expression language intentionally small. Prefer JSON/YAML paths and a
 small set of operators over embedding a general programming language.
 ## P5.6 - Define LLM assessment execution interface
 ```task
 id: MKTT-WP-0005-T006
 status: todo
 priority: medium
 state_hub_task_id: "24b22b3a-e89e-4946-81f4-94f971a11979"
 ```
 Define provider-neutral request/response models for rubric execution:
 - contract id
 - rule id
 - scope: document, section, or field
 - text and structured inputs
 - context snapshot
 - rubric criteria
 - cache key material
 - pass/fail
 - score
 - reason
 - model/provider metadata
 - diagnostics
 Core package should define the protocol and result model, not a provider
 implementation.
 ## P5.7 - Add assessment runner and cache boundary
 ```task
 id: MKTT-WP-0005-T007
 status: todo
 priority: medium
 state_hub_task_id: "b09b77e2-59c0-4d31-b246-685b742d111f"
 ```
 Implement a runner that can invoke an injected assessment adapter and normalize
 results into diagnostics.
 Add deterministic cache key calculation but keep storage pluggable. The default
 cache may be local file-based only if it remains transparent and easy to reset.
 ## P5.8 - Add examples and failure diagnostics
 ```task
 id: MKTT-WP-0005-T008
 status: todo
 priority: high
 state_hub_task_id: "2efb8233-3154-4824-a898-6fcde37330c5"
 ```
 Create examples that show the practical value:
 - business letter with context-prefilled recipient/sender data
 - PRD/FRS with context-dependent product metadata
 - workplan where task requirements depend on status
 - concept note with LLM rubric declaration and mocked assessment output
 Each example should include expected diagnostics for missing context, ambiguous
 prefill, invalid dynamic rules, and assessment failures.
 ## Open Questions
 - Should context values override frontmatter, or should conflicts always be
  diagnostics until explicitly resolved?
 - Should the first dynamic rule syntax reuse JSON Schema conditionals or define
  a smaller Markitect-native rule vocabulary?
 - Should LLM assessment execution live behind an optional extra, or only in
  external adapters?
 - What cache invalidation metadata is sufficient for assessment reproducibility
  without pretending model judgments are deterministic?
 ## Exit Criteria
 - Runtime context can be supplied to contract checks.
 - Field prefill and validation produce unified diagnostics.
 - Form state can be rendered by a future adapter without changing core models.
 - Dynamic rules cover common requiredness, visibility, and context assertions.
 - LLM rubric execution has a provider-neutral protocol and mocked test adapter.
 - Examples demonstrate utility beyond static document validation.