generated from coulomb/repo-seed
324 lines
12 KiB
Markdown
324 lines
12 KiB
Markdown
# Practical Schema Framework Research
|
||
|
||
Date: 2026-05-03
|
||
|
||
## Purpose
|
||
|
||
This document reassesses `markitect-tool` schema utility before further
|
||
implementation. The concern is that pure structural validation, such as heading
|
||
counts and min/max depth constraints, is rarely enough to make markdown document
|
||
pipelines useful.
|
||
|
||
The practical opportunity is to define a stronger framework for markdown-native
|
||
document contracts: section specifications, content assertions, form fields,
|
||
context-aware rules, LLM-assisted assessments, and high-quality diagnostics.
|
||
|
||
## Research Signals
|
||
|
||
### Structured Authoring
|
||
|
||
DITA is the strongest analogue for typed, reusable textual units. It emphasizes
|
||
information typing, semantic markup, modularity, reuse, interchange, and
|
||
multiple deliverables from one source. A DITA topic is the unit of authoring and
|
||
reuse; topics may be generic or specialized into roles such as concept, task, or
|
||
reference.
|
||
|
||
Relevance for `markitect-tool`:
|
||
|
||
- A markdown document or section should have an explicit information type.
|
||
- Information type should imply expected structure and reader purpose.
|
||
- Reuse and composition need stable addressing of sections, not only files.
|
||
- Specialization is a better mental model than ad hoc schema forks.
|
||
|
||
Sources:
|
||
|
||
- https://dita-lang.org/dita/archspec/base/basic-concepts
|
||
- https://dita-lang.org/dita/archspec/base/introduction-to-dita
|
||
|
||
### Document Schemas With Assertions
|
||
|
||
DocBook remains relevant because it combines formal document schemas with
|
||
Schematron-style assertions. That is the missing layer in many simplistic JSON
|
||
Schema approaches: grammar says what may exist; assertions say what must be true
|
||
in context.
|
||
|
||
Relevance for `markitect-tool`:
|
||
|
||
- JSON Schema over `Document.to_dict()` is useful but insufficient.
|
||
- We need a second assertion layer for document-specific semantics.
|
||
- Diagnostics must point to the document location and rule intention.
|
||
|
||
Source:
|
||
|
||
- https://docbook.org/schemas/docbook/
|
||
|
||
### Dynamic Form Rules
|
||
|
||
JSON Schema supports conditional validation through `dependentRequired`,
|
||
`dependentSchemas`, and `if`/`then`/`else`. JSON Forms separates data schema
|
||
from UI schema and uses rules to show, hide, enable, or disable UI elements
|
||
based on JSON Schema conditions. Form.io’s architecture treats the form schema
|
||
as a single source of truth for validation and conditional logic across client
|
||
and server.
|
||
|
||
Relevance for `markitect-tool`:
|
||
|
||
- Forms should be first-class, not bolted onto document generation.
|
||
- Field definitions need static validation and dynamic rules.
|
||
- Prefill, visibility, requiredness, and calculated values should come from the
|
||
same contract used for generation and validation.
|
||
- Context data must be explicit and typed.
|
||
|
||
Sources:
|
||
|
||
- https://json-schema.org/understanding-json-schema/reference/conditionals
|
||
- https://jsonforms.io/docs/uischema/rules/
|
||
- https://form.io/features/form-conditional-logic-form-validation/
|
||
|
||
### LLM-Assisted Assessment
|
||
|
||
Modern evaluation frameworks treat LLM assessment as explicit graders or
|
||
rubrics. OpenAI graders return scores in a 0–1 range and can combine grader
|
||
types. Promptfoo’s `llm-rubric` uses explicit criteria and expects structured
|
||
judge output with reason, score, and pass/fail.
|
||
|
||
Relevance for `markitect-tool`:
|
||
|
||
- LLM checks should be declared as assessment rules, not hidden in prompts.
|
||
- Deterministic validation and LLM assessment should produce one diagnostic
|
||
model.
|
||
- Section-level rubrics are more useful than whole-document vague grading.
|
||
- The LLM provider must remain external; `markitect-tool` defines contracts and
|
||
reports.
|
||
|
||
Sources:
|
||
|
||
- https://developers.openai.com/api/docs/guides/graders
|
||
- https://www.promptfoo.dev/docs/configuration/expected-outputs/model-graded/llm-rubric/
|
||
|
||
### Markdown Structure
|
||
|
||
CommonMark gives markdown a well-defined block/inline model. mdast gives a
|
||
language-neutral tree vocabulary for Markdown nodes. Both point toward keeping
|
||
the parse layer separate from domain/schema layers.
|
||
|
||
Relevance for `markitect-tool`:
|
||
|
||
- The core document model should stay close to CommonMark/mdast concepts.
|
||
- Practical document contracts should sit above the parse model.
|
||
- Section addressing, source spans, and block identity are foundational for good
|
||
diagnostics.
|
||
|
||
Sources:
|
||
|
||
- https://spec.commonmark.org/0.31.2/
|
||
- https://github.com/syntax-tree/mdast
|
||
|
||
## Terminology Proposal
|
||
|
||
| Term | Meaning |
|
||
| --- | --- |
|
||
| Document | A markdown artifact parsed into frontmatter, blocks, headings, sections, and source spans. |
|
||
| Section | A heading-led document region with content, children, source location, and stable identity. |
|
||
| Document Type | A named contract for a whole document, e.g. ADR, PRD, invoice letter, support reply, concept note. |
|
||
| Section Type | A reusable role for a section, e.g. Context, Decision, Risks, Procedure, Evidence, Conclusion. |
|
||
| Field | A typed value expected in frontmatter, inline matter, a section, or an external data record. |
|
||
| Form | A field collection with UI hints, validation rules, defaults, dynamic visibility, and calculations. |
|
||
| Context | External data available during validation/generation, such as user data, project data, dates, or related entities. |
|
||
| Rule | A deterministic condition evaluated against document, fields, context, or pipeline state. |
|
||
| Assertion | A claim that must hold for content, usually richer than shape validation. |
|
||
| Metric Band | A soft or hard target for size/complexity, such as word count, sentence count, section count, or reading level. |
|
||
| Assessment | A deterministic or LLM-assisted evaluation that returns pass/fail, score, reason, and diagnostics. |
|
||
| Rubric | A human-readable criterion for LLM-assisted assessment, scoped to a document or section type. |
|
||
| Diagnostic | A structured finding with severity, code, message, source location, rule id, and suggested repair. |
|
||
| Contract | The full specification for a document type: structure, sections, fields, rules, forms, assertions, rubrics, and outputs. |
|
||
| Pipeline | A repeatable sequence of parse, prefill, generate, validate, assess, transform, and compose operations. |
|
||
|
||
## Most Relevant Use Cases
|
||
|
||
### UC-001: Typed Document Contract
|
||
|
||
Define a document type such as ADR, PRD, FRS, workplan, customer letter, or
|
||
meeting brief. Specify required sections by semantic role, allowed alternatives,
|
||
field requirements, and diagnostics.
|
||
|
||
Practical value:
|
||
|
||
- Prevents missing critical content.
|
||
- Makes generated documents predictable.
|
||
- Creates an explicit contract for humans and agents.
|
||
|
||
Needed tooling:
|
||
|
||
- `mkt contract check <doc> --contract <contract.md>`
|
||
- Section matching by heading text, aliases, ids, or section type markers.
|
||
- Diagnostics that say which section/field/assertion failed and why.
|
||
|
||
### UC-002: Section-Level Content Expectations
|
||
|
||
Specify what a section is expected to contain: assertions, required evidence,
|
||
forbidden omissions, content patterns, examples, and reviewer prompts.
|
||
|
||
Practical value:
|
||
|
||
- Moves beyond “has a heading” toward “does the section do its job?”
|
||
- Enables review of generated or human-authored text.
|
||
|
||
Needed tooling:
|
||
|
||
- Deterministic assertions for regex, presence, references, counts, and field
|
||
values.
|
||
- Optional LLM rubrics for semantic content checks.
|
||
- Per-section diagnostic reports.
|
||
|
||
### UC-003: Size and Complexity Bands
|
||
|
||
Define soft/hard bands for document and section size: words, characters,
|
||
sentences, paragraphs, sections, list items, code blocks, and nesting depth.
|
||
|
||
Practical value:
|
||
|
||
- Controls generation output size.
|
||
- Keeps templates from becoming bloated or underdeveloped.
|
||
- Helps compare intended vs actual document complexity.
|
||
|
||
Needed tooling:
|
||
|
||
- Metrics extractor.
|
||
- Rule severities: info, warning, error.
|
||
- “Too small/too large” diagnostics with actual and target values.
|
||
|
||
### UC-004: Form-Backed Markdown Generation
|
||
|
||
Define forms that collect or prefill structured fields, then render markdown
|
||
documents. Fields may be static, calculated, conditional, or context-derived.
|
||
|
||
Practical value:
|
||
|
||
- Bridges structured data capture and prose generation.
|
||
- Supports repeatable business documents.
|
||
- Makes prefill from user/project/entity data explicit.
|
||
|
||
Needed tooling:
|
||
|
||
- Field schema.
|
||
- UI schema or form hints.
|
||
- Dynamic rules for requiredness, visibility, defaults, and calculations.
|
||
- Template rendering with validation before and after render.
|
||
|
||
### UC-005: Context-Aware Validation
|
||
|
||
Validate a document against external context: user data, project metadata,
|
||
related entities, dates, policy constraints, or canonical terminology.
|
||
|
||
Practical value:
|
||
|
||
- Checks whether a document is correct for this case, not only generally
|
||
well-formed.
|
||
- Enables pipelines like personalized letters, compliance reports, and
|
||
project-specific workplans.
|
||
|
||
Needed tooling:
|
||
|
||
- Context object schema.
|
||
- Resolvers for local files, JSON/YAML data, and later higher-layer systems.
|
||
- Rule expressions that can reference document and context paths.
|
||
|
||
### UC-006: LLM-Assisted Section Assessment
|
||
|
||
Attach rubrics to section types. Use an external LLM adapter to assess whether a
|
||
section satisfies the rubric, returning score, reason, and pass/fail.
|
||
|
||
Practical value:
|
||
|
||
- Handles semantic checks that deterministic rules cannot.
|
||
- Supports review loops for generated text.
|
||
- Makes subjective requirements explicit and auditable.
|
||
|
||
Needed tooling:
|
||
|
||
- Rubric declaration format.
|
||
- Provider-neutral assessment request/response models.
|
||
- Caching and reproducibility metadata.
|
||
- Clear distinction between deterministic errors and model-judged findings.
|
||
|
||
### UC-007: Pipeline Diagnostics and Repair Guidance
|
||
|
||
Run a document pipeline and get one coherent diagnostic report from parsing,
|
||
schema checks, field validation, assertions, generation, composition, and
|
||
LLM-assisted assessments.
|
||
|
||
Practical value:
|
||
|
||
- Makes failures debuggable.
|
||
- Helps humans and agents repair documents.
|
||
- Avoids scattered errors from unrelated subsystems.
|
||
|
||
Needed tooling:
|
||
|
||
- Common diagnostic model.
|
||
- Error codes and severities.
|
||
- Source spans and rule ids.
|
||
- Suggested repair text or structured patches when safe.
|
||
|
||
## Comparison With markitect-main
|
||
|
||
`markitect-main` had several useful seeds:
|
||
|
||
- `x-markitect-sections` for required/recommended/optional/discouraged/improper sections.
|
||
- `x-markitect-content-control` for required, discouraged, and forbidden patterns plus word-count metrics.
|
||
- Section and content validators with warnings/errors.
|
||
- Schema generation and validation experiments.
|
||
- Draft generation with `x-markitect-field-mapping`.
|
||
- Prompt quality gates with schema and pattern validators.
|
||
- Infospace entity parsing and LLM classification/evaluation.
|
||
|
||
The problem was not lack of ideas. The problem was that the ideas lived in
|
||
separate subsystems with different models:
|
||
|
||
- Schema validation compared generated schemas rather than validating a stable
|
||
document contract.
|
||
- Semantic validation used `x-markitect-*` extensions but was not integrated
|
||
into a unified contract framework.
|
||
- Field mapping existed in draft generation, not in a general form/context
|
||
model.
|
||
- LLM quality gates existed inside prompt execution, not as provider-neutral
|
||
document assessments.
|
||
- Infospace checks were domain/application layer behavior, not syntax-layer
|
||
primitives.
|
||
|
||
## Strategic Direction
|
||
|
||
The successor should introduce a framework layer above parsing:
|
||
|
||
```text
|
||
Markdown parse model
|
||
-> document contract
|
||
-> section specifications
|
||
-> field/form specifications
|
||
-> deterministic rules/assertions
|
||
-> metric bands
|
||
-> optional LLM rubrics
|
||
-> unified diagnostics
|
||
```
|
||
|
||
This should not replace JSON Schema. JSON Schema remains useful for typed data
|
||
and machine validation. The new layer should make document-specific semantics
|
||
natural.
|
||
|
||
## Recommendation
|
||
|
||
Do not continue straight into generic query/transform work until this framework
|
||
direction is captured. The next implementation slice should be a small,
|
||
deterministic version of document contracts:
|
||
|
||
1. Define the contract schema and terminology.
|
||
2. Implement section specifications.
|
||
3. Implement metric bands.
|
||
4. Implement the unified diagnostic model.
|
||
5. Leave LLM rubrics and form dynamics as designed extension points for the next
|
||
slice.
|
||
|
||
This is the utility inflection point. It will make `markitect-tool` practically
|
||
useful instead of merely structurally correct.
|