diff --git a/docs/template-generation.md b/docs/template-generation.md new file mode 100644 index 0000000..0835145 --- /dev/null +++ b/docs/template-generation.md @@ -0,0 +1,89 @@ +# Templates and Generation Hooks + +`markitect-tool` keeps generation deterministic in core. LLM-assisted +generation is represented as an external hook boundary, not as a provider +dependency. + +## Template Variables + +Templates use simple double-brace variables: + +```markdown +# {{title}} + +Owner: {{owner.name}} + +{{body}} +``` + +Variables can use dot paths into JSON/YAML data. + +```bash +mkt template inspect template.md +mkt template render template.md --data data.yaml --output rendered.md +mkt template render template.md --set title="Draft" --set owner.name=Ada +``` + +Strict rendering is the default. Use `--lenient` to keep unresolved +placeholders in place. + +## Contract Stubs + +Contracts can generate first-draft Markdown stubs: + +```bash +mkt generate stub --contract examples/contracts/adr.contract.md --set status=proposed +``` + +The stub generator creates frontmatter from contract fields and emits required +and recommended sections. Forbidden sections are skipped. + +## Rule-Based Generation + +A generation plan is a Markdown file with a fenced YAML block: + +````markdown +# Letter Generation + +```yaml generation +template: templates/letter.md +data_file: data/letter.yaml +output: out/letter.md +``` +```` + +Run it with: + +```bash +mkt generate rules rules.md --output-dir . +``` + +The `documents` key can render multiple outputs: + +```yaml +documents: + - template: templates/letter.md + data: + title: First + output: out/first.md + - template: templates/letter.md + data_file: data/second.yaml + output: out/second.md +``` + +## Assisted Generation Hook + +FR-042 is supported as a protocol boundary. External packages can implement a +hook with: + +```python +from markitect_tool.generation import GenerationHookRequest, GenerationHookResult + +class MyHook: + def generate(self, request: GenerationHookRequest) -> GenerationHookResult: + ... +``` + +Core does not call an LLM provider by itself. Higher layers can pass a hook into +`generate_with_hook(...)` after handling credentials, policy, and provider +selection. diff --git a/docs/workplan-planning-map.md b/docs/workplan-planning-map.md index 32c36f5..5245300 100644 --- a/docs/workplan-planning-map.md +++ b/docs/workplan-planning-map.md @@ -30,9 +30,9 @@ and descriptions mirror the operational view. | `MKTT-WP-0001` | complete | done | none | Repository foundation is complete. | | `MKTT-WP-0002` | complete | done | `MKTT-WP-0001` | Legacy scope extraction is complete. | | `MKTT-WP-0004` | complete | done | `MKTT-WP-0001`, `MKTT-WP-0002` | Contract framework is complete and informs later validation/generation work. | -| `MKTT-WP-0003` | P0 | active | `MKTT-WP-0001`, `MKTT-WP-0002`, `MKTT-WP-0004` | Mainline implementation. P3.5 is complete; continue with P3.6 templating/generation hooks. | +| `MKTT-WP-0003` | P0 | active | `MKTT-WP-0001`, `MKTT-WP-0002`, `MKTT-WP-0004` | Mainline implementation. P3.6 is complete; P3.7 caching remains. | | `MKTT-WP-0006` | P1 | todo | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T005` | Ready after transform/composition shape is clear; should account for future reference/provenance needs. | -| `MKTT-WP-0010` | P1 | todo | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T006` | Preserve richer content-reference, processor, explode/implode, and weave/tangle architecture after P3.6. | +| `MKTT-WP-0010` | P1 | todo | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T006` | Trigger is satisfied; keep as the richer content-reference, processor, explode/implode, and weave/tangle track. | | `MKTT-WP-0007` | P2 | todo | `MKTT-WP-0006` | First practical cache backend use case: AST/JSONPath/SQLite/FTS. | | `MKTT-WP-0005` | P2 | todo | `MKTT-WP-0003`, `MKTT-WP-0004` | Pick up when generation/form/context or semantic assessment pressure appears. | | `MKTT-WP-0009` | P2 | todo | `MKTT-WP-0006` | Establish access-control gateway before security-sensitive cache/context use. | @@ -46,10 +46,9 @@ It should wait until `MKTT-WP-0003-T005` gives transform/composition enough shape to know what cached identities and invalidation rules must preserve. The second important nuance is `MKTT-WP-0010`: it captures richer content -reference, processor, explode/implode, and weave/tangle work. It should wait -until `MKTT-WP-0003-T006` defines the deterministic templating/generation hook -surface, but it should inform backend, index, context-memory, and access-control -architecture before those become rigid. +reference, processor, explode/implode, and weave/tangle work. Its task-level +trigger `MKTT-WP-0003-T006` is now satisfied. It should inform backend, index, +context-memory, and access-control architecture before those become rigid. These are mixed task/workstream dependencies. State Hub does not currently model them natively. diff --git a/examples/templates/adr-summary.data.yaml b/examples/templates/adr-summary.data.yaml new file mode 100644 index 0000000..b2a7160 --- /dev/null +++ b/examples/templates/adr-summary.data.yaml @@ -0,0 +1,4 @@ +title: Use Deterministic Templates +status: proposed +context: Authors need repeatable Markdown generation before assisted generation. +decision: Use small data-bound templates and keep provider hooks external. diff --git a/examples/templates/adr-summary.generation.md b/examples/templates/adr-summary.generation.md new file mode 100644 index 0000000..2fbbd80 --- /dev/null +++ b/examples/templates/adr-summary.generation.md @@ -0,0 +1,7 @@ +# ADR Summary Generation + +```yaml generation +template: adr-summary.template.md +data_file: adr-summary.data.yaml +output: adr-summary.generated.md +``` diff --git a/examples/templates/adr-summary.template.md b/examples/templates/adr-summary.template.md new file mode 100644 index 0000000..80a855c --- /dev/null +++ b/examples/templates/adr-summary.template.md @@ -0,0 +1,11 @@ +# {{title}} + +Status: {{status}} + +## Context + +{{context}} + +## Decision + +{{decision}} diff --git a/src/markitect_tool/__init__.py b/src/markitect_tool/__init__.py index 75597b3..32d797c 100644 --- a/src/markitect_tool/__init__.py +++ b/src/markitect_tool/__init__.py @@ -21,6 +21,17 @@ from markitect_tool.contract import ( validate_contract_file, ) from markitect_tool.diagnostics import Diagnostic, SourceLocation +from markitect_tool.generation import ( + GeneratedDocument, + GenerationHookRequest, + GenerationHookResult, + GenerationPlan, + GenerationResult, + generate_stub_from_contract, + generate_with_hook, + load_generation_plan_file, + run_generation_plan, +) from markitect_tool.ops import ( ComposeResult, IncludeError, @@ -44,6 +55,14 @@ from markitect_tool.schema import ( validate_document, validate_markdown_file, ) +from markitect_tool.template import ( + MissingTemplateVariable, + TemplateAnalysis, + TemplateError, + TemplateRenderResult, + analyze_template, + render_template, +) __all__ = [ "ContentBlock", @@ -70,6 +89,15 @@ __all__ = [ "validate_contract_file", "Diagnostic", "SourceLocation", + "GeneratedDocument", + "GenerationHookRequest", + "GenerationHookResult", + "GenerationPlan", + "GenerationResult", + "generate_stub_from_contract", + "generate_with_hook", + "load_generation_plan_file", + "run_generation_plan", "ComposeResult", "IncludeError", "IncludeResult", @@ -81,4 +109,10 @@ __all__ = [ "QueryMatch", "extract_document", "query_document", + "MissingTemplateVariable", + "TemplateAnalysis", + "TemplateError", + "TemplateRenderResult", + "analyze_template", + "render_template", ] diff --git a/src/markitect_tool/cli/main.py b/src/markitect_tool/cli/main.py index 9f76f51..c18d696 100644 --- a/src/markitect_tool/cli/main.py +++ b/src/markitect_tool/cli/main.py @@ -16,9 +16,22 @@ from markitect_tool.contract import ( load_contract_file, validate_contract, ) +from markitect_tool.generation import ( + GenerationPlanError, + generate_stub_from_contract, + load_data_file, + load_generation_plan_file, + run_generation_plan, +) from markitect_tool.ops import IncludeError, compose_files, resolve_includes, transform_markdown from markitect_tool.query import InvalidQueryError, extract_document, query_document from markitect_tool.schema import load_schema_file, validate_markdown_file, validate_schema +from markitect_tool.template import ( + MissingTemplateVariable, + TemplateError, + analyze_template, + render_template, +) @click.group() @@ -275,6 +288,179 @@ def include( _emit_markdown_result(result.to_dict(), output_format, output) +@main.group() +def template() -> None: + """Render and inspect deterministic Markdown templates.""" + + +@template.command("inspect") +@click.argument("template_file", type=click.Path(exists=True, dir_okay=False, path_type=Path)) +@click.option( + "--format", + "output_format", + type=click.Choice(["json", "yaml", "text"], case_sensitive=False), + default="text", + show_default=True, +) +def template_inspect(template_file: Path, output_format: str) -> None: + """Inspect variables required by a template.""" + + data = analyze_template(template_file.read_text(encoding="utf-8")).to_dict() | { + "template_path": str(template_file) + } + _emit_template_analysis(data, output_format) + raise click.exceptions.Exit(0 if data["valid"] else 1) + + +@template.command("render") +@click.argument("template_file", type=click.Path(exists=True, dir_okay=False, path_type=Path)) +@click.option( + "--data", + "data_file", + type=click.Path(exists=True, dir_okay=False, path_type=Path), + help="JSON, YAML, or CSV data file. CSV must contain one record for render.", +) +@click.option( + "--set", + "set_values", + multiple=True, + metavar="KEY=VALUE", + help="Set a template data value. Dot paths create nested mappings.", +) +@click.option("--lenient", is_flag=True, help="Keep unresolved placeholders instead of failing.") +@click.option( + "--output", + type=click.Path(dir_okay=False, path_type=Path), + help="Write rendered Markdown to a file.", +) +@click.option( + "--format", + "output_format", + type=click.Choice(["markdown", "json", "yaml"], case_sensitive=False), + default="markdown", + show_default=True, +) +def template_render( + template_file: Path, + data_file: Path | None, + set_values: tuple[str, ...], + lenient: bool, + output: Path | None, + output_format: str, +) -> None: + """Render a Markdown template with structured data.""" + + try: + data = _load_template_data(data_file) + data = _deep_merge_cli(data, _parse_key_value_options(set_values)) + result = render_template( + template_file.read_text(encoding="utf-8"), + data, + strict=not lenient, + ) + except (MissingTemplateVariable, TemplateError, ValueError, TypeError) as exc: + raise click.ClickException(str(exc)) from exc + _emit_markdown_result(result.to_dict(), output_format, output) + + +@main.group() +def generate() -> None: + """Generate Markdown from contracts, rules, or external hooks.""" + + +@generate.command("stub") +@click.option( + "--contract", + "contract_file", + required=True, + type=click.Path(exists=True, dir_okay=False, path_type=Path), + help="Markdown document contract to generate from.", +) +@click.option( + "--data", + "data_file", + type=click.Path(exists=True, dir_okay=False, path_type=Path), + help="Optional JSON/YAML data for frontmatter values.", +) +@click.option( + "--set", + "set_values", + multiple=True, + metavar="KEY=VALUE", + help="Set generation data. Dot paths create nested mappings.", +) +@click.option("--include-optional", is_flag=True, help="Include optional contract sections.") +@click.option( + "--output", + type=click.Path(dir_okay=False, path_type=Path), + help="Write generated Markdown to a file.", +) +@click.option( + "--format", + "output_format", + type=click.Choice(["markdown", "json", "yaml"], case_sensitive=False), + default="markdown", + show_default=True, +) +def generate_stub( + contract_file: Path, + data_file: Path | None, + set_values: tuple[str, ...], + include_optional: bool, + output: Path | None, + output_format: str, +) -> None: + """Generate a Markdown stub from a document contract.""" + + try: + data = _load_template_data(data_file) + data = _deep_merge_cli(data, _parse_key_value_options(set_values)) + result = generate_stub_from_contract( + load_contract_file(contract_file), + data=data, + include_optional=include_optional, + ) + except (ContractLoaderError, ValueError, TypeError) as exc: + raise click.ClickException(str(exc)) from exc + _emit_markdown_result(result.to_dict(), output_format, output) + + +@generate.command("rules") +@click.argument("rules_file", type=click.Path(exists=True, dir_okay=False, path_type=Path)) +@click.option( + "--output-dir", + type=click.Path(file_okay=False, path_type=Path), + help="Directory used for relative output paths in the plan.", +) +@click.option("--dry-run", is_flag=True, help="Render without writing output files.") +@click.option( + "--format", + "output_format", + type=click.Choice(["json", "yaml"], case_sensitive=False), + default="json", + show_default=True, +) +def generate_rules( + rules_file: Path, + output_dir: Path | None, + dry_run: bool, + output_format: str, +) -> None: + """Run a Markdown/YAML generation plan.""" + + try: + plan = load_generation_plan_file(rules_file) + result = run_generation_plan( + plan, + base_dir=rules_file.parent, + output_dir=output_dir, + dry_run=dry_run, + ) + except (GenerationPlanError, TemplateError, MissingTemplateVariable) as exc: + raise click.ClickException(str(exc)) from exc + _emit_jsonish(result.to_dict(), output_format) + + @main.command() @click.argument("file", type=click.Path(exists=True, dir_okay=False, path_type=Path)) @click.option( @@ -461,6 +647,27 @@ def _emit_markdown_result(data: dict, output_format: str, output: Path | None) - click.echo(markdown, nl=False) +def _emit_jsonish(data: dict, output_format: str) -> None: + if output_format == "yaml": + click.echo(yaml.safe_dump(data, sort_keys=False)) + else: + click.echo(json.dumps(data, indent=2, ensure_ascii=False)) + + +def _emit_template_analysis(data: dict, output_format: str) -> None: + if output_format == "json": + click.echo(json.dumps(data, indent=2, ensure_ascii=False)) + elif output_format == "yaml": + click.echo(yaml.safe_dump(data, sort_keys=False)) + else: + click.echo("valid" if data["valid"] else "invalid") + click.echo(f"variables: {data['unique_variables']}") + for variable in data["variables"]: + click.echo(f"- {variable}") + for error in data["syntax_errors"]: + click.echo(f"! {error}") + + def _parse_key_value_options(items: tuple[str, ...]) -> dict[str, object]: values: dict[str, object] = {} for item in items: @@ -484,5 +691,28 @@ def _set_path(mapping: dict[str, object], path: list[str], value: object) -> Non current[path[-1]] = value +def _load_template_data(data_file: Path | None) -> dict[str, object]: + if data_file is None: + return {} + data = load_data_file(data_file) + if isinstance(data, list): + if len(data) != 1: + raise ValueError("Template render expects exactly one CSV record") + data = data[0] + if not isinstance(data, dict): + raise ValueError("Template data must be a mapping") + return data + + +def _deep_merge_cli(left: dict[str, object], right: dict[str, object]) -> dict[str, object]: + merged = dict(left) + for key, value in right.items(): + if isinstance(merged.get(key), dict) and isinstance(value, dict): + merged[key] = _deep_merge_cli(merged[key], value) + else: + merged[key] = value + return merged + + if __name__ == "__main__": main() diff --git a/src/markitect_tool/generation/__init__.py b/src/markitect_tool/generation/__init__.py new file mode 100644 index 0000000..152a3c6 --- /dev/null +++ b/src/markitect_tool/generation/__init__.py @@ -0,0 +1,31 @@ +"""Deterministic Markdown generation primitives and hook boundaries.""" + +from markitect_tool.generation.engine import ( + GeneratedDocument, + GenerationHook, + GenerationHookRequest, + GenerationHookResult, + GenerationPlan, + GenerationPlanError, + GenerationResult, + generate_stub_from_contract, + generate_with_hook, + load_data_file, + load_generation_plan_file, + run_generation_plan, +) + +__all__ = [ + "GeneratedDocument", + "GenerationHook", + "GenerationHookRequest", + "GenerationHookResult", + "GenerationPlan", + "GenerationPlanError", + "GenerationResult", + "generate_stub_from_contract", + "generate_with_hook", + "load_data_file", + "load_generation_plan_file", + "run_generation_plan", +] diff --git a/src/markitect_tool/generation/engine.py b/src/markitect_tool/generation/engine.py new file mode 100644 index 0000000..3ddbea3 --- /dev/null +++ b/src/markitect_tool/generation/engine.py @@ -0,0 +1,339 @@ +"""Markdown generation from contracts, templates, rules, and external hooks.""" + +from __future__ import annotations + +import csv +import json +import re +from dataclasses import asdict, dataclass, field +from pathlib import Path +from typing import Any, Protocol + +import yaml + +from markitect_tool.contract import DocumentContract +from markitect_tool.core import parse_markdown +from markitect_tool.template import TemplateRenderResult, render_template + + +class GenerationPlanError(ValueError): + """Raised when a Markdown generation plan cannot be loaded or run.""" + + +@dataclass(frozen=True) +class GeneratedDocument: + """One generated Markdown document.""" + + markdown: str + output_path: str | None = None + source_template: str | None = None + data: dict[str, Any] = field(default_factory=dict) + missing_variables: list[str] = field(default_factory=list) + + def to_dict(self) -> dict[str, Any]: + data = asdict(self) + data["complete"] = not self.missing_variables + return {key: value for key, value in data.items() if value not in (None, [], {})} + + +@dataclass(frozen=True) +class GenerationResult: + """Result of a deterministic generation run.""" + + documents: list[GeneratedDocument] + plan_path: str | None = None + + def to_dict(self) -> dict[str, Any]: + data = { + "count": len(self.documents), + "documents": [document.to_dict() for document in self.documents], + "plan_path": self.plan_path, + } + return {key: value for key, value in data.items() if value is not None} + + +@dataclass(frozen=True) +class GenerationPlan: + """Markdown/YAML rule-based generation plan.""" + + documents: list[dict[str, Any]] + source_path: str | None = None + + def to_dict(self) -> dict[str, Any]: + data = {"documents": self.documents, "source_path": self.source_path} + return {key: value for key, value in data.items() if value is not None} + + +@dataclass(frozen=True) +class GenerationHookRequest: + """Provider-neutral request for optional assisted generation.""" + + prompt: str + data: dict[str, Any] = field(default_factory=dict) + template: str | None = None + contract_id: str | None = None + metadata: dict[str, Any] = field(default_factory=dict) + + +@dataclass(frozen=True) +class GenerationHookResult: + """Provider-neutral response from an assisted generation hook.""" + + markdown: str + provider: str | None = None + metadata: dict[str, Any] = field(default_factory=dict) + + def to_dict(self) -> dict[str, Any]: + data = asdict(self) + return {key: value for key, value in data.items() if value not in (None, {})} + + +class GenerationHook(Protocol): + """Protocol implemented by optional external generation providers.""" + + def generate(self, request: GenerationHookRequest) -> GenerationHookResult: + """Generate Markdown for a request.""" + + +def load_data_file(path: str | Path) -> Any: + """Load generation data from JSON, YAML, or CSV.""" + + file_path = Path(path) + suffix = file_path.suffix.lower() + if suffix == ".json": + return json.loads(file_path.read_text(encoding="utf-8")) + if suffix in {".yaml", ".yml"}: + return yaml.safe_load(file_path.read_text(encoding="utf-8")) or {} + if suffix == ".csv": + with file_path.open("r", encoding="utf-8", newline="") as handle: + return list(csv.DictReader(handle)) + raise GenerationPlanError(f"Unsupported data file format: {file_path.suffix}") + + +def generate_stub_from_contract( + contract: DocumentContract, + *, + data: dict[str, Any] | None = None, + include_optional: bool = False, +) -> GeneratedDocument: + """Generate a Markdown stub from a document contract.""" + + data = data or {} + frontmatter: dict[str, Any] = {} + if contract.document_type: + frontmatter["document_type"] = contract.document_type + + for field_spec in contract.fields: + path = field_spec.path or (f"frontmatter.{field_spec.id}" if field_spec.id else "") + if not path.startswith("frontmatter.") or not field_spec.id: + continue + key_path = path.removeprefix("frontmatter.").split(".") + value = _value_for_field(field_spec, data) + _set_nested(frontmatter, key_path, value) + + title = contract.title or contract.document_type or contract.id or "Generated Document" + parts = [_frontmatter_block(frontmatter), f"# {title}".strip()] + + for section in contract.sections: + if section.presence == "forbidden": + continue + if section.presence == "optional" and not include_optional: + continue + heading_title = section.title or section.id or "Section" + level = section.level or 2 + guidance = _section_guidance(section.raw.get("assertions")) + parts.extend(["", f"{'#' * level} {heading_title}", "", guidance or f"TODO: Add content for {heading_title}."]) + + markdown = "\n".join(part for part in parts if part is not None).rstrip() + "\n" + return GeneratedDocument(markdown=markdown, data=data) + + +def load_generation_plan_file(path: str | Path) -> GenerationPlan: + """Load a generation plan from a Markdown file with a fenced YAML block.""" + + file_path = Path(path) + document = parse_markdown(file_path.read_text(encoding="utf-8"), source_path=str(file_path)) + plan_data: dict[str, Any] | None = None + for token in document.tokens: + if token.get("type") != "fence": + continue + info = str(token.get("info", "")).strip().lower().split() + if "generation" not in info: + continue + if "yaml" not in info and "yml" not in info: + continue + loaded = yaml.safe_load(token.get("content", "")) or {} + if not isinstance(loaded, dict): + raise GenerationPlanError("Generation YAML block must be a mapping") + plan_data = loaded + break + if plan_data is None: + frontmatter_plan = document.frontmatter.get("generation") + if isinstance(frontmatter_plan, dict): + plan_data = frontmatter_plan + if not plan_data: + raise GenerationPlanError("No fenced ```yaml generation block found") + + documents = plan_data.get("documents") + if documents is None: + documents = [plan_data] + if not isinstance(documents, list) or not all(isinstance(item, dict) for item in documents): + raise GenerationPlanError("Generation `documents` must be a list of mappings") + return GenerationPlan(documents=documents, source_path=str(file_path)) + + +def run_generation_plan( + plan: GenerationPlan, + *, + base_dir: str | Path | None = None, + output_dir: str | Path | None = None, + dry_run: bool = False, +) -> GenerationResult: + """Render every document described by a generation plan.""" + + base = Path(base_dir or Path(plan.source_path or ".").parent).resolve() + output_base = Path(output_dir).resolve() if output_dir else base + documents: list[GeneratedDocument] = [] + + for raw_doc in plan.documents: + template_path = _required_path(raw_doc, "template", base) + template_text = template_path.read_text(encoding="utf-8") + data = _data_for_plan_doc(raw_doc, base) + strict = bool(raw_doc.get("strict", True)) + rendered = render_template(template_text, data, strict=strict) + output = raw_doc.get("output") + output_path: Path | None = None + if output: + output_path = (output_base / str(output)).resolve() + if not _is_within(output_path, output_base): + raise GenerationPlanError(f"Output path escapes output directory: {output}") + if not dry_run: + output_path.parent.mkdir(parents=True, exist_ok=True) + output_path.write_text(rendered.markdown, encoding="utf-8") + documents.append( + GeneratedDocument( + markdown=rendered.markdown, + output_path=str(output_path) if output_path else None, + source_template=str(template_path), + data=data, + missing_variables=rendered.missing_variables, + ) + ) + + return GenerationResult(documents=documents, plan_path=plan.source_path) + + +def generate_with_hook( + request: GenerationHookRequest, + hook: GenerationHook, +) -> GenerationHookResult: + """Run optional assisted generation through an external hook.""" + + return hook.generate(request) + + +def _data_for_plan_doc(raw_doc: dict[str, Any], base: Path) -> dict[str, Any]: + data: Any = {} + if "data_file" in raw_doc: + data = load_data_file((base / str(raw_doc["data_file"])).resolve()) + if "data" in raw_doc: + inline_data = raw_doc["data"] + if not isinstance(inline_data, dict): + raise GenerationPlanError("Inline generation `data` must be a mapping") + if isinstance(data, dict): + data = _deep_merge(data, inline_data) + elif data: + raise GenerationPlanError("Cannot merge inline data into non-mapping data file") + else: + data = inline_data + if not isinstance(data, dict): + raise GenerationPlanError("Generation template data must be a mapping") + return data + + +def _required_path(raw_doc: dict[str, Any], key: str, base: Path) -> Path: + raw_path = raw_doc.get(key) + if not raw_path: + raise GenerationPlanError(f"Generation document requires `{key}`") + path = (base / str(raw_path)).resolve() + if not path.exists() or not path.is_file(): + raise GenerationPlanError(f"Generation {key} not found: {path}") + return path + + +def _value_for_field(field_spec, data: dict[str, Any]) -> Any: + if field_spec.id and field_spec.id in data: + return data[field_spec.id] + if field_spec.path and field_spec.path.startswith("frontmatter."): + value = _get_nested(data, field_spec.path.removeprefix("frontmatter.").split(".")) + if value is not _MISSING: + return value + if field_spec.default is not None: + return field_spec.default + if field_spec.type == "boolean": + return False + if field_spec.type in {"number", "integer"}: + return 0 + if field_spec.type == "array": + return [] + if field_spec.type == "object": + return {} + return f"TODO: {field_spec.id or 'value'}" + + +def _section_guidance(raw_assertions: Any) -> str | None: + if not isinstance(raw_assertions, list): + return None + guidance = [] + for assertion in raw_assertions: + if isinstance(assertion, dict) and assertion.get("guidance"): + guidance.append(f"TODO: {assertion['guidance']}") + return "\n\n".join(guidance) if guidance else None + + +def _frontmatter_block(frontmatter: dict[str, Any]) -> str: + if not frontmatter: + return "" + return f"---\n{yaml.safe_dump(frontmatter, sort_keys=False).strip()}\n---\n" + + +def _set_nested(mapping: dict[str, Any], path: list[str], value: Any) -> None: + current = mapping + for part in path[:-1]: + nested = current.setdefault(part, {}) + if not isinstance(nested, dict): + nested = {} + current[part] = nested + current = nested + current[path[-1]] = value + + +_MISSING = object() + + +def _get_nested(mapping: dict[str, Any], path: list[str]) -> Any: + current: Any = mapping + for part in path: + if isinstance(current, dict) and part in current: + current = current[part] + else: + return _MISSING + return current + + +def _deep_merge(left: dict[str, Any], right: dict[str, Any]) -> dict[str, Any]: + merged = dict(left) + for key, value in right.items(): + if isinstance(merged.get(key), dict) and isinstance(value, dict): + merged[key] = _deep_merge(merged[key], value) + else: + merged[key] = value + return merged + + +def _is_within(path: Path, root: Path) -> bool: + try: + path.relative_to(root) + return True + except ValueError: + return False diff --git a/src/markitect_tool/template/__init__.py b/src/markitect_tool/template/__init__.py new file mode 100644 index 0000000..3b39681 --- /dev/null +++ b/src/markitect_tool/template/__init__.py @@ -0,0 +1,19 @@ +"""Deterministic Markdown template rendering.""" + +from markitect_tool.template.engine import ( + MissingTemplateVariable, + TemplateAnalysis, + TemplateError, + TemplateRenderResult, + analyze_template, + render_template, +) + +__all__ = [ + "MissingTemplateVariable", + "TemplateAnalysis", + "TemplateError", + "TemplateRenderResult", + "analyze_template", + "render_template", +] diff --git a/src/markitect_tool/template/engine.py b/src/markitect_tool/template/engine.py new file mode 100644 index 0000000..ec45ac7 --- /dev/null +++ b/src/markitect_tool/template/engine.py @@ -0,0 +1,179 @@ +"""Small deterministic template engine for Markdown generation.""" + +from __future__ import annotations + +import re +from dataclasses import asdict, dataclass +from typing import Any + +import yaml + + +class TemplateError(ValueError): + """Raised when a template cannot be parsed or rendered.""" + + +class MissingTemplateVariable(TemplateError): + """Raised when strict rendering cannot resolve a variable.""" + + +_IDENT = r"(?:_|[^\W\d])\w*" +_VARIABLE_RE = re.compile(r"\{\{\s*(?P" + _IDENT + r"(?:\." + _IDENT + r")*)\s*\}\}", re.UNICODE) +_BRACE_RE = re.compile(r"\{\{(?P.*?)\}\}", re.DOTALL) + + +@dataclass(frozen=True) +class TemplateAnalysis: + """Variables and syntax diagnostics for one template.""" + + variables: list[str] + root_variables: list[str] + nested_variables: list[str] + syntax_errors: list[str] + max_nesting_depth: int = 0 + + @property + def total_variables(self) -> int: + return len(self.variables) + + @property + def unique_variables(self) -> int: + return len(set(self.variables)) + + @property + def valid(self) -> bool: + return not self.syntax_errors + + def to_dict(self) -> dict[str, Any]: + data = asdict(self) + data["total_variables"] = self.total_variables + data["unique_variables"] = self.unique_variables + data["valid"] = self.valid + return data + + +@dataclass(frozen=True) +class TemplateRenderResult: + """Rendered Markdown plus trace information.""" + + markdown: str + variables: list[str] + missing_variables: list[str] + strict: bool = True + + @property + def complete(self) -> bool: + return not self.missing_variables + + def to_dict(self) -> dict[str, Any]: + data = asdict(self) + data["complete"] = self.complete + return data + + +def analyze_template(template_text: str) -> TemplateAnalysis: + """Analyze variable usage and syntax in a template.""" + + variables = _unique(_VARIABLE_RE.findall(template_text)) + roots = _unique(variable.split(".", 1)[0] for variable in variables) + nested = [variable for variable in variables if "." in variable] + max_depth = max((len(variable.split(".")) for variable in variables), default=0) + return TemplateAnalysis( + variables=variables, + root_variables=roots, + nested_variables=nested, + syntax_errors=_syntax_errors(template_text), + max_nesting_depth=max_depth, + ) + + +def render_template( + template_text: str, + data: dict[str, Any], + *, + strict: bool = True, +) -> TemplateRenderResult: + """Render ``{{variable.path}}`` placeholders with data.""" + + if not isinstance(data, dict): + raise TypeError("Template data must be a mapping") + + analysis = analyze_template(template_text) + if analysis.syntax_errors: + raise TemplateError("; ".join(analysis.syntax_errors)) + + missing: list[str] = [] + + def replace(match: re.Match[str]) -> str: + variable = match.group("name") + value = _resolve_path(data, variable) + if value is _MISSING: + missing.append(variable) + if strict: + raise MissingTemplateVariable(f"Missing template variable `{variable}`") + return match.group(0) + return _format_value(value) + + markdown = _VARIABLE_RE.sub(replace, template_text) + return TemplateRenderResult( + markdown=markdown, + variables=analysis.variables, + missing_variables=_unique(missing), + strict=strict, + ) + + +def _syntax_errors(template_text: str) -> list[str]: + errors: list[str] = [] + opens = template_text.count("{{") + closes = template_text.count("}}") + if opens != closes: + errors.append(f"Unmatched template braces: {opens} opening, {closes} closing") + for match in _BRACE_RE.finditer(template_text): + raw = match.group(0) + if not _VARIABLE_RE.fullmatch(raw): + errors.append(f"Invalid template variable syntax: {raw}") + return errors + + +_MISSING = object() + + +def _resolve_path(data: dict[str, Any], path: str) -> Any: + current: Any = data + for part in path.split("."): + if isinstance(current, dict) and part in current: + current = current[part] + else: + return _MISSING + return current + + +def _format_value(value: Any) -> str: + if value is None: + return "" + if isinstance(value, str): + return value + if isinstance(value, bool): + return "true" if value else "false" + if isinstance(value, int | float): + return str(value) + if isinstance(value, list): + if not value: + return "" + return "\n".join(f"- {_format_scalar(item)}" for item in value) + if isinstance(value, dict): + return yaml.safe_dump(value, sort_keys=False).strip() + return str(value) + + +def _format_scalar(value: Any) -> str: + if isinstance(value, str): + return value + if isinstance(value, int | float | bool) or value is None: + return _format_value(value) + return yaml.safe_dump(value, sort_keys=False).strip() + + +def _unique(items) -> list: + return list(dict.fromkeys(items)) diff --git a/tests/test_template_generation.py b/tests/test_template_generation.py new file mode 100644 index 0000000..157db4b --- /dev/null +++ b/tests/test_template_generation.py @@ -0,0 +1,167 @@ +from pathlib import Path + +import pytest +from click.testing import CliRunner + +from markitect_tool.cli import main +from markitect_tool.contract import load_contract_file +from markitect_tool.generation import ( + GenerationHookRequest, + GenerationHookResult, + generate_stub_from_contract, + generate_with_hook, + load_generation_plan_file, + run_generation_plan, +) +from markitect_tool.template import ( + MissingTemplateVariable, + analyze_template, + render_template, +) + + +def test_analyze_template_extracts_nested_and_unicode_variables(): + analysis = analyze_template("Hello {{customer.name}}, café {{café.price}}") + + assert analysis.valid + assert analysis.variables == ["customer.name", "café.price"] + assert analysis.root_variables == ["customer", "café"] + assert analysis.nested_variables == ["customer.name", "café.price"] + assert analysis.max_nesting_depth == 2 + + +def test_analyze_template_reports_invalid_syntax(): + analysis = analyze_template("Valid {{name}} but invalid {{1bad}} and {{missing") + + assert not analysis.valid + assert "name" in analysis.variables + assert len(analysis.syntax_errors) == 2 + + +def test_render_template_strict_and_lenient_modes(): + template = "# {{title}}\n\nOwner: {{owner.name}}\n\n{{items}}" + data = {"title": "Plan", "owner": {"name": "Ada"}, "items": ["one", "two"]} + + result = render_template(template, data) + + assert result.complete + assert "# Plan" in result.markdown + assert "Owner: Ada" in result.markdown + assert "- one\n- two" in result.markdown + + with pytest.raises(MissingTemplateVariable): + render_template("{{missing}}", {}, strict=True) + + lenient = render_template("{{missing}}", {}, strict=False) + assert lenient.markdown == "{{missing}}" + assert lenient.missing_variables == ["missing"] + + +def test_generate_stub_from_contract_uses_sections_and_guidance(): + contract = load_contract_file("examples/contracts/adr.contract.md") + + result = generate_stub_from_contract(contract, data={"status": "proposed"}) + + assert "document_type: adr" in result.markdown + assert "status: proposed" in result.markdown + assert "# Architecture Decision Record" in result.markdown + assert "## Context" in result.markdown + assert "TODO: Explain why the decision exists." in result.markdown + assert "Deprecated Approach" not in result.markdown + + +def test_generation_plan_renders_and_writes_outputs(tmp_path: Path): + template = tmp_path / "letter.md" + data = tmp_path / "data.yaml" + rules = tmp_path / "rules.md" + template.write_text("# Hello {{person.name}}\n\n{{message}}", encoding="utf-8") + data.write_text("person:\n name: Ada\nmessage: Welcome.\n", encoding="utf-8") + rules.write_text( + """# Generation Rules + +```yaml generation +template: letter.md +data_file: data.yaml +output: out/letter.md +``` +""", + encoding="utf-8", + ) + + plan = load_generation_plan_file(rules) + result = run_generation_plan(plan, base_dir=tmp_path, output_dir=tmp_path) + + assert result.documents[0].markdown == "# Hello Ada\n\nWelcome." + assert (tmp_path / "out" / "letter.md").read_text(encoding="utf-8") == "# Hello Ada\n\nWelcome." + + +def test_generation_hook_boundary_accepts_external_provider(): + class FakeHook: + def generate(self, request: GenerationHookRequest) -> GenerationHookResult: + return GenerationHookResult( + markdown=f"# {request.data['title']}\n\n{request.prompt}", + provider="fake", + ) + + result = generate_with_hook( + GenerationHookRequest(prompt="Draft this deterministically in the test.", data={"title": "Hook"}), + FakeHook(), + ) + + assert result.provider == "fake" + assert result.markdown.startswith("# Hook") + + +def test_mkt_template_render_outputs_markdown(tmp_path: Path): + template = tmp_path / "template.md" + data = tmp_path / "data.json" + template.write_text("# {{title}}\n", encoding="utf-8") + data.write_text('{"title": "Rendered"}', encoding="utf-8") + + result = CliRunner().invoke(main, ["template", "render", str(template), "--data", str(data)]) + + assert result.exit_code == 0 + assert result.output == "# Rendered\n" + + +def test_mkt_template_inspect_outputs_text(tmp_path: Path): + template = tmp_path / "template.md" + template.write_text("# {{title}}\n\n{{owner.name}}", encoding="utf-8") + + result = CliRunner().invoke(main, ["template", "inspect", str(template)]) + + assert result.exit_code == 0 + assert "variables: 2" in result.output + assert "owner.name" in result.output + + +def test_mkt_generate_stub_outputs_contract_stub(): + result = CliRunner().invoke( + main, + ["generate", "stub", "--contract", "examples/contracts/adr.contract.md", "--set", "status=accepted"], + ) + + assert result.exit_code == 0 + assert "status: accepted" in result.output + assert "## Decision" in result.output + + +def test_mkt_generate_rules_writes_file(tmp_path: Path): + template = tmp_path / "template.md" + rules = tmp_path / "rules.md" + template.write_text("# {{title}}\n", encoding="utf-8") + rules.write_text( + """```yaml generation +template: template.md +data: + title: From Rules +output: generated.md +```""", + encoding="utf-8", + ) + + result = CliRunner().invoke(main, ["generate", "rules", str(rules), "--output-dir", str(tmp_path)]) + + assert result.exit_code == 0 + assert '"count": 1' in result.output + assert (tmp_path / "generated.md").read_text(encoding="utf-8") == "# From Rules\n" diff --git a/workplans/MKTT-WP-0003-core-toolkit-implementation.md b/workplans/MKTT-WP-0003-core-toolkit-implementation.md index 36b6234..12728f7 100644 --- a/workplans/MKTT-WP-0003-core-toolkit-implementation.md +++ b/workplans/MKTT-WP-0003-core-toolkit-implementation.md @@ -106,7 +106,7 @@ and `mkt transform`, `mkt compose`, and `mkt include`. ```task id: MKTT-WP-0003-T006 -status: todo +status: done priority: medium state_hub_task_id: "307fa072-b1ce-42e8-9309-e2a92e130ae1" ``` @@ -120,6 +120,12 @@ Keep this slice focused on deterministic templates and generation hooks. Rich processors, named chunks, weave/tangle, namespaces, and content-class inheritance are captured in `MKTT-WP-0010` after this hook surface is clear. +Initial implementation complete for deterministic `{{field.path}}` templates, +template inspection, strict/lenient rendering, JSON/YAML/CSV data loading, +contract-based stub generation, Markdown/YAML generation plans, provider-neutral +generation hook protocols, docs, examples, tests, `mkt template inspect`, +`mkt template render`, `mkt generate stub`, and `mkt generate rules`. + ## P3.7 - Add caching and incremental processing ```task