# MarkiTect Prompt Dependency Resolution Tutorial **Example: Generating InfoTech Primers with Full Provenance Tracking** This tutorial demonstrates how to use MarkiTect's Prompt Dependency Resolution infrastructure to systematically generate content with complete dependency tracking, quality validation, and traceability. --- ## Table of Contents 1. [Overview](#overview) 2. [Architecture](#architecture) 3. [Setup](#setup) 4. [Core Concepts](#core-concepts) 5. [Step-by-Step Walkthrough](#step-by-step-walkthrough) 6. [Advanced Features](#advanced-features) 7. [CLI Usage](#cli-usage) 8. [Best Practices](#best-practices) --- ## Overview ### What This Example Does This example shows how to generate **InfoTech Primers** (structured reference documents for IT concepts) using a prompt template system with: - **Artifact Management**: Store and version all inputs (templates, topics, guidelines) - **Dependency Resolution**: Automatically resolve macro references across information spaces - **Provenance Tracking**: Trace any generated primer back to its inputs and template - **Incremental Updates**: Detect when inputs change and regenerate affected primers - **Quality Validation**: Apply quality gates to ensure output meets standards - **Visualization**: View dependency graphs in DOT or Mermaid format ### Why Use Prompt Dependency Resolution? **Before** (manual approach in `prepdr/`): ```markdown # Template with manual macros {{topic}} {{AuthoringRules}} {{ResearchPrompt}} ``` Problems: - Manual macro substitution - No version tracking - No dependency awareness - Can't detect when inputs change - No provenance traceability **After** (with infrastructure): ```markdown # Template with resolved dependencies @{topic} @{authoring_rules} @{research_prompt} ``` Benefits: - Automatic macro resolution - Content-based change detection (SHA-256 digests) - Full dependency graph construction - Incremental recomputation when inputs change - Complete provenance: artifact → template → inputs → validation - CLI commands for inspection and debugging --- ## Architecture ### Information Spaces The system organizes artifacts into **information spaces** (logical namespaces): ``` primer-templates/ # PromptTemplates for generation ├─ generate-primer primer-topics/ # Topic definitions (ETL, Microservices, OAuth, etc.) ├─ etl ├─ microservices └─ ... primer-guidelines/ # Authoring and research guidelines ├─ authoring-rules └─ research-prompt generated-primers/ # Output artifacts ├─ etl-primer ├─ microservices-primer └─ ... ``` ### Dependency Graph When you generate a primer, the system creates a dependency graph: ```mermaid graph LR A[etl topic] -->|requires| B[generate-primer template] C[authoring-rules] -->|requires| B D[research-prompt] -->|requires| B B -->|generates| E[etl-primer output] ``` This graph enables: - **Impact analysis**: "What primers need regeneration if authoring-rules changes?" - **Provenance tracing**: "What inputs produced this primer?" - **Incremental execution**: "Only regenerate affected primers" --- ## Setup ### Prerequisites ```bash # Ensure MarkiTect is installed cd /path/to/markitect_project pip install -e . ``` ### Directory Structure ``` examples/content-generator/ ├── TUTORIAL.md # This file ├── generate_primers.py # Main example script ├── templates/ │ └── generate-primer.md # PromptTemplate ├── artifacts/ │ ├── topics/ │ │ ├── etl.md # Topic: ETL │ │ └── microservices.md # Topic: Microservices │ └── guidelines/ │ ├── authoring-rules.md # Authoring standards │ └── research-prompt.md # Research methodology └── prepdr/ # Original manual system (preserved) ├── README.md ├── ETL.md ├── AuthoringRules.md ├── AssistentPrompt.md └── GeneratePrimerTemplate.md ``` ### Running the Example ```bash cd examples/content-generator python generate_primers.py ``` Expected output: ``` ╔══════════════════════════════════════════════════════════════╗ ║ MarkiTect Prompt Dependency Resolution Example ║ ║ InfoTech Primer Generation ║ ╚══════════════════════════════════════════════════════════════╝ === Loading Artifacts === ✓ Created artifact: generate-primer (digest: a7f3e2b1) ✓ Created artifact: etl (digest: 9c4d6e8a) ✓ Created artifact: microservices (digest: 5b2f1c9d) ✓ Created artifact: authoring-rules (digest: 3e7a9f2c) ✓ Created artifact: research-prompt (digest: 8d1b4e6f) === Generating Primer: etl === ✓ Template created with 3 macro dependencies ✓ Resolved 3 macros ✓ Compiled prompt (digest: 4c9e2a7b) ✓ Persisted 3 dependency edges ✓ Generated primer: etl-primer === Provenance Trace === Artifact: abc-123-def-456 Producing Run: run-etl-001 Input Artifacts: 3 Dependency Chain: 5 artifacts ✓ Primer generation complete! ``` --- ## Core Concepts ### 1. Artifacts **Artifacts** are versioned content units with content-based addressing. ```python from markitect.prompts.models import Artifact, ArtifactType # Create an artifact artifact = Artifact.create( space_id="primer-topics", name="etl", content=topic_content, artifact_type=ArtifactType.CONTENT, ) # Automatic SHA-256 digest generation print(artifact.content_digest) # "9c4d6e8a..." ``` **Key features:** - **Content digest**: SHA-256 hash for change detection - **Space isolation**: Artifacts in different spaces can have same names - **Type classification**: CONTENT, TEMPLATE, GENERATED, SCHEMA, CONFIG ### 2. PromptTemplates **PromptTemplates** are artifacts with macro references. ```markdown --- id: generate-primer-v1 artifact_type: template --- # Generate Primer Topic: @{topic} Guidelines: @{authoring_rules} ``` **Macro syntax:** - `@{macro_name}` - Resolved to artifact content - Resolution happens at execution time - Macros can reference artifacts in any information space ### 3. Resolution Strategy **Resolution** finds artifacts to substitute for macros. ```python from markitect.prompts.resolver.strategy import ResolutionConfig, ResolutionStrategy config = ResolutionConfig( strategy=ResolutionStrategy.FIRST_MATCH, spaces=["primer-topics", "primer-guidelines"], ) ``` **Strategies:** - `FIRST_MATCH`: Use first artifact found - `LATEST_VERSION`: Use newest version (if artifacts have versions) - `EXPLICIT_ONLY`: Require explicit space qualification ### 4. Dependency Tracking **Dependency edges** are automatically created during resolution. ```python # Edge types EdgeType.REQUIRES # Input dependency (template → topic) EdgeType.GENERATES # Output relationship (run → primer) EdgeType.INCLUDES # Composition (nested templates) ``` **Graph operations:** ```python # Find all artifacts that depend on authoring-rules dependents = query_service.find_transitive_dependents("authoring-rules-id") # Find all inputs needed to regenerate a primer dependencies = query_service.find_transitive_dependencies("etl-primer-id") # Detect circular dependencies cycles = query_service.detect_circular_dependencies() ``` ### 5. Traceability **ProvenanceTrace** captures complete lineage. ```python trace = trace_service.trace_artifact(artifact_id) print(trace.producing_run) # Run that generated this print(trace.template) # Template used print(trace.input_artifacts) # All input dependencies print(trace.validation_results) # Quality gate results print(trace.impact_debt) # Suppressed recomputations ``` --- ## Step-by-Step Walkthrough ### Step 1: Initialize Repositories ```python from markitect.prompts.repositories.sqlite import SQLiteArtifactRepository from markitect.prompts.dependencies.repository import SQLiteDependencyRepository artifact_repo = SQLiteArtifactRepository("primers.db") dep_repo = SQLiteDependencyRepository("primers.db") ``` **What this does:** - Creates SQLite database with artifact and dependency tables - Artifact table: id, space_id, name, content_digest, metadata - Dependency table: source_id, target_id, edge_type, run_id ### Step 2: Load Artifacts ```python # Read artifact file content = Path("artifacts/topics/etl.md").read_text() # Create artifact artifact = Artifact.create( space_id="primer-topics", name="etl", content=content, artifact_type=ArtifactType.CONTENT, ) # Store in repository artifact = artifact_repo.create(artifact) ``` **Content-based addressing:** ```python # If you modify the content updated_content = content + "\n\n**New section added**" artifact.update_content(updated_content) # Digest changes automatically print(artifact.content_digest) # Different hash! ``` ### Step 3: Create PromptTemplate ```python from markitect.prompts.templates.models import PromptTemplate, MacroReference template = PromptTemplate.create( id="generate-primer-v1", name="generate-primer", content=template_content, space_id="primer-templates", ) # Add macro dependencies template.add_macro(MacroReference( name="topic", source_space="primer-topics" )) template.add_macro(MacroReference( name="authoring_rules", source_space="primer-guidelines" )) ``` **Template content** (`templates/generate-primer.md`): ```markdown # Generate InfoTech Primer ## Topic @{topic} ## Guidelines @{authoring_rules} ## Research Protocol @{research_prompt} Generate a complete primer following the authoring rules. ``` ### Step 4: Resolve Dependencies ```python from markitect.prompts.resolver.resolver import PromptResolver from markitect.prompts.resolver.strategy import ResolutionConfig resolver = PromptResolver(artifact_repo) config = ResolutionConfig( strategy=ResolutionStrategy.FIRST_MATCH, spaces=["primer-topics", "primer-guidelines"], ) resolution_result = resolver.resolve_template(template, config) if resolution_result.success: for resolved in resolution_result.context.resolved_macros: print(f"{resolved.macro_name} → {resolved.artifact.name}") else: print("Resolution failed:", resolution_result.context.errors) ``` **Resolution algorithm:** 1. Parse template to extract `@{macro_name}` references 2. For each macro: - Search configured spaces in order - Match by name (case-sensitive) - Return first match (FIRST_MATCH strategy) 3. Build ResolutionResult with all resolved artifacts ### Step 5: Compile Prompt ```python from markitect.prompts.resolver.compiler import ContextCompiler compiler = ContextCompiler() compiled = compiler.compile(template, template_content, resolution_result) print(compiled.content) # Fully expanded prompt print(compiled.content_digest) # Hash for caching print(compiled.dependency_digests) # Map of macro → artifact digest ``` **Compiled output:** ```markdown # Generate InfoTech Primer ## Topic A three-phase computing process where data is extracted from source systems, transformed (including validation, cleaning, enrichment, and aggregation), and loaded into a target data store or data warehouse. ... ## Guidelines [Full authoring-rules content] ... ## Research Protocol [Full research-prompt content] ... ``` ### Step 6: Track Dependencies ```python from markitect.prompts.execution.manifest import RunManifest from markitect.prompts.dependencies.graph import GraphBuilder # Create run manifest manifest = RunManifest.create( run_id="run-etl-001", template_id=template.id, template_name=template.name, template_digest=template.content_digest, ) # Add resolved inputs for resolved in resolution_result.context.resolved_macros: manifest.add_resolved_input( name=resolved.macro_name, artifact_id=resolved.artifact.id, space_id=resolved.space_id, digest=resolved.artifact.content_digest, ) # Create dependency edge manifest.add_dependency_edge( source_id=resolved.artifact.id, target_id="run-etl-001", edge_type="requires", ) # Persist to database builder = GraphBuilder(dep_repo) edges = builder.persist_edges(manifest) ``` **Result:** Dependency edges stored in database: ``` source_artifact_id | target_artifact_id | edge_type | run_id --------------------|--------------------|-----------|----------- etl-id | run-etl-001 | requires | run-etl-001 authoring-rules-id | run-etl-001 | requires | run-etl-001 research-prompt-id | run-etl-001 | requires | run-etl-001 ``` ### Step 7: Generate Output ```python # In real usage, this would call an LLM API # For demo, we create a mock output output_content = """ # ETL Primer ## Definition ETL (Extract, Transform, Load) is a data integration pattern... [Generated content] """ output_artifact = Artifact.create( space_id="generated-primers", name="etl-primer", content=output_content, artifact_type=ArtifactType.GENERATED, ) output_artifact = artifact_repo.create(output_artifact) # Add to manifest manifest.add_output_artifact( artifact_id=output_artifact.id, name=output_artifact.name, digest=output_artifact.content_digest, artifact_type="generated", ) manifest.add_dependency_edge( source_id="run-etl-001", target_id=output_artifact.id, edge_type="generates", ) # Persist output edges builder.persist_edges(manifest) ``` ### Step 8: Trace Provenance ```python from markitect.prompts.traceability.service import TraceabilityService trace_service = TraceabilityService(artifact_repo, dep_repo, db_path="primers.db") # Trace the generated primer trace = trace_service.trace_artifact(output_artifact.id) # Inspect provenance print("Template:", trace.template.name if trace.template else "None") print("Producing run:", trace.producing_run.run_id if trace.producing_run else "None") print("Input artifacts:") for inp in trace.input_artifacts: print(f" - {inp.name} ({inp.artifact_type})") print("Dependency chain:") for dep_id in trace.dependency_chain: artifact = artifact_repo.get_by_id(dep_id) print(f" - {artifact.name if artifact else dep_id}") ``` --- ## Advanced Features ### Incremental Recomputation When an input changes, automatically detect affected outputs: ```python from markitect.prompts.incremental.detector import ChangeDetector from markitect.prompts.incremental.engine import IncrementalExecutionEngine from markitect.prompts.incremental.models import RecomputeConfig # Detect change detector = ChangeDetector("primers.db") authoring_rules = artifact_repo.get_by_name("primer-guidelines", "authoring-rules") # User updates the file new_content = Path("artifacts/guidelines/authoring-rules.md").read_text() change = detector.detect_change(authoring_rules, new_content) if change: detector.record_change(change) # Find affected primers engine = IncrementalExecutionEngine("primers.db", query_service) result = engine.recompute( change, config=RecomputeConfig(max_depth=2, impact_threshold=0.1), old_content=authoring_rules.content, new_content=new_content, ) print(f"Total dependents: {result.total_dependents}") print(f"Recomputed: {result.recomputed_count}") print(f"Suppressed: {result.suppressed_count}") ``` **Recomputation strategies:** - **max_depth**: Traverse dependency graph N levels - **impact_threshold**: Only recompute if change magnitude > threshold - **max_recomputes**: Budget limit to prevent runaway execution ### Quality Validation Apply quality gates to generated primers: ```python from markitect.prompts.quality.validator import QualityValidator from markitect.prompts.quality.gates.pattern_gate import PatternValidationGate # Create validation gate gate = PatternValidationGate( required_patterns=[ r"## Definition", r"## Context", r"## Core Concepts", r"## Scope and Non-Scope", ], forbidden_patterns=[ r"TODO", r"FIXME", ], gate_id="primer-structure-check", name="Primer Structure Validator", ) validator = QualityValidator(gates=[gate], db_path="primers.db") # Validate output results = validator.validate_artifact( content=output_content, artifact_id=output_artifact.id, run_id="run-etl-001", ) if validator.all_passed(results): print("✓ All quality gates passed") else: failed = validator.get_failed_gates(results) for result in failed: print(f"✗ {result.gate_id} failed") for diag in result.diagnostics: print(f" {diag.message}") ``` ### Visualization Generate dependency graphs: ```python from markitect.prompts.visualization.graph import GraphExporter from markitect.prompts.dependencies.queries import DependencyQueryService query_service = DependencyQueryService(dep_repo) # Find all related artifacts deps = query_service.find_transitive_dependencies(output_artifact.id) dependents = query_service.find_transitive_dependents(output_artifact.id) all_ids = deps | dependents | {output_artifact.id} # Build graph builder = GraphBuilder(dep_repo) graph = builder.build_graph(all_ids) # Export to Mermaid mermaid = GraphExporter.to_mermaid(graph, "Primer Dependencies") Path("dependencies.mermaid").write_text(mermaid) # Export to DOT (Graphviz) dot = GraphExporter.to_dot(graph, "Primer Dependencies") Path("dependencies.dot").write_text(dot) ``` **Mermaid output:** ```mermaid %%{ title: Primer Dependencies }%% graph LR etl-id-->|requires|run-etl-001 authoring-rules-id-->|requires|run-etl-001 research-prompt-id-->|requires|run-etl-001 run-etl-001-.->|generates|etl-primer-id ``` --- ## CLI Usage The Prompt Dependency Resolution infrastructure includes CLI commands: ### Trace Provenance ```bash markitect prompt trace --database primers.db ``` Output (JSON): ```json { "artifact_id": "abc-123-def-456", "producing_run": { "run_id": "run-etl-001", "template_id": "generate-primer-v1", "status": "success" }, "input_artifacts": [ { "artifact_id": "...", "name": "etl", "role": "input" } ], "dependency_chain": ["...", "..."] } ``` ### Visualize Graph ```bash markitect prompt graph --format mermaid --database primers.db ``` ### List Runs ```bash # All runs markitect prompt runs --database primers.db # Filter by template markitect prompt runs --template generate-primer-v1 --database primers.db # Filter by status markitect prompt runs --status success --limit 10 --database primers.db ``` ### Show Impact Debt ```bash # All stale artifacts markitect prompt debt --database primers.db # Specific artifact markitect prompt debt --artifact authoring-rules-id --database primers.db ``` ### Graph Statistics ```bash markitect prompt stats --database primers.db ``` Output: ```json { "total_nodes": 12, "total_edges": 18, "root_count": 3, "leaf_count": 2, "has_cycles": false } ``` --- ## Best Practices ### 1. Organize Artifacts by Space ``` Clear separation of concerns: - templates/ ← Reusable PromptTemplates - topics/ ← Domain-specific content - guidelines/ ← Standards and methodologies - output/ ← Generated artifacts ``` ### 2. Use Content Digests for Change Detection ```python # Don't compare content strings if old_content != new_content: # ✗ Inefficient # Do compare digests if artifact.has_changed(new_digest): # ✓ Fast, hash-based ``` ### 3. Apply Quality Gates ```python # Define quality standards as code gates = [ PatternValidationGate(required_patterns=[...]), SchemaValidationGate(schema={...}), ] # Fail fast if quality checks fail if not validator.all_passed(results): raise QualityError("Output does not meet standards") ``` ### 4. Track All Dependencies ```python # Always persist dependency edges manifest.add_dependency_edge(source, target, edge_type) builder.persist_edges(manifest) # This enables: # - Impact analysis # - Incremental recomputation # - Provenance tracing ``` ### 5. Use Incremental Execution ```python # Don't regenerate everything on every change config = RecomputeConfig( max_depth=2, # Limit blast radius impact_threshold=0.1, # Skip minor changes max_recomputes=10, # Budget limit ) ``` ### 6. Version Your Templates ```python # Include version in template metadata --- id: generate-primer-v1 version: 1.0.0 --- # When template changes significantly, create v2 --- id: generate-primer-v2 version: 2.0.0 --- ``` ### 7. Leverage Traceability ```python # Use provenance traces for debugging trace = trace_service.trace_artifact(failed_output_id) print("Inputs used:") for inp in trace.input_artifacts: print(f" {inp.name} @ {inp.content_digest[:8]}") # This helps identify which input caused the issue ``` --- ## Comparison with Original System ### Original (`prepdr/`) **GeneratePrimerTemplate.md:** ```markdown {{ETL}} {{AuthoringRules}} ``` **Process:** 1. Manually copy-paste content 2. Replace `{{...}}` markers by hand 3. Run through LLM 4. No record of what inputs were used 5. No change detection **Limitations:** - No automation - No version control on inputs - Can't regenerate from history - No impact analysis when guidelines change ### With Infrastructure **templates/generate-primer.md:** ```markdown @{topic} @{authoring_rules} ``` **Process:** 1. Define artifacts once 2. Create template with `@{...}` macros 3. Run resolver → compiler → executor 4. Full dependency graph persisted 5. Complete provenance trace available **Benefits:** - Fully automated resolution - Content-based change detection (SHA-256) - Reproducible: "same inputs → same output" - Impact analysis: "what needs regeneration?" - Traceability: "how was this generated?" - Quality validation: automated checks - Visualization: see dependency relationships --- ## Next Steps 1. **Extend the example:** - Add more topics (OAuth, Docker, Kubernetes) - Create topic-specific quality gates - Implement actual LLM integration 2. **Build a workflow:** - Git hooks to detect artifact changes - CI/CD pipeline to regenerate affected primers - Dashboard to show primer freshness 3. **Add advanced features:** - Version conflict resolution - A/B testing different templates - Batch generation with parallelization 4. **Integrate with MarkiTect:** - Use MarkiTect ingestion for artifact storage - Query relationships with relational metadata - Generate documentation sites from primers --- ## References - [Prompt Dependency Resolution Specification](../../roadmap/prompt-dependency-resolution/) - [MarkiTect Documentation](../../README.md) - [Phase 8 Implementation](../../markitect/prompts/) --- **Questions or feedback?** File an issue or reach out to the maintainers.