Finishes the in-progress rename so docs, configs, tests, and capability manifests all reference the current repo name consistently. Fixes two tests (test_roundtrip_consolidated.py, test_issue_140_roundtrip_simplified.py) whose hardcoded cwd paths would have broken under the renamed directory. Archival content under history/, reports/, and roadmap/eat-the-frog/, plus derived artifacts (.venv_old/, node_modules/, asset_registry.json) are intentionally left untouched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
24 KiB
MarkiTect Prompt Dependency Resolution Tutorial
Example: Generating InfoTech Primers with Full Provenance Tracking
This tutorial demonstrates how to use MarkiTect's Prompt Dependency Resolution infrastructure to systematically generate content with complete dependency tracking, quality validation, and traceability.
Table of Contents
- Overview
- Architecture
- Setup
- Core Concepts
- Step-by-Step Walkthrough
- Advanced Features
- CLI Usage
- Best Practices
Overview
What This Example Does
This example shows how to generate InfoTech Primers (structured reference documents for IT concepts) using a prompt template system with:
- Artifact Management: Store and version all inputs (templates, topics, guidelines)
- Dependency Resolution: Automatically resolve macro references across information spaces
- Provenance Tracking: Trace any generated primer back to its inputs and template
- Incremental Updates: Detect when inputs change and regenerate affected primers
- Quality Validation: Apply quality gates to ensure output meets standards
- Visualization: View dependency graphs in DOT or Mermaid format
Why Use Prompt Dependency Resolution?
Before (manual approach in prepdr/):
# Template with manual macros
{{topic}}
{{AuthoringRules}}
{{ResearchPrompt}}
Problems:
- Manual macro substitution
- No version tracking
- No dependency awareness
- Can't detect when inputs change
- No provenance traceability
After (with infrastructure):
# Template with resolved dependencies
@{topic}
@{authoring_rules}
@{research_prompt}
Benefits:
- Automatic macro resolution
- Content-based change detection (SHA-256 digests)
- Full dependency graph construction
- Incremental recomputation when inputs change
- Complete provenance: artifact → template → inputs → validation
- CLI commands for inspection and debugging
Architecture
Information Spaces
The system organizes artifacts into information spaces (logical namespaces):
primer-templates/ # PromptTemplates for generation
├─ generate-primer
primer-topics/ # Topic definitions (ETL, Microservices, OAuth, etc.)
├─ etl
├─ microservices
└─ ...
primer-guidelines/ # Authoring and research guidelines
├─ authoring-rules
└─ research-prompt
generated-primers/ # Output artifacts
├─ etl-primer
├─ microservices-primer
└─ ...
Dependency Graph
When you generate a primer, the system creates a dependency graph:
graph LR
A[etl topic] -->|requires| B[generate-primer template]
C[authoring-rules] -->|requires| B
D[research-prompt] -->|requires| B
B -->|generates| E[etl-primer output]
This graph enables:
- Impact analysis: "What primers need regeneration if authoring-rules changes?"
- Provenance tracing: "What inputs produced this primer?"
- Incremental execution: "Only regenerate affected primers"
Setup
Prerequisites
# Ensure MarkiTect is installed
cd /path/to/markitect-main
pip install -e .
Directory Structure
examples/content-generator/
├── TUTORIAL.md # This file
├── generate_primers.py # Main example script
├── templates/
│ └── generate-primer.md # PromptTemplate
├── artifacts/
│ ├── topics/
│ │ ├── etl.md # Topic: ETL
│ │ └── microservices.md # Topic: Microservices
│ └── guidelines/
│ ├── authoring-rules.md # Authoring standards
│ └── research-prompt.md # Research methodology
└── prepdr/ # Original manual system (preserved)
├── README.md
├── ETL.md
├── AuthoringRules.md
├── AssistentPrompt.md
└── GeneratePrimerTemplate.md
Running the Example
cd examples/content-generator
python generate_primers.py
Expected output:
╔══════════════════════════════════════════════════════════════╗
║ MarkiTect Prompt Dependency Resolution Example ║
║ InfoTech Primer Generation ║
╚══════════════════════════════════════════════════════════════╝
=== Loading Artifacts ===
✓ Created artifact: generate-primer (digest: a7f3e2b1)
✓ Created artifact: etl (digest: 9c4d6e8a)
✓ Created artifact: microservices (digest: 5b2f1c9d)
✓ Created artifact: authoring-rules (digest: 3e7a9f2c)
✓ Created artifact: research-prompt (digest: 8d1b4e6f)
=== Generating Primer: etl ===
✓ Template created with 3 macro dependencies
✓ Resolved 3 macros
✓ Compiled prompt (digest: 4c9e2a7b)
✓ Persisted 3 dependency edges
✓ Generated primer: etl-primer
=== Provenance Trace ===
Artifact: abc-123-def-456
Producing Run: run-etl-001
Input Artifacts: 3
Dependency Chain: 5 artifacts
✓ Primer generation complete!
Core Concepts
1. Artifacts
Artifacts are versioned content units with content-based addressing.
from markitect.prompts.models import Artifact, ArtifactType
# Create an artifact
artifact = Artifact.create(
space_id="primer-topics",
name="etl",
content=topic_content,
artifact_type=ArtifactType.CONTENT,
)
# Automatic SHA-256 digest generation
print(artifact.content_digest) # "9c4d6e8a..."
Key features:
- Content digest: SHA-256 hash for change detection
- Space isolation: Artifacts in different spaces can have same names
- Type classification: CONTENT, TEMPLATE, GENERATED, SCHEMA, CONFIG
2. PromptTemplates
PromptTemplates are artifacts with macro references.
---
id: generate-primer-v1
artifact_type: template
---
# Generate Primer
Topic: @{topic}
Guidelines: @{authoring_rules}
Macro syntax:
@{macro_name}- Resolved to artifact content- Resolution happens at execution time
- Macros can reference artifacts in any information space
3. Resolution Strategy
Resolution finds artifacts to substitute for macros.
from markitect.prompts.resolver.strategy import ResolutionConfig, ResolutionStrategy
config = ResolutionConfig(
strategy=ResolutionStrategy.FIRST_MATCH,
spaces=["primer-topics", "primer-guidelines"],
)
Strategies:
FIRST_MATCH: Use first artifact foundLATEST_VERSION: Use newest version (if artifacts have versions)EXPLICIT_ONLY: Require explicit space qualification
4. Dependency Tracking
Dependency edges are automatically created during resolution.
# Edge types
EdgeType.REQUIRES # Input dependency (template → topic)
EdgeType.GENERATES # Output relationship (run → primer)
EdgeType.INCLUDES # Composition (nested templates)
Graph operations:
# Find all artifacts that depend on authoring-rules
dependents = query_service.find_transitive_dependents("authoring-rules-id")
# Find all inputs needed to regenerate a primer
dependencies = query_service.find_transitive_dependencies("etl-primer-id")
# Detect circular dependencies
cycles = query_service.detect_circular_dependencies()
5. Traceability
ProvenanceTrace captures complete lineage.
trace = trace_service.trace_artifact(artifact_id)
print(trace.producing_run) # Run that generated this
print(trace.template) # Template used
print(trace.input_artifacts) # All input dependencies
print(trace.validation_results) # Quality gate results
print(trace.impact_debt) # Suppressed recomputations
Step-by-Step Walkthrough
Step 1: Initialize Repositories
from markitect.prompts.repositories.sqlite import SQLiteArtifactRepository
from markitect.prompts.dependencies.repository import SQLiteDependencyRepository
artifact_repo = SQLiteArtifactRepository("primers.db")
dep_repo = SQLiteDependencyRepository("primers.db")
What this does:
- Creates SQLite database with artifact and dependency tables
- Artifact table: id, space_id, name, content_digest, metadata
- Dependency table: source_id, target_id, edge_type, run_id
Step 2: Load Artifacts
# Read artifact file
content = Path("artifacts/topics/etl.md").read_text()
# Create artifact
artifact = Artifact.create(
space_id="primer-topics",
name="etl",
content=content,
artifact_type=ArtifactType.CONTENT,
)
# Store in repository
artifact = artifact_repo.create(artifact)
Content-based addressing:
# If you modify the content
updated_content = content + "\n\n**New section added**"
artifact.update_content(updated_content)
# Digest changes automatically
print(artifact.content_digest) # Different hash!
Step 3: Create PromptTemplate
from markitect.prompts.templates.models import PromptTemplate, MacroReference
template = PromptTemplate.create(
id="generate-primer-v1",
name="generate-primer",
content=template_content,
space_id="primer-templates",
)
# Add macro dependencies
template.add_macro(MacroReference(
name="topic",
source_space="primer-topics"
))
template.add_macro(MacroReference(
name="authoring_rules",
source_space="primer-guidelines"
))
Template content (templates/generate-primer.md):
# Generate InfoTech Primer
## Topic
@{topic}
## Guidelines
@{authoring_rules}
## Research Protocol
@{research_prompt}
Generate a complete primer following the authoring rules.
Step 4: Resolve Dependencies
from markitect.prompts.resolver.resolver import PromptResolver
from markitect.prompts.resolver.strategy import ResolutionConfig
resolver = PromptResolver(artifact_repo)
config = ResolutionConfig(
strategy=ResolutionStrategy.FIRST_MATCH,
spaces=["primer-topics", "primer-guidelines"],
)
resolution_result = resolver.resolve_template(template, config)
if resolution_result.success:
for resolved in resolution_result.context.resolved_macros:
print(f"{resolved.macro_name} → {resolved.artifact.name}")
else:
print("Resolution failed:", resolution_result.context.errors)
Resolution algorithm:
- Parse template to extract
@{macro_name}references - For each macro:
- Search configured spaces in order
- Match by name (case-sensitive)
- Return first match (FIRST_MATCH strategy)
- Build ResolutionResult with all resolved artifacts
Step 5: Compile Prompt
from markitect.prompts.resolver.compiler import ContextCompiler
compiler = ContextCompiler()
compiled = compiler.compile(template, template_content, resolution_result)
print(compiled.content) # Fully expanded prompt
print(compiled.content_digest) # Hash for caching
print(compiled.dependency_digests) # Map of macro → artifact digest
Compiled output:
# Generate InfoTech Primer
## Topic
A three-phase computing process where data is extracted from source systems,
transformed (including validation, cleaning, enrichment, and aggregation),
and loaded into a target data store or data warehouse.
...
## Guidelines
[Full authoring-rules content]
...
## Research Protocol
[Full research-prompt content]
...
Step 6: Track Dependencies
from markitect.prompts.execution.manifest import RunManifest
from markitect.prompts.dependencies.graph import GraphBuilder
# Create run manifest
manifest = RunManifest.create(
run_id="run-etl-001",
template_id=template.id,
template_name=template.name,
template_digest=template.content_digest,
)
# Add resolved inputs
for resolved in resolution_result.context.resolved_macros:
manifest.add_resolved_input(
name=resolved.macro_name,
artifact_id=resolved.artifact.id,
space_id=resolved.space_id,
digest=resolved.artifact.content_digest,
)
# Create dependency edge
manifest.add_dependency_edge(
source_id=resolved.artifact.id,
target_id="run-etl-001",
edge_type="requires",
)
# Persist to database
builder = GraphBuilder(dep_repo)
edges = builder.persist_edges(manifest)
Result: Dependency edges stored in database:
source_artifact_id | target_artifact_id | edge_type | run_id
--------------------|--------------------|-----------|-----------
etl-id | run-etl-001 | requires | run-etl-001
authoring-rules-id | run-etl-001 | requires | run-etl-001
research-prompt-id | run-etl-001 | requires | run-etl-001
Step 7: Generate Output
# In real usage, this would call an LLM API
# For demo, we create a mock output
output_content = """
# ETL Primer
## Definition
ETL (Extract, Transform, Load) is a data integration pattern...
[Generated content]
"""
output_artifact = Artifact.create(
space_id="generated-primers",
name="etl-primer",
content=output_content,
artifact_type=ArtifactType.GENERATED,
)
output_artifact = artifact_repo.create(output_artifact)
# Add to manifest
manifest.add_output_artifact(
artifact_id=output_artifact.id,
name=output_artifact.name,
digest=output_artifact.content_digest,
artifact_type="generated",
)
manifest.add_dependency_edge(
source_id="run-etl-001",
target_id=output_artifact.id,
edge_type="generates",
)
# Persist output edges
builder.persist_edges(manifest)
Step 8: Trace Provenance
from markitect.prompts.traceability.service import TraceabilityService
trace_service = TraceabilityService(artifact_repo, dep_repo, db_path="primers.db")
# Trace the generated primer
trace = trace_service.trace_artifact(output_artifact.id)
# Inspect provenance
print("Template:", trace.template.name if trace.template else "None")
print("Producing run:", trace.producing_run.run_id if trace.producing_run else "None")
print("Input artifacts:")
for inp in trace.input_artifacts:
print(f" - {inp.name} ({inp.artifact_type})")
print("Dependency chain:")
for dep_id in trace.dependency_chain:
artifact = artifact_repo.get_by_id(dep_id)
print(f" - {artifact.name if artifact else dep_id}")
Advanced Features
Incremental Recomputation
When an input changes, automatically detect affected outputs:
from markitect.prompts.incremental.detector import ChangeDetector
from markitect.prompts.incremental.engine import IncrementalExecutionEngine
from markitect.prompts.incremental.models import RecomputeConfig
# Detect change
detector = ChangeDetector("primers.db")
authoring_rules = artifact_repo.get_by_name("primer-guidelines", "authoring-rules")
# User updates the file
new_content = Path("artifacts/guidelines/authoring-rules.md").read_text()
change = detector.detect_change(authoring_rules, new_content)
if change:
detector.record_change(change)
# Find affected primers
engine = IncrementalExecutionEngine("primers.db", query_service)
result = engine.recompute(
change,
config=RecomputeConfig(max_depth=2, impact_threshold=0.1),
old_content=authoring_rules.content,
new_content=new_content,
)
print(f"Total dependents: {result.total_dependents}")
print(f"Recomputed: {result.recomputed_count}")
print(f"Suppressed: {result.suppressed_count}")
Recomputation strategies:
- max_depth: Traverse dependency graph N levels
- impact_threshold: Only recompute if change magnitude > threshold
- max_recomputes: Budget limit to prevent runaway execution
Quality Validation
Apply quality gates to generated primers:
from markitect.prompts.quality.validator import QualityValidator
from markitect.prompts.quality.gates.pattern_gate import PatternValidationGate
# Create validation gate
gate = PatternValidationGate(
required_patterns=[
r"## Definition",
r"## Context",
r"## Core Concepts",
r"## Scope and Non-Scope",
],
forbidden_patterns=[
r"TODO",
r"FIXME",
],
gate_id="primer-structure-check",
name="Primer Structure Validator",
)
validator = QualityValidator(gates=[gate], db_path="primers.db")
# Validate output
results = validator.validate_artifact(
content=output_content,
artifact_id=output_artifact.id,
run_id="run-etl-001",
)
if validator.all_passed(results):
print("✓ All quality gates passed")
else:
failed = validator.get_failed_gates(results)
for result in failed:
print(f"✗ {result.gate_id} failed")
for diag in result.diagnostics:
print(f" {diag.message}")
Visualization
Generate dependency graphs:
from markitect.prompts.visualization.graph import GraphExporter
from markitect.prompts.dependencies.queries import DependencyQueryService
query_service = DependencyQueryService(dep_repo)
# Find all related artifacts
deps = query_service.find_transitive_dependencies(output_artifact.id)
dependents = query_service.find_transitive_dependents(output_artifact.id)
all_ids = deps | dependents | {output_artifact.id}
# Build graph
builder = GraphBuilder(dep_repo)
graph = builder.build_graph(all_ids)
# Export to Mermaid
mermaid = GraphExporter.to_mermaid(graph, "Primer Dependencies")
Path("dependencies.mermaid").write_text(mermaid)
# Export to DOT (Graphviz)
dot = GraphExporter.to_dot(graph, "Primer Dependencies")
Path("dependencies.dot").write_text(dot)
Mermaid output:
%%{ title: Primer Dependencies }%%
graph LR
etl-id-->|requires|run-etl-001
authoring-rules-id-->|requires|run-etl-001
research-prompt-id-->|requires|run-etl-001
run-etl-001-.->|generates|etl-primer-id
CLI Usage
The Prompt Dependency Resolution infrastructure includes CLI commands:
Trace Provenance
markitect prompt trace <artifact-id> --database primers.db
Output (JSON):
{
"artifact_id": "abc-123-def-456",
"producing_run": {
"run_id": "run-etl-001",
"template_id": "generate-primer-v1",
"status": "success"
},
"input_artifacts": [
{
"artifact_id": "...",
"name": "etl",
"role": "input"
}
],
"dependency_chain": ["...", "..."]
}
Visualize Graph
markitect prompt graph <artifact-id> --format mermaid --database primers.db
List Runs
# All runs
markitect prompt runs --database primers.db
# Filter by template
markitect prompt runs --template generate-primer-v1 --database primers.db
# Filter by status
markitect prompt runs --status success --limit 10 --database primers.db
Show Impact Debt
# All stale artifacts
markitect prompt debt --database primers.db
# Specific artifact
markitect prompt debt --artifact authoring-rules-id --database primers.db
Graph Statistics
markitect prompt stats --database primers.db
Output:
{
"total_nodes": 12,
"total_edges": 18,
"root_count": 3,
"leaf_count": 2,
"has_cycles": false
}
Best Practices
1. Organize Artifacts by Space
Clear separation of concerns:
- templates/ ← Reusable PromptTemplates
- topics/ ← Domain-specific content
- guidelines/ ← Standards and methodologies
- output/ ← Generated artifacts
2. Use Content Digests for Change Detection
# Don't compare content strings
if old_content != new_content: # ✗ Inefficient
# Do compare digests
if artifact.has_changed(new_digest): # ✓ Fast, hash-based
3. Apply Quality Gates
# Define quality standards as code
gates = [
PatternValidationGate(required_patterns=[...]),
SchemaValidationGate(schema={...}),
]
# Fail fast if quality checks fail
if not validator.all_passed(results):
raise QualityError("Output does not meet standards")
4. Track All Dependencies
# Always persist dependency edges
manifest.add_dependency_edge(source, target, edge_type)
builder.persist_edges(manifest)
# This enables:
# - Impact analysis
# - Incremental recomputation
# - Provenance tracing
5. Use Incremental Execution
# Don't regenerate everything on every change
config = RecomputeConfig(
max_depth=2, # Limit blast radius
impact_threshold=0.1, # Skip minor changes
max_recomputes=10, # Budget limit
)
6. Version Your Templates
# Include version in template metadata
---
id: generate-primer-v1
version: 1.0.0
---
# When template changes significantly, create v2
---
id: generate-primer-v2
version: 2.0.0
---
7. Leverage Traceability
# Use provenance traces for debugging
trace = trace_service.trace_artifact(failed_output_id)
print("Inputs used:")
for inp in trace.input_artifacts:
print(f" {inp.name} @ {inp.content_digest[:8]}")
# This helps identify which input caused the issue
Comparison with Original System
Original (prepdr/)
GeneratePrimerTemplate.md:
<topic>
{{ETL}}
</topic>
<guidance>
{{AuthoringRules}}
</guidance>
Process:
- Manually copy-paste content
- Replace
{{...}}markers by hand - Run through LLM
- No record of what inputs were used
- No change detection
Limitations:
- No automation
- No version control on inputs
- Can't regenerate from history
- No impact analysis when guidelines change
With Infrastructure
templates/generate-primer.md:
@{topic}
@{authoring_rules}
Process:
- Define artifacts once
- Create template with
@{...}macros - Run resolver → compiler → executor
- Full dependency graph persisted
- Complete provenance trace available
Benefits:
- Fully automated resolution
- Content-based change detection (SHA-256)
- Reproducible: "same inputs → same output"
- Impact analysis: "what needs regeneration?"
- Traceability: "how was this generated?"
- Quality validation: automated checks
- Visualization: see dependency relationships
Next Steps
-
Extend the example:
- Add more topics (OAuth, Docker, Kubernetes)
- Create topic-specific quality gates
- Implement actual LLM integration
-
Build a workflow:
- Git hooks to detect artifact changes
- CI/CD pipeline to regenerate affected primers
- Dashboard to show primer freshness
-
Add advanced features:
- Version conflict resolution
- A/B testing different templates
- Batch generation with parallelization
-
Integrate with MarkiTect:
- Use MarkiTect ingestion for artifact storage
- Query relationships with relational metadata
- Generate documentation sites from primers
References
Questions or feedback? File an issue or reach out to the maintainers.