Files
markitect-main/examples/content-generator/TUTORIAL.md
tegwick 360c3b1de2
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
feat(examples): add content-generator example demonstrating Prompt Dependency Resolution
This example demonstrates the full workflow of generating InfoTech primers
using MarkiTect's Prompt Dependency Resolution infrastructure.

Features demonstrated:
- Artifact creation and storage with content-based addressing
- PromptTemplate with @{macro} resolution across multiple spaces
- Automatic dependency tracking and graph construction
- Provenance tracing from outputs back to inputs
- Visualization export (Mermaid format)
- Incremental execution with change detection

Files added:
- generate_primers.py: Complete working example
- README.md: Quick start guide and architecture overview
- TUTORIAL.md: Comprehensive 500+ line tutorial
- templates/generate-primer.md: Template with macros
- artifacts/topics/: ETL and Microservices topic definitions
- artifacts/guidelines/: Authoring rules and research protocol
- prepdr/: Original manual system (preserved for reference)

Example output:
- Generates 2 primers (ETL, Microservices)
- Creates 8 artifacts across 4 information spaces
- Records 8 dependency edges in SQLite database
- Exports dependency graph visualization

Run with: cd examples/content-generator && python generate_primers.py

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-09 23:50:07 +01:00

24 KiB

MarkiTect Prompt Dependency Resolution Tutorial

Example: Generating InfoTech Primers with Full Provenance Tracking

This tutorial demonstrates how to use MarkiTect's Prompt Dependency Resolution infrastructure to systematically generate content with complete dependency tracking, quality validation, and traceability.


Table of Contents

  1. Overview
  2. Architecture
  3. Setup
  4. Core Concepts
  5. Step-by-Step Walkthrough
  6. Advanced Features
  7. CLI Usage
  8. Best Practices

Overview

What This Example Does

This example shows how to generate InfoTech Primers (structured reference documents for IT concepts) using a prompt template system with:

  • Artifact Management: Store and version all inputs (templates, topics, guidelines)
  • Dependency Resolution: Automatically resolve macro references across information spaces
  • Provenance Tracking: Trace any generated primer back to its inputs and template
  • Incremental Updates: Detect when inputs change and regenerate affected primers
  • Quality Validation: Apply quality gates to ensure output meets standards
  • Visualization: View dependency graphs in DOT or Mermaid format

Why Use Prompt Dependency Resolution?

Before (manual approach in prepdr/):

# Template with manual macros
{{topic}}
{{AuthoringRules}}
{{ResearchPrompt}}

Problems:

  • Manual macro substitution
  • No version tracking
  • No dependency awareness
  • Can't detect when inputs change
  • No provenance traceability

After (with infrastructure):

# Template with resolved dependencies
@{topic}
@{authoring_rules}
@{research_prompt}

Benefits:

  • Automatic macro resolution
  • Content-based change detection (SHA-256 digests)
  • Full dependency graph construction
  • Incremental recomputation when inputs change
  • Complete provenance: artifact → template → inputs → validation
  • CLI commands for inspection and debugging

Architecture

Information Spaces

The system organizes artifacts into information spaces (logical namespaces):

primer-templates/        # PromptTemplates for generation
  ├─ generate-primer

primer-topics/           # Topic definitions (ETL, Microservices, OAuth, etc.)
  ├─ etl
  ├─ microservices
  └─ ...

primer-guidelines/       # Authoring and research guidelines
  ├─ authoring-rules
  └─ research-prompt

generated-primers/       # Output artifacts
  ├─ etl-primer
  ├─ microservices-primer
  └─ ...

Dependency Graph

When you generate a primer, the system creates a dependency graph:

graph LR
    A[etl topic] -->|requires| B[generate-primer template]
    C[authoring-rules] -->|requires| B
    D[research-prompt] -->|requires| B
    B -->|generates| E[etl-primer output]

This graph enables:

  • Impact analysis: "What primers need regeneration if authoring-rules changes?"
  • Provenance tracing: "What inputs produced this primer?"
  • Incremental execution: "Only regenerate affected primers"

Setup

Prerequisites

# Ensure MarkiTect is installed
cd /path/to/markitect_project
pip install -e .

Directory Structure

examples/content-generator/
├── TUTORIAL.md                          # This file
├── generate_primers.py                  # Main example script
├── templates/
│   └── generate-primer.md               # PromptTemplate
├── artifacts/
│   ├── topics/
│   │   ├── etl.md                       # Topic: ETL
│   │   └── microservices.md             # Topic: Microservices
│   └── guidelines/
│       ├── authoring-rules.md           # Authoring standards
│       └── research-prompt.md           # Research methodology
└── prepdr/                              # Original manual system (preserved)
    ├── README.md
    ├── ETL.md
    ├── AuthoringRules.md
    ├── AssistentPrompt.md
    └── GeneratePrimerTemplate.md

Running the Example

cd examples/content-generator
python generate_primers.py

Expected output:

╔══════════════════════════════════════════════════════════════╗
║  MarkiTect Prompt Dependency Resolution Example             ║
║  InfoTech Primer Generation                                  ║
╚══════════════════════════════════════════════════════════════╝

=== Loading Artifacts ===
✓ Created artifact: generate-primer (digest: a7f3e2b1)
✓ Created artifact: etl (digest: 9c4d6e8a)
✓ Created artifact: microservices (digest: 5b2f1c9d)
✓ Created artifact: authoring-rules (digest: 3e7a9f2c)
✓ Created artifact: research-prompt (digest: 8d1b4e6f)

=== Generating Primer: etl ===
✓ Template created with 3 macro dependencies
✓ Resolved 3 macros
✓ Compiled prompt (digest: 4c9e2a7b)
✓ Persisted 3 dependency edges
✓ Generated primer: etl-primer

=== Provenance Trace ===
Artifact: abc-123-def-456
Producing Run: run-etl-001
Input Artifacts: 3
Dependency Chain: 5 artifacts

✓ Primer generation complete!

Core Concepts

1. Artifacts

Artifacts are versioned content units with content-based addressing.

from markitect.prompts.models import Artifact, ArtifactType

# Create an artifact
artifact = Artifact.create(
    space_id="primer-topics",
    name="etl",
    content=topic_content,
    artifact_type=ArtifactType.CONTENT,
)

# Automatic SHA-256 digest generation
print(artifact.content_digest)  # "9c4d6e8a..."

Key features:

  • Content digest: SHA-256 hash for change detection
  • Space isolation: Artifacts in different spaces can have same names
  • Type classification: CONTENT, TEMPLATE, GENERATED, SCHEMA, CONFIG

2. PromptTemplates

PromptTemplates are artifacts with macro references.

---
id: generate-primer-v1
artifact_type: template
---

# Generate Primer

Topic: @{topic}
Guidelines: @{authoring_rules}

Macro syntax:

  • @{macro_name} - Resolved to artifact content
  • Resolution happens at execution time
  • Macros can reference artifacts in any information space

3. Resolution Strategy

Resolution finds artifacts to substitute for macros.

from markitect.prompts.resolver.strategy import ResolutionConfig, ResolutionStrategy

config = ResolutionConfig(
    strategy=ResolutionStrategy.FIRST_MATCH,
    spaces=["primer-topics", "primer-guidelines"],
)

Strategies:

  • FIRST_MATCH: Use first artifact found
  • LATEST_VERSION: Use newest version (if artifacts have versions)
  • EXPLICIT_ONLY: Require explicit space qualification

4. Dependency Tracking

Dependency edges are automatically created during resolution.

# Edge types
EdgeType.REQUIRES    # Input dependency (template → topic)
EdgeType.GENERATES   # Output relationship (run → primer)
EdgeType.INCLUDES    # Composition (nested templates)

Graph operations:

# Find all artifacts that depend on authoring-rules
dependents = query_service.find_transitive_dependents("authoring-rules-id")

# Find all inputs needed to regenerate a primer
dependencies = query_service.find_transitive_dependencies("etl-primer-id")

# Detect circular dependencies
cycles = query_service.detect_circular_dependencies()

5. Traceability

ProvenanceTrace captures complete lineage.

trace = trace_service.trace_artifact(artifact_id)

print(trace.producing_run)      # Run that generated this
print(trace.template)            # Template used
print(trace.input_artifacts)    # All input dependencies
print(trace.validation_results) # Quality gate results
print(trace.impact_debt)        # Suppressed recomputations

Step-by-Step Walkthrough

Step 1: Initialize Repositories

from markitect.prompts.repositories.sqlite import SQLiteArtifactRepository
from markitect.prompts.dependencies.repository import SQLiteDependencyRepository

artifact_repo = SQLiteArtifactRepository("primers.db")
dep_repo = SQLiteDependencyRepository("primers.db")

What this does:

  • Creates SQLite database with artifact and dependency tables
  • Artifact table: id, space_id, name, content_digest, metadata
  • Dependency table: source_id, target_id, edge_type, run_id

Step 2: Load Artifacts

# Read artifact file
content = Path("artifacts/topics/etl.md").read_text()

# Create artifact
artifact = Artifact.create(
    space_id="primer-topics",
    name="etl",
    content=content,
    artifact_type=ArtifactType.CONTENT,
)

# Store in repository
artifact = artifact_repo.create(artifact)

Content-based addressing:

# If you modify the content
updated_content = content + "\n\n**New section added**"
artifact.update_content(updated_content)

# Digest changes automatically
print(artifact.content_digest)  # Different hash!

Step 3: Create PromptTemplate

from markitect.prompts.templates.models import PromptTemplate, MacroReference

template = PromptTemplate.create(
    id="generate-primer-v1",
    name="generate-primer",
    content=template_content,
    space_id="primer-templates",
)

# Add macro dependencies
template.add_macro(MacroReference(
    name="topic",
    source_space="primer-topics"
))
template.add_macro(MacroReference(
    name="authoring_rules",
    source_space="primer-guidelines"
))

Template content (templates/generate-primer.md):

# Generate InfoTech Primer

## Topic
@{topic}

## Guidelines
@{authoring_rules}

## Research Protocol
@{research_prompt}

Generate a complete primer following the authoring rules.

Step 4: Resolve Dependencies

from markitect.prompts.resolver.resolver import PromptResolver
from markitect.prompts.resolver.strategy import ResolutionConfig

resolver = PromptResolver(artifact_repo)

config = ResolutionConfig(
    strategy=ResolutionStrategy.FIRST_MATCH,
    spaces=["primer-topics", "primer-guidelines"],
)

resolution_result = resolver.resolve_template(template, config)

if resolution_result.success:
    for resolved in resolution_result.context.resolved_macros:
        print(f"{resolved.macro_name}{resolved.artifact.name}")
else:
    print("Resolution failed:", resolution_result.context.errors)

Resolution algorithm:

  1. Parse template to extract @{macro_name} references
  2. For each macro:
    • Search configured spaces in order
    • Match by name (case-sensitive)
    • Return first match (FIRST_MATCH strategy)
  3. Build ResolutionResult with all resolved artifacts

Step 5: Compile Prompt

from markitect.prompts.resolver.compiler import ContextCompiler

compiler = ContextCompiler()
compiled = compiler.compile(template, template_content, resolution_result)

print(compiled.content)  # Fully expanded prompt
print(compiled.content_digest)  # Hash for caching
print(compiled.dependency_digests)  # Map of macro → artifact digest

Compiled output:

# Generate InfoTech Primer

## Topic
A three-phase computing process where data is extracted from source systems,
transformed (including validation, cleaning, enrichment, and aggregation),
and loaded into a target data store or data warehouse.
...

## Guidelines
[Full authoring-rules content]
...

## Research Protocol
[Full research-prompt content]
...

Step 6: Track Dependencies

from markitect.prompts.execution.manifest import RunManifest
from markitect.prompts.dependencies.graph import GraphBuilder

# Create run manifest
manifest = RunManifest.create(
    run_id="run-etl-001",
    template_id=template.id,
    template_name=template.name,
    template_digest=template.content_digest,
)

# Add resolved inputs
for resolved in resolution_result.context.resolved_macros:
    manifest.add_resolved_input(
        name=resolved.macro_name,
        artifact_id=resolved.artifact.id,
        space_id=resolved.space_id,
        digest=resolved.artifact.content_digest,
    )

    # Create dependency edge
    manifest.add_dependency_edge(
        source_id=resolved.artifact.id,
        target_id="run-etl-001",
        edge_type="requires",
    )

# Persist to database
builder = GraphBuilder(dep_repo)
edges = builder.persist_edges(manifest)

Result: Dependency edges stored in database:

source_artifact_id  | target_artifact_id | edge_type | run_id
--------------------|--------------------|-----------|-----------
etl-id              | run-etl-001        | requires  | run-etl-001
authoring-rules-id  | run-etl-001        | requires  | run-etl-001
research-prompt-id  | run-etl-001        | requires  | run-etl-001

Step 7: Generate Output

# In real usage, this would call an LLM API
# For demo, we create a mock output
output_content = """
# ETL Primer

## Definition
ETL (Extract, Transform, Load) is a data integration pattern...
[Generated content]
"""

output_artifact = Artifact.create(
    space_id="generated-primers",
    name="etl-primer",
    content=output_content,
    artifact_type=ArtifactType.GENERATED,
)
output_artifact = artifact_repo.create(output_artifact)

# Add to manifest
manifest.add_output_artifact(
    artifact_id=output_artifact.id,
    name=output_artifact.name,
    digest=output_artifact.content_digest,
    artifact_type="generated",
)

manifest.add_dependency_edge(
    source_id="run-etl-001",
    target_id=output_artifact.id,
    edge_type="generates",
)

# Persist output edges
builder.persist_edges(manifest)

Step 8: Trace Provenance

from markitect.prompts.traceability.service import TraceabilityService

trace_service = TraceabilityService(artifact_repo, dep_repo, db_path="primers.db")

# Trace the generated primer
trace = trace_service.trace_artifact(output_artifact.id)

# Inspect provenance
print("Template:", trace.template.name if trace.template else "None")
print("Producing run:", trace.producing_run.run_id if trace.producing_run else "None")
print("Input artifacts:")
for inp in trace.input_artifacts:
    print(f"  - {inp.name} ({inp.artifact_type})")

print("Dependency chain:")
for dep_id in trace.dependency_chain:
    artifact = artifact_repo.get_by_id(dep_id)
    print(f"  - {artifact.name if artifact else dep_id}")

Advanced Features

Incremental Recomputation

When an input changes, automatically detect affected outputs:

from markitect.prompts.incremental.detector import ChangeDetector
from markitect.prompts.incremental.engine import IncrementalExecutionEngine
from markitect.prompts.incremental.models import RecomputeConfig

# Detect change
detector = ChangeDetector("primers.db")
authoring_rules = artifact_repo.get_by_name("primer-guidelines", "authoring-rules")

# User updates the file
new_content = Path("artifacts/guidelines/authoring-rules.md").read_text()
change = detector.detect_change(authoring_rules, new_content)

if change:
    detector.record_change(change)

    # Find affected primers
    engine = IncrementalExecutionEngine("primers.db", query_service)
    result = engine.recompute(
        change,
        config=RecomputeConfig(max_depth=2, impact_threshold=0.1),
        old_content=authoring_rules.content,
        new_content=new_content,
    )

    print(f"Total dependents: {result.total_dependents}")
    print(f"Recomputed: {result.recomputed_count}")
    print(f"Suppressed: {result.suppressed_count}")

Recomputation strategies:

  • max_depth: Traverse dependency graph N levels
  • impact_threshold: Only recompute if change magnitude > threshold
  • max_recomputes: Budget limit to prevent runaway execution

Quality Validation

Apply quality gates to generated primers:

from markitect.prompts.quality.validator import QualityValidator
from markitect.prompts.quality.gates.pattern_gate import PatternValidationGate

# Create validation gate
gate = PatternValidationGate(
    required_patterns=[
        r"## Definition",
        r"## Context",
        r"## Core Concepts",
        r"## Scope and Non-Scope",
    ],
    forbidden_patterns=[
        r"TODO",
        r"FIXME",
    ],
    gate_id="primer-structure-check",
    name="Primer Structure Validator",
)

validator = QualityValidator(gates=[gate], db_path="primers.db")

# Validate output
results = validator.validate_artifact(
    content=output_content,
    artifact_id=output_artifact.id,
    run_id="run-etl-001",
)

if validator.all_passed(results):
    print("✓ All quality gates passed")
else:
    failed = validator.get_failed_gates(results)
    for result in failed:
        print(f"✗ {result.gate_id} failed")
        for diag in result.diagnostics:
            print(f"  {diag.message}")

Visualization

Generate dependency graphs:

from markitect.prompts.visualization.graph import GraphExporter
from markitect.prompts.dependencies.queries import DependencyQueryService

query_service = DependencyQueryService(dep_repo)

# Find all related artifacts
deps = query_service.find_transitive_dependencies(output_artifact.id)
dependents = query_service.find_transitive_dependents(output_artifact.id)
all_ids = deps | dependents | {output_artifact.id}

# Build graph
builder = GraphBuilder(dep_repo)
graph = builder.build_graph(all_ids)

# Export to Mermaid
mermaid = GraphExporter.to_mermaid(graph, "Primer Dependencies")
Path("dependencies.mermaid").write_text(mermaid)

# Export to DOT (Graphviz)
dot = GraphExporter.to_dot(graph, "Primer Dependencies")
Path("dependencies.dot").write_text(dot)

Mermaid output:

%%{ title: Primer Dependencies }%%
graph LR
  etl-id-->|requires|run-etl-001
  authoring-rules-id-->|requires|run-etl-001
  research-prompt-id-->|requires|run-etl-001
  run-etl-001-.->|generates|etl-primer-id

CLI Usage

The Prompt Dependency Resolution infrastructure includes CLI commands:

Trace Provenance

markitect prompt trace <artifact-id> --database primers.db

Output (JSON):

{
  "artifact_id": "abc-123-def-456",
  "producing_run": {
    "run_id": "run-etl-001",
    "template_id": "generate-primer-v1",
    "status": "success"
  },
  "input_artifacts": [
    {
      "artifact_id": "...",
      "name": "etl",
      "role": "input"
    }
  ],
  "dependency_chain": ["...", "..."]
}

Visualize Graph

markitect prompt graph <artifact-id> --format mermaid --database primers.db

List Runs

# All runs
markitect prompt runs --database primers.db

# Filter by template
markitect prompt runs --template generate-primer-v1 --database primers.db

# Filter by status
markitect prompt runs --status success --limit 10 --database primers.db

Show Impact Debt

# All stale artifacts
markitect prompt debt --database primers.db

# Specific artifact
markitect prompt debt --artifact authoring-rules-id --database primers.db

Graph Statistics

markitect prompt stats --database primers.db

Output:

{
  "total_nodes": 12,
  "total_edges": 18,
  "root_count": 3,
  "leaf_count": 2,
  "has_cycles": false
}

Best Practices

1. Organize Artifacts by Space

Clear separation of concerns:
- templates/     ← Reusable PromptTemplates
- topics/        ← Domain-specific content
- guidelines/    ← Standards and methodologies
- output/        ← Generated artifacts

2. Use Content Digests for Change Detection

# Don't compare content strings
if old_content != new_content:  # ✗ Inefficient

# Do compare digests
if artifact.has_changed(new_digest):  # ✓ Fast, hash-based

3. Apply Quality Gates

# Define quality standards as code
gates = [
    PatternValidationGate(required_patterns=[...]),
    SchemaValidationGate(schema={...}),
]

# Fail fast if quality checks fail
if not validator.all_passed(results):
    raise QualityError("Output does not meet standards")

4. Track All Dependencies

# Always persist dependency edges
manifest.add_dependency_edge(source, target, edge_type)
builder.persist_edges(manifest)

# This enables:
# - Impact analysis
# - Incremental recomputation
# - Provenance tracing

5. Use Incremental Execution

# Don't regenerate everything on every change
config = RecomputeConfig(
    max_depth=2,              # Limit blast radius
    impact_threshold=0.1,     # Skip minor changes
    max_recomputes=10,        # Budget limit
)

6. Version Your Templates

# Include version in template metadata
---
id: generate-primer-v1
version: 1.0.0
---

# When template changes significantly, create v2
---
id: generate-primer-v2
version: 2.0.0
---

7. Leverage Traceability

# Use provenance traces for debugging
trace = trace_service.trace_artifact(failed_output_id)

print("Inputs used:")
for inp in trace.input_artifacts:
    print(f"  {inp.name} @ {inp.content_digest[:8]}")

# This helps identify which input caused the issue

Comparison with Original System

Original (prepdr/)

GeneratePrimerTemplate.md:

<topic>
{{ETL}}
</topic>

<guidance>
{{AuthoringRules}}
</guidance>

Process:

  1. Manually copy-paste content
  2. Replace {{...}} markers by hand
  3. Run through LLM
  4. No record of what inputs were used
  5. No change detection

Limitations:

  • No automation
  • No version control on inputs
  • Can't regenerate from history
  • No impact analysis when guidelines change

With Infrastructure

templates/generate-primer.md:

@{topic}
@{authoring_rules}

Process:

  1. Define artifacts once
  2. Create template with @{...} macros
  3. Run resolver → compiler → executor
  4. Full dependency graph persisted
  5. Complete provenance trace available

Benefits:

  • Fully automated resolution
  • Content-based change detection (SHA-256)
  • Reproducible: "same inputs → same output"
  • Impact analysis: "what needs regeneration?"
  • Traceability: "how was this generated?"
  • Quality validation: automated checks
  • Visualization: see dependency relationships

Next Steps

  1. Extend the example:

    • Add more topics (OAuth, Docker, Kubernetes)
    • Create topic-specific quality gates
    • Implement actual LLM integration
  2. Build a workflow:

    • Git hooks to detect artifact changes
    • CI/CD pipeline to regenerate affected primers
    • Dashboard to show primer freshness
  3. Add advanced features:

    • Version conflict resolution
    • A/B testing different templates
    • Batch generation with parallelization
  4. Integrate with MarkiTect:

    • Use MarkiTect ingestion for artifact storage
    • Query relationships with relational metadata
    • Generate documentation sites from primers

References


Questions or feedback? File an issue or reach out to the maintainers.