Files

tegwick b7e11461f4 chore: rename markitect_project to markitect-main across project

Finishes the in-progress rename so docs, configs, tests, and capability
manifests all reference the current repo name consistently. Fixes two
tests (test_roundtrip_consolidated.py, test_issue_140_roundtrip_simplified.py)
whose hardcoded cwd paths would have broken under the renamed directory.

Archival content under history/, reports/, and roadmap/eat-the-frog/, plus
derived artifacts (.venv_old/, node_modules/, asset_registry.json) are
intentionally left untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-21 01:57:35 +02:00

24 KiB

Raw Blame History

MarkiTect Prompt Dependency Resolution Tutorial

Example: Generating InfoTech Primers with Full Provenance Tracking

This tutorial demonstrates how to use MarkiTect's Prompt Dependency Resolution infrastructure to systematically generate content with complete dependency tracking, quality validation, and traceability.

Overview
Architecture
Setup
Core Concepts
Step-by-Step Walkthrough
Advanced Features
CLI Usage
Best Practices

Overview

What This Example Does

This example shows how to generate InfoTech Primers (structured reference documents for IT concepts) using a prompt template system with:

Artifact Management: Store and version all inputs (templates, topics, guidelines)
Dependency Resolution: Automatically resolve macro references across information spaces
Provenance Tracking: Trace any generated primer back to its inputs and template
Incremental Updates: Detect when inputs change and regenerate affected primers
Quality Validation: Apply quality gates to ensure output meets standards
Visualization: View dependency graphs in DOT or Mermaid format

Why Use Prompt Dependency Resolution?

Before (manual approach in prepdr/):

# Template with manual macros
{{topic}}
{{AuthoringRules}}
{{ResearchPrompt}}

Problems:

Manual macro substitution
No version tracking
No dependency awareness
Can't detect when inputs change
No provenance traceability

After (with infrastructure):

# Template with resolved dependencies
@{topic}
@{authoring_rules}
@{research_prompt}

Benefits:

Automatic macro resolution
Content-based change detection (SHA-256 digests)
Full dependency graph construction
Incremental recomputation when inputs change
Complete provenance: artifact → template → inputs → validation
CLI commands for inspection and debugging

Architecture

Information Spaces

The system organizes artifacts into information spaces (logical namespaces):

primer-templates/        # PromptTemplates for generation
  ├─ generate-primer

primer-topics/           # Topic definitions (ETL, Microservices, OAuth, etc.)
  ├─ etl
  ├─ microservices
  └─ ...

primer-guidelines/       # Authoring and research guidelines
  ├─ authoring-rules
  └─ research-prompt

generated-primers/       # Output artifacts
  ├─ etl-primer
  ├─ microservices-primer
  └─ ...

Dependency Graph

When you generate a primer, the system creates a dependency graph:

graph LR
    A[etl topic] -->|requires| B[generate-primer template]
    C[authoring-rules] -->|requires| B
    D[research-prompt] -->|requires| B
    B -->|generates| E[etl-primer output]

This graph enables:

Impact analysis: "What primers need regeneration if authoring-rules changes?"
Provenance tracing: "What inputs produced this primer?"
Incremental execution: "Only regenerate affected primers"

Setup

Prerequisites

# Ensure MarkiTect is installed
cd /path/to/markitect-main
pip install -e .

Directory Structure

examples/content-generator/
├── TUTORIAL.md                          # This file
├── generate_primers.py                  # Main example script
├── templates/
│   └── generate-primer.md               # PromptTemplate
├── artifacts/
│   ├── topics/
│   │   ├── etl.md                       # Topic: ETL
│   │   └── microservices.md             # Topic: Microservices
│   └── guidelines/
│       ├── authoring-rules.md           # Authoring standards
│       └── research-prompt.md           # Research methodology
└── prepdr/                              # Original manual system (preserved)
    ├── README.md
    ├── ETL.md
    ├── AuthoringRules.md
    ├── AssistentPrompt.md
    └── GeneratePrimerTemplate.md

Running the Example

cd examples/content-generator
python generate_primers.py

Expected output:

╔══════════════════════════════════════════════════════════════╗
║  MarkiTect Prompt Dependency Resolution Example             ║
║  InfoTech Primer Generation                                  ║
╚══════════════════════════════════════════════════════════════╝

=== Loading Artifacts ===
✓ Created artifact: generate-primer (digest: a7f3e2b1)
✓ Created artifact: etl (digest: 9c4d6e8a)
✓ Created artifact: microservices (digest: 5b2f1c9d)
✓ Created artifact: authoring-rules (digest: 3e7a9f2c)
✓ Created artifact: research-prompt (digest: 8d1b4e6f)

=== Generating Primer: etl ===
✓ Template created with 3 macro dependencies
✓ Resolved 3 macros
✓ Compiled prompt (digest: 4c9e2a7b)
✓ Persisted 3 dependency edges
✓ Generated primer: etl-primer

=== Provenance Trace ===
Artifact: abc-123-def-456
Producing Run: run-etl-001
Input Artifacts: 3
Dependency Chain: 5 artifacts

✓ Primer generation complete!

Core Concepts

1. Artifacts

Artifacts are versioned content units with content-based addressing.

from markitect.prompts.models import Artifact, ArtifactType

# Create an artifact
artifact = Artifact.create(
    space_id="primer-topics",
    name="etl",
    content=topic_content,
    artifact_type=ArtifactType.CONTENT,
)

# Automatic SHA-256 digest generation
print(artifact.content_digest)  # "9c4d6e8a..."

Key features:

Content digest: SHA-256 hash for change detection
Space isolation: Artifacts in different spaces can have same names
Type classification: CONTENT, TEMPLATE, GENERATED, SCHEMA, CONFIG

2. PromptTemplates

PromptTemplates are artifacts with macro references.

---
id: generate-primer-v1
artifact_type: template
---

# Generate Primer

Topic: @{topic}
Guidelines: @{authoring_rules}

Macro syntax:

@{macro_name} - Resolved to artifact content
Resolution happens at execution time
Macros can reference artifacts in any information space

3. Resolution Strategy

Resolution finds artifacts to substitute for macros.

from markitect.prompts.resolver.strategy import ResolutionConfig, ResolutionStrategy

config = ResolutionConfig(
    strategy=ResolutionStrategy.FIRST_MATCH,
    spaces=["primer-topics", "primer-guidelines"],
)

Strategies:

FIRST_MATCH: Use first artifact found
LATEST_VERSION: Use newest version (if artifacts have versions)
EXPLICIT_ONLY: Require explicit space qualification

4. Dependency Tracking

Dependency edges are automatically created during resolution.

# Edge types
EdgeType.REQUIRES    # Input dependency (template → topic)
EdgeType.GENERATES   # Output relationship (run → primer)
EdgeType.INCLUDES    # Composition (nested templates)

Graph operations:

# Find all artifacts that depend on authoring-rules
dependents = query_service.find_transitive_dependents("authoring-rules-id")

# Find all inputs needed to regenerate a primer
dependencies = query_service.find_transitive_dependencies("etl-primer-id")

# Detect circular dependencies
cycles = query_service.detect_circular_dependencies()

5. Traceability

ProvenanceTrace captures complete lineage.

trace = trace_service.trace_artifact(artifact_id)

print(trace.producing_run)      # Run that generated this
print(trace.template)            # Template used
print(trace.input_artifacts)    # All input dependencies
print(trace.validation_results) # Quality gate results
print(trace.impact_debt)        # Suppressed recomputations

Step-by-Step Walkthrough

Step 1: Initialize Repositories

from markitect.prompts.repositories.sqlite import SQLiteArtifactRepository
from markitect.prompts.dependencies.repository import SQLiteDependencyRepository

artifact_repo = SQLiteArtifactRepository("primers.db")
dep_repo = SQLiteDependencyRepository("primers.db")

What this does:

Creates SQLite database with artifact and dependency tables
Artifact table: id, space_id, name, content_digest, metadata
Dependency table: source_id, target_id, edge_type, run_id

Step 2: Load Artifacts

# Read artifact file
content = Path("artifacts/topics/etl.md").read_text()

# Create artifact
artifact = Artifact.create(
    space_id="primer-topics",
    name="etl",
    content=content,
    artifact_type=ArtifactType.CONTENT,
)

# Store in repository
artifact = artifact_repo.create(artifact)

Content-based addressing:

# If you modify the content
updated_content = content + "\n\n**New section added**"
artifact.update_content(updated_content)

# Digest changes automatically
print(artifact.content_digest)  # Different hash!

Step 3: Create PromptTemplate

from markitect.prompts.templates.models import PromptTemplate, MacroReference

template = PromptTemplate.create(
    id="generate-primer-v1",
    name="generate-primer",
    content=template_content,
    space_id="primer-templates",
)

# Add macro dependencies
template.add_macro(MacroReference(
    name="topic",
    source_space="primer-topics"
))
template.add_macro(MacroReference(
    name="authoring_rules",
    source_space="primer-guidelines"
))

Template content (templates/generate-primer.md):

# Generate InfoTech Primer

## Topic
@{topic}

## Guidelines
@{authoring_rules}

## Research Protocol
@{research_prompt}

Generate a complete primer following the authoring rules.

Step 4: Resolve Dependencies

from markitect.prompts.resolver.resolver import PromptResolver
from markitect.prompts.resolver.strategy import ResolutionConfig

resolver = PromptResolver(artifact_repo)

config = ResolutionConfig(
    strategy=ResolutionStrategy.FIRST_MATCH,
    spaces=["primer-topics", "primer-guidelines"],
)

resolution_result = resolver.resolve_template(template, config)

if resolution_result.success:
    for resolved in resolution_result.context.resolved_macros:
        print(f"{resolved.macro_name} → {resolved.artifact.name}")
else:
    print("Resolution failed:", resolution_result.context.errors)

Resolution algorithm:

Parse template to extract @{macro_name} references
For each macro:
- Search configured spaces in order
- Match by name (case-sensitive)
- Return first match (FIRST_MATCH strategy)
Build ResolutionResult with all resolved artifacts

Step 5: Compile Prompt

from markitect.prompts.resolver.compiler import ContextCompiler

compiler = ContextCompiler()
compiled = compiler.compile(template, template_content, resolution_result)

print(compiled.content)  # Fully expanded prompt
print(compiled.content_digest)  # Hash for caching
print(compiled.dependency_digests)  # Map of macro → artifact digest

Compiled output:

# Generate InfoTech Primer

## Topic
A three-phase computing process where data is extracted from source systems,
transformed (including validation, cleaning, enrichment, and aggregation),
and loaded into a target data store or data warehouse.
...

## Guidelines
[Full authoring-rules content]
...

## Research Protocol
[Full research-prompt content]
...

Step 6: Track Dependencies

from markitect.prompts.execution.manifest import RunManifest
from markitect.prompts.dependencies.graph import GraphBuilder

# Create run manifest
manifest = RunManifest.create(
    run_id="run-etl-001",
    template_id=template.id,
    template_name=template.name,
    template_digest=template.content_digest,
)

# Add resolved inputs
for resolved in resolution_result.context.resolved_macros:
    manifest.add_resolved_input(
        name=resolved.macro_name,
        artifact_id=resolved.artifact.id,
        space_id=resolved.space_id,
        digest=resolved.artifact.content_digest,
    )

    # Create dependency edge
    manifest.add_dependency_edge(
        source_id=resolved.artifact.id,
        target_id="run-etl-001",
        edge_type="requires",
    )

# Persist to database
builder = GraphBuilder(dep_repo)
edges = builder.persist_edges(manifest)

Result: Dependency edges stored in database:

source_artifact_id  | target_artifact_id | edge_type | run_id
--------------------|--------------------|-----------|-----------
etl-id              | run-etl-001        | requires  | run-etl-001
authoring-rules-id  | run-etl-001        | requires  | run-etl-001
research-prompt-id  | run-etl-001        | requires  | run-etl-001

Step 7: Generate Output

# In real usage, this would call an LLM API
# For demo, we create a mock output
output_content = """
# ETL Primer

## Definition
ETL (Extract, Transform, Load) is a data integration pattern...
[Generated content]
"""

output_artifact = Artifact.create(
    space_id="generated-primers",
    name="etl-primer",
    content=output_content,
    artifact_type=ArtifactType.GENERATED,
)
output_artifact = artifact_repo.create(output_artifact)

# Add to manifest
manifest.add_output_artifact(
    artifact_id=output_artifact.id,
    name=output_artifact.name,
    digest=output_artifact.content_digest,
    artifact_type="generated",
)

manifest.add_dependency_edge(
    source_id="run-etl-001",
    target_id=output_artifact.id,
    edge_type="generates",
)

# Persist output edges
builder.persist_edges(manifest)

Step 8: Trace Provenance

from markitect.prompts.traceability.service import TraceabilityService

trace_service = TraceabilityService(artifact_repo, dep_repo, db_path="primers.db")

# Trace the generated primer
trace = trace_service.trace_artifact(output_artifact.id)

# Inspect provenance
print("Template:", trace.template.name if trace.template else "None")
print("Producing run:", trace.producing_run.run_id if trace.producing_run else "None")
print("Input artifacts:")
for inp in trace.input_artifacts:
    print(f"  - {inp.name} ({inp.artifact_type})")

print("Dependency chain:")
for dep_id in trace.dependency_chain:
    artifact = artifact_repo.get_by_id(dep_id)
    print(f"  - {artifact.name if artifact else dep_id}")

Advanced Features

Incremental Recomputation

When an input changes, automatically detect affected outputs:

from markitect.prompts.incremental.detector import ChangeDetector
from markitect.prompts.incremental.engine import IncrementalExecutionEngine
from markitect.prompts.incremental.models import RecomputeConfig

# Detect change
detector = ChangeDetector("primers.db")
authoring_rules = artifact_repo.get_by_name("primer-guidelines", "authoring-rules")

# User updates the file
new_content = Path("artifacts/guidelines/authoring-rules.md").read_text()
change = detector.detect_change(authoring_rules, new_content)

if change:
    detector.record_change(change)

    # Find affected primers
    engine = IncrementalExecutionEngine("primers.db", query_service)
    result = engine.recompute(
        change,
        config=RecomputeConfig(max_depth=2, impact_threshold=0.1),
        old_content=authoring_rules.content,
        new_content=new_content,
    )

    print(f"Total dependents: {result.total_dependents}")
    print(f"Recomputed: {result.recomputed_count}")
    print(f"Suppressed: {result.suppressed_count}")

Recomputation strategies:

max_depth: Traverse dependency graph N levels
impact_threshold: Only recompute if change magnitude > threshold
max_recomputes: Budget limit to prevent runaway execution

Quality Validation

Apply quality gates to generated primers:

from markitect.prompts.quality.validator import QualityValidator
from markitect.prompts.quality.gates.pattern_gate import PatternValidationGate

# Create validation gate
gate = PatternValidationGate(
    required_patterns=[
        r"## Definition",
        r"## Context",
        r"## Core Concepts",
        r"## Scope and Non-Scope",
    ],
    forbidden_patterns=[
        r"TODO",
        r"FIXME",
    ],
    gate_id="primer-structure-check",
    name="Primer Structure Validator",
)

validator = QualityValidator(gates=[gate], db_path="primers.db")

# Validate output
results = validator.validate_artifact(
    content=output_content,
    artifact_id=output_artifact.id,
    run_id="run-etl-001",
)

if validator.all_passed(results):
    print("✓ All quality gates passed")
else:
    failed = validator.get_failed_gates(results)
    for result in failed:
        print(f"✗ {result.gate_id} failed")
        for diag in result.diagnostics:
            print(f"  {diag.message}")

Visualization

Generate dependency graphs:

from markitect.prompts.visualization.graph import GraphExporter
from markitect.prompts.dependencies.queries import DependencyQueryService

query_service = DependencyQueryService(dep_repo)

# Find all related artifacts
deps = query_service.find_transitive_dependencies(output_artifact.id)
dependents = query_service.find_transitive_dependents(output_artifact.id)
all_ids = deps | dependents | {output_artifact.id}

# Build graph
builder = GraphBuilder(dep_repo)
graph = builder.build_graph(all_ids)

# Export to Mermaid
mermaid = GraphExporter.to_mermaid(graph, "Primer Dependencies")
Path("dependencies.mermaid").write_text(mermaid)

# Export to DOT (Graphviz)
dot = GraphExporter.to_dot(graph, "Primer Dependencies")
Path("dependencies.dot").write_text(dot)

Mermaid output:

%%{ title: Primer Dependencies }%%
graph LR
  etl-id-->|requires|run-etl-001
  authoring-rules-id-->|requires|run-etl-001
  research-prompt-id-->|requires|run-etl-001
  run-etl-001-.->|generates|etl-primer-id

CLI Usage

The Prompt Dependency Resolution infrastructure includes CLI commands:

Trace Provenance

markitect prompt trace <artifact-id> --database primers.db

Output (JSON):

{
  "artifact_id": "abc-123-def-456",
  "producing_run": {
    "run_id": "run-etl-001",
    "template_id": "generate-primer-v1",
    "status": "success"
  },
  "input_artifacts": [
    {
      "artifact_id": "...",
      "name": "etl",
      "role": "input"
    }
  ],
  "dependency_chain": ["...", "..."]
}

Visualize Graph

markitect prompt graph <artifact-id> --format mermaid --database primers.db

List Runs

# All runs
markitect prompt runs --database primers.db

# Filter by template
markitect prompt runs --template generate-primer-v1 --database primers.db

# Filter by status
markitect prompt runs --status success --limit 10 --database primers.db

Show Impact Debt

# All stale artifacts
markitect prompt debt --database primers.db

# Specific artifact
markitect prompt debt --artifact authoring-rules-id --database primers.db

Graph Statistics

markitect prompt stats --database primers.db

Output:

{
  "total_nodes": 12,
  "total_edges": 18,
  "root_count": 3,
  "leaf_count": 2,
  "has_cycles": false
}

Best Practices

1. Organize Artifacts by Space

Clear separation of concerns:
- templates/     ← Reusable PromptTemplates
- topics/        ← Domain-specific content
- guidelines/    ← Standards and methodologies
- output/        ← Generated artifacts

2. Use Content Digests for Change Detection

# Don't compare content strings
if old_content != new_content:  # ✗ Inefficient

# Do compare digests
if artifact.has_changed(new_digest):  # ✓ Fast, hash-based

3. Apply Quality Gates

# Define quality standards as code
gates = [
    PatternValidationGate(required_patterns=[...]),
    SchemaValidationGate(schema={...}),
]

# Fail fast if quality checks fail
if not validator.all_passed(results):
    raise QualityError("Output does not meet standards")

4. Track All Dependencies

# Always persist dependency edges
manifest.add_dependency_edge(source, target, edge_type)
builder.persist_edges(manifest)

# This enables:
# - Impact analysis
# - Incremental recomputation
# - Provenance tracing

5. Use Incremental Execution

# Don't regenerate everything on every change
config = RecomputeConfig(
    max_depth=2,              # Limit blast radius
    impact_threshold=0.1,     # Skip minor changes
    max_recomputes=10,        # Budget limit
)

6. Version Your Templates

# Include version in template metadata
---
id: generate-primer-v1
version: 1.0.0
---

# When template changes significantly, create v2
---
id: generate-primer-v2
version: 2.0.0
---

7. Leverage Traceability

# Use provenance traces for debugging
trace = trace_service.trace_artifact(failed_output_id)

print("Inputs used:")
for inp in trace.input_artifacts:
    print(f"  {inp.name} @ {inp.content_digest[:8]}")

# This helps identify which input caused the issue

Comparison with Original System

Original (`prepdr/`)

GeneratePrimerTemplate.md:

<topic>
{{ETL}}
</topic>

<guidance>
{{AuthoringRules}}
</guidance>

Process:

Manually copy-paste content
Replace {{...}} markers by hand
Run through LLM
No record of what inputs were used
No change detection

Limitations:

No automation
No version control on inputs
Can't regenerate from history
No impact analysis when guidelines change

With Infrastructure

templates/generate-primer.md:

@{topic}
@{authoring_rules}