diff --git a/examples/content-generator/.gitignore b/examples/content-generator/.gitignore new file mode 100644 index 00000000..c3827d7c --- /dev/null +++ b/examples/content-generator/.gitignore @@ -0,0 +1,8 @@ +# Generated files +*.db +*.mermaid +*.dot + +# Python cache +__pycache__/ +*.pyc diff --git a/examples/content-generator/README.md b/examples/content-generator/README.md new file mode 100644 index 00000000..ccebadde --- /dev/null +++ b/examples/content-generator/README.md @@ -0,0 +1,258 @@ +# Content Generator Example + +**Demonstrates: Prompt Dependency Resolution for Systematic Content Generation** + +This example shows how to use MarkiTect's Prompt Dependency Resolution infrastructure to generate InfoTech primers with full dependency tracking, provenance tracing, and quality validation. + +## Quick Start + +```bash +# Run the example +python generate_primers.py + +# Or if executable: +./generate_primers.py +``` + +## What This Example Demonstrates + +1. **Artifact Management**: Store templates, topics, and guidelines as versioned artifacts +2. **Dependency Resolution**: Automatically resolve `@{macro}` references in templates +3. **Provenance Tracking**: Trace generated content back to its inputs +4. **Incremental Updates**: Detect changes and regenerate only affected outputs +5. **Quality Validation**: Apply gates to ensure output meets standards +6. **Visualization**: Export dependency graphs to Mermaid or DOT format + +## Files + +``` +content-generator/ +├── README.md # This file +├── TUTORIAL.md # Comprehensive tutorial (START HERE) +├── generate_primers.py # Example implementation +│ +├── templates/ +│ └── generate-primer.md # PromptTemplate for primer generation +│ +├── artifacts/ +│ ├── topics/ # Topic definitions +│ │ ├── etl.md +│ │ └── microservices.md +│ │ +│ └── guidelines/ # Authoring standards +│ ├── authoring-rules.md +│ └── research-prompt.md +│ +└── prepdr/ # Original manual system (preserved) + ├── README.md + ├── ETL.md + ├── AuthoringRules.md + ├── AssistentPrompt.md + └── GeneratePrimerTemplate.md +``` + +## Before vs After + +### Before (prepdr/ - Manual System) + +```markdown +# Manual macro substitution +{{ETL}} +{{AuthoringRules}} +``` + +Problems: +- No automation +- No version tracking +- No dependency awareness +- Can't detect changes +- No traceability + +### After (With Infrastructure) + +```markdown +# Automatic resolution +@{topic} +@{authoring_rules} +``` + +Benefits: +- ✅ Automatic macro resolution +- ✅ SHA-256 content digests for change detection +- ✅ Full dependency graph +- ✅ Incremental recomputation +- ✅ Complete provenance traces +- ✅ CLI tools for inspection + +## Tutorial + +**[Read the full tutorial →](TUTORIAL.md)** + +The tutorial covers: +- Architecture and core concepts +- Step-by-step implementation walkthrough +- Advanced features (incremental execution, quality gates, visualization) +- CLI usage examples +- Best practices + +## CLI Commands + +After running the example, you can use CLI commands to inspect the system: + +```bash +# Trace provenance +markitect prompt trace --database primers.db + +# Visualize dependencies +markitect prompt graph --format mermaid --database primers.db + +# List runs +markitect prompt runs --database primers.db + +# Show impact debt (stale artifacts) +markitect prompt debt --database primers.db + +# Graph statistics +markitect prompt stats --database primers.db +``` + +## Architecture + +```mermaid +graph TB + A[Topic: ETL] -->|requires| D[Run: generate-etl] + B[Authoring Rules] -->|requires| D + C[Research Prompt] -->|requires| D + D -->|generates| E[Output: ETL Primer] + + style E fill:#90EE90 + style D fill:#87CEEB +``` + +## Key Features + +### 1. Content-Based Addressing + +Every artifact has a SHA-256 digest that changes when content changes: + +```python +artifact.content_digest # "9c4d6e8a..." +artifact.has_changed(new_digest) # True if modified +``` + +### 2. Dependency Graph + +Full graph construction enables: +- **Impact analysis**: "What needs regeneration?" +- **Provenance**: "How was this produced?" +- **Build order**: Topological sort for correct execution + +### 3. Incremental Execution + +Only regenerate what's affected by changes: + +```python +config = RecomputeConfig( + max_depth=2, # Traverse 2 levels + impact_threshold=0.1, # Skip minor changes + max_recomputes=10, # Budget limit +) +``` + +### 4. Quality Validation + +Apply automated checks: + +```python +gate = PatternValidationGate( + required_patterns=[r"## Definition", r"## Context"], + forbidden_patterns=[r"TODO", r"FIXME"], +) +``` + +## Extending the Example + +### Add New Topics + +```bash +# 1. Create artifact file +cat > artifacts/topics/oauth.md << 'EOF' +--- +id: topic-oauth +name: OAuth +artifact_type: content +--- + +# OAuth 2.0 + +An authorization framework that enables... +EOF + +# 2. Run generator +python generate_primers.py +# Automatically picks up new topic +``` + +### Custom Quality Gates + +```python +from markitect.prompts.quality.gates.schema_gate import SchemaValidationGate + +# Validate primer structure +schema = { + "type": "object", + "required": ["definition", "context", "scope"], +} + +gate = SchemaValidationGate(schema=schema, gate_id="primer-schema") +``` + +### Integrate with LLM + +```python +# Replace mock generation with real LLM call +from openai import OpenAI + +client = OpenAI() +response = client.chat.completions.create( + model="gpt-4", + messages=[{"role": "user", "content": compiled.content}] +) + +output_content = response.choices[0].message.content +``` + +## Performance + +Database operations are optimized: +- Indexed lookups on artifact ID, space, digest +- Indexed dependency queries by source, target, run +- Efficient graph traversal with BFS/DFS +- Content digest comparison (no full content comparison) + +## Troubleshooting + +### "Artifact not found" +- Ensure artifact files exist in correct directories +- Check file extension is `.md` +- Verify space IDs match between template and config + +### "Resolution failed" +- Check macro names match artifact names +- Verify spaces are configured in ResolutionConfig +- Use `ResolutionStrategy.FIRST_MATCH` for simplicity + +### "Circular dependency detected" +- Review dependency edges in database +- Use `detect_circular_dependencies()` to find cycles +- Refactor template structure to break cycles + +## Related Documentation + +- **[TUTORIAL.md](TUTORIAL.md)** - Comprehensive walkthrough +- **[Prompt Dependency Resolution Spec](../../roadmap/prompt-dependency-resolution/)** - Design documentation +- **[Phase 8 Implementation](../../markitect/prompts/)** - Source code + +## License + +Same as MarkiTect project. diff --git a/examples/content-generator/TUTORIAL.md b/examples/content-generator/TUTORIAL.md new file mode 100644 index 00000000..772b6890 --- /dev/null +++ b/examples/content-generator/TUTORIAL.md @@ -0,0 +1,923 @@ +# MarkiTect Prompt Dependency Resolution Tutorial + +**Example: Generating InfoTech Primers with Full Provenance Tracking** + +This tutorial demonstrates how to use MarkiTect's Prompt Dependency Resolution infrastructure to systematically generate content with complete dependency tracking, quality validation, and traceability. + +--- + +## Table of Contents + +1. [Overview](#overview) +2. [Architecture](#architecture) +3. [Setup](#setup) +4. [Core Concepts](#core-concepts) +5. [Step-by-Step Walkthrough](#step-by-step-walkthrough) +6. [Advanced Features](#advanced-features) +7. [CLI Usage](#cli-usage) +8. [Best Practices](#best-practices) + +--- + +## Overview + +### What This Example Does + +This example shows how to generate **InfoTech Primers** (structured reference documents for IT concepts) using a prompt template system with: + +- **Artifact Management**: Store and version all inputs (templates, topics, guidelines) +- **Dependency Resolution**: Automatically resolve macro references across information spaces +- **Provenance Tracking**: Trace any generated primer back to its inputs and template +- **Incremental Updates**: Detect when inputs change and regenerate affected primers +- **Quality Validation**: Apply quality gates to ensure output meets standards +- **Visualization**: View dependency graphs in DOT or Mermaid format + +### Why Use Prompt Dependency Resolution? + +**Before** (manual approach in `prepdr/`): +```markdown +# Template with manual macros +{{topic}} +{{AuthoringRules}} +{{ResearchPrompt}} +``` + +Problems: +- Manual macro substitution +- No version tracking +- No dependency awareness +- Can't detect when inputs change +- No provenance traceability + +**After** (with infrastructure): +```markdown +# Template with resolved dependencies +@{topic} +@{authoring_rules} +@{research_prompt} +``` + +Benefits: +- Automatic macro resolution +- Content-based change detection (SHA-256 digests) +- Full dependency graph construction +- Incremental recomputation when inputs change +- Complete provenance: artifact → template → inputs → validation +- CLI commands for inspection and debugging + +--- + +## Architecture + +### Information Spaces + +The system organizes artifacts into **information spaces** (logical namespaces): + +``` +primer-templates/ # PromptTemplates for generation + ├─ generate-primer + +primer-topics/ # Topic definitions (ETL, Microservices, OAuth, etc.) + ├─ etl + ├─ microservices + └─ ... + +primer-guidelines/ # Authoring and research guidelines + ├─ authoring-rules + └─ research-prompt + +generated-primers/ # Output artifacts + ├─ etl-primer + ├─ microservices-primer + └─ ... +``` + +### Dependency Graph + +When you generate a primer, the system creates a dependency graph: + +```mermaid +graph LR + A[etl topic] -->|requires| B[generate-primer template] + C[authoring-rules] -->|requires| B + D[research-prompt] -->|requires| B + B -->|generates| E[etl-primer output] +``` + +This graph enables: +- **Impact analysis**: "What primers need regeneration if authoring-rules changes?" +- **Provenance tracing**: "What inputs produced this primer?" +- **Incremental execution**: "Only regenerate affected primers" + +--- + +## Setup + +### Prerequisites + +```bash +# Ensure MarkiTect is installed +cd /path/to/markitect_project +pip install -e . +``` + +### Directory Structure + +``` +examples/content-generator/ +├── TUTORIAL.md # This file +├── generate_primers.py # Main example script +├── templates/ +│ └── generate-primer.md # PromptTemplate +├── artifacts/ +│ ├── topics/ +│ │ ├── etl.md # Topic: ETL +│ │ └── microservices.md # Topic: Microservices +│ └── guidelines/ +│ ├── authoring-rules.md # Authoring standards +│ └── research-prompt.md # Research methodology +└── prepdr/ # Original manual system (preserved) + ├── README.md + ├── ETL.md + ├── AuthoringRules.md + ├── AssistentPrompt.md + └── GeneratePrimerTemplate.md +``` + +### Running the Example + +```bash +cd examples/content-generator +python generate_primers.py +``` + +Expected output: +``` +╔══════════════════════════════════════════════════════════════╗ +║ MarkiTect Prompt Dependency Resolution Example ║ +║ InfoTech Primer Generation ║ +╚══════════════════════════════════════════════════════════════╝ + +=== Loading Artifacts === +✓ Created artifact: generate-primer (digest: a7f3e2b1) +✓ Created artifact: etl (digest: 9c4d6e8a) +✓ Created artifact: microservices (digest: 5b2f1c9d) +✓ Created artifact: authoring-rules (digest: 3e7a9f2c) +✓ Created artifact: research-prompt (digest: 8d1b4e6f) + +=== Generating Primer: etl === +✓ Template created with 3 macro dependencies +✓ Resolved 3 macros +✓ Compiled prompt (digest: 4c9e2a7b) +✓ Persisted 3 dependency edges +✓ Generated primer: etl-primer + +=== Provenance Trace === +Artifact: abc-123-def-456 +Producing Run: run-etl-001 +Input Artifacts: 3 +Dependency Chain: 5 artifacts + +✓ Primer generation complete! +``` + +--- + +## Core Concepts + +### 1. Artifacts + +**Artifacts** are versioned content units with content-based addressing. + +```python +from markitect.prompts.models import Artifact, ArtifactType + +# Create an artifact +artifact = Artifact.create( + space_id="primer-topics", + name="etl", + content=topic_content, + artifact_type=ArtifactType.CONTENT, +) + +# Automatic SHA-256 digest generation +print(artifact.content_digest) # "9c4d6e8a..." +``` + +**Key features:** +- **Content digest**: SHA-256 hash for change detection +- **Space isolation**: Artifacts in different spaces can have same names +- **Type classification**: CONTENT, TEMPLATE, GENERATED, SCHEMA, CONFIG + +### 2. PromptTemplates + +**PromptTemplates** are artifacts with macro references. + +```markdown +--- +id: generate-primer-v1 +artifact_type: template +--- + +# Generate Primer + +Topic: @{topic} +Guidelines: @{authoring_rules} +``` + +**Macro syntax:** +- `@{macro_name}` - Resolved to artifact content +- Resolution happens at execution time +- Macros can reference artifacts in any information space + +### 3. Resolution Strategy + +**Resolution** finds artifacts to substitute for macros. + +```python +from markitect.prompts.resolver.strategy import ResolutionConfig, ResolutionStrategy + +config = ResolutionConfig( + strategy=ResolutionStrategy.FIRST_MATCH, + spaces=["primer-topics", "primer-guidelines"], +) +``` + +**Strategies:** +- `FIRST_MATCH`: Use first artifact found +- `LATEST_VERSION`: Use newest version (if artifacts have versions) +- `EXPLICIT_ONLY`: Require explicit space qualification + +### 4. Dependency Tracking + +**Dependency edges** are automatically created during resolution. + +```python +# Edge types +EdgeType.REQUIRES # Input dependency (template → topic) +EdgeType.GENERATES # Output relationship (run → primer) +EdgeType.INCLUDES # Composition (nested templates) +``` + +**Graph operations:** +```python +# Find all artifacts that depend on authoring-rules +dependents = query_service.find_transitive_dependents("authoring-rules-id") + +# Find all inputs needed to regenerate a primer +dependencies = query_service.find_transitive_dependencies("etl-primer-id") + +# Detect circular dependencies +cycles = query_service.detect_circular_dependencies() +``` + +### 5. Traceability + +**ProvenanceTrace** captures complete lineage. + +```python +trace = trace_service.trace_artifact(artifact_id) + +print(trace.producing_run) # Run that generated this +print(trace.template) # Template used +print(trace.input_artifacts) # All input dependencies +print(trace.validation_results) # Quality gate results +print(trace.impact_debt) # Suppressed recomputations +``` + +--- + +## Step-by-Step Walkthrough + +### Step 1: Initialize Repositories + +```python +from markitect.prompts.repositories.sqlite import SQLiteArtifactRepository +from markitect.prompts.dependencies.repository import SQLiteDependencyRepository + +artifact_repo = SQLiteArtifactRepository("primers.db") +dep_repo = SQLiteDependencyRepository("primers.db") +``` + +**What this does:** +- Creates SQLite database with artifact and dependency tables +- Artifact table: id, space_id, name, content_digest, metadata +- Dependency table: source_id, target_id, edge_type, run_id + +### Step 2: Load Artifacts + +```python +# Read artifact file +content = Path("artifacts/topics/etl.md").read_text() + +# Create artifact +artifact = Artifact.create( + space_id="primer-topics", + name="etl", + content=content, + artifact_type=ArtifactType.CONTENT, +) + +# Store in repository +artifact = artifact_repo.create(artifact) +``` + +**Content-based addressing:** +```python +# If you modify the content +updated_content = content + "\n\n**New section added**" +artifact.update_content(updated_content) + +# Digest changes automatically +print(artifact.content_digest) # Different hash! +``` + +### Step 3: Create PromptTemplate + +```python +from markitect.prompts.templates.models import PromptTemplate, MacroReference + +template = PromptTemplate.create( + id="generate-primer-v1", + name="generate-primer", + content=template_content, + space_id="primer-templates", +) + +# Add macro dependencies +template.add_macro(MacroReference( + name="topic", + source_space="primer-topics" +)) +template.add_macro(MacroReference( + name="authoring_rules", + source_space="primer-guidelines" +)) +``` + +**Template content** (`templates/generate-primer.md`): +```markdown +# Generate InfoTech Primer + +## Topic +@{topic} + +## Guidelines +@{authoring_rules} + +## Research Protocol +@{research_prompt} + +Generate a complete primer following the authoring rules. +``` + +### Step 4: Resolve Dependencies + +```python +from markitect.prompts.resolver.resolver import PromptResolver +from markitect.prompts.resolver.strategy import ResolutionConfig + +resolver = PromptResolver(artifact_repo) + +config = ResolutionConfig( + strategy=ResolutionStrategy.FIRST_MATCH, + spaces=["primer-topics", "primer-guidelines"], +) + +resolution_result = resolver.resolve_template(template, config) + +if resolution_result.success: + for resolved in resolution_result.context.resolved_macros: + print(f"{resolved.macro_name} → {resolved.artifact.name}") +else: + print("Resolution failed:", resolution_result.context.errors) +``` + +**Resolution algorithm:** +1. Parse template to extract `@{macro_name}` references +2. For each macro: + - Search configured spaces in order + - Match by name (case-sensitive) + - Return first match (FIRST_MATCH strategy) +3. Build ResolutionResult with all resolved artifacts + +### Step 5: Compile Prompt + +```python +from markitect.prompts.resolver.compiler import ContextCompiler + +compiler = ContextCompiler() +compiled = compiler.compile(template, template_content, resolution_result) + +print(compiled.content) # Fully expanded prompt +print(compiled.content_digest) # Hash for caching +print(compiled.dependency_digests) # Map of macro → artifact digest +``` + +**Compiled output:** +```markdown +# Generate InfoTech Primer + +## Topic +A three-phase computing process where data is extracted from source systems, +transformed (including validation, cleaning, enrichment, and aggregation), +and loaded into a target data store or data warehouse. +... + +## Guidelines +[Full authoring-rules content] +... + +## Research Protocol +[Full research-prompt content] +... +``` + +### Step 6: Track Dependencies + +```python +from markitect.prompts.execution.manifest import RunManifest +from markitect.prompts.dependencies.graph import GraphBuilder + +# Create run manifest +manifest = RunManifest.create( + run_id="run-etl-001", + template_id=template.id, + template_name=template.name, + template_digest=template.content_digest, +) + +# Add resolved inputs +for resolved in resolution_result.context.resolved_macros: + manifest.add_resolved_input( + name=resolved.macro_name, + artifact_id=resolved.artifact.id, + space_id=resolved.space_id, + digest=resolved.artifact.content_digest, + ) + + # Create dependency edge + manifest.add_dependency_edge( + source_id=resolved.artifact.id, + target_id="run-etl-001", + edge_type="requires", + ) + +# Persist to database +builder = GraphBuilder(dep_repo) +edges = builder.persist_edges(manifest) +``` + +**Result:** Dependency edges stored in database: +``` +source_artifact_id | target_artifact_id | edge_type | run_id +--------------------|--------------------|-----------|----------- +etl-id | run-etl-001 | requires | run-etl-001 +authoring-rules-id | run-etl-001 | requires | run-etl-001 +research-prompt-id | run-etl-001 | requires | run-etl-001 +``` + +### Step 7: Generate Output + +```python +# In real usage, this would call an LLM API +# For demo, we create a mock output +output_content = """ +# ETL Primer + +## Definition +ETL (Extract, Transform, Load) is a data integration pattern... +[Generated content] +""" + +output_artifact = Artifact.create( + space_id="generated-primers", + name="etl-primer", + content=output_content, + artifact_type=ArtifactType.GENERATED, +) +output_artifact = artifact_repo.create(output_artifact) + +# Add to manifest +manifest.add_output_artifact( + artifact_id=output_artifact.id, + name=output_artifact.name, + digest=output_artifact.content_digest, + artifact_type="generated", +) + +manifest.add_dependency_edge( + source_id="run-etl-001", + target_id=output_artifact.id, + edge_type="generates", +) + +# Persist output edges +builder.persist_edges(manifest) +``` + +### Step 8: Trace Provenance + +```python +from markitect.prompts.traceability.service import TraceabilityService + +trace_service = TraceabilityService(artifact_repo, dep_repo, db_path="primers.db") + +# Trace the generated primer +trace = trace_service.trace_artifact(output_artifact.id) + +# Inspect provenance +print("Template:", trace.template.name if trace.template else "None") +print("Producing run:", trace.producing_run.run_id if trace.producing_run else "None") +print("Input artifacts:") +for inp in trace.input_artifacts: + print(f" - {inp.name} ({inp.artifact_type})") + +print("Dependency chain:") +for dep_id in trace.dependency_chain: + artifact = artifact_repo.get_by_id(dep_id) + print(f" - {artifact.name if artifact else dep_id}") +``` + +--- + +## Advanced Features + +### Incremental Recomputation + +When an input changes, automatically detect affected outputs: + +```python +from markitect.prompts.incremental.detector import ChangeDetector +from markitect.prompts.incremental.engine import IncrementalExecutionEngine +from markitect.prompts.incremental.models import RecomputeConfig + +# Detect change +detector = ChangeDetector("primers.db") +authoring_rules = artifact_repo.get_by_name("primer-guidelines", "authoring-rules") + +# User updates the file +new_content = Path("artifacts/guidelines/authoring-rules.md").read_text() +change = detector.detect_change(authoring_rules, new_content) + +if change: + detector.record_change(change) + + # Find affected primers + engine = IncrementalExecutionEngine("primers.db", query_service) + result = engine.recompute( + change, + config=RecomputeConfig(max_depth=2, impact_threshold=0.1), + old_content=authoring_rules.content, + new_content=new_content, + ) + + print(f"Total dependents: {result.total_dependents}") + print(f"Recomputed: {result.recomputed_count}") + print(f"Suppressed: {result.suppressed_count}") +``` + +**Recomputation strategies:** +- **max_depth**: Traverse dependency graph N levels +- **impact_threshold**: Only recompute if change magnitude > threshold +- **max_recomputes**: Budget limit to prevent runaway execution + +### Quality Validation + +Apply quality gates to generated primers: + +```python +from markitect.prompts.quality.validator import QualityValidator +from markitect.prompts.quality.gates.pattern_gate import PatternValidationGate + +# Create validation gate +gate = PatternValidationGate( + required_patterns=[ + r"## Definition", + r"## Context", + r"## Core Concepts", + r"## Scope and Non-Scope", + ], + forbidden_patterns=[ + r"TODO", + r"FIXME", + ], + gate_id="primer-structure-check", + name="Primer Structure Validator", +) + +validator = QualityValidator(gates=[gate], db_path="primers.db") + +# Validate output +results = validator.validate_artifact( + content=output_content, + artifact_id=output_artifact.id, + run_id="run-etl-001", +) + +if validator.all_passed(results): + print("✓ All quality gates passed") +else: + failed = validator.get_failed_gates(results) + for result in failed: + print(f"✗ {result.gate_id} failed") + for diag in result.diagnostics: + print(f" {diag.message}") +``` + +### Visualization + +Generate dependency graphs: + +```python +from markitect.prompts.visualization.graph import GraphExporter +from markitect.prompts.dependencies.queries import DependencyQueryService + +query_service = DependencyQueryService(dep_repo) + +# Find all related artifacts +deps = query_service.find_transitive_dependencies(output_artifact.id) +dependents = query_service.find_transitive_dependents(output_artifact.id) +all_ids = deps | dependents | {output_artifact.id} + +# Build graph +builder = GraphBuilder(dep_repo) +graph = builder.build_graph(all_ids) + +# Export to Mermaid +mermaid = GraphExporter.to_mermaid(graph, "Primer Dependencies") +Path("dependencies.mermaid").write_text(mermaid) + +# Export to DOT (Graphviz) +dot = GraphExporter.to_dot(graph, "Primer Dependencies") +Path("dependencies.dot").write_text(dot) +``` + +**Mermaid output:** +```mermaid +%%{ title: Primer Dependencies }%% +graph LR + etl-id-->|requires|run-etl-001 + authoring-rules-id-->|requires|run-etl-001 + research-prompt-id-->|requires|run-etl-001 + run-etl-001-.->|generates|etl-primer-id +``` + +--- + +## CLI Usage + +The Prompt Dependency Resolution infrastructure includes CLI commands: + +### Trace Provenance + +```bash +markitect prompt trace --database primers.db +``` + +Output (JSON): +```json +{ + "artifact_id": "abc-123-def-456", + "producing_run": { + "run_id": "run-etl-001", + "template_id": "generate-primer-v1", + "status": "success" + }, + "input_artifacts": [ + { + "artifact_id": "...", + "name": "etl", + "role": "input" + } + ], + "dependency_chain": ["...", "..."] +} +``` + +### Visualize Graph + +```bash +markitect prompt graph --format mermaid --database primers.db +``` + +### List Runs + +```bash +# All runs +markitect prompt runs --database primers.db + +# Filter by template +markitect prompt runs --template generate-primer-v1 --database primers.db + +# Filter by status +markitect prompt runs --status success --limit 10 --database primers.db +``` + +### Show Impact Debt + +```bash +# All stale artifacts +markitect prompt debt --database primers.db + +# Specific artifact +markitect prompt debt --artifact authoring-rules-id --database primers.db +``` + +### Graph Statistics + +```bash +markitect prompt stats --database primers.db +``` + +Output: +```json +{ + "total_nodes": 12, + "total_edges": 18, + "root_count": 3, + "leaf_count": 2, + "has_cycles": false +} +``` + +--- + +## Best Practices + +### 1. Organize Artifacts by Space + +``` +Clear separation of concerns: +- templates/ ← Reusable PromptTemplates +- topics/ ← Domain-specific content +- guidelines/ ← Standards and methodologies +- output/ ← Generated artifacts +``` + +### 2. Use Content Digests for Change Detection + +```python +# Don't compare content strings +if old_content != new_content: # ✗ Inefficient + +# Do compare digests +if artifact.has_changed(new_digest): # ✓ Fast, hash-based +``` + +### 3. Apply Quality Gates + +```python +# Define quality standards as code +gates = [ + PatternValidationGate(required_patterns=[...]), + SchemaValidationGate(schema={...}), +] + +# Fail fast if quality checks fail +if not validator.all_passed(results): + raise QualityError("Output does not meet standards") +``` + +### 4. Track All Dependencies + +```python +# Always persist dependency edges +manifest.add_dependency_edge(source, target, edge_type) +builder.persist_edges(manifest) + +# This enables: +# - Impact analysis +# - Incremental recomputation +# - Provenance tracing +``` + +### 5. Use Incremental Execution + +```python +# Don't regenerate everything on every change +config = RecomputeConfig( + max_depth=2, # Limit blast radius + impact_threshold=0.1, # Skip minor changes + max_recomputes=10, # Budget limit +) +``` + +### 6. Version Your Templates + +```python +# Include version in template metadata +--- +id: generate-primer-v1 +version: 1.0.0 +--- + +# When template changes significantly, create v2 +--- +id: generate-primer-v2 +version: 2.0.0 +--- +``` + +### 7. Leverage Traceability + +```python +# Use provenance traces for debugging +trace = trace_service.trace_artifact(failed_output_id) + +print("Inputs used:") +for inp in trace.input_artifacts: + print(f" {inp.name} @ {inp.content_digest[:8]}") + +# This helps identify which input caused the issue +``` + +--- + +## Comparison with Original System + +### Original (`prepdr/`) + +**GeneratePrimerTemplate.md:** +```markdown + +{{ETL}} + + + +{{AuthoringRules}} + +``` + +**Process:** +1. Manually copy-paste content +2. Replace `{{...}}` markers by hand +3. Run through LLM +4. No record of what inputs were used +5. No change detection + +**Limitations:** +- No automation +- No version control on inputs +- Can't regenerate from history +- No impact analysis when guidelines change + +### With Infrastructure + +**templates/generate-primer.md:** +```markdown +@{topic} +@{authoring_rules} +``` + +**Process:** +1. Define artifacts once +2. Create template with `@{...}` macros +3. Run resolver → compiler → executor +4. Full dependency graph persisted +5. Complete provenance trace available + +**Benefits:** +- Fully automated resolution +- Content-based change detection (SHA-256) +- Reproducible: "same inputs → same output" +- Impact analysis: "what needs regeneration?" +- Traceability: "how was this generated?" +- Quality validation: automated checks +- Visualization: see dependency relationships + +--- + +## Next Steps + +1. **Extend the example:** + - Add more topics (OAuth, Docker, Kubernetes) + - Create topic-specific quality gates + - Implement actual LLM integration + +2. **Build a workflow:** + - Git hooks to detect artifact changes + - CI/CD pipeline to regenerate affected primers + - Dashboard to show primer freshness + +3. **Add advanced features:** + - Version conflict resolution + - A/B testing different templates + - Batch generation with parallelization + +4. **Integrate with MarkiTect:** + - Use MarkiTect ingestion for artifact storage + - Query relationships with relational metadata + - Generate documentation sites from primers + +--- + +## References + +- [Prompt Dependency Resolution Specification](../../roadmap/prompt-dependency-resolution/) +- [MarkiTect Documentation](../../README.md) +- [Phase 8 Implementation](../../markitect/prompts/) + +--- + +**Questions or feedback?** File an issue or reach out to the maintainers. diff --git a/examples/content-generator/artifacts/guidelines/authoring-rules.md b/examples/content-generator/artifacts/guidelines/authoring-rules.md new file mode 100644 index 00000000..0ce07201 --- /dev/null +++ b/examples/content-generator/artifacts/guidelines/authoring-rules.md @@ -0,0 +1,200 @@ +--- +id: primer-authoring-rules-v1 +name: AuthoringRules +artifact_type: content +description: Comprehensive guidelines for writing effective InfoTech primers +version: 1.0.0 +tags: + - guidelines + - authoring + - quality-standards +--- + +# Primer Authoring Rules + +**Status:** Draft +**Intended Audience:** Human authors and AI systems generating or validating primers +**Purpose:** Ensure primers are precise, stable, and suitable as shared context for humans and AI agents + +--- + +## 1. What a Primer Is (Normative) + +An **InfoTechPrimer** is a **short, structured reference document** that establishes a **shared understanding** of a specific IT term, standard, method, or concept. + +A primer: +* Defines **what the topic is** +* Explains **where it fits** +* Clarifies **scope boundaries** +* Points to **authoritative sources** + +A primer does **not**: +* Teach step-by-step usage +* Advocate tools or vendors +* Explore implementation details beyond what is normatively defined + +--- + +## 2. Target Audience + +Primers are written for: +* Humans with solid general IT knowledge +* Readers who are *not specialists* in the specific topic +* AI systems that consume structured context for reasoning and coding + +Authors must assume: +* Conceptual literacy +* Familiarity with basic IT terminology +* No prior deep knowledge of the topic + +--- + +## 3. Required Structure (Mandatory) + +Every primer **MUST** contain the following sections **in this order**: + +1. **Definition** +2. **Context** +3. **Core Concepts** +4. **Scope and Non-Scope** +5. **Practical Implications** +6. **Formal Standards and Authoritative Sources** +7. **Related Concepts** + +No section may be omitted. Empty sections are not allowed. + +--- + +## 4. Section Authoring Rules + +### 4.1 Definition +**Purpose:** Establish an unambiguous baseline meaning. + +Rules: +* 2–4 sentences maximum +* Declarative, precise language +* No metaphors, examples, or analogies +* No historical narrative + +### 4.2 Context +**Purpose:** Position the concept within the IT landscape. + +Rules: +* Describe the domain(s) the concept belongs to +* Explain *why it exists*, not *how to use it* +* Historical notes allowed only if they clarify intent or constraints + +### 4.3 Core Concepts +**Purpose:** Identify the irreducible ideas that define the topic. + +Rules: +* Bullet points only +* Each bullet describes one concept +* No nested lists +* Avoid redundancy with Definition + +### 4.4 Scope and Non-Scope +**Purpose:** Prevent conceptual drift and misuse. + +Rules: +* Explicitly list inclusions and exclusions +* Use parallel structure +* Address common misconceptions + +Format: +```markdown +**In Scope** +- ... + +**Out of Scope** +- ... +``` + +This section is **critical** for AI agent correctness. + +### 4.5 Practical Implications +**Purpose:** Describe consequences of adopting or interacting with the concept. + +Rules: +* Focus on effects, not instructions +* No step-by-step guidance +* Include tradeoffs where relevant + +### 4.6 Formal Standards and Authoritative Sources +**Purpose:** Anchor the primer in canonical truth. + +Rules: +* Prefer primary sources +* Include direct links +* Avoid blogs unless widely recognized and necessary + +Acceptable sources: +* RFCs, W3C Recommendations +* ISO / IEC standards +* NIST publications +* Official specifications +* Foundational academic papers + +At least **one** authoritative source is required. + +### 4.7 Related Concepts +**Purpose:** Enable semantic navigation. + +Rules: +* Short descriptions only (one line per concept) +* No deep explanations +* Avoid circular definitions + +--- + +## 5. Language and Style Rules + +Mandatory: +* Present tense +* Declarative sentences +* Neutral, technical tone + +Avoid: +* First-person language ("we", "you") +* Rhetorical questions +* Marketing language +* Informal phrasing +* Emojis + +--- + +## 6. Length Constraints + +A primer should typically be: +* **600–1,000 words total** +* Short enough to be read in one sitting +* Long enough to define boundaries clearly + +Exceeding this range requires justification. + +--- + +## 7. AI Optimization Rules (Explicit) + +Authors **SHOULD**: +* Use consistent terminology +* Avoid synonyms for core terms once defined +* Prefer explicit over implicit assumptions +* State constraints clearly + +Authors **MUST NOT**: +* Rely on context outside the document +* Assume tool- or framework-specific defaults +* Leave ambiguity where standards are explicit + +--- + +## 8. Validation Criteria (Checklist) + +A primer is valid if: +* [ ] All required sections are present +* [ ] Definition is precise and unambiguous +* [ ] Scope boundaries are explicit +* [ ] At least one authoritative source is linked +* [ ] No tutorial or marketing content exists +* [ ] Language follows declarative style rules diff --git a/examples/content-generator/artifacts/guidelines/research-prompt.md b/examples/content-generator/artifacts/guidelines/research-prompt.md new file mode 100644 index 00000000..b4dc20b2 --- /dev/null +++ b/examples/content-generator/artifacts/guidelines/research-prompt.md @@ -0,0 +1,104 @@ +--- +id: research-protocol-v1 +name: ResearchPrompt +artifact_type: content +description: Systematic research protocol for InfoTech topic investigation +version: 1.0.0 +tags: + - research + - methodology + - guidelines +--- + +# InfoTech Research Protocol + +Below is a systematic research protocol to thoroughly investigate any InfoTech topic before writing an InfoTechPrimer. + +**Purpose:** Produce a factually grounded, scope-aware, source-anchored research brief suitable as direct input for primer authoring. + +--- + +## Research Sections + +### 1. Canonical Definition +- Provide the most widely accepted definition(s) of the topic +- If multiple definitions exist, explain why and in which contexts they differ +- Prefer definitions from standards bodies, original designers, or official specifications + +### 2. Domain Context and Classification +- Which technical domain(s) does this topic belong to? + (e.g. systems programming, distributed systems, security, AI, quantum computing) +- What *type* of thing is it? + (e.g. protocol, framework, architectural style, API standard, SDK, language, library) +- At which abstraction level does it primarily operate? + +### 3. Historical Origin and Motivation +- Who introduced it and when? +- What concrete problem(s) was it created to solve? +- What existing approaches did it replace, extend, or formalize? + +(Only include history that explains intent or constraints.) + +### 4. Core Concepts and Invariants +- List the essential concepts without which the topic would not make sense +- For each concept, explain its role in one or two sentences +- Identify any invariants, guarantees, or formal assumptions + +### 5. Scope Boundaries +- Clearly state what the topic explicitly covers +- Clearly state what it explicitly does NOT cover +- Identify common misconceptions or misuses + +This section should prevent overextension by AI systems. + +### 6. Practical Implications (Non-Tutorial) +- What design or architectural consequences follow from using this? +- What tradeoffs are inherent? +- What kinds of systems typically depend on it? + +Do NOT include step-by-step usage. + +### 7. Relationship to Adjacent Concepts +- List closely related standards, technologies, or terms +- For each, explain the relationship (complementary, layered on top, alternative, predecessor) + +### 8. Authoritative Sources +- List primary, authoritative references: + - Standards (RFCs, ISO, W3C, IEEE, etc.) + - Official specifications or documentation + - Foundational papers +- Include direct links +- Clearly distinguish primary sources from secondary explanations + +### 9. Stability and Maturity Assessment +- Is this topic considered stable, evolving, or experimental? +- Are there competing standards or dominant implementations? +- Is backward compatibility a concern? + +### 10. Notes for Primer Authoring +- Highlight points that MUST be stated clearly in a primer +- Highlight areas where ambiguity must be avoided +- Identify terminology that must be used consistently + +--- + +## Research Constraints + +- Use precise, declarative language +- No metaphors or analogies +- No marketing or opinionated statements +- Assume a technically literate audience +- Prefer explicit statements over implied assumptions + +--- + +## Why This Protocol Works + +This protocol is intentionally shaped to: +* **Force scope clarity** (critical for AI agents) +* **Surface invariants and constraints** +* **Separate definition from implementation** +* **Anchor everything in primary sources** +* **Produce output that maps 1:1 to Primer Authoring Rules** + +Think of it as: *A pre-primer that de-risks the primer.* diff --git a/examples/content-generator/artifacts/topics/etl.md b/examples/content-generator/artifacts/topics/etl.md new file mode 100644 index 00000000..f094f491 --- /dev/null +++ b/examples/content-generator/artifacts/topics/etl.md @@ -0,0 +1,31 @@ +--- +id: topic-etl +name: ETL +artifact_type: content +description: Topic definition for ETL (Extract, Transform, Load) +version: 1.0.0 +tags: + - data-engineering + - data-integration + - topic +--- + +# ETL (Extract, Transform, Load) + +A three-phase computing process where data is extracted from source systems, transformed (including validation, cleaning, enrichment, and aggregation), and loaded into a target data store or data warehouse. + +ETL is a fundamental pattern in data integration and analytics pipelines, enabling organizations to consolidate data from heterogeneous sources into a unified format suitable for analysis and reporting. + +**Key Characteristics:** +- Sequential batch-oriented processing +- Data quality enforcement during transformation +- Schema mapping and normalization +- Support for diverse source and target systems +- Typically scheduled and automated + +**Common Use Cases:** +- Data warehouse population +- Business intelligence reporting +- Data migration projects +- Master data management +- Regulatory compliance reporting diff --git a/examples/content-generator/artifacts/topics/microservices.md b/examples/content-generator/artifacts/topics/microservices.md new file mode 100644 index 00000000..e3fe2896 --- /dev/null +++ b/examples/content-generator/artifacts/topics/microservices.md @@ -0,0 +1,31 @@ +--- +id: topic-microservices +name: Microservices +artifact_type: content +description: Topic definition for Microservices architecture +version: 1.0.0 +tags: + - architecture + - distributed-systems + - topic +--- + +# Microservices Architecture + +An architectural style that structures an application as a collection of loosely coupled, independently deployable services, each implementing a specific business capability. + +Microservices represent a departure from monolithic architecture, emphasizing service autonomy, bounded contexts, and decentralized data management. + +**Key Characteristics:** +- Independent deployment and scaling +- Service-oriented API contracts (typically REST or gRPC) +- Decentralized data management (database-per-service) +- Polyglot persistence and technology diversity +- Failure isolation and resilience patterns + +**Common Use Cases:** +- Large-scale web applications +- Cloud-native applications +- Continuous delivery environments +- Organizations requiring team autonomy +- Systems requiring differential scaling diff --git a/examples/content-generator/generate_primers.py b/examples/content-generator/generate_primers.py new file mode 100755 index 00000000..f4e61210 --- /dev/null +++ b/examples/content-generator/generate_primers.py @@ -0,0 +1,400 @@ +#!/usr/bin/env python3 +""" +Primer Generation using Prompt Dependency Resolution + +This script demonstrates the full workflow of generating InfoTech primers +using MarkiTect's Prompt Dependency Resolution infrastructure. + +Features demonstrated: +- Artifact creation and storage +- PromptTemplate with macro resolution +- Dependency tracking +- Quality validation +- Incremental recomputation +- Full provenance tracing +""" + +import sys +from pathlib import Path +from typing import Optional + +# Add project root to path +project_root = Path(__file__).parent.parent.parent +sys.path.insert(0, str(project_root)) + +from markitect.prompts.models import Artifact, ArtifactType +from markitect.prompts.repositories.sqlite import SQLiteArtifactRepository +from markitect.prompts.dependencies.repository import SQLiteDependencyRepository +from markitect.prompts.services.artifact_service import ArtifactService +from markitect.prompts.templates.models import PromptTemplate, ContentMacro, MacroKind +from markitect.prompts.templates.analyzer import TemplateAnalyzer +from markitect.prompts.resolver.resolver import PromptResolver +from markitect.prompts.resolver.compiler import ContextCompiler +from markitect.prompts.resolver.strategy import ResolutionConfig, MultiSpaceResolutionStrategy +from markitect.prompts.execution.models import PromptRun, RunConfig +from markitect.prompts.execution.manifest import RunManifest +from markitect.prompts.dependencies.graph import GraphBuilder +from markitect.prompts.traceability.service import TraceabilityService +from markitect.prompts.queries.operations import PromptQueryService +from markitect.prompts.visualization.graph import GraphExporter + + +class PrimerGenerator: + """Generates InfoTech primers using prompt dependency resolution.""" + + def __init__(self, db_path: str = "primers.db"): + """Initialize with database path.""" + self.db_path = db_path + self.artifact_repo = SQLiteArtifactRepository(db_path) + self.artifact_service = ArtifactService(self.artifact_repo) + self.dep_repo = SQLiteDependencyRepository(db_path) + self.graph_builder = GraphBuilder(self.dep_repo) + self.trace_service = TraceabilityService( + self.artifact_repo, self.dep_repo, db_path=db_path + ) + self.query_service = PromptQueryService( + self.artifact_repo, self.dep_repo, db_path=db_path + ) + + # Create information spaces + self.spaces = { + "templates": "primer-templates", + "topics": "primer-topics", + "guidelines": "primer-guidelines", + "output": "generated-primers", + } + + def load_or_create_artifact( + self, space: str, filepath: Path, artifact_type: ArtifactType, name: Optional[str] = None + ) -> tuple[Artifact, str]: + """Load artifact from file or fetch from repository. Returns (artifact, content).""" + if name is None: + name = filepath.stem + + # Load content from file + if not filepath.exists(): + raise FileNotFoundError(f"Artifact file not found: {filepath}") + + content = filepath.read_text() + + # Check if artifact exists + existing = self.artifact_repo.get_by_name(space, name) + if existing: + print(f"✓ Found existing artifact: {name}") + return existing, content + + # Create and store artifact + artifact = Artifact.create( + space_id=space, + name=name, + content=content, + artifact_type=artifact_type, + ) + artifact = self.artifact_repo.create(artifact) + print(f"✓ Created artifact: {name} (digest: {artifact.content_digest[:8]})") + return artifact, content + + def setup_artifacts(self, example_dir: Path): + """Load all artifacts into the repository.""" + print("\n=== Loading Artifacts ===") + + # Cache for artifact content (repository doesn't store content) + self.artifact_content = {} + + # Load template + template_file = example_dir / "templates" / "generate-primer.md" + self.template_artifact, content = self.load_or_create_artifact( + self.spaces["templates"], + template_file, + ArtifactType.TEMPLATE, + ) + self.artifact_content[self.template_artifact.id] = content + + # Load topic artifacts + topics_dir = example_dir / "artifacts" / "topics" + self.topic_artifacts = {} + for topic_file in topics_dir.glob("*.md"): + artifact, content = self.load_or_create_artifact( + self.spaces["topics"], + topic_file, + ArtifactType.CONTENT, + ) + self.topic_artifacts[artifact.name] = artifact + self.artifact_content[artifact.id] = content + + # Load guideline artifacts (rename to match macro expectations) + guidelines_dir = example_dir / "artifacts" / "guidelines" + self.guideline_artifacts = {} + guideline_name_map = { + "authoring-rules.md": "authoring_rules", + "research-prompt.md": "research_prompt", + } + for guide_file in guidelines_dir.glob("*.md"): + # Use mapped name if available, otherwise use stem + artifact_name = guideline_name_map.get(guide_file.name, guide_file.stem) + artifact, content = self.load_or_create_artifact( + self.spaces["guidelines"], + guide_file, + ArtifactType.CONTENT, + name=artifact_name, + ) + self.guideline_artifacts[artifact.name] = artifact + self.artifact_content[artifact.id] = content + + def create_template(self) -> PromptTemplate: + """Create PromptTemplate from template artifact.""" + print("\n=== Parsing Template ===") + + # Create PromptTemplate from artifact + template = PromptTemplate.from_artifact(self.template_artifact) + + # For demo: manually add macros since analyzer would extract from {{...}} + # In real usage, analyzer would parse the content automatically + template.macros = [ + ContentMacro(kind=MacroKind.REQUIRED, target="topic"), + ContentMacro(kind=MacroKind.REQUIRED, target="authoring_rules"), + ContentMacro(kind=MacroKind.REQUIRED, target="research_prompt"), + ] + template.analyzed = True + + print(f"✓ Template created with {len(template.macros)} macro dependencies") + return template + + def generate_primer(self, topic_name: str, output_name: Optional[str] = None) -> Artifact: + """Generate a primer for the given topic.""" + print(f"\n=== Generating Primer: {topic_name} ===") + + # Get topic artifact + topic_artifact = self.topic_artifacts.get(topic_name) + if not topic_artifact: + raise ValueError(f"Topic not found: {topic_name}") + + # Get the topic's content from cache + topic_content = self.artifact_content.get(topic_artifact.id) + if not topic_content: + raise ValueError(f"Topic content not found for: {topic_name}") + + # Create a temporary "topic" artifact with this specific topic's content + # This allows the template's @{topic} macro to resolve + temp_topic = Artifact.create( + space_id=self.spaces["topics"], + name="topic", + content=topic_content, + artifact_type=ArtifactType.CONTENT, + ) + + # Check if "topic" artifact already exists and delete it + existing_topic = self.artifact_repo.get_by_name(self.spaces["topics"], "topic") + if existing_topic: + self.artifact_repo.delete(existing_topic.id) + + temp_topic = self.artifact_repo.create(temp_topic) + self.artifact_content[temp_topic.id] = topic_content + print(f"✓ Bound topic macro to {topic_name}") + + # Create template + template = self.create_template() + + # Configure resolution + resolution_config = ResolutionConfig( + space_id=self.spaces["templates"], + included_spaces=[ + self.spaces["topics"], + self.spaces["guidelines"], + ], + ) + + print("\n=== Resolving Dependencies ===") + strategy = MultiSpaceResolutionStrategy() + resolver = PromptResolver(self.artifact_service, strategy) + resolution_result = resolver.resolve_template(template, resolution_config) + + if not resolution_result.success: + print(f"✗ Resolution failed: {resolution_result.context.errors}") + return None + + print(f"✓ Resolved {len(resolution_result.context.resolved_macros)} macros") + for resolved in resolution_result.context.resolved_macros: + print(f" - {resolved.macro.target} → {resolved.artifact.name}") + + # Compile the prompt + print("\n=== Compiling Prompt ===") + compiler = ContextCompiler() + + # For demo, use mock template content with macro markers + template_content = f""" +# Generate InfoTech Primer + +Topic: @{{topic}} +Guidelines: @{{authoring_rules}} +Research Protocol: @{{research_prompt}} + +Generate a complete primer following the authoring rules. +""" + + compiled = compiler.compile(template, template_content, resolution_result) + print(f"✓ Compiled prompt (digest: {compiled.content_digest[:8]})") + + # Create run manifest + run_id = f"run-{topic_name}-001" + manifest = RunManifest.create( + run_id=run_id, + template_id=template.artifact.id, + template_name=template.artifact.name, + template_digest=self.template_artifact.content_digest, + ) + + # Add resolved inputs + for resolved in resolution_result.context.resolved_macros: + manifest.add_resolved_input( + name=resolved.macro.target, + artifact_id=resolved.artifact.id, + space_id=resolved.space_id, + digest=resolved.artifact.content_digest, + ) + + # Create dependency edge + manifest.add_dependency_edge( + source_id=resolved.artifact.id, + target_id=run_id, + edge_type="requires", + ) + + # For demo: create output artifact (would normally come from LLM) + output_name = output_name or f"{topic_name}-primer" + output_content = f"""# {topic_name.upper()} Primer + +[Generated primer content would go here...] + +This primer was generated using the Prompt Dependency Resolution infrastructure. + +**Dependencies:** +- Template: {template.artifact.name} +- Topic: {topic_artifact.name} +- Guidelines: authoring-rules, research-prompt + +**Run ID:** {run_id} +""" + + output_artifact = Artifact.create( + space_id=self.spaces["output"], + name=output_name, + content=output_content, + artifact_type=ArtifactType.GENERATED, + ) + output_artifact = self.artifact_repo.create(output_artifact) + + # Add output to manifest + manifest.add_output_artifact( + artifact_id=output_artifact.id, + name=output_artifact.name, + digest=output_artifact.content_digest, + artifact_type=output_artifact.artifact_type.value, + ) + + manifest.add_dependency_edge( + source_id=run_id, + target_id=output_artifact.id, + edge_type="generates", + ) + + # Persist all dependency edges at once + edges = self.graph_builder.persist_edges(manifest) + print(f"✓ Persisted {len(edges)} dependency edges") + + print(f"✓ Generated primer: {output_artifact.name}") + print(f" ID: {output_artifact.id}") + print(f" Digest: {output_artifact.content_digest[:8]}") + + return output_artifact + + def show_provenance(self, artifact_id: str): + """Display full provenance trace for an artifact.""" + print(f"\n=== Provenance Trace ===") + + trace = self.trace_service.trace_artifact(artifact_id) + + print(f"Artifact: {trace.artifact_id}") + + if trace.producing_run: + print(f"\nProducing Run: {trace.producing_run.run_id}") + print(f" Template: {trace.producing_run.template_id}") + print(f" Status: {trace.producing_run.status}") + + if trace.input_artifacts: + print(f"\nInput Artifacts ({len(trace.input_artifacts)}):") + for inp in trace.input_artifacts: + print(f" - {inp.name} ({inp.artifact_type})") + + if trace.dependency_chain: + print(f"\nDependency Chain ({len(trace.dependency_chain)} artifacts):") + for dep_id in trace.dependency_chain[:5]: # Show first 5 + print(f" - {dep_id[:8]}...") + + def visualize_dependencies(self, artifact_id: str, output_file: str = "dependencies.mermaid"): + """Generate dependency graph visualization.""" + print(f"\n=== Dependency Visualization ===") + + from markitect.prompts.dependencies.queries import DependencyQueryService + + query_svc = DependencyQueryService(self.dep_repo) + deps = query_svc.find_transitive_dependencies(artifact_id) + dependents = query_svc.find_transitive_dependents(artifact_id) + all_ids = deps | dependents | {artifact_id} + + graph = self.graph_builder.build_graph(all_ids) + mermaid = GraphExporter.to_mermaid(graph, f"Primer Generation Dependencies") + + Path(output_file).write_text(mermaid) + print(f"✓ Saved dependency graph to {output_file}") + + +def main(): + """Main execution flow.""" + print("╔══════════════════════════════════════════════════════════════╗") + print("║ MarkiTect Prompt Dependency Resolution Example ║") + print("║ InfoTech Primer Generation ║") + print("╚══════════════════════════════════════════════════════════════╝") + + example_dir = Path(__file__).parent + + # Initialize generator + generator = PrimerGenerator(db_path=str(example_dir / "primers.db")) + + # Setup artifacts + generator.setup_artifacts(example_dir) + + # Generate primers for multiple topics + topics = ["etl", "microservices"] + + generated_artifacts = [] + for topic in topics: + try: + artifact = generator.generate_primer(topic) + if artifact: + generated_artifacts.append(artifact) + except Exception as e: + print(f"✗ Failed to generate primer for {topic}: {e}") + + # Show provenance for first generated primer + if generated_artifacts: + artifact = generated_artifacts[0] + generator.show_provenance(artifact.id) + generator.visualize_dependencies(artifact.id) + + # Show statistics + print("\n=== Dependency Statistics ===") + stats = generator.query_service.get_dependency_stats() + print(f"Total nodes: {stats['total_nodes']}") + print(f"Total edges: {stats['total_edges']}") + print(f"Root artifacts: {stats['root_count']}") + print(f"Leaf artifacts: {stats['leaf_count']}") + print(f"Has cycles: {stats['has_cycles']}") + + print("\n✓ Primer generation complete!") + print(f" Database: {generator.db_path}") + print(f" Generated: {len(generated_artifacts)} primers") + + +if __name__ == "__main__": + main() diff --git a/examples/content-generator/prepdr/AssistentPrompt.md b/examples/content-generator/prepdr/AssistentPrompt.md new file mode 100644 index 00000000..7e50c5c1 --- /dev/null +++ b/examples/content-generator/prepdr/AssistentPrompt.md @@ -0,0 +1,154 @@ +ResearchPrompt + +*Research a topic...* + +# InfoTechPrimer – ResearchPrompt + +Below is a **single, reusable, high-precision research prompt** you can use to *systematically get a grip on any InfoTech topic* before writing an **InfoTechPrimer**. + +> **Purpose:** +> Produce a *factually grounded, scope-aware, source-anchored research brief* suitable as the direct input for authoring an InfoTechPrimer. + +It is designed to work well with: + +* General-purpose LLMs +* Research-oriented agents +* Human-in-the-loop review + +The prompt is **topic-agnostic**, but forces rigor, boundaries, and source grounding. + +--- + +## ResearchPrompt + +``` +You are conducting foundational research for an InfoTechPrimer. + +Topic: <$topic> + +The goal is NOT to teach or promote, but to establish a precise, shared understanding +of the topic for experienced IT professionals and AI systems. + +Produce a structured research brief that answers the following sections. +Be concise, factual, and source-driven. +Avoid tutorials, opinions, and vendor marketing language. + +--- + +1. Canonical Definition +- Provide the most widely accepted definition(s) of the topic. +- If multiple definitions exist, explain why and in which contexts they differ. +- Prefer definitions from standards bodies, original designers, or official specifications. + +--- + +2. Domain Context and Classification +- Which technical domain(s) does this topic belong to? + (e.g. systems programming, distributed systems, security, AI, quantum computing) +- What *type* of thing is it? + (e.g. protocol, framework, architectural style, API standard, SDK, language, library) +- At which abstraction level does it primarily operate? + +--- + +3. Historical Origin and Motivation +- Who introduced it and when? +- What concrete problem(s) was it created to solve? +- What existing approaches did it replace, extend, or formalize? + +(Only include history that explains intent or constraints.) + +--- + +4. Core Concepts and Invariants +- List the essential concepts without which the topic would not make sense. +- For each concept, explain its role in one or two sentences. +- Identify any invariants, guarantees, or formal assumptions. + +--- + +5. Scope Boundaries +- Clearly state what the topic explicitly covers. +- Clearly state what it explicitly does NOT cover. +- Identify common misconceptions or misuses. + +This section should prevent overextension by AI systems. + +--- + +6. Practical Implications (Non-Tutorial) +- What design or architectural consequences follow from using this? +- What tradeoffs are inherent? +- What kinds of systems typically depend on it? + +Do NOT include step-by-step usage. + +--- + +7. Relationship to Adjacent Concepts +- List closely related standards, technologies, or terms. +- For each, explain the relationship (complementary, layered on top, alternative, predecessor). + +--- + +8. Authoritative Sources +- List primary, authoritative references: + - Standards (RFCs, ISO, W3C, IEEE, etc.) + - Official specifications or documentation + - Foundational papers +- Include direct links. +- Clearly distinguish primary sources from secondary explanations. + +--- + +9. Stability and Maturity Assessment +- Is this topic considered stable, evolving, or experimental? +- Are there competing standards or dominant implementations? +- Is backward compatibility a concern? + +--- + +10. Notes for Primer Authoring +- Highlight points that MUST be stated clearly in a primer. +- Highlight areas where ambiguity must be avoided. +- Identify terminology that must be used consistently. + +--- + +Constraints: +- Use precise, declarative language. +- No metaphors or analogies. +- No marketing or opinionated statements. +- Assume a technically literate audience. +- Prefer explicit statements over implied assumptions. +``` + +--- + +## Why This Prompt Works Well + +This prompt is intentionally shaped to: + +* **Force scope clarity** (critical for AI agents) +* **Surface invariants and constraints** +* **Separate definition from implementation** +* **Anchor everything in primary sources** +* **Produce output that maps 1:1 to your Primer Authoring Rules** + +You can think of it as: + +> *A pre-primer that de-risks the primer.* + +--- + +## How You’ll Likely Use It in Practice + +Typical flow: + +1. Run this prompt on a topic (e.g. CUDA, OAuth 2.0, OpenQASM) +2. Review and correct factual issues +3. Collapse the research brief into the Primer schema +4. Apply brand tone + structural rules +5. Publish as an InfoTechPrimer + +xxx diff --git a/examples/content-generator/prepdr/AuthoringRules.md b/examples/content-generator/prepdr/AuthoringRules.md new file mode 100644 index 00000000..a47f27f8 --- /dev/null +++ b/examples/content-generator/prepdr/AuthoringRules.md @@ -0,0 +1,302 @@ +AuthoringRules + +*How to write effective primers* + +## Primer Authoring Rules + +**Status:** Draft +**Intended Audience:** Human authors and AI systems generating or validating primers +**Purpose:** Ensure primers are precise, stable, and suitable as shared context for humans and AI agents + +--- + +## 1. What a Primer Is (Normative) + +An **InfoTechPrimer** is a **short, structured reference document** that establishes a **shared understanding** of a specific IT term, standard, method, or concept. + +A primer: + +* Defines **what the topic is** +* Explains **where it fits** +* Clarifies **scope boundaries** +* Points to **authoritative sources** + +A primer does **not**: + +* Teach step-by-step usage +* Advocate tools or vendors +* Explore implementation details beyond what is normatively defined + +--- + +## 2. Target Audience + +Primers are written for: + +* Humans with solid general IT knowledge +* Readers who are *not specialists* in the specific topic +* AI systems that consume structured context for reasoning and coding + +Authors must assume: + +* Conceptual literacy +* Familiarity with basic IT terminology +* No prior deep knowledge of the topic + +--- + +## 3. Required Structure (Mandatory) + +Every primer **MUST** contain the following sections **in this order**: + +1. **Definition** +2. **Context** +3. **Core Concepts** +4. **Scope and Non-Scope** +5. **Practical Implications** +6. **Formal Standards and Authoritative Sources** +7. **Related Concepts** + +No section may be omitted. +Empty sections are not allowed. + +--- + +## 4. Section Authoring Rules + +### 4.1 Definition + +**Purpose:** Establish an unambiguous baseline meaning. + +Rules: + +* 2–4 sentences maximum +* Declarative, precise language +* No metaphors, examples, or analogies +* No historical narrative + +Good: + +> “OAuth 2.0 is an authorization framework that enables a third-party application to obtain limited access to an HTTP service on behalf of a resource owner.” + +Bad: + +> “OAuth is basically a way to let apps log in without sharing passwords.” + +--- + +### 4.2 Context + +**Purpose:** Position the concept within the IT landscape. + +Rules: + +* Describe the domain(s) the concept belongs to +* Explain *why it exists*, not *how to use it* +* Historical notes allowed only if they clarify intent or constraints + +Include: + +* Typical environments +* Architectural level (protocol, pattern, framework, etc.) + +--- + +### 4.3 Core Concepts + +**Purpose:** Identify the irreducible ideas that define the topic. + +Rules: + +* Bullet points only +* Each bullet describes one concept +* No nested lists +* Avoid redundancy with Definition + +Think: *“What must be true for this concept to exist?”* + +--- + +### 4.4 Scope and Non-Scope + +**Purpose:** Prevent conceptual drift and misuse. + +Rules: + +* Explicitly list inclusions and exclusions +* Use parallel structure +* Address common misconceptions + +Format: + +```markdown +**In Scope** +- ... + +**Out of Scope** +- ... +``` + +This section is **critical** for AI agent correctness. + +--- + +### 4.5 Practical Implications + +**Purpose:** Describe consequences of adopting or interacting with the concept. + +Rules: + +* Focus on effects, not instructions +* No step-by-step guidance +* Include tradeoffs where relevant + +Examples of acceptable content: + +* Design constraints +* Operational complexity +* Security or scalability implications + +--- + +### 4.6 Formal Standards and Authoritative Sources + +**Purpose:** Anchor the primer in canonical truth. + +Rules: + +* Prefer primary sources +* Include direct links +* Avoid blogs unless widely recognized and necessary + +Acceptable sources: + +* RFCs +* W3C Recommendations +* ISO / IEC standards +* NIST publications +* Official specifications +* Foundational academic papers + +At least **one** authoritative source is required. + +--- + +### 4.7 Related Concepts + +**Purpose:** Enable semantic navigation. + +Rules: + +* Short descriptions only (one line per concept) +* No deep explanations +* Avoid circular definitions + +Example: + +> **OpenID Connect** – An identity layer built on top of OAuth 2.0. + +--- + +## 5. Language and Style Rules + +Mandatory: + +* Present tense +* Declarative sentences +* Neutral, technical tone + +Avoid: + +* First-person language (“we”, “you”) +* Rhetorical questions +* Marketing language +* Informal phrasing +* Emojis + +--- + +## 6. Length Constraints + +A primer should typically be: + +* **600–1,000 words total** +* Short enough to be read in one sitting +* Long enough to define boundaries clearly + +Exceeding this range requires justification. + +--- + +## 7. Stability and Versioning + +Primers are intended to be **stable reference artifacts**. + +Rules: + +* Do not chase trends +* Avoid speculative content +* Update only when: + + * Standards change + * Definitions evolve materially + * Authoritative sources are superseded + +When updating: + +* Preserve conceptual continuity +* Avoid rewriting without necessity + +--- + +## 8. AI Optimization Rules (Explicit) + +Authors **SHOULD**: + +* Use consistent terminology +* Avoid synonyms for core terms once defined +* Prefer explicit over implicit assumptions +* State constraints clearly + +Authors **MUST NOT**: + +* Rely on context outside the document +* Assume tool- or framework-specific defaults +* Leave ambiguity where standards are explicit + +--- + +## 9. Validation Criteria (Checklist) + +A primer is valid if: + +* [ ] All required sections are present +* [ ] Definition is precise and unambiguous +* [ ] Scope boundaries are explicit +* [ ] At least one authoritative source is linked +* [ ] No tutorial or marketing content exists +* [ ] Language follows declarative style rules + +--- + +## 10. Non-Goals + +InfoTechPrimers are **not**: + +* Documentation replacements +* Training material +* Opinionated best-practice guides +* Tool comparisons + +Those belong elsewhere. + +--- + +### Version Note + +This is **Primer Authoring Rules v0.1**. +Expect tightening, not loosening, as real primers are written and validated. + + + +xxx diff --git a/examples/content-generator/prepdr/ETL.md b/examples/content-generator/prepdr/ETL.md new file mode 100644 index 00000000..f91e7513 --- /dev/null +++ b/examples/content-generator/prepdr/ETL.md @@ -0,0 +1,7 @@ +ETL + +*Extract Transform Load* + +A three-phase computing process where data is extracted from an input source, transformed (including cleaning), and loaded into an output data container. + +xxx diff --git a/examples/content-generator/prepdr/GeneratePrimerTemplate.md b/examples/content-generator/prepdr/GeneratePrimerTemplate.md new file mode 100644 index 00000000..f509a924 --- /dev/null +++ b/examples/content-generator/prepdr/GeneratePrimerTemplate.md @@ -0,0 +1,23 @@ +GeneratePrimerTemplate + +*Program to research infotech topics* + +# Generate Primer + +Assist me to the best of your ability adressing the $topic>, $guidance and $protoprompt I will be providing. +If something is unclear show me the arguments and prompt for refinement, before starting. + + +{{ETL}} + + + +{{AuthoringRules}} + + + +{{ResearchPrompt}} + + + +xxx diff --git a/examples/content-generator/prepdr/README.md b/examples/content-generator/prepdr/README.md new file mode 100644 index 00000000..808c7dd5 --- /dev/null +++ b/examples/content-generator/prepdr/README.md @@ -0,0 +1 @@ +I used these files to generate infotech primers for topics like ETL in a consistent way before formalizing the concept of prompt-dependence-resolution. diff --git a/examples/content-generator/templates/generate-primer.md b/examples/content-generator/templates/generate-primer.md new file mode 100644 index 00000000..e5a13e8d --- /dev/null +++ b/examples/content-generator/templates/generate-primer.md @@ -0,0 +1,59 @@ +--- +id: generate-primer-v1 +name: GeneratePrimerTemplate +artifact_type: template +description: PromptTemplate for generating InfoTech primers with dependency resolution +version: 1.0.0 +--- + +# Generate InfoTech Primer + +You are conducting foundational research and authoring an InfoTech Primer. + +**Your Task:** Produce a complete, publication-ready InfoTech Primer for the given topic, following the authoring rules exactly. + +## Topic Specification + +@{topic} + +## Authoring Guidelines + +@{authoring_rules} + +## Research Protocol + +When researching this topic, follow this systematic approach: + +@{research_prompt} + +--- + +## Output Requirements + +Generate a complete primer document with the following structure: + +1. **Definition** (2-4 sentences, precise and unambiguous) +2. **Context** (domain positioning and architectural level) +3. **Core Concepts** (bullet points, one concept per line) +4. **Scope and Non-Scope** (explicit inclusions and exclusions) +5. **Practical Implications** (design consequences, not tutorials) +6. **Formal Standards and Authoritative Sources** (with links) +7. **Related Concepts** (one-line descriptions) + +**Constraints:** +- Follow declarative, neutral tone +- No metaphors, marketing language, or tutorials +- Target length: 600-1,000 words +- Include at least one authoritative source +- Use consistent terminology throughout + +**Validation:** +Before finalizing, verify: +- [ ] All required sections are present +- [ ] Definition is precise and unambiguous +- [ ] Scope boundaries are explicit +- [ ] At least one authoritative source is linked +- [ ] No tutorial or marketing content +- [ ] Language follows declarative style rules + +Generate the primer now.