Files

tegwick cbde1dabc4 docs(prompts): add comprehensive implementation workplan

Create detailed 26-week workplan for Prompt Dependency Resolution system
implementing all 11 functional requirements across 8 phases:

- Phase 1-2: Foundation (artifacts, templates, macros)
- Phase 3-4: Resolution and execution engine with idempotent runs
- Phase 5-6: Dependency tracking and incremental recomputation
- Phase 7-8: Quality validation and observability/traceability

Includes database schemas, verification strategies, risk management,
and complete file structure for ~60 new modules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-08 22:09:20 +01:00

43 KiB

Raw Blame History

Prompt Dependency Resolution - Implementation Workplan

Overview

This workplan details the implementation phases for building the Prompt Dependency Resolution infrastructure within MarkiTect. This system enables structured execution of PromptTemplates with deterministic dependency resolution, incremental recomputation, and quality validation across InformationSpaces.

The system transforms MarkiTect from a static markdown tool into an executable knowledge infrastructure that supports:

Template-driven content generation with LLMs
Automatic dependency tracking and resolution
Idempotent execution with content-based caching
Incremental recomputation with change impact analysis
Quality gate validation with halting policies

Functional Requirements Mapping

The implementation is organized into 8 phases covering all 11 functional requirements from the FRS:

FR ID	Requirement	Implementation Phase
FR-1	InformationSpace Addressability	Phase 1: Foundation
FR-2	PromptTemplate Definition	Phase 2: Templates & Macros
FR-3	PromptResolver Behavior	Phase 3: Resolver Engine
FR-4	PromptRun Lifecycle	Phase 4: Execution Engine
FR-5	RunManifest Persistence	Phase 4: Execution Engine
FR-6	Dependency Graph Construction	Phase 5: Dependency Tracking
FR-7	Incremental Recompute	Phase 6: Incremental Execution
FR-8	Change Impact Assessment	Phase 6: Incremental Execution
FR-9	QualityGate Validation	Phase 7: Quality & Validation
FR-10	Halting and Refinement Policy	Phase 7: Quality & Validation
FR-11	Traceability and Auditability	Phase 8: Observability

Phase 1: Foundation - Addressable Artifacts (FR-1)

Capability Requirements

ID	Capability	Description	Priority
CAP-101	Artifact Identity	Persistent identifiers for content artifacts	Critical
CAP-102	Content Digest	SHA-256 content hashing for change detection	Critical
CAP-103	Artifact Registry	Lookup artifacts by name or ID within spaces	Critical
CAP-104	Cross-Space References	Reference artifacts across space boundaries	High
CAP-105	Artifact Metadata	Store artifact metadata (type, created, modified)	High

Implementation Tasks

Week 1: Core Models

Create markitect/prompts/models.py
- Artifact dataclass with id, name, space_id, content_digest, metadata
- ArtifactReference dataclass for cross-space addressing
- Content digest calculation utilities (SHA-256)
Create markitect/prompts/repositories/interfaces.py
- IArtifactRepository interface
Unit tests for artifact models and digest calculation

Week 2: Repository Implementation

Create markitect/prompts/repositories/sqlite.py
- SQLiteArtifactRepository implementing IArtifactRepository
- CRUD operations with content digest tracking
- Cross-space artifact lookup
Database migration scripts
Repository unit tests

Week 3: Artifact Service

Create markitect/prompts/services/artifact_service.py
- Register artifacts with automatic digest calculation
- Query artifacts by name, ID, or digest
- Track artifact modifications with digest updates
Integration tests with existing InformationSpace service

Database Schema

CREATE TABLE prompt_artifacts (
    id TEXT PRIMARY KEY,
    space_id TEXT NOT NULL REFERENCES spaces(id),
    name TEXT NOT NULL,
    artifact_type TEXT NOT NULL,
    content_digest TEXT NOT NULL,
    content_size INTEGER,
    metadata JSON,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    UNIQUE(space_id, name)
);

CREATE INDEX idx_artifacts_digest ON prompt_artifacts(content_digest);
CREATE INDEX idx_artifacts_space ON prompt_artifacts(space_id);

Verification

pytest tests/unit/prompts/test_artifact_models.py
pytest tests/unit/prompts/test_artifact_repository.py
pytest tests/integration/prompts/test_artifact_service.py

Phase 2: Templates & Macros (FR-2)

Capability Requirements

ID	Capability	Description	Priority
CAP-201	PromptTemplate Model	Template definition with content and metadata	Critical
CAP-202	ContentMacro Detection	Parse and extract macros from template content	Critical
CAP-203	Macro Types	Support Required, Optional, Generate macro kinds	Critical
CAP-204	Template Analysis	Analyze templates to extract macro dependencies	High
CAP-205	Template Validation	Validate template syntax and macro references	Medium

Implementation Tasks

Week 4: Template Models

Create markitect/prompts/templates/models.py
- PromptTemplate dataclass extending Artifact
- ContentMacro dataclass with kind, target, parameters
- MacroKind enum: REQUIRED, OPTIONAL, GENERATE
- TemplateMetadata for template-specific metadata
Unit tests for template models

Week 5: Macro Parser

Create markitect/prompts/templates/parser.py
- Regex-based macro extraction from markdown content
- Support macro syntax: {{require:artifact}}, {{optional:artifact}}, {{generate:template}}
- Parameter parsing for macro arguments
Create markitect/prompts/templates/analyzer.py
- TemplateAnalyzer class for dependency extraction
- Identify all macros and their types
- Build initial dependency list
Parser and analyzer unit tests

Week 6: Template Service

Create markitect/prompts/services/template_service.py
- Register templates with automatic analysis
- Query templates by ID or name
- Retrieve template with analyzed macro list
Integration tests

Template Syntax

# Example PromptTemplate

## Context

{{require:project-overview}}
{{optional:technical-constraints}}

## Task Description

Generate a technical design for {{require:feature-name}}.

## Previous Designs

{{generate:related-designs-collector}}

Macro Format

{{<kind>:<target>[|<param1>=<value1>|<param2>=<value2>...]}}

Examples:
{{require:glossary/authentication}}
{{optional:standards/api-design}}
{{generate:code-examples|language=python|framework=fastapi}}

Verification

pytest tests/unit/prompts/test_template_models.py
pytest tests/unit/prompts/test_macro_parser.py
pytest tests/unit/prompts/test_template_analyzer.py
pytest tests/integration/prompts/test_template_service.py

Phase 3: Resolver Engine (FR-3)

Capability Requirements

ID	Capability	Description	Priority
CAP-301	Resolution Strategy	Deterministic multi-space resolution order	Critical
CAP-302	Required Macro Resolution	Fail on missing required artifacts	Critical
CAP-303	Optional Macro Resolution	Graceful fallback for missing optional artifacts	Critical
CAP-304	Generate Macro Detection	Identify generator templates for nested execution	High
CAP-305	Resolution Context	Track resolution state and errors	High

Implementation Tasks

Week 7: Resolver Core

Create markitect/prompts/resolver/models.py
- ResolutionContext with resolution order, resolved artifacts, errors
- ResolutionResult with success status, resolved content, unresolved macros
- ResolutionError for missing required artifacts
Create markitect/prompts/resolver/strategy.py
- ResolutionStrategy base class
- MultiSpaceResolutionStrategy implementing FR-3.1 order:
  1. Local InformationSpace
  2. Explicitly included InformationSpaces
  3. Default InformationSpace
  4. Team/Shared InformationSpace (if configured)
Unit tests for resolution strategy

Week 8: PromptResolver Implementation

Create markitect/prompts/resolver/resolver.py
- PromptResolver class
- resolve_template(template, context) -> ResolutionResult
- Handle Required macros: fail if not found (FR-3.2)
- Handle Optional macros: resolve to empty (FR-3.3)
- Detect Generate macros for deferred resolution (FR-3.4)
- Track resolution errors and warnings
Resolver unit tests

Week 9: Context Compilation

Create markitect/prompts/resolver/compiler.py
- ContextCompiler class
- Compile resolved artifacts into single prompt context
- Substitute macros with resolved content
- Generate CompiledPrompt with full context
Integration tests for full resolution flow

Resolution Order Example

# Given template in space "my-project" referencing {{require:glossary}}
# Resolution search order:
1. my-project/glossary
2. <included-space-1>/glossary
3. <included-space-2>/glossary
4. default-space/glossary
5. shared-space/glossary  # if configured
# If not found: ResolutionError(MacroKind.REQUIRED, "glossary")

Database Schema Additions

CREATE TABLE prompt_resolution_config (
    space_id TEXT PRIMARY KEY REFERENCES spaces(id),
    included_spaces JSON,  -- Array of space IDs to search
    default_space_id TEXT REFERENCES spaces(id),
    shared_space_id TEXT REFERENCES spaces(id),
    max_generation_depth INTEGER DEFAULT 3,
    config JSON,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Verification

pytest tests/unit/prompts/test_resolution_strategy.py
pytest tests/unit/prompts/test_prompt_resolver.py
pytest tests/unit/prompts/test_context_compiler.py
pytest tests/integration/prompts/test_resolution_flow.py

Phase 4: Execution Engine (FR-4, FR-5)

Capability Requirements

ID	Capability	Description	Priority
CAP-401	PromptRun Lifecycle	Three-stage execution: Analysis, Compilation, Processing	Critical
CAP-402	InputBundleHash	Content-based execution fingerprinting	Critical
CAP-403	Idempotent Execution	Skip re-execution for identical input bundles	Critical
CAP-404	LLM Integration	Execute compiled prompts via LLM provider	Critical
CAP-405	RunManifest Persistence	Store complete execution provenance	Critical
CAP-406	Nested Generator Runs	Execute generate macros recursively	High

Implementation Tasks

Week 10: Execution Models

Create markitect/prompts/execution/models.py
- PromptRun dataclass with id, template_id, input_bundle_hash, status
- ExecutionStage enum: ANALYSIS, COMPILATION, PROCESSING, COMPLETE, FAILED
- RunConfig with model settings, depth limits, options
- InputBundle with template digest, dependency digests, config hash
- InputBundleHash calculation (SHA-256 of sorted input bundle)
Create markitect/prompts/execution/manifest.py
- RunManifest comprehensive execution record
- Template metadata, resolved inputs, compiled prompt digest
- Model configuration, output artifacts, validation results
- Dependency edges, timing metadata
Unit tests for execution models

Week 11: Execution Engine

Create markitect/prompts/execution/engine.py
- PromptExecutionEngine class
- execute(template, config) -> PromptRun
- Stage 1: Template analysis (use TemplateAnalyzer)
- Stage 2: Context compilation (use ContextCompiler)
- Stage 3: Prompt processing (LLM invocation)
- Calculate InputBundleHash before execution
- Check for existing run with same hash (FR-4.4)
- Store RunManifest on completion
Engine unit tests

Week 12: LLM Integration Layer

Create markitect/prompts/execution/llm_adapter.py
- LLMAdapter abstract base class
- execute_prompt(compiled_prompt, config) -> LLMResponse
- Mock implementation for testing
- OpenAI/Anthropic adapter stubs (to be implemented)
Create markitect/prompts/execution/generator.py
- GeneratorExecutor for nested generate macro execution
- Enforce max depth limit (FR-3.5)
- Track parent-child run relationships
- Link generator runs in RunManifest (FR-5.3)
Integration tests for full execution flow

Database Schema Additions

CREATE TABLE prompt_runs (
    id TEXT PRIMARY KEY,
    template_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
    input_bundle_hash TEXT NOT NULL,
    status TEXT NOT NULL,
    stage TEXT NOT NULL,
    parent_run_id TEXT REFERENCES prompt_runs(id),
    depth INTEGER DEFAULT 0,
    started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    completed_at TIMESTAMP,
    error_message TEXT,
    UNIQUE(input_bundle_hash)  -- Idempotency constraint
);

CREATE TABLE run_manifests (
    run_id TEXT PRIMARY KEY REFERENCES prompt_runs(id),
    template_metadata JSON NOT NULL,
    resolved_inputs JSON NOT NULL,
    compiled_prompt_digest TEXT NOT NULL,
    model_config JSON NOT NULL,
    output_artifacts JSON,
    dependency_edges JSON,
    validation_results JSON,
    impact_debt JSON,
    timing_metadata JSON,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_runs_template ON prompt_runs(template_id);
CREATE INDEX idx_runs_bundle_hash ON prompt_runs(input_bundle_hash);
CREATE INDEX idx_runs_parent ON prompt_runs(parent_run_id);

InputBundleHash Calculation

def calculate_input_bundle_hash(
    template_digest: str,
    dependency_digests: Dict[str, str],  # {artifact_name: digest}
    config_hash: str,
    model_settings: Dict
) -> str:
    """
    Deterministic hash of complete input context.

    Components (sorted for determinism):
    1. Template content digest
    2. Sorted dependency digests by name
    3. Resolution configuration hash
    4. Model settings (name, temperature, etc.)
    5. Compilation options
    """
    bundle = {
        'template': template_digest,
        'dependencies': sorted(dependency_digests.items()),
        'config': config_hash,
        'model': sorted(model_settings.items())
    }
    return hashlib.sha256(
        json.dumps(bundle, sort_keys=True).encode()
    ).hexdigest()

Verification

pytest tests/unit/prompts/test_execution_models.py
pytest tests/unit/prompts/test_execution_engine.py
pytest tests/unit/prompts/test_llm_adapter.py
pytest tests/unit/prompts/test_generator_executor.py
pytest tests/integration/prompts/test_prompt_execution.py
pytest tests/integration/prompts/test_idempotent_execution.py

Phase 5: Dependency Tracking (FR-6)

Capability Requirements

ID	Capability	Description	Priority
CAP-501	Dependency Edge Recording	Track input → output relationships	Critical
CAP-502	Dependency Graph Construction	Build queryable dependency graph	Critical
CAP-503	Circular Dependency Detection	Identify cycles in dependency chains	High
CAP-504	Dependency Query	Find dependents of any artifact	High
CAP-505	Cross-Space Dependencies	Track dependencies across spaces	Medium

Implementation Tasks

Week 13: Dependency Models

Create markitect/prompts/dependencies/models.py
- DependencyEdge with source_id, target_id, run_id, edge_type
- EdgeType enum: REQUIRES, GENERATES, INCLUDES
- DependencyGraph class for graph operations
- CircularDependencyError exception
Unit tests for dependency models

Week 14: Graph Builder

Create markitect/prompts/dependencies/graph.py
- GraphBuilder class
- Extract dependencies from RunManifest
- Add edges: artifact → run (input), run → artifact (output)
- Build adjacency list representation
- Cycle detection using DFS
Create markitect/prompts/dependencies/repository.py
- SQLiteDependencyRepository
- Store and query dependency edges
- Efficient dependent lookup queries
Graph builder and repository tests

Week 15: Query Operations

Create markitect/prompts/dependencies/queries.py
- find_dependents(artifact_id, depth=1) -> List[Artifact]
- find_dependencies(artifact_id) -> List[Artifact]
- get_dependency_chain(source_id, target_id) -> List[Edge]
- detect_circular_dependencies(artifact_id) -> List[Cycle]
Integration tests for dependency queries

Database Schema Additions

CREATE TABLE prompt_dependencies (
    id TEXT PRIMARY KEY,
    source_artifact_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
    target_artifact_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
    run_id TEXT NOT NULL REFERENCES prompt_runs(id),
    edge_type TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    UNIQUE(source_artifact_id, target_artifact_id, run_id)
);

CREATE INDEX idx_deps_source ON prompt_dependencies(source_artifact_id);
CREATE INDEX idx_deps_target ON prompt_dependencies(target_artifact_id);
CREATE INDEX idx_deps_run ON prompt_dependencies(run_id);

Dependency Graph Example

Template: design-doc (ID: t1)
  → requires: glossary (ID: a1)
  → requires: requirements (ID: a2)
  → generates: api-spec (ID: a3)
    PromptRun: r1
      Edges:
        a1 → r1 (REQUIRES)
        a2 → r1 (REQUIRES)
        r1 → a3 (GENERATES)

When a1 (glossary) changes:
  Dependents(a1, depth=1) = [r1]
  Affected outputs = [a3]  (need recomputation)

Verification

pytest tests/unit/prompts/test_dependency_models.py
pytest tests/unit/prompts/test_graph_builder.py
pytest tests/unit/prompts/test_dependency_repository.py
pytest tests/unit/prompts/test_dependency_queries.py
pytest tests/integration/prompts/test_dependency_graph.py
pytest tests/integration/prompts/test_circular_detection.py

Phase 6: Incremental Execution (FR-7, FR-8)

Capability Requirements

ID	Capability	Description	Priority
CAP-601	Change Detection	Detect artifact modifications via digest comparison	Critical
CAP-602	Incremental Recompute	Recompute direct dependents on change	Critical
CAP-603	Depth Control	Configurable recomputation depth (default=1)	High
CAP-604	Circular Suppression	Suppress recompute to prevent cycles	High
CAP-605	Change Impact Analysis	Calculate change magnitude metrics	High
CAP-606	Impact Debt Tracking	Record suppressed recomputations	Medium

Implementation Tasks

Week 16: Change Detection

Create markitect/prompts/incremental/models.py
- ArtifactChange with old_digest, new_digest, change_type
- ChangeType enum: CREATED, MODIFIED, DELETED
- ImpactDebt for suppressed recomputations
- RecomputeConfig with depth, circular handling, budget limits
Create markitect/prompts/incremental/detector.py
- ChangeDetector class
- Compare current digest with stored digest
- Identify change type and magnitude
Unit tests for change detection

Week 17: Impact Analysis

Create markitect/prompts/incremental/impact.py
- ImpactAnalyzer class
- Calculate change magnitude (FR-8.2):
  - Structural diff ratio (default)
  - Content diff ratio (character-level)
  - Optional: embedding distance
  - Optional: LLM-assessed impact
- Generate impact score (0.0 to 1.0)
Create markitect/prompts/incremental/metrics.py
- Diff calculation utilities
- Similarity scoring algorithms
Impact analyzer tests

Week 18: Incremental Recompute Engine

Create markitect/prompts/incremental/engine.py
- IncrementalExecutionEngine class
- recompute_dependents(artifact_id, config) -> RecomputeResult
- Find direct dependents via dependency graph (depth=1 default)
- Check for circular dependencies
- Execute prompt runs for affected dependents
- Track suppressed recomputations as ImpactDebt
- Record impact assessments in RunManifest (FR-8.3)
Integration tests for incremental execution

Recomputation Logic

def recompute_dependents(artifact_id: str, config: RecomputeConfig):
    """
    FR-7: Incremental recompute with depth control.

    1. Detect change in artifact
    2. Find dependents up to specified depth (default=1)
    3. For each dependent:
       - Check if recompute would create cycle → suppress if yes
       - Calculate change impact
       - If impact > threshold and budget available:
         - Recompute (re-execute PromptRun)
       - Else:
         - Record as ImpactDebt in RunManifest
    4. Return RecomputeResult with executed/suppressed counts
    """

Database Schema Additions

CREATE TABLE artifact_changes (
    id TEXT PRIMARY KEY,
    artifact_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
    old_digest TEXT,
    new_digest TEXT NOT NULL,
    change_type TEXT NOT NULL,
    detected_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE impact_debt (
    id TEXT PRIMARY KEY,
    artifact_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
    dependent_run_id TEXT NOT NULL REFERENCES prompt_runs(id),
    change_magnitude REAL NOT NULL,
    suppression_reason TEXT NOT NULL,
    recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_changes_artifact ON artifact_changes(artifact_id);
CREATE INDEX idx_debt_artifact ON impact_debt(artifact_id);
CREATE INDEX idx_debt_run ON impact_debt(dependent_run_id);

Verification

pytest tests/unit/prompts/test_change_detector.py
pytest tests/unit/prompts/test_impact_analyzer.py
pytest tests/unit/prompts/test_incremental_engine.py
pytest tests/integration/prompts/test_incremental_recompute.py
pytest tests/integration/prompts/test_circular_suppression.py
pytest tests/integration/prompts/test_impact_debt.py

Phase 7: Quality & Validation (FR-9, FR-10)

Capability Requirements

ID	Capability	Description	Priority
CAP-701	Schema Validation	Validate generated artifacts against JSON schemas	High
CAP-702	QualityGate Framework	Pluggable validation framework	High
CAP-703	Validation Results	Record pass/fail with diagnostics	High
CAP-704	Halting Policy	Configurable execution halting rules	Medium
CAP-705	Refinement Loop	Iterative improvement with quality checks	Medium

Implementation Tasks

Week 19: QualityGate Framework

Create markitect/prompts/quality/models.py
- QualityGate abstract base class
- ValidationResult with status, diagnostics, score
- QualityPolicy with halting rules
- GateType enum: SCHEMA, PATTERN, CUSTOM
Create markitect/prompts/quality/gates/schema_gate.py
- SchemaValidationGate using existing schema validator
- Validate generated artifacts against JSON schemas
Create markitect/prompts/quality/gates/pattern_gate.py
- PatternValidationGate for regex-based checks
Unit tests for quality gates

Week 20: Validation Integration

Create markitect/prompts/quality/validator.py
- QualityValidator class
- Apply multiple gates to generated artifacts
- Aggregate validation results
- Record results in RunManifest (FR-9.3)
Integrate with execution engine
- Run quality gates after prompt processing
- Store validation results in RunManifest
Integration tests

Week 21: Halting Policy Engine

Create markitect/prompts/quality/policy.py
- HaltingPolicyEngine class
- Evaluate halting conditions (FR-10.2):
  - QualityGate failures
  - Marginal improvement below threshold
  - Iteration limit reached
  - Resource budget exhausted
- Record halting decisions in RunManifest (FR-10.3)
Create markitect/prompts/quality/refinement.py
- RefinementLoop for iterative improvement
- Execute → Validate → Halt or Refine
Policy engine and refinement loop tests

QualityGate Example

# Schema validation gate
schema_gate = SchemaValidationGate(
    schema_path="schemas/api-spec-schema.json"
)

# Pattern validation gate
pattern_gate = PatternValidationGate(
    required_patterns=[r"## Endpoints", r"### Authentication"],
    forbidden_patterns=[r"TODO", r"FIXME"]
)

# Quality policy
policy = QualityPolicy(
    gates=[schema_gate, pattern_gate],
    halting_rules={
        'max_iterations': 3,
        'min_improvement': 0.05,
        'fail_on_validation_error': True
    }
)

Database Schema Additions

CREATE TABLE quality_gates (
    id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    gate_type TEXT NOT NULL,
    config JSON NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE validation_results (
    id TEXT PRIMARY KEY,
    run_id TEXT NOT NULL REFERENCES prompt_runs(id),
    gate_id TEXT NOT NULL REFERENCES quality_gates(id),
    artifact_id TEXT REFERENCES prompt_artifacts(id),
    status TEXT NOT NULL,  -- PASS, FAIL, WARNING
    score REAL,
    diagnostics JSON,
    validated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_validations_run ON validation_results(run_id);
CREATE INDEX idx_validations_artifact ON validation_results(artifact_id);

Verification

pytest tests/unit/prompts/test_quality_gates.py
pytest tests/unit/prompts/test_quality_validator.py
pytest tests/unit/prompts/test_halting_policy.py
pytest tests/unit/prompts/test_refinement_loop.py
pytest tests/integration/prompts/test_quality_validation.py
pytest tests/integration/prompts/test_halting_execution.py

Phase 8: Observability & Traceability (FR-11)

Capability Requirements

ID	Capability	Description	Priority
CAP-801	Provenance Tracing	Trace any artifact to its producing run	High
CAP-802	Dependency Visualization	Visualize dependency graph	Medium
CAP-803	Run History	Query execution history	High
CAP-804	Audit Logging	Complete audit trail of all operations	Medium
CAP-805	GraphQL API	Query interface for all prompt operations	High
CAP-806	CLI Commands	Command-line tools for management	High

Implementation Tasks

Week 22: Traceability Service

Create markitect/prompts/traceability/service.py
- TraceabilityService class
- trace_artifact(artifact_id) -> ProvenanceTrace
- get_producing_run(artifact_id) -> PromptRun
- get_input_artifacts(run_id) -> List[Artifact]
- get_generator_runs(run_id) -> List[PromptRun]
- get_validation_history(artifact_id) -> List[ValidationResult]
Unit and integration tests

Week 23: Query & Visualization

Create markitect/prompts/visualization/graph.py
- Export dependency graph in DOT format
- Generate Mermaid diagrams
Create markitect/prompts/queries/
- Complex query operations
- Run history queries
- Impact analysis queries
Visualization and query tests

Week 24: API Layer - GraphQL

Create markitect/prompts/graphql/schema.py
- Extend existing GraphQL schema with prompt types
- PromptTemplate, PromptRun, Artifact, DependencyEdge types
- Queries: template, templates, run, runs, artifact, dependencies
- Mutations: executeTemplate, recomputeDependents
- Subscriptions: onRunComplete, onArtifactChange
Create markitect/prompts/graphql/resolvers.py
- Implement all query and mutation resolvers
GraphQL integration tests

Week 25: CLI Commands

Extend markitect/cli.py with prompt commands:
- markitect prompt template create/list/show/delete
- markitect prompt execute <template> [--config CONFIG]
- markitect prompt recompute <artifact> [--depth N]
- markitect prompt trace <artifact>
- markitect prompt graph <artifact> [--format dot|mermaid]
- markitect prompt runs [--template TEMPLATE] [--status STATUS]
- markitect prompt validate <artifact> [--gates GATES]
CLI integration tests

Week 26: Documentation & Polish

User guide for prompt dependency resolution
API documentation
Example prompt templates and workflows
Performance optimization
Final integration testing

GraphQL Schema Extensions

type PromptTemplate {
  id: ID!
  name: String!
  spaceId: ID!
  content: String!
  contentDigest: String!
  macros: [ContentMacro!]!
  metadata: JSON
  createdAt: DateTime!
  updatedAt: DateTime!
}

type ContentMacro {
  kind: MacroKind!
  target: String!
  parameters: JSON
}

enum MacroKind {
  REQUIRED
  OPTIONAL
  GENERATE
}

type PromptRun {
  id: ID!
  template: PromptTemplate!
  inputBundleHash: String!
  status: RunStatus!
  stage: ExecutionStage!
  parentRun: PromptRun
  depth: Int!
  manifest: RunManifest!
  startedAt: DateTime!
  completedAt: DateTime
}

type RunManifest {
  runId: ID!
  templateMetadata: JSON!
  resolvedInputs: [ResolvedInput!]!
  compiledPromptDigest: String!
  modelConfig: JSON!
  outputArtifacts: [Artifact!]!
  dependencyEdges: [DependencyEdge!]!
  validationResults: [ValidationResult!]!
  impactDebt: [ImpactDebt!]!
}

type DependencyEdge {
  id: ID!
  source: Artifact!
  target: Artifact!
  run: PromptRun!
  edgeType: EdgeType!
}

type Query {
  promptTemplate(id: ID!): PromptTemplate
  promptTemplates(spaceId: ID): [PromptTemplate!]!
  promptRun(id: ID!): PromptRun
  promptRuns(templateId: ID, status: RunStatus): [PromptRun!]!
  artifact(id: ID!): Artifact
  dependencies(artifactId: ID!, depth: Int): [DependencyEdge!]!
  traceArtifact(id: ID!): ProvenanceTrace!
}

type Mutation {
  createTemplate(input: CreateTemplateInput!): PromptTemplate!
  executeTemplate(templateId: ID!, config: ExecutionConfig): PromptRun!
  recomputeDependents(artifactId: ID!, config: RecomputeConfig): RecomputeResult!
}

type Subscription {
  onRunComplete(templateId: ID): PromptRun!
  onArtifactChange(spaceId: ID): ArtifactChange!
}

CLI Examples

# Create and execute a template
markitect prompt template create design-doc \
  --space my-project \
  --content @templates/design-template.md

markitect prompt execute design-doc \
  --config '{"model": "gpt-4", "temperature": 0.7}'

# Trace artifact provenance
markitect prompt trace api-spec
# Output:
# Artifact: api-spec (a3)
# Produced by: PromptRun r1
# Template: design-doc (t1)
# Input artifacts:
#   - glossary (a1)
#   - requirements (a2)
# Generated at: 2026-02-08 10:30:00

# Visualize dependencies
markitect prompt graph api-spec --format mermaid > deps.mmd

# Recompute after change
markitect prompt recompute glossary --depth 2
# Recomputing dependents of glossary...
# ✓ design-doc run r1 (api-spec regenerated)
# ✓ implementation-guide run r2 (guide regenerated)
# Summary: 2 runs executed, 0 suppressed

Verification

pytest tests/unit/prompts/test_traceability_service.py
pytest tests/unit/prompts/test_visualization.py
pytest tests/integration/prompts/test_graphql_api.py
pytest tests/integration/prompts/test_cli_commands.py
pytest tests/e2e/prompts/test_complete_workflow.py

Timeline Summary

Phase	Focus	Duration	Cumulative
1	Foundation - Addressable Artifacts	3 weeks	3 weeks
2	Templates & Macros	3 weeks	6 weeks
3	Resolver Engine	3 weeks	9 weeks
4	Execution Engine	3 weeks	12 weeks
5	Dependency Tracking	3 weeks	15 weeks
6	Incremental Execution	3 weeks	18 weeks
7	Quality & Validation	3 weeks	21 weeks
8	Observability & Traceability	5 weeks	26 weeks

Total: 26 weeks (~6 months)

Parallel Work Opportunities

Phases 7 (Quality) and 8 (Observability) can partially overlap
Documentation can be written incrementally throughout
CLI commands can start in parallel with Phase 7
GraphQL schema can be drafted early and implemented incrementally

Files to Create

Core Modules

markitect/prompts/
├── __init__.py
├── models.py                           # Phase 1: Artifact models
├── repositories/
│   ├── __init__.py
│   ├── interfaces.py                   # Phase 1: Repository interfaces
│   └── sqlite.py                       # Phase 1: SQLite implementations
├── templates/
│   ├── __init__.py
│   ├── models.py                       # Phase 2: Template models
│   ├── parser.py                       # Phase 2: Macro parser
│   └── analyzer.py                     # Phase 2: Template analyzer
├── resolver/
│   ├── __init__.py
│   ├── models.py                       # Phase 3: Resolution models
│   ├── strategy.py                     # Phase 3: Resolution strategies
│   ├── resolver.py                     # Phase 3: PromptResolver
│   └── compiler.py                     # Phase 3: Context compiler
├── execution/
│   ├── __init__.py
│   ├── models.py                       # Phase 4: Execution models
│   ├── manifest.py                     # Phase 4: RunManifest
│   ├── engine.py                       # Phase 4: Execution engine
│   ├── llm_adapter.py                  # Phase 4: LLM integration
│   └── generator.py                    # Phase 4: Generator executor
├── dependencies/
│   ├── __init__.py
│   ├── models.py                       # Phase 5: Dependency models
│   ├── graph.py                        # Phase 5: Graph builder
│   ├── repository.py                   # Phase 5: Dependency storage
│   └── queries.py                      # Phase 5: Graph queries
├── incremental/
│   ├── __init__.py
│   ├── models.py                       # Phase 6: Change models
│   ├── detector.py                     # Phase 6: Change detector
│   ├── impact.py                       # Phase 6: Impact analyzer
│   ├── metrics.py                      # Phase 6: Diff metrics
│   └── engine.py                       # Phase 6: Incremental engine
├── quality/
│   ├── __init__.py
│   ├── models.py                       # Phase 7: Quality models
│   ├── gates/
│   │   ├── __init__.py
│   │   ├── schema_gate.py             # Phase 7: Schema validation
│   │   └── pattern_gate.py            # Phase 7: Pattern validation
│   ├── validator.py                    # Phase 7: Quality validator
│   ├── policy.py                       # Phase 7: Halting policy
│   └── refinement.py                   # Phase 7: Refinement loop
├── traceability/
│   ├── __init__.py
│   └── service.py                      # Phase 8: Traceability
├── visualization/
│   ├── __init__.py
│   └── graph.py                        # Phase 8: Graph visualization
├── queries/
│   ├── __init__.py
│   └── operations.py                   # Phase 8: Complex queries
├── graphql/
│   ├── __init__.py
│   ├── schema.py                       # Phase 8: GraphQL schema
│   └── resolvers.py                    # Phase 8: Resolvers
└── services/
    ├── __init__.py
    ├── artifact_service.py             # Phase 1: Artifact operations
    └── template_service.py             # Phase 2: Template operations

Test Files

tests/unit/prompts/
├── test_artifact_models.py             # Phase 1
├── test_artifact_repository.py         # Phase 1
├── test_template_models.py             # Phase 2
├── test_macro_parser.py                # Phase 2
├── test_template_analyzer.py           # Phase 2
├── test_resolution_strategy.py         # Phase 3
├── test_prompt_resolver.py             # Phase 3
├── test_context_compiler.py            # Phase 3
├── test_execution_models.py            # Phase 4
├── test_execution_engine.py            # Phase 4
├── test_llm_adapter.py                 # Phase 4
├── test_generator_executor.py          # Phase 4
├── test_dependency_models.py           # Phase 5
├── test_graph_builder.py               # Phase 5
├── test_dependency_repository.py       # Phase 5
├── test_dependency_queries.py          # Phase 5
├── test_change_detector.py             # Phase 6
├── test_impact_analyzer.py             # Phase 6
├── test_incremental_engine.py          # Phase 6
├── test_quality_gates.py               # Phase 7
├── test_quality_validator.py           # Phase 7
├── test_halting_policy.py              # Phase 7
├── test_refinement_loop.py             # Phase 7
├── test_traceability_service.py        # Phase 8
└── test_visualization.py               # Phase 8

tests/integration/prompts/
├── test_artifact_service.py            # Phase 1
├── test_template_service.py            # Phase 2
├── test_resolution_flow.py             # Phase 3
├── test_prompt_execution.py            # Phase 4
├── test_idempotent_execution.py        # Phase 4
├── test_dependency_graph.py            # Phase 5
├── test_circular_detection.py          # Phase 5
├── test_incremental_recompute.py       # Phase 6
├── test_circular_suppression.py        # Phase 6
├── test_impact_debt.py                 # Phase 6
├── test_quality_validation.py          # Phase 7
├── test_halting_execution.py           # Phase 7
├── test_graphql_api.py                 # Phase 8
└── test_cli_commands.py                # Phase 8

tests/e2e/prompts/
└── test_complete_workflow.py           # Phase 8

Documentation Files

docs/prompts/
├── GETTING_STARTED.md
├── TEMPLATE_GUIDE.md
├── EXECUTION_GUIDE.md
├── DEPENDENCY_MANAGEMENT.md
├── QUALITY_GATES.md
└── API_REFERENCE.md

Database Migrations

migrations/prompts/
├── 001_create_artifacts_table.sql      # Phase 1
├── 002_create_resolution_config.sql    # Phase 3
├── 003_create_runs_and_manifests.sql   # Phase 4
├── 004_create_dependencies.sql         # Phase 5
├── 005_create_changes_and_debt.sql     # Phase 6
└── 006_create_quality_tables.sql       # Phase 7

Success Criteria

The implementation is considered complete when all of the following acceptance criteria from the FRS are met:

✅ FR-2 & FR-3: A PromptTemplate referencing Required, Optional, and Generate macros can be executed
✅ FR-3.4 & FR-4: Missing Generate dependencies are automatically generated and persisted
✅ FR-4.4: Re-running an unchanged PromptRun with identical InputBundleHash results in skipped execution
✅ FR-7: Changing an upstream artifact triggers recomputation of direct dependents
✅ FR-7.3: Circular recomputation is suppressed and logged
✅ FR-5: RunManifest contains complete provenance and dependency information
✅ FR-9: Schema validation failures are correctly recorded and influence halting policy

Additional Quality Metrics

Test Coverage: >85% for all prompt modules
Performance: Execute simple template in <500ms (excluding LLM call)
Performance: Build dependency graph for 1000 artifacts in <2s
Performance: Incremental recompute for 100 dependents in <5s
Documentation: Complete user guides for all major workflows
Integration: Zero regressions in existing MarkiTect functionality

Design Decisions

1. LLM Provider Abstraction

Decision: Abstract LLM integration behind LLMAdapter interface Rationale: FRS explicitly does not prescribe LLM provider. Adapter pattern allows pluggable providers (OpenAI, Anthropic, local models).

2. Storage Backend

Decision: SQLite for persistence, in-memory graph for queries Rationale: Consistent with existing InformationSpace implementation. SQLite provides ACID guarantees. In-memory graph enables fast traversal.

3. Macro Syntax

Decision: Use {{kind:target|param=value}} syntax in markdown Rationale: Non-invasive in markdown source. Compatible with existing transclusion syntax. Easy to parse with regex.

4. Incremental Recompute Default Depth

Decision: Default depth=1 (direct dependents only) Rationale: Per FR-7.2. Prevents cascading recomputation storms. User can increase depth explicitly when needed.

5. Circular Dependency Handling

Decision: Suppress recomputation, record as ImpactDebt Rationale: Per FR-7.3. Avoids infinite loops. Debt tracking ensures visibility into suppressed updates.

6. Change Impact Default Method

Decision: Structural diff ratio Rationale: Per FR-8.2. Fast, deterministic, no external dependencies. Embedding and LLM methods are optional enhancements.

7. InputBundleHash Components

Decision: Template digest + dependency digests + config + model settings Rationale: Per FR-4.3. Captures all factors affecting prompt output. Ensures idempotent execution.

8. RunManifest Storage Format

Decision: JSON columns in SQLite Rationale: Flexible schema for manifest evolution. Queryable via SQLite JSON functions. Easy to export for analysis.

Risk Management

Risk 1: LLM Integration Complexity

Impact: High | Probability: Medium Mitigation:

Start with mock LLM adapter for testing
Well-defined adapter interface
Implement one production adapter (e.g., OpenAI) in Phase 4
Additional adapters can be added incrementally

Risk 2: Performance at Scale

Impact: Medium | Probability: Medium Mitigation:

Index all foreign keys and frequently queried columns
Use in-memory graph for dependency traversal
Implement pagination for large result sets
Performance testing with 10K+ artifacts in Phase 6

Risk 3: Circular Dependency Complexity

Impact: Medium | Probability: Low Mitigation:

Thorough cycle detection testing
Clear documentation on when cycles occur
ImpactDebt provides visibility
Users can manually break cycles if needed

Risk 4: Quality Gate Extensibility

Impact: Low | Probability: Low Mitigation:

Plugin-based architecture for gates
Well-defined QualityGate interface
Ship with schema and pattern gates
Document custom gate creation

Risk 5: Integration with Existing InformationSpace

Impact: High | Probability: Low Mitigation:

Build on top of existing space infrastructure
Reuse space repositories and services
Comprehensive integration tests
Incremental rollout per phase

Future Enhancements (Out of Scope)

The following capabilities are valuable but explicitly out of scope for initial implementation:

Distributed Execution: Execute prompt runs across multiple workers
Real-time Collaboration: Multiple users editing templates simultaneously
Version Control Integration: Git-based template versioning
Advanced Visualization: Interactive dependency graph UI
Cost Tracking: Track LLM API costs per run
Prompt Optimization: Automatic prompt refinement based on results
Multi-modal Artifacts: Support for images, audio in artifacts
External Data Sources: Pull artifacts from APIs, databases
Scheduling: Cron-based automatic recomputation
A/B Testing: Compare multiple template variations

Implementation Status

Status: Planning Complete Next Step: Begin Phase 1 implementation Target Start: TBD Estimated Completion: 26 weeks from start

Conclusion

This workplan provides a comprehensive roadmap for implementing Prompt Dependency Resolution infrastructure in MarkiTect. The 8-phase approach ensures:

Incremental Delivery: Each phase delivers working functionality
Risk Mitigation: Complex features built on solid foundations
Testability: Comprehensive test coverage at every phase
Extensibility: Clean architecture supports future enhancements
Compliance: Full coverage of all FR-1 through FR-11 requirements

The implementation will transform MarkiTect into an executable knowledge infrastructure, enabling deterministic, traceable, and incremental execution of prompt-based content generation across InformationSpaces.

Status: Ready for Implementation 🚀

43 KiB Raw Blame History

Prompt Dependency Resolution - Implementation Workplan

Overview

Functional Requirements Mapping

Phase 1: Foundation - Addressable Artifacts (FR-1)

Capability Requirements

Implementation Tasks

Database Schema

Verification

Phase 2: Templates & Macros (FR-2)

Capability Requirements

Implementation Tasks

Template Syntax

Macro Format

Verification

Phase 3: Resolver Engine (FR-3)

Capability Requirements

Implementation Tasks

Resolution Order Example

Database Schema Additions

Verification

Phase 4: Execution Engine (FR-4, FR-5)

Capability Requirements

Implementation Tasks

Database Schema Additions

InputBundleHash Calculation

Verification

Phase 5: Dependency Tracking (FR-6)

Capability Requirements

Implementation Tasks

Database Schema Additions

Dependency Graph Example

Verification

Phase 6: Incremental Execution (FR-7, FR-8)

Capability Requirements

Implementation Tasks

Recomputation Logic

Database Schema Additions

Verification

Phase 7: Quality & Validation (FR-9, FR-10)

Capability Requirements

Implementation Tasks

QualityGate Example

Database Schema Additions

Verification

Phase 8: Observability & Traceability (FR-11)

Capability Requirements

Implementation Tasks

GraphQL Schema Extensions

CLI Examples

Verification

Timeline Summary

Parallel Work Opportunities

Files to Create

Core Modules

Test Files

Documentation Files

Database Migrations

Success Criteria

Additional Quality Metrics

Design Decisions

1. LLM Provider Abstraction

2. Storage Backend

3. Macro Syntax

4. Incremental Recompute Default Depth

5. Circular Dependency Handling

6. Change Impact Default Method

7. InputBundleHash Components

8. RunManifest Storage Format

Risk Management

Risk 1: LLM Integration Complexity

Risk 2: Performance at Scale

Risk 3: Circular Dependency Complexity

Risk 4: Quality Gate Extensibility

Risk 5: Integration with Existing InformationSpace

Future Enhancements (Out of Scope)

Implementation Status

Conclusion

43 KiB

Raw Blame History