Create detailed 26-week workplan for Prompt Dependency Resolution system implementing all 11 functional requirements across 8 phases: - Phase 1-2: Foundation (artifacts, templates, macros) - Phase 3-4: Resolution and execution engine with idempotent runs - Phase 5-6: Dependency tracking and incremental recomputation - Phase 7-8: Quality validation and observability/traceability Includes database schemas, verification strategies, risk management, and complete file structure for ~60 new modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
43 KiB
Prompt Dependency Resolution - Implementation Workplan
Overview
This workplan details the implementation phases for building the Prompt Dependency Resolution infrastructure within MarkiTect. This system enables structured execution of PromptTemplates with deterministic dependency resolution, incremental recomputation, and quality validation across InformationSpaces.
The system transforms MarkiTect from a static markdown tool into an executable knowledge infrastructure that supports:
- Template-driven content generation with LLMs
- Automatic dependency tracking and resolution
- Idempotent execution with content-based caching
- Incremental recomputation with change impact analysis
- Quality gate validation with halting policies
Functional Requirements Mapping
The implementation is organized into 8 phases covering all 11 functional requirements from the FRS:
| FR ID | Requirement | Implementation Phase |
|---|---|---|
| FR-1 | InformationSpace Addressability | Phase 1: Foundation |
| FR-2 | PromptTemplate Definition | Phase 2: Templates & Macros |
| FR-3 | PromptResolver Behavior | Phase 3: Resolver Engine |
| FR-4 | PromptRun Lifecycle | Phase 4: Execution Engine |
| FR-5 | RunManifest Persistence | Phase 4: Execution Engine |
| FR-6 | Dependency Graph Construction | Phase 5: Dependency Tracking |
| FR-7 | Incremental Recompute | Phase 6: Incremental Execution |
| FR-8 | Change Impact Assessment | Phase 6: Incremental Execution |
| FR-9 | QualityGate Validation | Phase 7: Quality & Validation |
| FR-10 | Halting and Refinement Policy | Phase 7: Quality & Validation |
| FR-11 | Traceability and Auditability | Phase 8: Observability |
Phase 1: Foundation - Addressable Artifacts (FR-1)
Capability Requirements
| ID | Capability | Description | Priority |
|---|---|---|---|
| CAP-101 | Artifact Identity | Persistent identifiers for content artifacts | Critical |
| CAP-102 | Content Digest | SHA-256 content hashing for change detection | Critical |
| CAP-103 | Artifact Registry | Lookup artifacts by name or ID within spaces | Critical |
| CAP-104 | Cross-Space References | Reference artifacts across space boundaries | High |
| CAP-105 | Artifact Metadata | Store artifact metadata (type, created, modified) | High |
Implementation Tasks
Week 1: Core Models
- Create
markitect/prompts/models.pyArtifactdataclass with id, name, space_id, content_digest, metadataArtifactReferencedataclass for cross-space addressing- Content digest calculation utilities (SHA-256)
- Create
markitect/prompts/repositories/interfaces.pyIArtifactRepositoryinterface
- Unit tests for artifact models and digest calculation
Week 2: Repository Implementation
- Create
markitect/prompts/repositories/sqlite.pySQLiteArtifactRepositoryimplementingIArtifactRepository- CRUD operations with content digest tracking
- Cross-space artifact lookup
- Database migration scripts
- Repository unit tests
Week 3: Artifact Service
- Create
markitect/prompts/services/artifact_service.py- Register artifacts with automatic digest calculation
- Query artifacts by name, ID, or digest
- Track artifact modifications with digest updates
- Integration tests with existing InformationSpace service
Database Schema
CREATE TABLE prompt_artifacts (
id TEXT PRIMARY KEY,
space_id TEXT NOT NULL REFERENCES spaces(id),
name TEXT NOT NULL,
artifact_type TEXT NOT NULL,
content_digest TEXT NOT NULL,
content_size INTEGER,
metadata JSON,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE(space_id, name)
);
CREATE INDEX idx_artifacts_digest ON prompt_artifacts(content_digest);
CREATE INDEX idx_artifacts_space ON prompt_artifacts(space_id);
Verification
pytest tests/unit/prompts/test_artifact_models.py
pytest tests/unit/prompts/test_artifact_repository.py
pytest tests/integration/prompts/test_artifact_service.py
Phase 2: Templates & Macros (FR-2)
Capability Requirements
| ID | Capability | Description | Priority |
|---|---|---|---|
| CAP-201 | PromptTemplate Model | Template definition with content and metadata | Critical |
| CAP-202 | ContentMacro Detection | Parse and extract macros from template content | Critical |
| CAP-203 | Macro Types | Support Required, Optional, Generate macro kinds | Critical |
| CAP-204 | Template Analysis | Analyze templates to extract macro dependencies | High |
| CAP-205 | Template Validation | Validate template syntax and macro references | Medium |
Implementation Tasks
Week 4: Template Models
- Create
markitect/prompts/templates/models.pyPromptTemplatedataclass extendingArtifactContentMacrodataclass with kind, target, parametersMacroKindenum: REQUIRED, OPTIONAL, GENERATETemplateMetadatafor template-specific metadata
- Unit tests for template models
Week 5: Macro Parser
- Create
markitect/prompts/templates/parser.py- Regex-based macro extraction from markdown content
- Support macro syntax:
{{require:artifact}},{{optional:artifact}},{{generate:template}} - Parameter parsing for macro arguments
- Create
markitect/prompts/templates/analyzer.pyTemplateAnalyzerclass for dependency extraction- Identify all macros and their types
- Build initial dependency list
- Parser and analyzer unit tests
Week 6: Template Service
- Create
markitect/prompts/services/template_service.py- Register templates with automatic analysis
- Query templates by ID or name
- Retrieve template with analyzed macro list
- Integration tests
Template Syntax
# Example PromptTemplate
## Context
{{require:project-overview}}
{{optional:technical-constraints}}
## Task Description
Generate a technical design for {{require:feature-name}}.
## Previous Designs
{{generate:related-designs-collector}}
Macro Format
{{<kind>:<target>[|<param1>=<value1>|<param2>=<value2>...]}}
Examples:
{{require:glossary/authentication}}
{{optional:standards/api-design}}
{{generate:code-examples|language=python|framework=fastapi}}
Verification
pytest tests/unit/prompts/test_template_models.py
pytest tests/unit/prompts/test_macro_parser.py
pytest tests/unit/prompts/test_template_analyzer.py
pytest tests/integration/prompts/test_template_service.py
Phase 3: Resolver Engine (FR-3)
Capability Requirements
| ID | Capability | Description | Priority |
|---|---|---|---|
| CAP-301 | Resolution Strategy | Deterministic multi-space resolution order | Critical |
| CAP-302 | Required Macro Resolution | Fail on missing required artifacts | Critical |
| CAP-303 | Optional Macro Resolution | Graceful fallback for missing optional artifacts | Critical |
| CAP-304 | Generate Macro Detection | Identify generator templates for nested execution | High |
| CAP-305 | Resolution Context | Track resolution state and errors | High |
Implementation Tasks
Week 7: Resolver Core
- Create
markitect/prompts/resolver/models.pyResolutionContextwith resolution order, resolved artifacts, errorsResolutionResultwith success status, resolved content, unresolved macrosResolutionErrorfor missing required artifacts
- Create
markitect/prompts/resolver/strategy.pyResolutionStrategybase classMultiSpaceResolutionStrategyimplementing FR-3.1 order:- Local InformationSpace
- Explicitly included InformationSpaces
- Default InformationSpace
- Team/Shared InformationSpace (if configured)
- Unit tests for resolution strategy
Week 8: PromptResolver Implementation
- Create
markitect/prompts/resolver/resolver.pyPromptResolverclassresolve_template(template, context) -> ResolutionResult- Handle Required macros: fail if not found (FR-3.2)
- Handle Optional macros: resolve to empty (FR-3.3)
- Detect Generate macros for deferred resolution (FR-3.4)
- Track resolution errors and warnings
- Resolver unit tests
Week 9: Context Compilation
- Create
markitect/prompts/resolver/compiler.pyContextCompilerclass- Compile resolved artifacts into single prompt context
- Substitute macros with resolved content
- Generate
CompiledPromptwith full context
- Integration tests for full resolution flow
Resolution Order Example
# Given template in space "my-project" referencing {{require:glossary}}
# Resolution search order:
1. my-project/glossary
2. <included-space-1>/glossary
3. <included-space-2>/glossary
4. default-space/glossary
5. shared-space/glossary # if configured
# If not found: ResolutionError(MacroKind.REQUIRED, "glossary")
Database Schema Additions
CREATE TABLE prompt_resolution_config (
space_id TEXT PRIMARY KEY REFERENCES spaces(id),
included_spaces JSON, -- Array of space IDs to search
default_space_id TEXT REFERENCES spaces(id),
shared_space_id TEXT REFERENCES spaces(id),
max_generation_depth INTEGER DEFAULT 3,
config JSON,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Verification
pytest tests/unit/prompts/test_resolution_strategy.py
pytest tests/unit/prompts/test_prompt_resolver.py
pytest tests/unit/prompts/test_context_compiler.py
pytest tests/integration/prompts/test_resolution_flow.py
Phase 4: Execution Engine (FR-4, FR-5)
Capability Requirements
| ID | Capability | Description | Priority |
|---|---|---|---|
| CAP-401 | PromptRun Lifecycle | Three-stage execution: Analysis, Compilation, Processing | Critical |
| CAP-402 | InputBundleHash | Content-based execution fingerprinting | Critical |
| CAP-403 | Idempotent Execution | Skip re-execution for identical input bundles | Critical |
| CAP-404 | LLM Integration | Execute compiled prompts via LLM provider | Critical |
| CAP-405 | RunManifest Persistence | Store complete execution provenance | Critical |
| CAP-406 | Nested Generator Runs | Execute generate macros recursively | High |
Implementation Tasks
Week 10: Execution Models
- Create
markitect/prompts/execution/models.pyPromptRundataclass with id, template_id, input_bundle_hash, statusExecutionStageenum: ANALYSIS, COMPILATION, PROCESSING, COMPLETE, FAILEDRunConfigwith model settings, depth limits, optionsInputBundlewith template digest, dependency digests, config hashInputBundleHashcalculation (SHA-256 of sorted input bundle)
- Create
markitect/prompts/execution/manifest.pyRunManifestcomprehensive execution record- Template metadata, resolved inputs, compiled prompt digest
- Model configuration, output artifacts, validation results
- Dependency edges, timing metadata
- Unit tests for execution models
Week 11: Execution Engine
- Create
markitect/prompts/execution/engine.pyPromptExecutionEngineclassexecute(template, config) -> PromptRun- Stage 1: Template analysis (use TemplateAnalyzer)
- Stage 2: Context compilation (use ContextCompiler)
- Stage 3: Prompt processing (LLM invocation)
- Calculate InputBundleHash before execution
- Check for existing run with same hash (FR-4.4)
- Store RunManifest on completion
- Engine unit tests
Week 12: LLM Integration Layer
- Create
markitect/prompts/execution/llm_adapter.pyLLMAdapterabstract base classexecute_prompt(compiled_prompt, config) -> LLMResponse- Mock implementation for testing
- OpenAI/Anthropic adapter stubs (to be implemented)
- Create
markitect/prompts/execution/generator.pyGeneratorExecutorfor nested generate macro execution- Enforce max depth limit (FR-3.5)
- Track parent-child run relationships
- Link generator runs in RunManifest (FR-5.3)
- Integration tests for full execution flow
Database Schema Additions
CREATE TABLE prompt_runs (
id TEXT PRIMARY KEY,
template_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
input_bundle_hash TEXT NOT NULL,
status TEXT NOT NULL,
stage TEXT NOT NULL,
parent_run_id TEXT REFERENCES prompt_runs(id),
depth INTEGER DEFAULT 0,
started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
completed_at TIMESTAMP,
error_message TEXT,
UNIQUE(input_bundle_hash) -- Idempotency constraint
);
CREATE TABLE run_manifests (
run_id TEXT PRIMARY KEY REFERENCES prompt_runs(id),
template_metadata JSON NOT NULL,
resolved_inputs JSON NOT NULL,
compiled_prompt_digest TEXT NOT NULL,
model_config JSON NOT NULL,
output_artifacts JSON,
dependency_edges JSON,
validation_results JSON,
impact_debt JSON,
timing_metadata JSON,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_runs_template ON prompt_runs(template_id);
CREATE INDEX idx_runs_bundle_hash ON prompt_runs(input_bundle_hash);
CREATE INDEX idx_runs_parent ON prompt_runs(parent_run_id);
InputBundleHash Calculation
def calculate_input_bundle_hash(
template_digest: str,
dependency_digests: Dict[str, str], # {artifact_name: digest}
config_hash: str,
model_settings: Dict
) -> str:
"""
Deterministic hash of complete input context.
Components (sorted for determinism):
1. Template content digest
2. Sorted dependency digests by name
3. Resolution configuration hash
4. Model settings (name, temperature, etc.)
5. Compilation options
"""
bundle = {
'template': template_digest,
'dependencies': sorted(dependency_digests.items()),
'config': config_hash,
'model': sorted(model_settings.items())
}
return hashlib.sha256(
json.dumps(bundle, sort_keys=True).encode()
).hexdigest()
Verification
pytest tests/unit/prompts/test_execution_models.py
pytest tests/unit/prompts/test_execution_engine.py
pytest tests/unit/prompts/test_llm_adapter.py
pytest tests/unit/prompts/test_generator_executor.py
pytest tests/integration/prompts/test_prompt_execution.py
pytest tests/integration/prompts/test_idempotent_execution.py
Phase 5: Dependency Tracking (FR-6)
Capability Requirements
| ID | Capability | Description | Priority |
|---|---|---|---|
| CAP-501 | Dependency Edge Recording | Track input → output relationships | Critical |
| CAP-502 | Dependency Graph Construction | Build queryable dependency graph | Critical |
| CAP-503 | Circular Dependency Detection | Identify cycles in dependency chains | High |
| CAP-504 | Dependency Query | Find dependents of any artifact | High |
| CAP-505 | Cross-Space Dependencies | Track dependencies across spaces | Medium |
Implementation Tasks
Week 13: Dependency Models
- Create
markitect/prompts/dependencies/models.pyDependencyEdgewith source_id, target_id, run_id, edge_typeEdgeTypeenum: REQUIRES, GENERATES, INCLUDESDependencyGraphclass for graph operationsCircularDependencyErrorexception
- Unit tests for dependency models
Week 14: Graph Builder
- Create
markitect/prompts/dependencies/graph.pyGraphBuilderclass- Extract dependencies from RunManifest
- Add edges: artifact → run (input), run → artifact (output)
- Build adjacency list representation
- Cycle detection using DFS
- Create
markitect/prompts/dependencies/repository.pySQLiteDependencyRepository- Store and query dependency edges
- Efficient dependent lookup queries
- Graph builder and repository tests
Week 15: Query Operations
- Create
markitect/prompts/dependencies/queries.pyfind_dependents(artifact_id, depth=1) -> List[Artifact]find_dependencies(artifact_id) -> List[Artifact]get_dependency_chain(source_id, target_id) -> List[Edge]detect_circular_dependencies(artifact_id) -> List[Cycle]
- Integration tests for dependency queries
Database Schema Additions
CREATE TABLE prompt_dependencies (
id TEXT PRIMARY KEY,
source_artifact_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
target_artifact_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
run_id TEXT NOT NULL REFERENCES prompt_runs(id),
edge_type TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE(source_artifact_id, target_artifact_id, run_id)
);
CREATE INDEX idx_deps_source ON prompt_dependencies(source_artifact_id);
CREATE INDEX idx_deps_target ON prompt_dependencies(target_artifact_id);
CREATE INDEX idx_deps_run ON prompt_dependencies(run_id);
Dependency Graph Example
Template: design-doc (ID: t1)
→ requires: glossary (ID: a1)
→ requires: requirements (ID: a2)
→ generates: api-spec (ID: a3)
PromptRun: r1
Edges:
a1 → r1 (REQUIRES)
a2 → r1 (REQUIRES)
r1 → a3 (GENERATES)
When a1 (glossary) changes:
Dependents(a1, depth=1) = [r1]
Affected outputs = [a3] (need recomputation)
Verification
pytest tests/unit/prompts/test_dependency_models.py
pytest tests/unit/prompts/test_graph_builder.py
pytest tests/unit/prompts/test_dependency_repository.py
pytest tests/unit/prompts/test_dependency_queries.py
pytest tests/integration/prompts/test_dependency_graph.py
pytest tests/integration/prompts/test_circular_detection.py
Phase 6: Incremental Execution (FR-7, FR-8)
Capability Requirements
| ID | Capability | Description | Priority |
|---|---|---|---|
| CAP-601 | Change Detection | Detect artifact modifications via digest comparison | Critical |
| CAP-602 | Incremental Recompute | Recompute direct dependents on change | Critical |
| CAP-603 | Depth Control | Configurable recomputation depth (default=1) | High |
| CAP-604 | Circular Suppression | Suppress recompute to prevent cycles | High |
| CAP-605 | Change Impact Analysis | Calculate change magnitude metrics | High |
| CAP-606 | Impact Debt Tracking | Record suppressed recomputations | Medium |
Implementation Tasks
Week 16: Change Detection
- Create
markitect/prompts/incremental/models.pyArtifactChangewith old_digest, new_digest, change_typeChangeTypeenum: CREATED, MODIFIED, DELETEDImpactDebtfor suppressed recomputationsRecomputeConfigwith depth, circular handling, budget limits
- Create
markitect/prompts/incremental/detector.pyChangeDetectorclass- Compare current digest with stored digest
- Identify change type and magnitude
- Unit tests for change detection
Week 17: Impact Analysis
- Create
markitect/prompts/incremental/impact.pyImpactAnalyzerclass- Calculate change magnitude (FR-8.2):
- Structural diff ratio (default)
- Content diff ratio (character-level)
- Optional: embedding distance
- Optional: LLM-assessed impact
- Generate impact score (0.0 to 1.0)
- Create
markitect/prompts/incremental/metrics.py- Diff calculation utilities
- Similarity scoring algorithms
- Impact analyzer tests
Week 18: Incremental Recompute Engine
- Create
markitect/prompts/incremental/engine.pyIncrementalExecutionEngineclassrecompute_dependents(artifact_id, config) -> RecomputeResult- Find direct dependents via dependency graph (depth=1 default)
- Check for circular dependencies
- Execute prompt runs for affected dependents
- Track suppressed recomputations as ImpactDebt
- Record impact assessments in RunManifest (FR-8.3)
- Integration tests for incremental execution
Recomputation Logic
def recompute_dependents(artifact_id: str, config: RecomputeConfig):
"""
FR-7: Incremental recompute with depth control.
1. Detect change in artifact
2. Find dependents up to specified depth (default=1)
3. For each dependent:
- Check if recompute would create cycle → suppress if yes
- Calculate change impact
- If impact > threshold and budget available:
- Recompute (re-execute PromptRun)
- Else:
- Record as ImpactDebt in RunManifest
4. Return RecomputeResult with executed/suppressed counts
"""
Database Schema Additions
CREATE TABLE artifact_changes (
id TEXT PRIMARY KEY,
artifact_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
old_digest TEXT,
new_digest TEXT NOT NULL,
change_type TEXT NOT NULL,
detected_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE impact_debt (
id TEXT PRIMARY KEY,
artifact_id TEXT NOT NULL REFERENCES prompt_artifacts(id),
dependent_run_id TEXT NOT NULL REFERENCES prompt_runs(id),
change_magnitude REAL NOT NULL,
suppression_reason TEXT NOT NULL,
recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_changes_artifact ON artifact_changes(artifact_id);
CREATE INDEX idx_debt_artifact ON impact_debt(artifact_id);
CREATE INDEX idx_debt_run ON impact_debt(dependent_run_id);
Verification
pytest tests/unit/prompts/test_change_detector.py
pytest tests/unit/prompts/test_impact_analyzer.py
pytest tests/unit/prompts/test_incremental_engine.py
pytest tests/integration/prompts/test_incremental_recompute.py
pytest tests/integration/prompts/test_circular_suppression.py
pytest tests/integration/prompts/test_impact_debt.py
Phase 7: Quality & Validation (FR-9, FR-10)
Capability Requirements
| ID | Capability | Description | Priority |
|---|---|---|---|
| CAP-701 | Schema Validation | Validate generated artifacts against JSON schemas | High |
| CAP-702 | QualityGate Framework | Pluggable validation framework | High |
| CAP-703 | Validation Results | Record pass/fail with diagnostics | High |
| CAP-704 | Halting Policy | Configurable execution halting rules | Medium |
| CAP-705 | Refinement Loop | Iterative improvement with quality checks | Medium |
Implementation Tasks
Week 19: QualityGate Framework
- Create
markitect/prompts/quality/models.pyQualityGateabstract base classValidationResultwith status, diagnostics, scoreQualityPolicywith halting rulesGateTypeenum: SCHEMA, PATTERN, CUSTOM
- Create
markitect/prompts/quality/gates/schema_gate.pySchemaValidationGateusing existing schema validator- Validate generated artifacts against JSON schemas
- Create
markitect/prompts/quality/gates/pattern_gate.pyPatternValidationGatefor regex-based checks
- Unit tests for quality gates
Week 20: Validation Integration
- Create
markitect/prompts/quality/validator.pyQualityValidatorclass- Apply multiple gates to generated artifacts
- Aggregate validation results
- Record results in RunManifest (FR-9.3)
- Integrate with execution engine
- Run quality gates after prompt processing
- Store validation results in RunManifest
- Integration tests
Week 21: Halting Policy Engine
- Create
markitect/prompts/quality/policy.pyHaltingPolicyEngineclass- Evaluate halting conditions (FR-10.2):
- QualityGate failures
- Marginal improvement below threshold
- Iteration limit reached
- Resource budget exhausted
- Record halting decisions in RunManifest (FR-10.3)
- Create
markitect/prompts/quality/refinement.pyRefinementLoopfor iterative improvement- Execute → Validate → Halt or Refine
- Policy engine and refinement loop tests
QualityGate Example
# Schema validation gate
schema_gate = SchemaValidationGate(
schema_path="schemas/api-spec-schema.json"
)
# Pattern validation gate
pattern_gate = PatternValidationGate(
required_patterns=[r"## Endpoints", r"### Authentication"],
forbidden_patterns=[r"TODO", r"FIXME"]
)
# Quality policy
policy = QualityPolicy(
gates=[schema_gate, pattern_gate],
halting_rules={
'max_iterations': 3,
'min_improvement': 0.05,
'fail_on_validation_error': True
}
)
Database Schema Additions
CREATE TABLE quality_gates (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
gate_type TEXT NOT NULL,
config JSON NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE validation_results (
id TEXT PRIMARY KEY,
run_id TEXT NOT NULL REFERENCES prompt_runs(id),
gate_id TEXT NOT NULL REFERENCES quality_gates(id),
artifact_id TEXT REFERENCES prompt_artifacts(id),
status TEXT NOT NULL, -- PASS, FAIL, WARNING
score REAL,
diagnostics JSON,
validated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_validations_run ON validation_results(run_id);
CREATE INDEX idx_validations_artifact ON validation_results(artifact_id);
Verification
pytest tests/unit/prompts/test_quality_gates.py
pytest tests/unit/prompts/test_quality_validator.py
pytest tests/unit/prompts/test_halting_policy.py
pytest tests/unit/prompts/test_refinement_loop.py
pytest tests/integration/prompts/test_quality_validation.py
pytest tests/integration/prompts/test_halting_execution.py
Phase 8: Observability & Traceability (FR-11)
Capability Requirements
| ID | Capability | Description | Priority |
|---|---|---|---|
| CAP-801 | Provenance Tracing | Trace any artifact to its producing run | High |
| CAP-802 | Dependency Visualization | Visualize dependency graph | Medium |
| CAP-803 | Run History | Query execution history | High |
| CAP-804 | Audit Logging | Complete audit trail of all operations | Medium |
| CAP-805 | GraphQL API | Query interface for all prompt operations | High |
| CAP-806 | CLI Commands | Command-line tools for management | High |
Implementation Tasks
Week 22: Traceability Service
- Create
markitect/prompts/traceability/service.pyTraceabilityServiceclasstrace_artifact(artifact_id) -> ProvenanceTraceget_producing_run(artifact_id) -> PromptRunget_input_artifacts(run_id) -> List[Artifact]get_generator_runs(run_id) -> List[PromptRun]get_validation_history(artifact_id) -> List[ValidationResult]
- Unit and integration tests
Week 23: Query & Visualization
- Create
markitect/prompts/visualization/graph.py- Export dependency graph in DOT format
- Generate Mermaid diagrams
- Create
markitect/prompts/queries/- Complex query operations
- Run history queries
- Impact analysis queries
- Visualization and query tests
Week 24: API Layer - GraphQL
- Create
markitect/prompts/graphql/schema.py- Extend existing GraphQL schema with prompt types
PromptTemplate,PromptRun,Artifact,DependencyEdgetypes- Queries: template, templates, run, runs, artifact, dependencies
- Mutations: executeTemplate, recomputeDependents
- Subscriptions: onRunComplete, onArtifactChange
- Create
markitect/prompts/graphql/resolvers.py- Implement all query and mutation resolvers
- GraphQL integration tests
Week 25: CLI Commands
- Extend
markitect/cli.pywith prompt commands:markitect prompt template create/list/show/deletemarkitect prompt execute <template> [--config CONFIG]markitect prompt recompute <artifact> [--depth N]markitect prompt trace <artifact>markitect prompt graph <artifact> [--format dot|mermaid]markitect prompt runs [--template TEMPLATE] [--status STATUS]markitect prompt validate <artifact> [--gates GATES]
- CLI integration tests
Week 26: Documentation & Polish
- User guide for prompt dependency resolution
- API documentation
- Example prompt templates and workflows
- Performance optimization
- Final integration testing
GraphQL Schema Extensions
type PromptTemplate {
id: ID!
name: String!
spaceId: ID!
content: String!
contentDigest: String!
macros: [ContentMacro!]!
metadata: JSON
createdAt: DateTime!
updatedAt: DateTime!
}
type ContentMacro {
kind: MacroKind!
target: String!
parameters: JSON
}
enum MacroKind {
REQUIRED
OPTIONAL
GENERATE
}
type PromptRun {
id: ID!
template: PromptTemplate!
inputBundleHash: String!
status: RunStatus!
stage: ExecutionStage!
parentRun: PromptRun
depth: Int!
manifest: RunManifest!
startedAt: DateTime!
completedAt: DateTime
}
type RunManifest {
runId: ID!
templateMetadata: JSON!
resolvedInputs: [ResolvedInput!]!
compiledPromptDigest: String!
modelConfig: JSON!
outputArtifacts: [Artifact!]!
dependencyEdges: [DependencyEdge!]!
validationResults: [ValidationResult!]!
impactDebt: [ImpactDebt!]!
}
type DependencyEdge {
id: ID!
source: Artifact!
target: Artifact!
run: PromptRun!
edgeType: EdgeType!
}
type Query {
promptTemplate(id: ID!): PromptTemplate
promptTemplates(spaceId: ID): [PromptTemplate!]!
promptRun(id: ID!): PromptRun
promptRuns(templateId: ID, status: RunStatus): [PromptRun!]!
artifact(id: ID!): Artifact
dependencies(artifactId: ID!, depth: Int): [DependencyEdge!]!
traceArtifact(id: ID!): ProvenanceTrace!
}
type Mutation {
createTemplate(input: CreateTemplateInput!): PromptTemplate!
executeTemplate(templateId: ID!, config: ExecutionConfig): PromptRun!
recomputeDependents(artifactId: ID!, config: RecomputeConfig): RecomputeResult!
}
type Subscription {
onRunComplete(templateId: ID): PromptRun!
onArtifactChange(spaceId: ID): ArtifactChange!
}
CLI Examples
# Create and execute a template
markitect prompt template create design-doc \
--space my-project \
--content @templates/design-template.md
markitect prompt execute design-doc \
--config '{"model": "gpt-4", "temperature": 0.7}'
# Trace artifact provenance
markitect prompt trace api-spec
# Output:
# Artifact: api-spec (a3)
# Produced by: PromptRun r1
# Template: design-doc (t1)
# Input artifacts:
# - glossary (a1)
# - requirements (a2)
# Generated at: 2026-02-08 10:30:00
# Visualize dependencies
markitect prompt graph api-spec --format mermaid > deps.mmd
# Recompute after change
markitect prompt recompute glossary --depth 2
# Recomputing dependents of glossary...
# ✓ design-doc run r1 (api-spec regenerated)
# ✓ implementation-guide run r2 (guide regenerated)
# Summary: 2 runs executed, 0 suppressed
Verification
pytest tests/unit/prompts/test_traceability_service.py
pytest tests/unit/prompts/test_visualization.py
pytest tests/integration/prompts/test_graphql_api.py
pytest tests/integration/prompts/test_cli_commands.py
pytest tests/e2e/prompts/test_complete_workflow.py
Timeline Summary
| Phase | Focus | Duration | Cumulative |
|---|---|---|---|
| 1 | Foundation - Addressable Artifacts | 3 weeks | 3 weeks |
| 2 | Templates & Macros | 3 weeks | 6 weeks |
| 3 | Resolver Engine | 3 weeks | 9 weeks |
| 4 | Execution Engine | 3 weeks | 12 weeks |
| 5 | Dependency Tracking | 3 weeks | 15 weeks |
| 6 | Incremental Execution | 3 weeks | 18 weeks |
| 7 | Quality & Validation | 3 weeks | 21 weeks |
| 8 | Observability & Traceability | 5 weeks | 26 weeks |
Total: 26 weeks (~6 months)
Parallel Work Opportunities
- Phases 7 (Quality) and 8 (Observability) can partially overlap
- Documentation can be written incrementally throughout
- CLI commands can start in parallel with Phase 7
- GraphQL schema can be drafted early and implemented incrementally
Files to Create
Core Modules
markitect/prompts/
├── __init__.py
├── models.py # Phase 1: Artifact models
├── repositories/
│ ├── __init__.py
│ ├── interfaces.py # Phase 1: Repository interfaces
│ └── sqlite.py # Phase 1: SQLite implementations
├── templates/
│ ├── __init__.py
│ ├── models.py # Phase 2: Template models
│ ├── parser.py # Phase 2: Macro parser
│ └── analyzer.py # Phase 2: Template analyzer
├── resolver/
│ ├── __init__.py
│ ├── models.py # Phase 3: Resolution models
│ ├── strategy.py # Phase 3: Resolution strategies
│ ├── resolver.py # Phase 3: PromptResolver
│ └── compiler.py # Phase 3: Context compiler
├── execution/
│ ├── __init__.py
│ ├── models.py # Phase 4: Execution models
│ ├── manifest.py # Phase 4: RunManifest
│ ├── engine.py # Phase 4: Execution engine
│ ├── llm_adapter.py # Phase 4: LLM integration
│ └── generator.py # Phase 4: Generator executor
├── dependencies/
│ ├── __init__.py
│ ├── models.py # Phase 5: Dependency models
│ ├── graph.py # Phase 5: Graph builder
│ ├── repository.py # Phase 5: Dependency storage
│ └── queries.py # Phase 5: Graph queries
├── incremental/
│ ├── __init__.py
│ ├── models.py # Phase 6: Change models
│ ├── detector.py # Phase 6: Change detector
│ ├── impact.py # Phase 6: Impact analyzer
│ ├── metrics.py # Phase 6: Diff metrics
│ └── engine.py # Phase 6: Incremental engine
├── quality/
│ ├── __init__.py
│ ├── models.py # Phase 7: Quality models
│ ├── gates/
│ │ ├── __init__.py
│ │ ├── schema_gate.py # Phase 7: Schema validation
│ │ └── pattern_gate.py # Phase 7: Pattern validation
│ ├── validator.py # Phase 7: Quality validator
│ ├── policy.py # Phase 7: Halting policy
│ └── refinement.py # Phase 7: Refinement loop
├── traceability/
│ ├── __init__.py
│ └── service.py # Phase 8: Traceability
├── visualization/
│ ├── __init__.py
│ └── graph.py # Phase 8: Graph visualization
├── queries/
│ ├── __init__.py
│ └── operations.py # Phase 8: Complex queries
├── graphql/
│ ├── __init__.py
│ ├── schema.py # Phase 8: GraphQL schema
│ └── resolvers.py # Phase 8: Resolvers
└── services/
├── __init__.py
├── artifact_service.py # Phase 1: Artifact operations
└── template_service.py # Phase 2: Template operations
Test Files
tests/unit/prompts/
├── test_artifact_models.py # Phase 1
├── test_artifact_repository.py # Phase 1
├── test_template_models.py # Phase 2
├── test_macro_parser.py # Phase 2
├── test_template_analyzer.py # Phase 2
├── test_resolution_strategy.py # Phase 3
├── test_prompt_resolver.py # Phase 3
├── test_context_compiler.py # Phase 3
├── test_execution_models.py # Phase 4
├── test_execution_engine.py # Phase 4
├── test_llm_adapter.py # Phase 4
├── test_generator_executor.py # Phase 4
├── test_dependency_models.py # Phase 5
├── test_graph_builder.py # Phase 5
├── test_dependency_repository.py # Phase 5
├── test_dependency_queries.py # Phase 5
├── test_change_detector.py # Phase 6
├── test_impact_analyzer.py # Phase 6
├── test_incremental_engine.py # Phase 6
├── test_quality_gates.py # Phase 7
├── test_quality_validator.py # Phase 7
├── test_halting_policy.py # Phase 7
├── test_refinement_loop.py # Phase 7
├── test_traceability_service.py # Phase 8
└── test_visualization.py # Phase 8
tests/integration/prompts/
├── test_artifact_service.py # Phase 1
├── test_template_service.py # Phase 2
├── test_resolution_flow.py # Phase 3
├── test_prompt_execution.py # Phase 4
├── test_idempotent_execution.py # Phase 4
├── test_dependency_graph.py # Phase 5
├── test_circular_detection.py # Phase 5
├── test_incremental_recompute.py # Phase 6
├── test_circular_suppression.py # Phase 6
├── test_impact_debt.py # Phase 6
├── test_quality_validation.py # Phase 7
├── test_halting_execution.py # Phase 7
├── test_graphql_api.py # Phase 8
└── test_cli_commands.py # Phase 8
tests/e2e/prompts/
└── test_complete_workflow.py # Phase 8
Documentation Files
docs/prompts/
├── GETTING_STARTED.md
├── TEMPLATE_GUIDE.md
├── EXECUTION_GUIDE.md
├── DEPENDENCY_MANAGEMENT.md
├── QUALITY_GATES.md
└── API_REFERENCE.md
Database Migrations
migrations/prompts/
├── 001_create_artifacts_table.sql # Phase 1
├── 002_create_resolution_config.sql # Phase 3
├── 003_create_runs_and_manifests.sql # Phase 4
├── 004_create_dependencies.sql # Phase 5
├── 005_create_changes_and_debt.sql # Phase 6
└── 006_create_quality_tables.sql # Phase 7
Success Criteria
The implementation is considered complete when all of the following acceptance criteria from the FRS are met:
- ✅ FR-2 & FR-3: A PromptTemplate referencing Required, Optional, and Generate macros can be executed
- ✅ FR-3.4 & FR-4: Missing Generate dependencies are automatically generated and persisted
- ✅ FR-4.4: Re-running an unchanged PromptRun with identical InputBundleHash results in skipped execution
- ✅ FR-7: Changing an upstream artifact triggers recomputation of direct dependents
- ✅ FR-7.3: Circular recomputation is suppressed and logged
- ✅ FR-5: RunManifest contains complete provenance and dependency information
- ✅ FR-9: Schema validation failures are correctly recorded and influence halting policy
Additional Quality Metrics
- Test Coverage: >85% for all prompt modules
- Performance: Execute simple template in <500ms (excluding LLM call)
- Performance: Build dependency graph for 1000 artifacts in <2s
- Performance: Incremental recompute for 100 dependents in <5s
- Documentation: Complete user guides for all major workflows
- Integration: Zero regressions in existing MarkiTect functionality
Design Decisions
1. LLM Provider Abstraction
Decision: Abstract LLM integration behind LLMAdapter interface
Rationale: FRS explicitly does not prescribe LLM provider. Adapter pattern allows pluggable providers (OpenAI, Anthropic, local models).
2. Storage Backend
Decision: SQLite for persistence, in-memory graph for queries Rationale: Consistent with existing InformationSpace implementation. SQLite provides ACID guarantees. In-memory graph enables fast traversal.
3. Macro Syntax
Decision: Use {{kind:target|param=value}} syntax in markdown
Rationale: Non-invasive in markdown source. Compatible with existing transclusion syntax. Easy to parse with regex.
4. Incremental Recompute Default Depth
Decision: Default depth=1 (direct dependents only) Rationale: Per FR-7.2. Prevents cascading recomputation storms. User can increase depth explicitly when needed.
5. Circular Dependency Handling
Decision: Suppress recomputation, record as ImpactDebt Rationale: Per FR-7.3. Avoids infinite loops. Debt tracking ensures visibility into suppressed updates.
6. Change Impact Default Method
Decision: Structural diff ratio Rationale: Per FR-8.2. Fast, deterministic, no external dependencies. Embedding and LLM methods are optional enhancements.
7. InputBundleHash Components
Decision: Template digest + dependency digests + config + model settings Rationale: Per FR-4.3. Captures all factors affecting prompt output. Ensures idempotent execution.
8. RunManifest Storage Format
Decision: JSON columns in SQLite Rationale: Flexible schema for manifest evolution. Queryable via SQLite JSON functions. Easy to export for analysis.
Risk Management
Risk 1: LLM Integration Complexity
Impact: High | Probability: Medium Mitigation:
- Start with mock LLM adapter for testing
- Well-defined adapter interface
- Implement one production adapter (e.g., OpenAI) in Phase 4
- Additional adapters can be added incrementally
Risk 2: Performance at Scale
Impact: Medium | Probability: Medium Mitigation:
- Index all foreign keys and frequently queried columns
- Use in-memory graph for dependency traversal
- Implement pagination for large result sets
- Performance testing with 10K+ artifacts in Phase 6
Risk 3: Circular Dependency Complexity
Impact: Medium | Probability: Low Mitigation:
- Thorough cycle detection testing
- Clear documentation on when cycles occur
- ImpactDebt provides visibility
- Users can manually break cycles if needed
Risk 4: Quality Gate Extensibility
Impact: Low | Probability: Low Mitigation:
- Plugin-based architecture for gates
- Well-defined
QualityGateinterface - Ship with schema and pattern gates
- Document custom gate creation
Risk 5: Integration with Existing InformationSpace
Impact: High | Probability: Low Mitigation:
- Build on top of existing space infrastructure
- Reuse space repositories and services
- Comprehensive integration tests
- Incremental rollout per phase
Future Enhancements (Out of Scope)
The following capabilities are valuable but explicitly out of scope for initial implementation:
- Distributed Execution: Execute prompt runs across multiple workers
- Real-time Collaboration: Multiple users editing templates simultaneously
- Version Control Integration: Git-based template versioning
- Advanced Visualization: Interactive dependency graph UI
- Cost Tracking: Track LLM API costs per run
- Prompt Optimization: Automatic prompt refinement based on results
- Multi-modal Artifacts: Support for images, audio in artifacts
- External Data Sources: Pull artifacts from APIs, databases
- Scheduling: Cron-based automatic recomputation
- A/B Testing: Compare multiple template variations
Implementation Status
Status: Planning Complete Next Step: Begin Phase 1 implementation Target Start: TBD Estimated Completion: 26 weeks from start
Conclusion
This workplan provides a comprehensive roadmap for implementing Prompt Dependency Resolution infrastructure in MarkiTect. The 8-phase approach ensures:
- Incremental Delivery: Each phase delivers working functionality
- Risk Mitigation: Complex features built on solid foundations
- Testability: Comprehensive test coverage at every phase
- Extensibility: Clean architecture supports future enhancements
- Compliance: Full coverage of all FR-1 through FR-11 requirements
The implementation will transform MarkiTect into an executable knowledge infrastructure, enabling deterministic, traceable, and incremental execution of prompt-based content generation across InformationSpaces.
Status: Ready for Implementation 🚀