feat(prompts): implement Phase 4 - Execution Engine (FR-4, FR-5)

Implement three-stage execution lifecycle with idempotent runs and complete provenance tracking via RunManifest. Core Features: - PromptRun model with execution lifecycle stages: 1. Analysis: Template analysis and macro extraction 2. Compilation: Macro resolution and context compilation 3. Processing: LLM execution and output generation - InputBundleHash for deterministic idempotency (FR-4.3) - RunManifest for complete execution provenance (FR-5) - LLMAdapter interface for pluggable model providers - MockLLMAdapter for testing without API calls - PromptExecutionEngine orchestrating full lifecycle Idempotent Execution (FR-4.4): - Calculate SHA-256 hash of complete input context - Skip execution if identical hash exists - Cache successful runs by hash - Support force re-execution via config flag RunManifest Tracking (FR-5.2): - Template metadata (id, name, digest) - Resolved input artifacts and digests - Compiled prompt digest - Model configuration - Output artifacts - Dependency edges for graph construction - Timing metadata for performance analysis Tests (27 passing): - 17 execution model tests (config, bundle, runs, stages) - 10 engine tests (execution, idempotency, errors, caching) Implements: - FR-4.1: Three-stage execution lifecycle - FR-4.2: CompiledPrompt during compilation - FR-4.3: InputBundleHash calculation - FR-4.4: Skip execution for identical hashes - FR-5.1: RunManifest persistence - FR-5.2: Complete manifest contents - FR-5.3: Nested run linking (foundation) Files Created: - markitect/prompts/execution/models.py - markitect/prompts/execution/manifest.py - markitect/prompts/execution/llm_adapter.py - markitect/prompts/execution/engine.py - migrations/prompts/003_create_runs_and_manifests.sql - tests/unit/prompts/test_execution_models.py - tests/unit/prompts/test_execution_engine.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 23:15:33 +01:00
parent 5f463e5b20
commit c56c92c815
8 changed files with 1739 additions and 0 deletions
--- a/markitect/prompts/execution/manifest.py
+++ b/markitect/prompts/execution/manifest.py
@@ -0,0 +1,291 @@
+"""
+RunManifest for execution provenance tracking.
+
+Implements FR-5: RunManifest Persistence
+Complete record of execution with all inputs, outputs, and metadata.
+"""
+
+from dataclasses import dataclass, field
+from datetime import datetime
+from typing import Dict, Any, List, Optional
+
+
+@dataclass
+class ResolvedInput:
+    """
+    Record of a resolved input artifact.
+
+    Attributes:
+        name: Artifact name
+        artifact_id: Artifact ID
+        space_id: Space where artifact was found
+        digest: Content digest
+    """
+    name: str
+    artifact_id: str
+    space_id: str
+    digest: str
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert to dictionary."""
+        return {
+            "name": self.name,
+            "artifact_id": self.artifact_id,
+            "space_id": self.space_id,
+            "digest": self.digest,
+        }
+
+    @classmethod
+    def from_dict(cls, data: Dict[str, Any]) -> "ResolvedInput":
+        """Create from dictionary."""
+        return cls(
+            name=data["name"],
+            artifact_id=data["artifact_id"],
+            space_id=data["space_id"],
+            digest=data["digest"],
+        )
+
+
+@dataclass
+class DependencyEdge:
+    """
+    Dependency edge in execution graph.
+
+    Attributes:
+        source_id: Source artifact/run ID
+        target_id: Target artifact/run ID
+        edge_type: Type of dependency (requires, generates, includes)
+    """
+    source_id: str
+    target_id: str
+    edge_type: str
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert to dictionary."""
+        return {
+            "source_id": self.source_id,
+            "target_id": self.target_id,
+            "edge_type": self.edge_type,
+        }
+
+
+@dataclass
+class OutputArtifact:
+    """
+    Artifact produced by execution.
+
+    Attributes:
+        artifact_id: Artifact ID
+        name: Artifact name
+        digest: Content digest
+        artifact_type: Type of artifact
+    """
+    artifact_id: str
+    name: str
+    digest: str
+    artifact_type: str
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert to dictionary."""
+        return {
+            "artifact_id": self.artifact_id,
+            "name": self.name,
+            "digest": self.digest,
+            "artifact_type": self.artifact_type,
+        }
+
+
+@dataclass
+class RunManifest:
+    """
+    Complete execution manifest with provenance.
+
+    Implements FR-5: RunManifest Persistence
+
+    The RunManifest provides complete traceability for a prompt execution,
+    capturing all inputs, outputs, configuration, and metadata.
+
+    Implements FR-5.2: RunManifest contents:
+    - Template metadata
+    - Resolved inputs and their digests
+    - CompiledPrompt digest
+    - Model configuration
+    - Output artifacts and digests
+    - Dependency edges
+    - Validation results
+    - Impact debt records (if applicable)
+
+    Attributes:
+        run_id: ID of associated run
+        template_metadata: Template information
+        resolved_inputs: List of resolved input artifacts
+        compiled_prompt_digest: Digest of compiled prompt
+        model_config: Model configuration used
+        output_artifacts: List of produced artifacts
+        dependency_edges: Dependency graph edges
+        validation_results: Quality validation results
+        impact_debt: Suppressed recomputation records
+        timing_metadata: Execution timing information
+        created_at: Manifest creation time
+    """
+    run_id: str
+    template_metadata: Dict[str, Any]
+    resolved_inputs: List[ResolvedInput] = field(default_factory=list)
+    compiled_prompt_digest: str = ""
+    model_config: Dict[str, Any] = field(default_factory=dict)
+    output_artifacts: List[OutputArtifact] = field(default_factory=list)
+    dependency_edges: List[DependencyEdge] = field(default_factory=list)
+    validation_results: Dict[str, Any] = field(default_factory=dict)
+    impact_debt: List[Dict[str, Any]] = field(default_factory=list)
+    timing_metadata: Dict[str, float] = field(default_factory=dict)
+    created_at: datetime = field(default_factory=datetime.utcnow)
+
+    @classmethod
+    def create(
+        cls,
+        run_id: str,
+        template_id: str,
+        template_name: str,
+        template_digest: str,
+    ) -> "RunManifest":
+        """
+        Create a new manifest.
+
+        Args:
+            run_id: Run ID
+            template_id: Template ID
+            template_name: Template name
+            template_digest: Template content digest
+
+        Returns:
+            New RunManifest instance
+        """
+        return cls(
+            run_id=run_id,
+            template_metadata={
+                "template_id": template_id,
+                "template_name": template_name,
+                "template_digest": template_digest,
+            },
+        )
+
+    def add_resolved_input(
+        self,
+        name: str,
+        artifact_id: str,
+        space_id: str,
+        digest: str,
+    ) -> None:
+        """
+        Add a resolved input artifact.
+
+        Args:
+            name: Artifact name
+            artifact_id: Artifact ID
+            space_id: Space ID
+            digest: Content digest
+        """
+        self.resolved_inputs.append(
+            ResolvedInput(
+                name=name,
+                artifact_id=artifact_id,
+                space_id=space_id,
+                digest=digest,
+            )
+        )
+
+    def add_output_artifact(
+        self,
+        artifact_id: str,
+        name: str,
+        digest: str,
+        artifact_type: str,
+    ) -> None:
+        """
+        Add an output artifact.
+
+        Args:
+            artifact_id: Artifact ID
+            name: Artifact name
+            digest: Content digest
+            artifact_type: Artifact type
+        """
+        self.output_artifacts.append(
+            OutputArtifact(
+                artifact_id=artifact_id,
+                name=name,
+                digest=digest,
+                artifact_type=artifact_type,
+            )
+        )
+
+    def add_dependency_edge(
+        self,
+        source_id: str,
+        target_id: str,
+        edge_type: str,
+    ) -> None:
+        """
+        Add a dependency edge.
+
+        Args:
+            source_id: Source ID
+            target_id: Target ID
+            edge_type: Edge type
+        """
+        self.dependency_edges.append(
+            DependencyEdge(
+                source_id=source_id,
+                target_id=target_id,
+                edge_type=edge_type,
+            )
+        )
+
+    def set_timing(self, stage: str, duration_seconds: float) -> None:
+        """
+        Record timing for a stage.
+
+        Args:
+            stage: Stage name
+            duration_seconds: Duration in seconds
+        """
+        self.timing_metadata[stage] = duration_seconds
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert to dictionary for serialization."""
+        return {
+            "run_id": self.run_id,
+            "template_metadata": self.template_metadata,
+            "resolved_inputs": [inp.to_dict() for inp in self.resolved_inputs],
+            "compiled_prompt_digest": self.compiled_prompt_digest,
+            "model_config": self.model_config,
+            "output_artifacts": [out.to_dict() for out in self.output_artifacts],
+            "dependency_edges": [edge.to_dict() for edge in self.dependency_edges],
+            "validation_results": self.validation_results,
+            "impact_debt": self.impact_debt,
+            "timing_metadata": self.timing_metadata,
+            "created_at": self.created_at.isoformat(),
+        }
+
+    @classmethod
+    def from_dict(cls, data: Dict[str, Any]) -> "RunManifest":
+        """Create from dictionary."""
+        return cls(
+            run_id=data["run_id"],
+            template_metadata=data["template_metadata"],
+            resolved_inputs=[
+                ResolvedInput.from_dict(inp) for inp in data.get("resolved_inputs", [])
+            ],
+            compiled_prompt_digest=data.get("compiled_prompt_digest", ""),
+            model_config=data.get("model_config", {}),
+            output_artifacts=[
+                OutputArtifact(**out) for out in data.get("output_artifacts", [])
+            ],
+            dependency_edges=[
+                DependencyEdge(**edge) for edge in data.get("dependency_edges", [])
+            ],
+            validation_results=data.get("validation_results", {}),
+            impact_debt=data.get("impact_debt", []),
+            timing_metadata=data.get("timing_metadata", {}),
+            created_at=datetime.fromisoformat(data["created_at"]),
+        )