Files
state-hub/docs/task-flow-engine-spec.md

201 lines
5.7 KiB
Markdown

---
id: task-flow-engine-spec
type: design-spec
title: "Task Flow Engine Specification"
status: draft
created: "2026-05-01"
updated: "2026-05-01"
---
# Task Flow Engine Specification
## Purpose
The task flow engine is a lightweight, declarative workflow substrate for
information objects that move through named workstations. It replaces local
status enums and hardcoded transition tables with pure assertions over an
object's observable properties.
The engine is intentionally small: it receives a plain dictionary plus a flow
definition, evaluates assertions, and returns a machine-readable result. It
does not know about SQLAlchemy, FastAPI, State Hub routers, or Custodian domain
rules.
## Core Terms
### Information Object
An information object is any entity with:
- a current workstation label, usually exposed as `workstation` or `status`
- a bag of observable properties
- optional nested collections of related entities
Examples include workstreams, tasks, contributions, capability requests, and
future interface changes. The engine treats all of them as plain dictionaries.
### WorkstationDef
A workstation is a named position an information object can occupy.
```yaml
name: active
description: Work is underway.
entry_assertions: []
exit_assertions:
- id: tasks.all_done
target: tasks.*.status
op: all_eq
value: [done, cancel]
description: All child tasks are done or canceled.
```
Schema:
- `name: str`
- `entry_assertions: list[AssertionDef]`
- `exit_assertions: list[AssertionDef]`
- `description: str`
A workstation with no entry assertions is always reachable. A workstation with
no exit assertions is always exitable.
### AssertionDef
An assertion is a pure predicate over object data.
Schema:
- `id: str`
- `target: str`
- `op: str`
- `value: Any`
- `description: str`
The `target` is a dot path into the information object. It supports normal dict
and attribute traversal plus `*` for collection expansion:
- `tasks.*.status`
- `dependencies.all.workstation`
- `metadata.approved_by`
The built-in operations are:
- `all_eq`: every resolved value equals the expected value, or is included in
the expected list
- `any_eq`: at least one resolved value equals the expected value, or is
included in the expected list
- `none_eq`: no resolved values equal the expected value, or are included in
the expected list
- `exists`: at least one non-empty value resolves
- `count_gte`: the number of resolved values is greater than or equal to the
expected integer
- `custom`: delegates evaluation to a host-injected callable
Assertions never mutate state.
### FlowDef
A flow definition is a named workstation graph for one entity type.
Schema:
- `id: str`
- `entity_type: str`
- `workstations: list[WorkstationDef]`
Multiple flows may exist for the same entity type, for example a lightweight
workstream flow and a governance-heavy workstream flow.
### Transition
Transition is not a first-class model. The engine derives reachable
workstations by evaluating every workstation's entry assertions against the
current object state. If the assertions for a target workstation are satisfied,
that workstation is reachable from the current workstation.
The current workstation's exit assertions determine whether the object is
blocked where it is. Unsatisfied exit assertions become blocking reasons.
### FlowResult
Evaluation returns:
```yaml
current_workstation: active
exit_blocked: true
blocking_assertions:
- id: tasks.all_done
passed: false
reason: "Expected all values at tasks.*.status to be in ['done', 'cancel']; got ['done', 'todo']."
reachable:
- ready
- active
unreachable:
- workstation: finished
blocking:
id: tasks.all_done
passed: false
reason: "Expected all values at tasks.*.status to be in ['done', 'cancel']; got ['done', 'todo']."
```
Schema:
- `current_workstation: str`
- `exit_blocked: bool`
- `blocking_assertions: list[AssertionResult]`
- `reachable: list[str]`
- `unreachable: list[UnreachableWorkstation]`
## Expressiveness Across Existing Entities
### Workstreams
Workstreams can express readiness for closure by asserting that child tasks
are `done` or `cancel`. They can express dependency blocking by checking that
all dependency workstreams have reached `finished` or `archived`.
### Tasks
Tasks can express human intervention with the existing `needs_human` flag.
Returning from `wait` to `progress` is an entry assertion over that same
flag. Lightweight completion remains unconstrained because curator intent is
the deciding signal.
### Contributions
Contributions can reproduce the current draft, submitted, acknowledged,
accepted, merged, rejected, and withdrawn lifecycle by giving each workstation
entry assertions that describe which previous statuses may enter it. This keeps
the current lifecycle readable without baking domain transitions into engine
code.
### Capability Requests
Capability requests can reproduce the existing requested, routing disputed,
accepted, in progress, ready for review, completed, rejected, and withdrawn
lifecycle the same way. Host-specific effects such as notifications remain in
the State Hub router; the flow engine only answers whether the target
workstation is reachable.
## Host Boundary
The engine owns:
- dataclasses for flow definitions and results
- target path resolution
- built-in predicate evaluation
- host-injected custom predicate dispatch
- reachable and blocked derivation
State Hub owns:
- loading domain-specific YAML flow definitions
- converting ORM entities into plain dictionaries
- migrations from enum-backed status fields to strings
- router side effects such as timestamps and notifications
- MCP tools and user-facing explanations
This boundary keeps the first implementation extractable into a standalone
`task-flow-engine` package once the API stabilizes.