TL;DR: After months exploring multi-agent orchestration with OpenClaw and Lobster, I hit a wall: no existing tool offered simple declarative spec + runtime-agnostic execution + first-class control flow. So I designed duckflux — a minimal YAML-based workflow DSL with loops, conditionals, parallelism, and events built in. The spec is done (v0.2), a Go CLI runner is working, and the next step is integrating it as the orchestration engine inside OpenClaw.
Table of Contents
- Previously, on this series
- The gap that remained
- What is duckflux
- Alternatives considered
- The spec at a glance
- The Go runner
- What's next: duckflux meets OpenClaw
Previously, on this series
This article is the third in a series about building deterministic multi-agent development pipelines. If you're joining now, here's the short version.
In the first article, I documented two months of trial and error trying to build a code → review → test pipeline with autonomous AI agents. The core thesis: LLMs are unreliable routers — they forget steps, miscount iterations, skip transitions. Orchestration must be deterministic and implemented in code, not delegated to inference. After five failed attempts (Ralph Orchestrator, OpenClaw sub-agents, a custom event bus, skill-driven self-orchestration, and plugin hooks), I found Lobster — OpenClaw's built-in workflow engine. It was close, but lacked native loop support. I contributed a pull request adding sub-workflow steps with loops.
In the second article, I zoomed out. The problem wasn't just orchestration — it was multi-agents × multi-projects × multi-providers × multi-channels. I compiled a dataset of agent configuration formats across providers, proposed the Monoswarm pattern (a monorepo layout for managing agent swarms), and identified the still-missing piece: an orchestration layer that ties agent events to workflow transitions across projects.
Both articles ended with the same conclusion: we need a proper workflow DSL.
The gap that remained
Lobster was the closest thing to what I needed, but it was designed for linear pipelines with approval gates. My pull request added loops, but the deeper issues remained:
- No conditional branching (
if/then/else). - No parallel execution of multiple agents.
- No event system for inter-agent coordination.
- No typed expressions — conditions were shell commands returning exit codes.
- Tied to OpenClaw's runtime — not portable to other environments.
I looked at the broader landscape:
| Tool | Where it falls short |
|---|---|
| Argo Workflows | Turing-complete YAML disguised as config. A conditional loop requires template recursion, manual iteration counters, and string-interpolated type casting. |
| GitHub Actions | No conditional loops. Workarounds require unrolling or recursive reusable workflows. |
| Temporal / Inngest | Code-first — Go/TS/Python SDKs. The code IS the spec. No declarative layer. |
| Airflow / Prefect | DAGs are acyclic by definition — conditional loops are architecturally impossible. |
| n8n / Make | Visual-first, JSON-heavy specs. Loop constructs require JavaScript function nodes. Specs are unreadable as text. |
| Lobster | Linear pipelines with approval gates. No native loops, no parallelism, no conditionals. |
The gap was clear: no existing tool combines a simple declarative spec + runtime-agnostic execution + first-class control flow (loops, conditionals, parallelism) + events.
So I built one.
What is duckflux
duckflux is a minimal, deterministic, runtime-agnostic DSL for orchestrating workflows through declarative YAML.
The design principles are deliberate:
- Readable in 5 seconds — any developer understands the flow by glancing at the YAML.
- Minimal by default — features are only added when absolutely necessary.
- Convention over configuration — sensible defaults everywhere.
- Runtime-agnostic — the DSL defines WHAT happens and in WHAT ORDER. The runtime decides HOW.
- Reuse proven standards — expressions use Google CEL (used in Kubernetes, Firebase, Envoy), schemas use JSON Schema, format is YAML.
The simplest possible workflow:
flow:
- as: greet
type: exec
run: echo "Hello, duckflux!"
That's it. One flow, one step. No boilerplate, no mandatory fields beyond what's needed.
A more realistic example — a code review pipeline with a retry loop, parallel checks, conditional deployment, and event notification:
id: code-review-pipeline
name: Code Review Pipeline
defaults:
timeout: 10m
inputs:
repoUrl:
type: string
format: uri
required: true
maxRounds:
type: integer
default: 3
participants:
coder:
type: agent
model: claude-sonnet-4-20250514
tools: [read, write, bash]
onError: retry
retry:
max: 2
backoff: 5s
reviewer:
type: agent
model: claude-sonnet-4-20250514
tools: [read]
output:
approved:
type: boolean
required: true
score:
type: integer
flow:
- coder
- loop:
until: reviewer.output.approved == true
max: input.maxRounds
steps:
- reviewer
- coder:
when: reviewer.output.approved == false
- parallel:
- as: tests
type: exec
run: npm test
onError: skip
- as: lint
type: exec
run: npm run lint
onError: skip
- if:
condition: tests.status == "success" && lint.status == "success"
then:
- as: deploy
type: exec
run: ./deploy.sh
- as: notify
type: emit
event: "deploy.completed"
payload:
approved: reviewer.output.approved
score: reviewer.output.score
else:
- as: notifyFailure
type: emit
event: "deploy.failed"
payload:
tests: tests.status
lint: lint.status
output:
approved: reviewer.output.approved
score: reviewer.output.score
Compare this to the same scenario in Argo Workflows (~40 lines of template recursion), GitHub Actions (~50+ lines with unrolled iterations), or Temporal (~35 lines of Go code that requires compilation and a server).
Alternatives considered
Before landing on a custom YAML format, I evaluated two other approaches:
Extending Argo Workflows. Argo's YAML is expressive, but its power came from 6+ years of incremental feature additions. A conditional loop in Argo requires template recursion, manual iteration counters, and string-interpolated type casting — 13+ lines for what should be 6. The complexity is the feature, not a bug, and that's the problem.
Mermaid as executable spec. Mermaid sequence diagrams already have loop, par, and alt constructs. The DX for reading and writing is excellent, and diagrams render natively in GitHub. However, extending Mermaid for real workflow concerns (retry policies, timeouts, error handling, typed variables) requires hacking Note blocks for config and $var for expressions — creating a custom parser as proprietary as a new YAML format, just disguised as something familiar.
Custom minimal YAML (chosen). A new format, intentionally constrained, inspired by Mermaid's visual clarity but with the extensibility and tooling ecosystem of YAML. The tradeoff: a new DSL to learn, but one designed to be readable in 5 seconds and writable in 5 minutes.
The spec at a glance
The full spec is at github.com/duckflux/spec. Here's a walkthrough of the key features.
Participants
Participants are the building blocks. Each has a type that determines its behavior:
| Type | Description |
|---|---|
exec |
Shell command |
http |
HTTP request |
mcp |
MCP server delegation |
workflow |
Sub-workflow (composition) |
emit |
Fire an event to the event hub |
wait |
Pause execution until an event, timeout, or polling condition |
Participants can be defined in a reusable participants block or inline in the flow:
# Reusable
participants:
build:
type: exec
run: npm run build
flow:
- build
# Inline (single-use)
- as: notify
type: http
url: https://hooks.slack.com/services/...
method: POST
Control flow
Loops — repeat until a CEL condition is true or N iterations:
- loop:
until: reviewer.output.approved == true
max: 3
steps:
- coder
- reviewer
Parallel — run steps concurrently:
- parallel:
- as: lint
type: exec
run: npm run lint
- as: test
type: exec
run: npm test
Conditionals — branch based on CEL expressions:
- if:
condition: tests.status == "success"
then:
- deploy
else:
- rollback
Guards — skip a single step conditionally:
- deploy:
when: reviewer.output.approved == true
Wait — pause for an event, a timeout, or a polling condition:
- wait:
event: "approval.received"
match: event.requestId == submitForApproval.output.id
timeout: 24h
Expressions with Google CEL
All conditions, input mappings, and output mappings use Google CEL. CEL is non-Turing-complete, sandboxed (no I/O, no side effects), type-checked at parse time, and has a familiar C/JS/Python-like syntax:
- if:
condition: reviewer.output.approved == false && loop.iteration < 3
CEL was chosen over JavaScript eval (security surface, runtime dependency), custom mini-DSLs (implementation burden), and JSONPath/JMESPath (poor logic support).
Events
emit publishes events, wait subscribes. Events propagate both internally (within the workflow) and externally:
- as: notifyProgress
type: emit
event: "task.progress"
payload:
taskId: input.taskId
status: coder.output.status
ack: true # block until delivery confirmed
Error handling
Configurable per participant or per flow step invocation, with four strategies:
participants:
coder:
type: agent
onError: retry
retry:
max: 3
backoff: 2s
factor: 2 # exponential: 2s, 4s, 8s
deploy:
type: exec
onError: notify # redirect to another participant as fallback
Inputs and outputs
Everything is string by default, like stdin/stdout. Schema is opt-in via JSON Schema (written in YAML):
inputs:
repoUrl:
type: string
format: uri
required: true
branch:
type: string
default: "main"
output:
approved: reviewer.output.approved
score: reviewer.output.score
JSON Schema for editor support
A JSON Schema ships with the spec, giving you autocomplete and validation in VS Code for free:
{
"yaml.schemas": {
"./duckflux.schema.json": "*.flow.yaml"
}
}
The Go runner
The spec is useless without an executor. The duckflux runner is a cross-platform CLI written in Go.
Why Go: official Google CEL implementation (cel-go), single static binary with zero runtime dependencies, native concurrency via goroutines (maps directly to parallel:), and ecosystem fit — virtually every workflow and infrastructure tool is written in Go (Argo, Temporal, Docker, Kubernetes, Terraform).
Installation
git clone https://github.com/duckflux/runner.git
cd runner
make build
./bin/duckflux version
Commands
Run a workflow:
duckflux run deploy.flow.yaml --input branch=main --input env=staging
Lint (validate without executing):
duckflux lint deploy.flow.yaml
Validate inputs against schema:
duckflux validate deploy.flow.yaml --input branch=main
What's next: duckflux meets OpenClaw
The entire journey — from Protoagent to Lobster to Monoswarm — has been converging toward one goal: a deterministic orchestration engine for multi-agent workflows inside OpenClaw.
The architecture
┌─────────────────────────────────────────────────────────┐
│ Orchestrator │
│ │
│ /work [workflow] [project] [task] │
│ │
└─────────────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Canonical Agents Plugin │
│ │
│ Watch + hot-reload of AGENTS.md │
│ Dynamically generates OpenClaw config │
└─────────────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ OpenClaw Gateway │
│ │
│ Webhooks + Sandboxing + Tools │
└─────────────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Per-Task Containers (Docker) │
│ │
│ Git worktrees as filesystem │
└─────────────────────────────────────────────────────────┘
Two plugins
The integration relies on two OpenClaw plugins:
Canonical Agents Plugin — watches a directory of AGENTS.md files (YAML frontmatter for model/tools/sandbox config + markdown body for the system prompt) and dynamically generates OpenClaw's agent configuration with hot-reload on changes. This is the Monoswarm pattern's .ai/ directory made executable.
Orchestrator Plugin — the duckflux runner embedded as an OpenClaw plugin. Triggered by a command like /work code-review project-a TASK-123, it reads a duckflux workflow file, clones canonical agents per project, manages git worktrees, and executes the workflow — where agent participants map to OpenClaw webhook calls with isolated session keys.
The details of each plugin's implementation will be a future article. For now, the important thing is how this changes the picture.
What this replaces
With duckflux as the orchestration engine:
- Lobster is replaced by a more expressive workflow DSL with native loops, conditionals, parallelism, and events.
-
Plugin hooks for routing are replaced by declarative
emit/waitin the workflow spec. - Shell exit codes for conditions are replaced by type-checked CEL expressions.
- The custom orchestration plugin described in article two becomes the duckflux runner itself, embedded in OpenClaw.
The LLMs do what they're good at: writing code, analyzing code, making decisions. duckflux does what code is good at: sequencing, counting, routing, retrying.
Links:
- duckflux spec — Full DSL specification
- duckflux runner — Go CLI runner
- Article 1 — Building a deterministic pipeline with Lobster
- Article 2 — Multi-agents × multi-projects × multi-providers × multi-channels
Top comments (0)