Agentic Dataflow-Oriented Robotic Architecture
100% Rust framework for real-time robotics and AI. 10-17x faster than ROS2 with zero-copy shared memory, 4 communication patterns, and production-grade fault tolerance.
Core Capabilities
Why dora 1.0
Zero-Copy Performance
10-17x faster than ROS2 via shared memory IPC and Apache Arrow columnar format. Flat latency from 4KB to 4MB payloads with Zenoh SHM data plane.
4 Communication Patterns
Beyond pub/sub. Built-in support for Topic, Service (request/reply), Action (goal/feedback/result), and Streaming (session/segment/chunk) via metadata conventions.
Multi-Language SDK
Write nodes in Rust, Python, C, or C++ with native APIs — not wrappers. Mix languages freely in a single dataflow with zero interop overhead.
Fault Tolerance
Per-node restart policies (never/on-failure/always), exponential backoff, health monitoring, circuit breakers, and coordinator state persistence via redb.
Record & Replay
Capture dataflow messages to .dora files. Replay offline at any speed with node substitution for regression testing and offline debugging.
Cluster Management
Distributed deployment via SSH with label scheduling, rolling upgrades, and auto-recovery. Single CLI manages local dev and multi-machine production.
Dynamic Topology
Add and remove nodes from running dataflows via CLI without restarting. Connect and disconnect node ports on the fly for live reconfiguration.
Comprehensive Observability
Built-in OpenTelemetry for structured logging, metrics, and distributed tracing. Zero-setup trace viewing, live topic inspection, and resource monitoring TUI.
Agentic Engineering
Built and maintained with autonomous AI agents driving code generation, reviews, refactoring, testing, and commits.
Architecture
Designed for scale
Four-layer architecture: CLI orchestrates the Coordinator via WebSocket, which manages Daemons across machines, each running Nodes and Operators. Zenoh provides the zero-copy data plane.
dora-cli — build, run, manage
dora-coordinator — WebSocket control
dora-daemon — per-machine manager
dora-runtime — operator engine
zenoh — zero-copy data plane
Communication Patterns
Four ways to communicate
Beyond pub/sub. dora provides Topic, Service, Action, and Streaming patterns via well-known metadata keys. No daemon or YAML changes required.
Topic
Fire-and-forget pub/sub for sensor data, periodic status, and events
Service
Request/reply with UUID correlation. Client sends request, server returns exactly one response
Action
Long-running goals with periodic feedback and cancellation support
Streaming
Continuous data with session/segment/chunk metadata and queue flushing for real-time interruption
Distributed
Run anywhere, scale everywhere
Local shared memory between co-located nodes, automatic Zenoh pub-sub for cross-machine communication. SSH-based cluster management with label scheduling and rolling upgrades.
WebSocket control plane
Coordinator manages all daemons via persistent WebSocket connections
Cluster deployment
Deploy across machines with dora cluster up — no Kubernetes needed
Zero-copy IPC
Shared memory for local nodes, Zenoh SHM for cross-machine — no serialization overhead
Observability
See everything
Built-in OpenTelemetry for structured logging, metrics, and distributed tracing. Monitor your entire fleet from the CLI.
Live monitoring
dora top
Per-node CPU, memory, queue depth, network I/O, restart count, and health status across all machines in a terminal TUI.
Distributed tracing
dora trace list/view
Zero-setup trace inspection. View distributed traces across nodes without deploying Jaeger or Zipkin.
Data inspection
dora topic echo/hz/info
Print live data, monitor publish frequencies, and inspect schemas and bandwidth — all from the command line.
Record & Replay
Capture, replay, debug
Record all dataflow messages to .dora files. Replay offline at any speed with node substitution for regression testing.
Capture live messages from any running dataflow to .dora files. Selective per-node or full-graph recording.
Replay recorded data at any speed — 0.1x for step-through debugging or 10x for fast regression tests.
Swap out nodes during replay to test new logic against recorded real-world inputs.
Time-Travel Debugging
Record once, debug forever
Capture data in the field, replay endlessly in the lab. Step through exact message sequences, swap individual nodes to test new algorithms against real-world data, and run regression suites without deploying to hardware.
Offline hardware debugging
Debug entire sensor-to-actuator pipelines without running physical hardware. Replay exact camera frames, lidar scans, and IMU data from recordings.
CI regression testing
Record golden datasets, replay in CI. Automatically detect when algorithm changes produce different outputs. Integrate with dora replay --assert.
Timestamp-accurate replay
Messages replayed with original timing relationships preserved. Variable speed playback from 0.1x slow-motion to 100x fast-forward.
Fault Tolerance
Built to recover
Per-node restart policies, exponential backoff, health monitoring, and persistent coordinator state. Your dataflow keeps running even when individual nodes fail.
Restart policies
Choose never, on-failure, or always per node. Configurable in YAML with max retries and time windows.
Coordinator persistence
State persisted via redb. Restart the coordinator and all daemons reconnect automatically with their prior state intact.
Graceful degradation
Individual node failures are isolated. The rest of your dataflow continues processing while failed nodes recover.
Performance
Fast where it matters
Sub-millisecond latency with shared memory, zero-copy Arrow buffers, and four communication patterns. The same zero-copy speed as dora 0.x, with full-featured orchestration on top.
| Dimension | dora 1.0 | dora 0.x | ROS2 |
|---|---|---|---|
| Latency (SHM) | ~500us p50 | ~500us p50 | 1-10ms |
| Copy overhead | Zero (Arrow) | Zero (Arrow) | CDR serialization |
| Comm patterns | 4 | 1 (Topic) | 4 |
| Fault tolerance | Built-in | None | Basic |
| Record/Replay | Built-in | None | rosbag |
| Cluster mgmt | Built-in | None | Manual |
| Dynamic topology | Yes | No | Partial |
| Observability | OpenTelemetry | Basic logs | ROS2 logging |
| Languages | Rust/Py/C/C++ | Rust/Py/C/C++ | C++/Py |
Languages
Write nodes in your language
Each node is an independent process. Use the language that fits your task best — Rust for performance, Python for AI models, C/C++ for hardware drivers.
Rust
use dora_node::prelude::*;
#[dora_main]
fn main() -> eyre::Result<()> {
let node = DoraNode::init()?;
for event in node {
match event {
Event::Input { id, data, .. } => {
// process data
}
_ => {}
}
}
Ok(())
} Python
from dora import Node
node = Node()
for event in node:
if event["type"] == "INPUT":
data = event["value"]
node.send_output(
"out", data
) C
#include "dora/node_api.h"
int main() {
void *node = dora_init_node();
void *event = dora_next_event(node);
while (event != NULL) {
char *id = dora_event_id(event);
/* process event */
dora_free_event(event);
event = dora_next_event(node);
}
dora_drop_node(node);
} C++
#include "dora/node.hpp"
int main() {
auto node = dora::Node();
for (auto event : node) {
if (event.type == dora::INPUT) {
auto data = event.data();
node.send("out", data);
}
}
return 0;
} Data Layer
Communication patterns
Four built-in patterns for every inter-node communication need. All patterns work identically across local shared memory and distributed Zenoh transport.
Comparison
dora 1.0 vs dora 0.x vs ROS2
dora 1.0 builds on dora's zero-copy data plane and adds the orchestration, fault tolerance, and observability you need for production deployments.
| Dimension | dora 1.0 | dora 0.x | ROS2 |
|---|---|---|---|
| What it is | Dataflow framework | Dataflow framework | Robotics middleware |
| Latency (SHM) | ~500us p50 | ~500us p50 | 1-10ms |
| Comm patterns | 4 (Topic/Service/Action/Stream) | 1 (Topic) | 4 |
| Fault tolerance | Built-in restart policies | None | Basic |
| Record/Replay | Built-in (.dora) | None | rosbag |
| Cluster management | SSH + labels | None | Manual |
| Dynamic topology | Yes | No | Partial |
| Observability | OpenTelemetry | Basic logs | ROS2 logging |
| Languages | Rust/Py/C/C++ | Rust/Py/C/C++ | C++/Py |
| Zero-copy | Arrow + Zenoh SHM | Arrow + Zenoh SHM | CDR serialization |
Use Cases
Built for real workloads
Autonomous Robots
Perception, planning, and control nodes connected via zero-copy shared memory. Add new sensors or actuators without rewriting existing nodes.
AI Pipelines
Chain camera, object detection, LLM reasoning, and action nodes into a single dataflow. Python for AI models, Rust for real-time control.
Voice Assistants
Streaming ASR, LLM generation, and TTS with queue flushing for real-time interruption. Action pattern for long-running conversations.
Distributed Simulation
Run simulation nodes across a cluster. Record and replay scenarios at any speed for regression testing without physical hardware.
Industrial Automation
PLC interfaces, sensor fusion, and control logic with fault tolerance. Exponential backoff and health monitoring keep production lines running.
Research & Prototyping
Define your dataflow in YAML, mix languages freely, and iterate fast. Record real data once and replay offline for algorithm development.
Get Started
Deploy in minutes
Install the CLI, define your nodes in YAML, and start processing data. No Docker, no Kubernetes, no complex setup.
# Install
cargo install dora-cli
# Create a dataflow
cat > dataflow.yml << 'EOF'
nodes:
- id: webcam
path: dora-hub/webcam
outputs: [image]
- id: detector
path: dora-hub/yolov8
inputs:
image: webcam/image
outputs: [bbox]
- id: plot
path: dora-hub/opencv-plot
inputs:
image: webcam/image
bbox: detector/bbox
EOF
# Run it
dora start dataflow.yml Questions
FAQ
What's new in dora 1.0? +
dora 1.0 is a major release that adds service (request/reply), action (goal/feedback/result), and streaming communication patterns, a WebSocket-based coordinator, fault tolerance with automatic restart and state persistence, record/replay, cluster management, dynamic topology, and comprehensive observability. dora 0.x supported only topic pub/sub with a TCP coordinator.
How is dora different from ROS2? +
dora is 10-17x faster than ROS2 Python thanks to zero-copy shared memory and Apache Arrow. It uses simple YAML for dataflow definitions instead of code generation, supports Rust/Python/C/C++ without IDL files, and provides built-in cluster management, record/replay, and a single CLI for the full lifecycle.
What languages does dora support? +
dora provides native APIs for Rust, Python (via PyO3), C, and C++ (via CXX bridge). All languages share the same Apache Arrow data format with zero serialization overhead, and you can mix languages freely in a single dataflow.
What communication patterns are available? +
dora supports four patterns: Topic (fire-and-forget pub/sub), Service (request/reply with UUID correlation), Action (long-running goals with periodic feedback and cancellation), and Streaming (continuous data with session/segment/chunk metadata and queue flushing for interruption).
Is dora 1.0 production-ready? +
Yes. dora 1.0 includes fault tolerance with configurable restart policies, persistent coordinator state via redb, record/replay for debugging, OpenTelemetry observability, distributed deployment with cluster management, and comprehensive CI including format, lint, test, and E2E checks on all platforms.
Can I upgrade from dora 0.x to 1.0? +
Existing topic-based dataflows are largely backward compatible. The core pub/sub model is the same, and YAML dataflow definitions follow the same structure. You can incrementally adopt new features like services, actions, and fault tolerance without rewriting existing nodes.
Build your first dataflow
in 5 minutes
Install the CLI, define your nodes in YAML, and start processing data.