dora dora dora

Agentic Dataflow-Oriented Robotic Architecture

100% Rust framework for real-time robotics and AI. 10-17x faster than ROS2 with zero-copy shared memory, 4 communication patterns, and production-grade fault tolerance.

10-17x faster | 4 patterns | 4 languages | zero-copy | open source
10-17x faster than ROS2
4 communication patterns
4 languages supported
0 copy overhead

Core Capabilities

Why dora 1.0

10-17x faster than ROS2

Zero-Copy Performance

10-17x faster than ROS2 via shared memory IPC and Apache Arrow columnar format. Flat latency from 4KB to 4MB payloads with Zenoh SHM data plane.

4 patterns

4 Communication Patterns

Beyond pub/sub. Built-in support for Topic, Service (request/reply), Action (goal/feedback/result), and Streaming (session/segment/chunk) via metadata conventions.

4 languages

Multi-Language SDK

Write nodes in Rust, Python, C, or C++ with native APIs — not wrappers. Mix languages freely in a single dataflow with zero interop overhead.

Fault Tolerance

Per-node restart policies (never/on-failure/always), exponential backoff, health monitoring, circuit breakers, and coordinator state persistence via redb.

Record & Replay

Capture dataflow messages to .dora files. Replay offline at any speed with node substitution for regression testing and offline debugging.

Cluster Management

Distributed deployment via SSH with label scheduling, rolling upgrades, and auto-recovery. Single CLI manages local dev and multi-machine production.

Dynamic Topology

Add and remove nodes from running dataflows via CLI without restarting. Connect and disconnect node ports on the fly for live reconfiguration.

Comprehensive Observability

Built-in OpenTelemetry for structured logging, metrics, and distributed tracing. Zero-setup trace viewing, live topic inspection, and resource monitoring TUI.

Agentic Engineering

Built and maintained with autonomous AI agents driving code generation, reviews, refactoring, testing, and commits.

Architecture

Designed for scale

Four-layer architecture: CLI orchestrates the Coordinator via WebSocket, which manages Daemons across machines, each running Nodes and Operators. Zenoh provides the zero-copy data plane.

dora-cli — build, run, manage

dora-coordinator — WebSocket control

dora-daemon — per-machine manager

dora-runtime — operator engine

zenoh — zero-copy data plane

Communication Patterns

Four ways to communicate

Beyond pub/sub. dora provides Topic, Service, Action, and Streaming patterns via well-known metadata keys. No daemon or YAML changes required.

Topic

Fire-and-forget pub/sub for sensor data, periodic status, and events

Service

Request/reply with UUID correlation. Client sends request, server returns exactly one response

Action

Long-running goals with periodic feedback and cancellation support

Streaming

Continuous data with session/segment/chunk metadata and queue flushing for real-time interruption

Distributed

Run anywhere, scale everywhere

Local shared memory between co-located nodes, automatic Zenoh pub-sub for cross-machine communication. SSH-based cluster management with label scheduling and rolling upgrades.

6013

WebSocket control plane

Coordinator manages all daemons via persistent WebSocket connections

SSH

Cluster deployment

Deploy across machines with dora cluster up — no Kubernetes needed

0

Zero-copy IPC

Shared memory for local nodes, Zenoh SHM for cross-machine — no serialization overhead

Observability

See everything

Built-in OpenTelemetry for structured logging, metrics, and distributed tracing. Monitor your entire fleet from the CLI.

Live monitoring

dora top

Per-node CPU, memory, queue depth, network I/O, restart count, and health status across all machines in a terminal TUI.

Distributed tracing

dora trace list/view

Zero-setup trace inspection. View distributed traces across nodes without deploying Jaeger or Zipkin.

Data inspection

dora topic echo/hz/info

Print live data, monitor publish frequencies, and inspect schemas and bandwidth — all from the command line.

Record & Replay

Capture, replay, debug

Record all dataflow messages to .dora files. Replay offline at any speed with node substitution for regression testing.

dora record

Capture live messages from any running dataflow to .dora files. Selective per-node or full-graph recording.

dora replay

Replay recorded data at any speed — 0.1x for step-through debugging or 10x for fast regression tests.

Node substitution

Swap out nodes during replay to test new logic against recorded real-world inputs.

Time-Travel Debugging

Record once, debug forever

Capture data in the field, replay endlessly in the lab. Step through exact message sequences, swap individual nodes to test new algorithms against real-world data, and run regression suites without deploying to hardware.

Offline hardware debugging

Debug entire sensor-to-actuator pipelines without running physical hardware. Replay exact camera frames, lidar scans, and IMU data from recordings.

CI regression testing

Record golden datasets, replay in CI. Automatically detect when algorithm changes produce different outputs. Integrate with dora replay --assert.

Timestamp-accurate replay

Messages replayed with original timing relationships preserved. Variable speed playback from 0.1x slow-motion to 100x fast-forward.

.dora format variable speed node substitution offline mode regression testing CI integration data inspection timestamp replay selective recording multi-node capture

Fault Tolerance

Built to recover

Per-node restart policies, exponential backoff, health monitoring, and persistent coordinator state. Your dataflow keeps running even when individual nodes fail.

Restart policies

Choose never, on-failure, or always per node. Configurable in YAML with max retries and time windows.

Coordinator persistence

State persisted via redb. Restart the coordinator and all daemons reconnect automatically with their prior state intact.

Graceful degradation

Individual node failures are isolated. The rest of your dataflow continues processing while failed nodes recover.

Performance

Fast where it matters

Sub-millisecond latency with shared memory, zero-copy Arrow buffers, and four communication patterns. The same zero-copy speed as dora 0.x, with full-featured orchestration on top.

Dimension dora 1.0 dora 0.x ROS2
Latency (SHM) ~500us p50 ~500us p50 1-10ms
Copy overhead Zero (Arrow) Zero (Arrow) CDR serialization
Comm patterns 4 1 (Topic) 4
Fault tolerance Built-in None Basic
Record/Replay Built-in None rosbag
Cluster mgmt Built-in None Manual
Dynamic topology Yes No Partial
Observability OpenTelemetry Basic logs ROS2 logging
Languages Rust/Py/C/C++ Rust/Py/C/C++ C++/Py

Languages

Write nodes in your language

Each node is an independent process. Use the language that fits your task best — Rust for performance, Python for AI models, C/C++ for hardware drivers.

Rust

use dora_node::prelude::*;

#[dora_main]
fn main() -> eyre::Result<()> {
    let node = DoraNode::init()?;
    for event in node {
        match event {
            Event::Input { id, data, .. } => {
                // process data
            }
            _ => {}
        }
    }
    Ok(())
}

Python

from dora import Node

node = Node()

for event in node:
    if event["type"] == "INPUT":
        data = event["value"]
        node.send_output(
            "out", data
        )

C

#include "dora/node_api.h"

int main() {
    void *node = dora_init_node();
    void *event = dora_next_event(node);
    while (event != NULL) {
        char *id = dora_event_id(event);
        /* process event */
        dora_free_event(event);
        event = dora_next_event(node);
    }
    dora_drop_node(node);
}

C++

#include "dora/node.hpp"

int main() {
    auto node = dora::Node();
    for (auto event : node) {
        if (event.type == dora::INPUT) {
            auto data = event.data();
            node.send("out", data);
        }
    }
    return 0;
}

Data Layer

Communication patterns

Four built-in patterns for every inter-node communication need. All patterns work identically across local shared memory and distributed Zenoh transport.

Topic
Service
Action
Streaming

Comparison

dora 1.0 vs dora 0.x vs ROS2

dora 1.0 builds on dora's zero-copy data plane and adds the orchestration, fault tolerance, and observability you need for production deployments.

Dimension dora 1.0 dora 0.x ROS2
What it is Dataflow framework Dataflow framework Robotics middleware
Latency (SHM) ~500us p50 ~500us p50 1-10ms
Comm patterns 4 (Topic/Service/Action/Stream) 1 (Topic) 4
Fault tolerance Built-in restart policies None Basic
Record/Replay Built-in (.dora) None rosbag
Cluster management SSH + labels None Manual
Dynamic topology Yes No Partial
Observability OpenTelemetry Basic logs ROS2 logging
Languages Rust/Py/C/C++ Rust/Py/C/C++ C++/Py
Zero-copy Arrow + Zenoh SHM Arrow + Zenoh SHM CDR serialization

Use Cases

Built for real workloads

01

Autonomous Robots

Perception, planning, and control nodes connected via zero-copy shared memory. Add new sensors or actuators without rewriting existing nodes.

Zero-copy SHM Hot-swap nodes Multi-language
02

AI Pipelines

Chain camera, object detection, LLM reasoning, and action nodes into a single dataflow. Python for AI models, Rust for real-time control.

YOLO + LLM GPU scheduling Arrow buffers
03

Voice Assistants

Streaming ASR, LLM generation, and TTS with queue flushing for real-time interruption. Action pattern for long-running conversations.

Streaming pattern Queue flushing Sub-second latency
04

Distributed Simulation

Run simulation nodes across a cluster. Record and replay scenarios at any speed for regression testing without physical hardware.

Record/Replay Cluster deploy Variable speed
05

Industrial Automation

PLC interfaces, sensor fusion, and control logic with fault tolerance. Exponential backoff and health monitoring keep production lines running.

Fault tolerance Health checks Rolling upgrades
06

Research & Prototyping

Define your dataflow in YAML, mix languages freely, and iterate fast. Record real data once and replay offline for algorithm development.

YAML dataflows Offline replay Fast iteration

Get Started

Deploy in minutes

Install the CLI, define your nodes in YAML, and start processing data. No Docker, no Kubernetes, no complex setup.

terminal
# Install
cargo install dora-cli

# Create a dataflow
cat > dataflow.yml << 'EOF'
nodes:
  - id: webcam
    path: dora-hub/webcam
    outputs: [image]
  - id: detector
    path: dora-hub/yolov8
    inputs:
      image: webcam/image
    outputs: [bbox]
  - id: plot
    path: dora-hub/opencv-plot
    inputs:
      image: webcam/image
      bbox: detector/bbox
EOF

# Run it
dora start dataflow.yml

Questions

FAQ

What's new in dora 1.0? +

dora 1.0 is a major release that adds service (request/reply), action (goal/feedback/result), and streaming communication patterns, a WebSocket-based coordinator, fault tolerance with automatic restart and state persistence, record/replay, cluster management, dynamic topology, and comprehensive observability. dora 0.x supported only topic pub/sub with a TCP coordinator.

How is dora different from ROS2? +

dora is 10-17x faster than ROS2 Python thanks to zero-copy shared memory and Apache Arrow. It uses simple YAML for dataflow definitions instead of code generation, supports Rust/Python/C/C++ without IDL files, and provides built-in cluster management, record/replay, and a single CLI for the full lifecycle.

What languages does dora support? +

dora provides native APIs for Rust, Python (via PyO3), C, and C++ (via CXX bridge). All languages share the same Apache Arrow data format with zero serialization overhead, and you can mix languages freely in a single dataflow.

What communication patterns are available? +

dora supports four patterns: Topic (fire-and-forget pub/sub), Service (request/reply with UUID correlation), Action (long-running goals with periodic feedback and cancellation), and Streaming (continuous data with session/segment/chunk metadata and queue flushing for interruption).

Is dora 1.0 production-ready? +

Yes. dora 1.0 includes fault tolerance with configurable restart policies, persistent coordinator state via redb, record/replay for debugging, OpenTelemetry observability, distributed deployment with cluster management, and comprehensive CI including format, lint, test, and E2E checks on all platforms.

Can I upgrade from dora 0.x to 1.0? +

Existing topic-based dataflows are largely backward compatible. The core pub/sub model is the same, and YAML dataflow definitions follow the same structure. You can incrementally adopt new features like services, actions, and fault tolerance without rewriting existing nodes.

Build your first dataflow in 5 minutes

Install the CLI, define your nodes in YAML, and start processing data.