Comparison
What's new in dora 1.0
dora 1.0 is a major release that adds production-grade features on top of dora's zero-copy core: services, actions, streaming, fault tolerance, cluster management, record/replay, dynamic topology, and comprehensive observability. Here's everything that's new.
Architecture
TCP to WebSocket, single to distributed
dora 0.x used TCP for coordinator-daemon communication and supported single-machine deployment. dora 1.0 replaces TCP with WebSocket, adds a multi-machine Zenoh data plane, and introduces persistent coordinator state for high availability.
Communication
From 1 pattern to 4
dora 0.x supported only topic-based pub/sub. dora 1.0 adds Service, Action, and Streaming patterns via well-known metadata keys — no protocol or daemon changes needed.
Topic (pub/sub)
nodes:
- id: sensor
outputs: [data]
- id: processor
inputs:
data: sensor/data Service (request/reply)
# dora 1.0 — metadata-based correlation
node.send_service_request("request", params, data)
# Server passes through request_id
node.send_service_response("response", metadata.parameters, result) Action (goal/feedback/result)
# Goal with periodic feedback + final result
# Supports cancellation via cancel output
# Metadata keys: goal_id, goal_status Streaming (session/segment/chunk)
# Real-time pipelines with interruption
# flush: true clears downstream queues
# Enables voice/video barge-in Features
Complete feature matrix
| Dimension | dora 1.0 | dora 0.x | ROS2 |
|---|---|---|---|
| Coordinator protocol | WebSocket (port 6013) | DDS | |
| Data plane | Zenoh SHM + shared memory | DDS serialization | |
| Communication patterns | 4 (Topic, Service, Action, Streaming) | 4 (Topic, Service, Action, Timer) | |
| Latency (SHM path) | ~500us p50 | 1-10ms | |
| Copy overhead | Zero (Apache Arrow) | CDR serialization | |
| Fault tolerance | Built-in (restart policies, health checks, circuit breakers) | Basic lifecycle | |
| Record / Replay | Built-in (.drec files) | rosbag2 | |
| Dynamic topology | Yes (add/remove/connect/disconnect at runtime) | Partial (lifecycle nodes) | |
| Cluster management | Built-in SSH clusters with label scheduling | Manual or external tools | |
| Observability | OpenTelemetry (logs, metrics, traces) | ROS2 logging + third-party | |
| Real-time support | SCHED_FIFO + mlockall + CPU affinity | rclcpp Executor (partial) | |
| Type annotations | Static validation via dora validate | IDL / .msg files | |
| Reusable modules | YAML module composition with typed ports | ROS2 packages | |
| Log management | Structured + rotation + routing + aggregation | rosout | |
| Resource monitoring | dora top TUI (CPU, mem, queues, network) | External tools | |
| Topic inspection | topic echo / hz / info | ros2 topic echo / hz / info | |
| Schema validation | dora validate with wiring checks | colcon build (compile-time) | |
| WebSocket control | Full control + topic data channels | rosbridge (third-party) | |
| Python CLI API | PyO3 native bindings | rclpy | |
| Coordinator HA | Persistent redb state + daemon reconnect | N/A (no coordinator) | |
| Source files | 772 | ~50,000+ | |
| Examples | 45 | Hundreds (community) | |
| CI templates | Reusable downstream CI workflows | N/A |
CLI
CLI commands side by side
dora 1.0 extends the dora CLI with record/replay, topic inspection, resource monitoring, dynamic topology, cluster management, parameter management, and more.
| Operation | dora 1.0 | dora 0.x |
|---|---|---|
| Start dataflow | dora start dataflow.yml | |
| Stop dataflow | dora stop | |
| View logs | dora logs --node webcam | |
| Record messages | dora record | |
| Replay recording | dora replay recording.drec | |
| Live topic data | dora topic echo webcam/image | |
| Frequency monitor | dora topic hz webcam/image | |
| Resource monitor | dora top | |
| Add node at runtime | dora node add --from-yaml node.yml | |
| Cluster deploy | dora cluster up | |
| Validate dataflow | dora validate dataflow.yml | |
| View traces | dora trace list | |
| Parameter management | dora param get/set/delete | |
| Dataflow visualization | dora graph dataflow.yml |
Production
Production readiness
Fault Tolerance
Per-node restart policies (never/on-failure/always), exponential backoff, health monitoring, circuit breakers with configurable input timeouts.
Coordinator HA
Persistent redb-backed state store. Daemons auto-reconnect with exponential backoff. Dataflow state reconstructed on coordinator restart.
Observability
OpenTelemetry integration for structured logging, metrics, and distributed tracing. Zero-setup trace viewing via CLI.
Record & Replay
Capture all messages to .drec files. Replay at any speed with node substitution for regression testing.
Cluster Management
SSH-based deployment with label scheduling, rolling upgrades, auto-recovery, and centralized management from the CLI.
Real-Time Support
Optional --rt flag for mlockall + SCHED_FIFO. Per-node CPU affinity pinning. Comprehensive tuning guide.
Upgrade
Upgrading from dora 0.x
Backward compatible for pub/sub
Existing topic-based dataflows work with minimal changes. The core pub/sub model is unchanged,
and YAML dataflow definitions follow the same structure — your existing
dora CLI commands keep working.
Incremental adoption
New features are additive. Add service patterns to specific nodes, enable fault tolerance
per-node with restart_policy, start recording with
dora record. No need to rewrite your entire pipeline.
Same crates, new capabilities
Your existing dora-* crate dependencies keep working —
dora-node-api, dora-operator-api,
and the rest are unchanged. dora 1.0 adds new APIs for services, actions, and streaming
without breaking existing ones.