Comparison
What's new in dora 1.0
dora 1.0 is a major release that adds production-grade features on top of dora's zero-copy core: services, actions, streaming, fault tolerance, cluster management, record/replay, dynamic topology, and comprehensive observability. Here's everything that's new.
Architecture
TCP to WebSocket, single to distributed
dora 0.x used TCP for coordinator-daemon communication and supported single-machine deployment. dora 1.0 replaces TCP with WebSocket, adds a multi-machine Zenoh data plane, and introduces persistent coordinator state for high availability.
Communication
From 1 pattern to 4
dora 0.x supported only topic-based pub/sub. dora 1.0 adds Service, Action, and Streaming patterns via well-known metadata keys — no protocol or daemon changes needed.
Topic (pub/sub)
nodes:
- id: sensor
outputs: [data]
- id: processor
inputs:
data: sensor/data Service (request/reply)
# dora 1.0 — metadata-based correlation
node.send_service_request("request", params, data)
# Server passes through request_id
node.send_service_response("response", metadata.parameters, result) Action (goal/feedback/result)
# Goal with periodic feedback + final result
# Supports cancellation via cancel output
# Metadata keys: goal_id, goal_status Streaming (session/segment/chunk)
# Real-time pipelines with interruption
# flush: true clears downstream queues
# Enables voice/video barge-in Features
Complete feature matrix
| Dimension | dora 1.0 | dora 0.x | ROS2 |
|---|---|---|---|
| Coordinator protocol | WebSocket (port 6013) | TCP | DDS |
| Data plane | Zenoh SHM + shared memory | Shared memory only | DDS serialization |
| Communication patterns | 4 (Topic, Service, Action, Streaming) | 1 (Topic only) | 4 (Topic, Service, Action, Timer) |
| Latency (SHM path) | ~500us p50 | ~500us p50 | 1-10ms |
| Copy overhead | Zero (Apache Arrow) | Zero (Apache Arrow) | CDR serialization |
| Fault tolerance | Built-in (restart policies, health checks, circuit breakers) | None | Basic lifecycle |
| Record / Replay | Built-in (.dora files) | None | rosbag2 |
| Dynamic topology | Yes (add/remove/connect/disconnect at runtime) | No | Partial (lifecycle nodes) |
| Cluster management | Built-in SSH clusters with label scheduling | None | Manual or external tools |
| Observability | OpenTelemetry (logs, metrics, traces) | Basic logging | ROS2 logging + third-party |
| Real-time support | SCHED_FIFO + mlockall + CPU affinity | None | rclcpp Executor (partial) |
| Type annotations | Static validation via dora validate | None | IDL / .msg files |
| Reusable modules | YAML module composition with typed ports | None | ROS2 packages |
| Log management | Structured + rotation + routing + aggregation | Basic stdout | rosout |
| Resource monitoring | dora top TUI (CPU, mem, queues, network) | None | External tools |
| Topic inspection | topic echo / hz / info | None | ros2 topic echo / hz / info |
| Schema validation | dora validate with wiring checks | None | colcon build (compile-time) |
| WebSocket control | Full control + topic data channels | None | rosbridge (third-party) |
| Python CLI API | PyO3 native bindings | None | rclpy |
| Coordinator HA | Persistent redb state + daemon reconnect | None | N/A (no coordinator) |
| Source files | 772 | 488 | ~50,000+ |
| Examples | 37 | 20 | Hundreds (community) |
| CI templates | Reusable downstream CI workflows | None | N/A |
CLI
CLI commands side by side
dora 1.0 extends the dora CLI with record/replay, topic inspection, resource monitoring, dynamic topology, cluster management, parameter management, and more.
| Operation | dora 1.0 | dora 0.x |
|---|---|---|
| Start dataflow | dora start dataflow.yml | dora start dataflow.yml |
| Stop dataflow | dora stop | dora destroy |
| View logs | dora logs --node webcam | (stdout only) |
| Record messages | dora record | (not available) |
| Replay recording | dora replay recording.dora | (not available) |
| Live topic data | dora topic echo webcam/image | (not available) |
| Frequency monitor | dora topic hz webcam/image | (not available) |
| Resource monitor | dora top | (not available) |
| Add node at runtime | dora node add --path ./detector | (not available) |
| Cluster deploy | dora cluster up | (not available) |
| Validate dataflow | dora validate dataflow.yml | (not available) |
| View traces | dora trace list | (not available) |
| Parameter management | dora param get/set/delete | (not available) |
| Dataflow visualization | dora graph dataflow.yml | dora graph dataflow.yml |
Production
Production readiness
Fault Tolerance
Per-node restart policies (never/on-failure/always), exponential backoff, health monitoring, circuit breakers with configurable input timeouts.
Coordinator HA
Persistent redb-backed state store. Daemons auto-reconnect with exponential backoff. Dataflow state reconstructed on coordinator restart.
Observability
OpenTelemetry integration for structured logging, metrics, and distributed tracing. Zero-setup trace viewing via CLI.
Record & Replay
Capture all messages to .dora files. Replay at any speed with node substitution for regression testing.
Cluster Management
SSH-based deployment with label scheduling, rolling upgrades, auto-recovery, and centralized management from the CLI.
Real-Time Support
Optional --rt flag for mlockall + SCHED_FIFO. Per-node CPU affinity pinning. Comprehensive tuning guide.
Upgrade
Upgrading from dora 0.x
Backward compatible for pub/sub
Existing topic-based dataflows work with minimal changes. The core pub/sub model is unchanged,
and YAML dataflow definitions follow the same structure — your existing
dora CLI commands keep working.
Incremental adoption
New features are additive. Add service patterns to specific nodes, enable fault tolerance
per-node with restart_policy, start recording with
dora record. No need to rewrite your entire pipeline.
Same crates, new capabilities
Your existing dora-* crate dependencies keep working —
dora-node-api, dora-operator-api,
and the rest are unchanged. dora 1.0 adds new APIs for services, actions, and streaming
without breaking existing ones.