Skip to main content

Reachy2 Pick and Place

· One min read
Haixuan Xavier Tao
Maintainer of dora-rs
Using qwenVL 2.5 multi bounding box capabilities to pick and place mulitple item with very low latency.

Rerun

In case Rerun does not work on your phone. You'll find the video below:

In the above iframe, the important information are:

  • /text_whisper: correspond to whisper audio transcription.
  • /text_response: correspond to the bounding box given as plain text from QwenVL 2.5
  • camera_torso: correspond to Orbecc Gemini 336 Depth Camera rgb image.
  • camera_torso bounding box: correspond to the QwenVL bounding box projected on the image that is going to be used to grasp object. The prediction is done at regular interval and does not disappear. Sorry if it can be a bit confusing.

Code

The branch: https://github.com/dora-rs/dora/pull/793

Demo