Why Two Arms?
A single robot arm is powerful for tasks where one end-effector is enough: pick-and-place, sorting, inspection. But a large class of real-world manipulation tasks fundamentally require two hands — the same way humans use both hands as a matter of course. Holding a container while pouring. Assembling parts that require one hand to stabilize and one to insert. Folding cloth, peeling packaging, passing an object from one hand to the other mid-task.
These tasks are not just "harder" with one arm — they are architecturally incompatible with a single-arm setup. The DK1's bimanual architecture gives you access to this entire task class. And because both arms operate in a shared workspace with synchronized joint states, the imitation learning setup — leader/follower teleoperation feeding into a single policy — is cleaner than you might expect.
The Leader/Follower Concept
The DK1 uses a leader/follower architecture for teleoperation. The concept is straightforward:
What you move
A lightweight, back-drivable controller arm that you physically manipulate with your hands. It has no payload capacity — its only job is to sense and transmit your intended motion at high frequency.
What executes the task
The two full-strength DK1 arms that mirror the leader's joint angles in real time. They interact with the actual workspace and objects. These are the arms that run the trained policy during deployment.
When you teleoperate, you physically move the leader arm. The follower arms replicate that motion within milliseconds. When you record data, the follower arm joint states — not the leader's — are what gets saved. When you train a policy, you are training the follower arms to reproduce the motion patterns your leader captured. The leader arm drops out entirely at inference time.
This architecture is more natural than keyboard or VR controller teleoperation because the motion mapping is direct: moving the leader 30° maps to 30° on the follower. Your body's proprioception transfers directly to the robot.
Hardware Checklist
Verify every item below before beginning Unit 1.
- DK1 follower arm ×2 — both arms from the kit. Verify both arrived undamaged and all joints move freely when unpowered.
- DK1 leader controller arm — the lighter, back-drivable teleoperation controller. Should feel easy to move by hand.
- Power supplies ×2 — one per follower arm. Included in the kit. Verify voltage spec matches your wall outlet (see label on supply).
- USB-C cables ×3 — one per arm (both followers + leader) for initial connection. Shorter cables (0.5–1m) are easier to manage in a bimanual workspace.
- Cameras ×2 — one wide-angle workspace camera (top-down or front-facing) and one wrist camera on the primary follower arm. A third camera on the secondary arm is optional but recommended for contact-rich tasks.
- Mounting hardware — the DK1 requires fixed mounting for both follower arms. The kit includes bolt-down plates. A rigid table or lab bench is required — a folding table will introduce vibration that degrades your data.
- Bimanual workspace — at least 80cm × 60cm of clear flat surface between the two arms. Mark the arm reach boundaries with tape during Unit 1 to define the safe operating envelope.
No physical hardware? You can complete most of this path in the MuJoCo bimanual simulation. See the DK1 simulation setup guide before Unit 1.
Software Checklist
- Ubuntu 22.04 or 24.04 — same requirement as OpenArm. A VM works for sim; real hardware requires native Linux for real-time CAN bus performance.
- Python 3.10 or higher — run
python3 --versionto check. - ROS 2 Humble or Jazzy — if you completed the OpenArm path, this is already installed. Run
ros2 --versionto verify. - DK1 SDK (separate from OpenArm SDK) — installation covered in Unit 2. Do not install now — the pairing configuration must happen after both arms are physically mounted.
- LeRobot — if you have it installed from the OpenArm path, it will work here. The bimanual dataset format uses the same structure with two joint-state arrays. Version ≥0.3.0 required for bimanual support.
- ~25 GB free disk space — bimanual datasets are larger than single-arm datasets (two joint state streams, two camera feeds). Training checkpoints add another 5–10 GB.
- GPU with 10GB+ VRAM — strongly recommended. Bimanual ACT training on CPU is feasible but will take 8–12h for a good training run. An RTX 3080 or better cuts this to under 2h.
Time Estimates
Bimanual setup takes longer than single-arm setup — factor in time for mounting, alignment, and sync verification for each unit.
| Unit | What You Do | Time |
|---|---|---|
| 0 | This orientation | 30 min |
| 1 | Mount and wire two arms, cameras | ~3 h |
| 2 | SDK, leader/follower pairing, sync test | ~2 h |
| 3 | First bimanual teleoperation session | ~2 h |
| 4 | Record 100 synchronized demos | ~3 h |
| 5 | Train ACT bimanual policy | ~4 h |
| 6 | Deploy, evaluate, improve | ~2 h |
| Total | ~16 h 30 min | |
Plan 4–5 sessions. Units 1 and 2 go together naturally (hardware setup + software config in one session). Units 3 and 4 are best done together once you are fluent with bimanual teleoperation. Unit 5 training can run overnight.
How to Get Help
- Check the completion check at the bottom of whatever unit you're in — it defines exactly what "done" looks like.
- Post in the DK1 forum thread — include your Ubuntu version, SDK version, exact error message, and which unit you're in. Bimanual-specific issues often have arm-specific error codes; include both.
- Check the troubleshooting section in Unit 2 — it covers the most common leader/follower sync errors.
- Join the SVRC Discord in #dk1-path — faster response during PST daytime hours.
Simulation Alternative
The DK1 path supports a MuJoCo bimanual simulation that replicates the leader/follower architecture, synchronized joint states, and camera feeds. You can complete Units 0 through 5 entirely in simulation. Unit 6 (real hardware deployment) requires physical arms. The simulation setup guide is at hardware/dk1/simulation.
Orientation Complete When...
You have checked every item in the hardware and software checklists, you understand the leader/follower concept and can explain it in one sentence, you know where to ask for help, and you have set aside your first 3-hour session for Unit 1 hardware setup.