AutoDex

An Automated Real-World System for Dexterous Grasping Data Collection

Mingi Choi1, Gunhee Kim1, Jisoo Kim1, Taeksoo Kim1, Taeyun Ha1, Jongbin Lim1, Hanbyul Joo1,2
1Seoul National University  ·  2RLWRLD
6× playback

Abstract

AutoDex pipeline overview

AutoDex executes generated dexterous-grasp candidates on real hardware, records physical success and failure outcomes, and builds a reusable grasp-trial database for retrieval-based execution in new scenes.

AutoDex is an automated real-world data-collection system for dexterous grasping. It bridges the gap between scalable candidate generation and slow human teleoperation by physically testing generated grasps with a multi-finger robot hand and recording the outcome of each attempt.

The system automates unattended collection end to end: multi-view object pose estimation, candidate filtering and motion planning, collision-monitored execution, lift-and-hold outcome labeling, and object reset between trials. Across 100 household objects, AutoDex collects 3,593 physically labeled grasp trials with Allegro and Inspire hands. In our experiments, it achieves 4.8× higher collection throughput than teleoperation, and retrieved grasps from the AutoDex database achieve 76% real-world success compared with 34% without real-world validation.

3,593
grasp trials
100
diverse objects
4.8×
faster than teleoperation
2
robot hands (Allegro, Inspire)

System Overview

AutoDex runs in a calibrated robot-capture workcell with a 6-DoF xArm, a swappable multi-finger hand, and 20 synchronized RGB cameras surrounding the tabletop workspace. The cameras are hardware-synchronized and calibrated into a shared robot-camera coordinate system, so object poses, robot motion, and multi-view videos are aligned in both space and time. This setup gives AutoDex dense visual coverage for pose estimation and tracking, even when the hand occludes the object during grasp execution.

AutoDex multi-camera capture workcell

Candidate Generation

AutoDex starts with grasp candidates generated from an object mesh and a scene specification, such as a wall, shelf, or box. These constraints restrict the hand’s approach directions and produce candidates that reflect different surrounding environments.

Grasp Validation Loop

AutoDex tests generated grasp candidates on the real robot. Each trial localizes the object, selects a feasible candidate, executes the grasp, and records the physical outcome. When the current pose runs out of candidates, AutoDex resets the object and continues.

1 Perceive
2 Filter
3 Select
4 Execute
5 Reset  ↻

1 · Perceive

AutoDex starts each trial by estimating the object’s current 6D pose from the multi-view camera rig. In the video, object masks are predicted across synchronized views, an initial pose is selected from per-view estimates, and silhouette refinement aligns the rendered object with the observed mask contours. The final pose is overlaid on the camera views and used to determine the object’s current stable tabletop pose.

2 · Filter

AutoDex considers only the candidates generated for the object’s current stable pose and filters them before execution. In the video, green grasps are reachable candidates, while yellow grasps are rejected because the arm has no valid inverse-kinematics solution. The remaining candidates form the feasible set for the trial.

3 · Select

AutoDex selects the feasible grasp that covers the most remaining scene conditions. If the grasp succeeds on the real robot, those covered conditions are marked complete, making each physical trial count for more than one scene.

4 · Execute

The robot executes the selected grasp by closing the hand, lifting the object, and holding it. Safety monitoring stops the trial if unexpected contact is detected. AutoDex records the synchronized videos and robot states, tracks the object motion, and labels the trial as a success or failure.

5 · Reset ↻

When the current stable pose runs out of feasible candidates, AutoDex moves the object to another stable pose with remaining candidates. In the video, reset is planned by merging the pickup and placement constraints into one collision scene. A collision-free reset grasp can pick up the object, reorient it, and place it at the target pose, keeping the collection loop running without manual intervention.

AutoDex in Continuous Operation

Select an object to watch AutoDex run uninterrupted grasp-collection loops on the real robot at 15× speed.

15× playback

3,593 Physically Labeled Grasp Trials

Through autonomous real-robot execution, AutoDex collects dexterous-grasp trials across 100 objects, two hands, and multiple scene conditions. Each trial stores the executed grasp configuration, synchronized robot trajectory, 20-view RGB video, tracked object motion, and lift-and-hold success-or-failure label.

BibTeX

@misc{choi2026autodex,
  title  = {AutoDex: An Automated Real-World System for Dexterous Grasping Data Collection},
  author = {Choi, Mingi and Kim, Gunhee and Kim, Jisoo and Kim, Taeksoo and
            Ha, Taeyun and Lim, Jongbin and Joo, Hanbyul},
  year   = {2026},
  eprint = {2606.23689},
  archivePrefix = {arXiv},
  primaryClass  = {cs.RO}
}