Part II: Hands — Robot and Human

Chapter 6: Human Hand Data Collection — Teaching by Demonstration

Written: 2026-04-01 Last updated: 2026-06-09

Overview

The most intuitive way to teach robots to manipulate is to show them human demonstrations. Yet transferring the 27-DoF motion and tactile information of the human hand to a robot presents fundamental challenges. This chapter covers human hand modeling (MANO [#17]), motion tracking gloves, tactile gloves, exoskeletons, and teleoperation systems, surveying the data pipeline from human to robot.

After reading this chapter, you will be able to... - Describe the MANO hand model's structure and applications. - Compare the characteristics of major motion tracking gloves. - Understand tactile glove designs (STAG, OSMO [#18]) and their cross-embodiment potential. - Evaluate the strengths and limitations of exoskeleton and teleoperation approaches.

6.1 The Human Hand Model: MANO (778 Vertices, 16 Joints)

MANO [1] is a statistical human hand model learned from 1,000 3D scans (SIGGRAPH Asia 2017):

  • 778 vertices, 16 joints
  • PCA shape space: Low-dimensional representation of inter-individual hand size/shape variation
  • Pose blend shapes: Skin deformation modeling as a function of joint angles
  • Compatible with the SMPL full-body model (SMPL+H)

MANO serves as the foundation for virtually all human hand research — hand pose estimation, human-robot retargeting, and tactile transfer (UniTacHand's MANO UV map) (→ Chapter 11.4).


6.2 Motion Tracking Gloves: From Stretchable Sensors to Commercial Products

Seminar 2 (Taejoon) systematically reviewed the current state of motion tracking gloves.

Stretchable Liquid-Metal Sensor Glove (2024)

Published in Nature Communications [2024]:

  • 9 eGaIn (eutectic gallium-indium) liquid-metal sensors
  • 9 DoF tracking, adapts to all hand sizes (one-size-fits-all)
  • Joint angle error: 4.16 degrees, fingertip position error: 4.02 mm
  • Bayesian refinement + Kalman filter
Key Paper: 2024 study. "Stretchable Liquid-Metal Sensor Glove." Nature Communications. A 9-sensor eGaIn glove adapting to all hand sizes. The 4.02 mm fingertip error is sufficient for most manipulation demonstrations.

Commercial Glove Comparison

Seminar 2 compared three commercial gloves:

Glove Sensor Type Sensors DoF Notes
Rokoko IMU 7 6 Lightest solution
Manus Stretch/flex 16 NVIDIA Isaac Teleop official glove (GTC 2026)
StretchSense EMF 25 Highest DoF

NVIDIA's designation of Manus as the official data glove for Isaac Teleop at GTC 2026 signals industrial standardization.

Korean Research: ML-Based Wearable Sensors

Seoul National University research [2024, PMC] implemented real-time hand motion recognition with ML-based wearable sensors, bridging the gap between lab research and practical applications.


6.3 Tactile Gloves: STAG, OSMO, and the Open-Source Approach

Beyond motion tracking, collecting tactile information during human grasping is the next step.

STAG (2019)

Sundaram et al. [3] (Nature, 2019):

  • 548 piezoresistive sensors: High-resolution pressure distribution across the human hand
  • Grasp-finger correlation analysis
  • Pioneering tactile demonstration dataset for robot learning

Ruppel et al.[21] proposed a 169-sensor reduced version, exploring the trade-off between sensor count and information loss.

OSMO Glove (2025)

OSMO [arXiv:2512.08920] takes the innovative approach of using the same glove on both human and robot hands:

  • 12 three-axis magnetic sensors: Simultaneous normal and shear force measurement
  • MuMetal shielding against external magnetic interference
  • Core concept — Embodiment Bridge: Using identical sensing gloves on human and robot simplifies the cross-embodiment problem
  • Open-source: Reproducible design
Key Paper: OSMO. (2025). "OSMO: Open-Source Multi-axis Tactile Glove." arXiv:2512.08920. A 12-sensor three-axis magnetic tactile glove usable on both human and robot hands. Presents a new paradigm for cross-embodiment tactile transfer.

Key insight from Seminar 2: Using identical sensing gloves on human and robot simplifies the cross-embodiment problem. When tactile data is collected from human demonstrations and the same glove is mounted on a robot hand for policy learning, the tactile domain gap is eliminated (→ Chapter 11.4).

Figure 6.1: OSMO tactile glove concept — full-hand tactile coverage via 3-axis magnetic sensors; the same glove is worn by human and robot alike, enabling the
Figure 6.1: OSMO tactile glove concept — full-hand tactile coverage via 3-axis magnetic sensors; the same glove is worn by human and robot alike, enabling the "glove as interface" cross-embodiment strategy. Source: Yin et al., 2025, arXiv:2512.08920, Fig. 1.
Figure 6.2: OSMO is compatible with diverse in-the-wild hand trackers: Aria Gen 2, Quest 3, Apple Vision Pro, RGB videos, and the Manus glove. Source: Yin et al., 2025, arXiv:2512.08920, Fig. 2.
Figure 6.2: OSMO is compatible with diverse in-the-wild hand trackers: Aria Gen 2, Quest 3, Apple Vision Pro, RGB videos, and the Manus glove. Source: Yin et al., 2025, arXiv:2512.08920, Fig. 2.

TacCap 2025

TacCap [arXiv, Mar 2025] implements FBG (Fiber Bragg Grating) optical tactile sensors in a fingertip thimble form factor:

  • FBG fiber-optic tactile sensors: High sensitivity and fast response
  • Mountable on both human and robot fingers: Cross-embodiment approach similar to OSMO
  • EMI immune (electromagnetic interference immune): Fully immune to electromagnetic interference due to optical sensing — usable in MRI environments or near strong electromagnetic fields
  • Thimble form factor for easy attachment to existing gloves or robot fingertips
TacCap occupies a complementary position to OSMO's magnetic sensors. In environments where magnetic sensors are vulnerable to external fields, FBG optical sensing provides a robust alternative.

VTDexManip 2025

VTDexManip [ICLR 2025] is the first large-scale visual-tactile human demonstration dataset:

  • 565,000 frames: Simultaneous visual and tactile data collection
  • 10 tasks, 182 objects: Covering diverse manipulation scenarios
  • First visual-tactile human demonstration dataset: While prior datasets provided either visual or tactile data, this integrates both modalities
  • Serves as a benchmark for cross-embodiment policy learning
VTDexManip provides a concrete answer to "how to utilize demonstration data collected with tactile gloves." The 565K frames represent sufficient scale for large-scale imitation learning.

FSR Optimization

Tang et al.[5] and Chen et al.[5] addressed the optimal placement of FSR sensors, exploring how to extract maximum information from limited sensors — answering the "high spatial resolution vs. optimized sensor count/position" trade-off.


6.4 Exoskeleton Approaches: DexUMI [#8], ExoStart [#9], DEXOP [#10]

Exoskeletons mechanically connect to the human hand, directly capturing motion.

DexUMI (2025)

Xu et al.[5] — "Human Hand as Universal Interface":

  • Wearable exoskeleton resolves the kinematic gap
  • SAM2 inpainting resolves the visual gap — erasing the human hand from camera images and replacing it with the robot hand
  • 86% success on Inspire and XHand
  • 3.2x faster data collection than teleoperation
Figure 6.3: DexUMI transfers dexterous human manipulation skills to various robot hands. Demonstrations span long-horizon, contact-rich, multi-finger, and precise skill categories — 86% average success. Source: Xu et al., 2025, arXiv:2505.21864, Fig. 1.
Figure 6.3: DexUMI transfers dexterous human manipulation skills to various robot hands. Demonstrations span long-horizon, contact-rich, multi-finger, and precise skill categories — 86% average success. Source: Xu et al., 2025, arXiv:2505.21864, Fig. 1.
Figure 6.4: DexUMI exoskeleton design — Inspire Hand (left) and XHand (right) share the same joint-to-fingertip mapping. Wrist motion and joint actions are recorded alongside visual and tactile observations. Source: Xu et al., 2025, arXiv:2505.21864, Fig. 2.
Figure 6.4: DexUMI exoskeleton design — Inspire Hand (left) and XHand (right) share the same joint-to-fingertip mapping. Wrist motion and joint actions are recorded alongside visual and tactile observations. Source: Xu et al., 2025, arXiv:2505.21864, Fig. 2.

ExoStart (2025)

Si et al.[5] learn dexterous manipulation policies from just 10 exoskeleton demonstrations:

  1. ~10 exoskeleton demos
  2. MuJoCo dynamics filtering
  3. Auto-curriculum RL
  4. ACT vision student
  5. Zero-shot real → >50% success on 6 of 7 tasks

This pipeline exemplifies Real-Sim-Real transfer (→ Chapter 10.4).

Figure 6.5: ExoStart framework overview — (a) human demonstration with a sensorized exoskeleton, (b) MuJoCo dynamics filtering, (c) auto-curriculum RL with sim-to-real distillation. Source: Si et al., 2025, arXiv:2506.11775, Fig. 1.
Figure 6.5: ExoStart framework overview — (a) human demonstration with a sensorized exoskeleton, (b) MuJoCo dynamics filtering, (c) auto-curriculum RL with sim-to-real distillation. Source: Si et al., 2025, arXiv:2506.11775, Fig. 1.

DEXOP (2025)

Fang et al.[5] (DEXOP) use a four-bar linkage to directly mechanically couple human and robot fingers:

  • 8x faster data collection than teleoperation
  • Direct contact feedback
  • 51.3% vs. 42.5% success (vs. teleoperation)
  • Variants include DEXOP-12 (4 fingers, 12 DOF), DEXOP-9 (3 fingers, no ring), DEXOP-6 (dual 3-finger bimanual)
Figure 6.6: DEXOP hardware overview — whole-hand tactile sensors (left), whole-hand dexterous perioperation (middle), contact-rich long-horizon bimanual tasks (right). Source: Fang et al., 2025, arXiv:2509.04441, Fig. 1.
Figure 6.6: DEXOP hardware overview — whole-hand tactile sensors (left), whole-hand dexterous perioperation (middle), contact-rich long-horizon bimanual tasks (right). Source: Fang et al., 2025, arXiv:2509.04441, Fig. 1.

AirExo / AirExo-2 (SJTU, 2024-2025)

AirExo [ICRA 2024] and AirExo-2 [CoRL 2025] are low-cost passive exoskeletons developed at Shanghai Jiao Tong University:

  • Approximately $300 fabrication cost: Constructed from 3D-printed parts and low-cost sensors
  • Passive actuation: Records human motion without motors
  • In-the-wild human demonstration collection: Enables data capture outside laboratory settings in everyday environments
  • Key finding from AirExo-2: 3 min teleop + in-the-wild data >= 20 min teleop only — empirically demonstrating that expensive teleoperation data can be supplemented or replaced by natural environment demonstrations
  • Full upper-body exoskeleton covering both arm and hand
The AirExo series exemplifies the democratization of data collection. The finding that in-the-wild data from a $300 exoskeleton can complement or replace costly teleoperation data presents a new solution to the scalability problem of robot learning data.

ACE (UCSD, CoRL 2024)

ACE [CoRL 2024] is a universal teleoperation interface developed at the University of California San Diego:

  • Hand-facing camera + exoskeleton: Tracks finger poses via hand-facing camera while capturing arm motion through the exoskeleton
  • Single system supports diverse robot platforms: Enables teleoperation of humanoids, robot arms, grippers, and quadrupeds
  • Cross-embodiment switching occurs at the software level, requiring no hardware changes
  • Intuitive operation: Natural human motions map directly to robot actions
ACE's key contribution is enabling control of any robot through a single interface. This maximizes the reusability of collected human demonstration data.

NuExo (Nubot Lab, ICRA 2025)

NuExo [ICRA 2025] is an active exoskeleton system developed at Nubot Lab:

  • 5.2 kg backpack-style active exoskeleton: Motor-driven with haptic feedback
  • 100% upper-limb ROM (Range of Motion): Captures the full range of human upper-limb motion without restriction
  • Successful 2.5 mm screw tightening: Performs extremely precise manipulation tasks remotely
  • Backpack form factor ensures mobility across diverse environments
NuExo takes the opposite approach from passive exoskeletons (AirExo). Active actuation and haptic feedback push the upper bound of precision, specializing in fine manipulation demonstration collection.

HumanoidExo (NUDT, 2025)

HumanoidExo [arXiv, Oct 2025] is a full-body exoskeleton developed at the National University of Defense Technology (NUDT):

  • Lightweight exoskeleton + LiDAR: Exoskeleton captures upper body/arm motion; LiDAR captures lower body/locomotion trajectories
  • Full-body trajectory collection: Simultaneously records upper-limb manipulation and lower-limb locomotion
  • Locomotion learning from exoskeleton data alone: Directly learns locomotion policies from human demonstrations without separate gait simulation
  • Optimized for full-body teleoperation of humanoid robots
HumanoidExo extends the application scope of exoskeletons from hands/arms to the entire body. As humanoid robots approach commercialization, the importance of full-body demonstration collection infrastructure is growing.
Key Perspective: DexUMI, ExoStart, and DEXOP each bridge the human-robot gap differently, but all share the goal of overcoming teleoperation's throughput bottleneck (~10 demos/hr). AirExo addresses cost barriers, ACE tackles platform compatibility, NuExo pushes precision limits, and HumanoidExo enables full-body applications.

6.5 Large-Scale Data: From Internet Videos to Egocentric Capture

Shaw et al. [2024, CMU] proposed extracting human hand motions from internet videos and retargeting them to robot hands. The potential lies in scalability — millions of hand manipulation videos exist online, and converting them to robot learning data could fundamentally solve the teleoperation bottleneck.

ImMimic [14] augments data by interpolating between large-scale human trajectories and a few teleoperation trajectories. As discussed in Seminar 1, this represents the direction of synergistically using human data instead of expensive teleop data (→ Chapter 11.2).

DexH2R[15] implements task-oriented human-to-robot dexterous transfer, mapping the intent of human demonstrations to robot actions.

EgoScale and the 2026 Shift Toward Large-Scale Hand Data

NVIDIA GEAR's EgoScale pushes this trend further in 2026. EgoScale pretrains a VLA model on more than 20,000 hours of action-labeled egocentric human video and reports a log-linear relationship between human-data scale and downstream dexterous robot performance [31]. The key point is not merely collecting more robot data, but treating first-person human hand video as a reusable motor prior.

This changes the data strategy of Chapter 6. Gloves and exoskeletons still provide precise hand and force data, but not every task can be teleoperated at robot scale. A practical hand-data stack is likely to have three layers:

  • large-scale egocentric video for hand motion, object, and task diversity;
  • a smaller amount of aligned human-robot data for embodiment-specific mid-training;
  • tactile/force-rich specialist data for slip, insertion, wiping, cap tightening, and other contact-quality skills.

EgoScale therefore does not replace tactile data. It suggests that the practical route is large-scale hand-motion priors plus smaller tactile/force-rich datasets.

EgoDex (Apple, 2025)

EgoDex [29] is a large-scale egocentric hand manipulation dataset leveraging Apple Vision Pro and ARKit:

  • 829 hours, 90M (90 million) frames: The largest hand manipulation dataset to date
  • 194 tasks: Spanning from everyday object manipulation to tool use
  • 30 Hz per-finger tracking: Real-time 3D trajectory recording for each finger via ARKit hand tracking
  • Collected with consumer hardware (Apple Vision Pro), ensuring scalability
EgoDex opens a new middle ground between internet video and teleoperation approaches. It is as large-scale as video data while providing accurate 3D finger trajectories like teleoperation. The 829-hour scale exceeds existing robot demonstration datasets by orders of magnitude, approaching the data volume required for foundation model training.

6.6 Teleoperation: AnyTeleop, DexPilot, Bunny-VisionPro

Teleoperation remains the most traditional collection method and yields the highest data quality.

AnyTeleop (2023)

Qin et al. [2023, RSS] built a general-purpose vision-based teleoperation system:

  • Dex-Retarget: Maps human keypoints to robot joint positions
  • Compatible with diverse robot hands
  • As discussed in Seminar 1, naive retargeting has limitations — kinematic differences between human and robot can violate physical feasibility

DexPilot (2020)

Handa & Van Wyk [2020, NVIDIA] achieved 23-DOA teleoperation from bare-hand depth images. Requires only an RGB-D camera, maximizing accessibility.

Bunny-VisionPro (2024)

Ding et al. [2024, UCSD] implemented bimanual teleoperation via Apple Vision Pro with haptic feedback, achieving research-grade teleoperation using consumer hardware.

DexCap (2024)

Wang et al. [2024, Stanford] created a portable mocap system enabling 3x faster data collection than teleoperation, with policy learning from 30 minutes of data.

DOGlove (2025)

Zhang et al. [2025, RSS, arXiv:2502.07730] designed DOGlove, a low-cost open-source haptic feedback glove:

  • 21-DoF motion capture + 5-DoF haptic force feedback: Faithfully replicates human hand kinematics
  • Retargets to Shadow Hand, LEAP Hand, and other multi-finger robots
  • Haptic feedback conveys contact cues to the operator during teleoperation
Figure 6.7: DOGlove — 21-DoF motion capture paired with 5-DoF haptic force feedback in a low-cost open-source form factor. Teleoperation benchmarks show a clear advantage of haptic-enabled tasks over vision-only baselines. Source: Zhang et al., 2025, RSS, arXiv:2502.07730, Fig. 1.
Figure 6.7: DOGlove — 21-DoF motion capture paired with 5-DoF haptic force feedback in a low-cost open-source form factor. Teleoperation benchmarks show a clear advantage of haptic-enabled tasks over vision-only baselines. Source: Zhang et al., 2025, RSS, arXiv:2502.07730, Fig. 1.

Feel Robot Feels (2026)

A tactile feedback array glove that closes the haptic loop, enabling operators to directly feel what the robot touches.

UMI-FT [30] occupies a unique position in this landscape: a handheld demonstration device that preserves human dexterity with natural haptic feedback (no teleoperation latency), collects 6-axis force/torque data via CoinFT sensors, and scales to in-the-wild environments. Hundreds of people can collect demonstrations daily without requiring robots or trained operators. The embodiment gap is small because the device mimics the robot's gripper form factor [30].

Figure 6.8: Data collection strategy comparison — teleoperation vs. video learning vs. UMI-style handheld. Source: Choi, SNU Data Science Seminar 2026.
Figure 6.8: Data collection strategy comparison — teleoperation vs. video learning vs. UMI-style handheld. Source: Choi, SNU Data Science Seminar 2026.
System Input Device Haptic Feedback Throughput Cost
AnyTeleop RGB camera None Baseline Low
DexPilot RGB-D camera None Baseline Low
Bunny-VisionPro Vision Pro Yes Baseline Medium
DexCap Motion capture None 3x Medium
DexUMI Exoskeleton Direct contact 3.2x Medium
DEXOP 4-bar linkage Direct contact 8x Low
AirExo Passive exoskeleton None High (in-the-wild) Very low (~$300)
ACE Camera+exoskeleton None Baseline Medium
NuExo Active exoskeleton Yes (haptic) Baseline High
EgoDex Vision Pro + ARKit None Very high (829h) Medium
Internet video None (observation) None Unlimited Very low

Summary and Outlook

Human hand data collection sits on a trade-off between throughput and data quality. Teleoperation provides high quality but scales poorly at ~10 demos/hr; internet video offers unlimited scale but lacks action labels and tactile information. The OSMO glove's Embodiment Bridge and DEXOP's mechanical coupling propose new solutions to this trade-off.

AirExo's $300 passive exoskeleton and EgoDex's 829-hour Vision Pro dataset are simultaneously advancing democratization and scaling of data collection. Furthermore, VTDexManip's 565K-frame visual-tactile dataset and TacCap's EMI-immune FBG sensors are accelerating the practical deployment of tactile-inclusive demonstration collection.

NVIDIA Isaac Teleop + MANUS standardization, internet video mining at scale, and extending synthetic data (780K trajectories/11 hours) to the tactile domain are the key directions for resolving the data bottleneck.

The next chapter examines how robots learn to manipulate from such collected data (→ Chapter 8: Contact Dynamics).


6.8 Collection Strategy: Teleop-Free Data, Co-Training, and Tactile-Rich Specialists

The related S2 survey's Part I/II sharpens Chapter 6's data-collection argument. Human hand data is not one category. Egocentric video, stretchable gloves, tactile gloves, passive exoskeletons, and handheld grippers each provide different cost/modality tradeoffs. "Large-scale data" is therefore not enough; the role of each data type must be defined.

The useful S2 frame has three layers. First, teleop-free Data B collects task diversity and human strategies quickly. Second, a smaller amount of Data A provides the action manifold that the target robot can actually execute. Third, tactile-rich specialist data raises the ceiling on tasks such as wiping, insertion, cap tightening, and sequential multi-object grasping, where contact quality determines success.

This also fits EgoScale. More than 20,000 hours of egocentric video can build a strong hand-motion prior, but without tactile and force signals it cannot explain the last 20-30% of contact-rich failures. Conversely, collecting everything with tactile gloves limits scale. A practical strategy layers large vision-only priors, medium-scale human tactile data, and a small amount of executable robot data.

For the Cosmax-style problem, this distinction is essential. Sequential multi-object grasping cannot infer safe contact transitions from video alone, while pure teleoperation cannot collect enough diversity. The system should learn candidate strategies from natural human data and verify force closure and slip margin on the robot hand through tactile feedback.

Operational Reading Note

The practical value of this chapter is not only the concept of human-hand data collection; it is the set of engineering decisions that the concept changes. A deployable robot-hand project should start by asking what state becomes observable after this chapter is applied. The answer should be concrete: contact existence, contact patch, normal force, shear direction, slip margin, object pose, task phase, operator override, or product-damage risk. If a variable cannot be logged or consumed by a controller, it remains an explanatory idea rather than a system capability.

The second decision is the unit of evidence. Research demos often report one success metric, but tactile manipulation improves through failure records. A useful attempt record contains the object or SKU, the selected grasp candidate, the robot hand and sensor configuration, calibration version, task phase, tactile summary, policy action, safety intervention, and final outcome. This record is what connects the sensor chapters to the data chapter, the control chapters to the learning chapters, and the manufacturing chapters to QA.

The third decision is where the chapter sits in the control stack. Some ideas belong in fast reflex loops, some in contact MPC, some in policy inputs, and some only in offline diagnosis. Mixing these time scales creates brittle systems: a VLA cannot react to millisecond slip, and a low-level force controller cannot infer the next process step. The right architecture separates fast contact stabilization, mid-level grasp or rearrangement control, and slow task reasoning.

Finally, the chapter should be evaluated by the failure modes it removes. A method that improves benchmark success but leaves the team unable to distinguish perception failure, contact-acquisition failure, force-closure failure, execution-time slip, or maintenance drift is not yet production-ready. A method with slightly lower headline performance but better logs, safer force limits, and clearer recovery hooks may be the stronger basis for manufacturing Physical AI.

References

  1. Romero, J., Tzionas, D., & Black, M. J. (2017). Embodied hands: Modeling and capturing hands and bodies together. SIGGRAPH Asia 2017. scholar
  2. 2024 study. Stretchable liquid-metal sensor glove. Nature Communications. https://doi.org/10.1038/s41467-024-50101-w scholar
  3. Sundaram, S., Kellnhofer, P., Li, Y., Zhu, J.-Y., Torralba, A., & Matusik, W. (2019). Learning the signatures of the human grasp using a scalable tactile glove. Nature, 569, 698-702. scholar
  4. Ruppel, P., et al. (2024). Reduced tactile sensor array for grasp analysis. Sensors. scholar
  5. Yin, J., Qi, H., Wi, Y., Kundu, S., Lambeta, M., Yang, W., Wang, C., Wu, T., Malik, J., & Hellebrekers, T. (2025). OSMO: Open-source tactile glove for human-to-robot skill transfer. arXiv preprint. arXiv:2512.08920. #18 scholar
  6. Xu, M., Zhang, H., Hou, Y., Xu, Z., Fan, L., Veloso, M., & Song, S. (2025). DexUMI: Using human hand as the universal manipulation interface for dexterous manipulation. arXiv preprint. #8 scholar
  7. Si, Z., et al. (2025). ExoStart: From 10 exoskeleton demos to dexterous robot manipulation. #9 scholar
  8. Fang, H.-S., Romero, B., Xie, Y., et al. (2025). DEXOP: A device for robotic transfer of dexterous human manipulation. arXiv preprint. arXiv:2509.04441. #10 scholar
  9. Qin, Y., et al. (2023). AnyTeleop: A general vision-based dexterous robot hand-arm teleoperation system. RSS 2023. scholar
  10. Handa, A., & Van Wyk, K. (2020). DexPilot: Vision-based teleoperation for dexterous manipulation. ICRA 2020. scholar
  11. Ding, Z., et al. (2024). Bunny-VisionPro: Real-time bimanual dexterous teleoperation for imitation learning. arXiv preprint. arXiv:2407.03162. scholar
  12. Wang, C., et al. (2024). DexCap: Scalable and portable mocap data collection system. RSS 2024. scholar
  13. Shaw, K., Bahl, S., & Pathak, D. (2024). Learning dexterity from human hand motion in internet videos. arXiv preprint. arXiv:2212.04498. scholar
  14. Liu, Y., et al. (2025). ImMimic: Large-scale human trajectory + few-shot teleoperation interpolation. scholar
  15. Li, Y., et al. (2024). DexH2R: Task-oriented dexterous manipulation from human to robots. arXiv preprint. scholar
  16. Various. (2025). DOGlove: Low-cost open-source haptic feedback glove. scholar
  17. Various. (2026). Feel Robot Feels: Tactile feedback array glove. scholar
  18. Murphy, L., et al. (2025). Capacitive tactile sensing for teaching by demonstration. arXiv preprint. scholar
  19. Tang, M., et al. (2025). FSR sensor optimization for grasp classification. IEEE Journal of Biomedical and Health Informatics. scholar
  20. Chen, H., et al. (2025). Capacitive sensor for lift-risk identification. Applied Ergonomics. scholar
  21. 2024 study. ML-based wearable sensors for real-time hand motion recognition. PMC. (Seoul National University) scholar
  22. TacCap. (2025). TacCap: FBG optical tactile sensor thimble for human and robot fingertips. arXiv preprint, Mar 2025. scholar
  23. VTDexManip. (2025). VTDexManip: A large-scale visual-tactile dataset for dexterous manipulation from human demonstrations. ICLR 2025. scholar
  24. Fang, J., et al. (2024). AirExo: Low-cost exoskeletons for learning whole-arm manipulation in the wild. ICRA 2024. scholar
  25. Fang, J., et al. (2025). AirExo-2: Scaling up generalizable manipulation skills via purely kinesthetic demonstrations in the wild. CoRL 2025. scholar
  26. Zhao, Q., et al. (2024). ACE: A cross-platform visual-exoskeleton system for low-cost dexterous teleoperation. CoRL 2024. scholar
  27. NuExo. (2025). NuExo: A 5.2 kg active upper-limb exoskeleton for dexterous teleoperation with 100% ROM. ICRA 2025. scholar
  28. HumanoidExo. (2025). HumanoidExo: Lightweight exoskeleton with LiDAR for full-body humanoid teleoperation and locomotion learning. arXiv preprint, Oct 2025. scholar
  29. EgoDex. (2025). EgoDex: Learning dexterous manipulation from large-scale egocentric hand data via Apple Vision Pro. Apple, 2025. scholar [Apple, 2025]
  30. Choi, H., Hou, Y., Pan, C., Hong, S., Patel, A., Xu, X., Cutkosky, M. R., & Song, S. (2026). In-the-Wild Compliant Manipulation with UMI-FT. arXiv preprint. arXiv:2601.09988. #36 scholar
  31. Bansal, A., et al. (2026). EgoScale: Scaling human video to unlock dexterous robot intelligence. NVIDIA GEAR. https://research.nvidia.com/labs/gear/egoscale/ [Bansal et al., 2026] source
  32. Um, T. (2026). S2 From Human Hands to Robot Hands: large-scale tactile hand data survey. source [Um, 2026]