Chapter 7: Haptic Feedback for Teleoperation — Returning Sensation to the Operator
Overview
Chapter 6 solved how to capture the human hand's output — motion-tracking gloves, tactile gloves, exoskeletons, and teleoperation input devices moved human motion and contact into the robot. But it left one problem unsolved: at the moment of capture, the operator cannot feel what their own hands are doing. When a teleoperator drives a dexterous hand through vision alone, the operator's sensorimotor loop is broken. The robot records contact forces the human never feels, so demonstrations are slow, error-prone, or fail outright at the "last millimeter" of contact.
The spine of this chapter is one sentence: the missing ingredient is the feedback loop back to the operator — kinesthetic (proprioceptive) force and tactile (cutaneous) actuation. And crucially, this is no longer an assertion but an empirical fact backed by 2024–2026 data. Where Chapter 6 captured the human hand's output, Chapter 7 returns sensation to the human's input, closing the loop the capture devices broke.
After reading this chapter, you will be able to... - Explain with numbers why vision-only contact-rich data collection fails (the last-millimeter problem). - Distinguish kinesthetic feedback from tactile actuation, and the actuator families behind each. - Understand the trade-offs wearable interfaces face (weight, bandwidth, power, durability). - Evaluate the effect of haptic-in-the-loop on data collection and policy quality, and the physical-haptics-versus-AR-surrogate fork.
The chapter's skeleton draws on talks from a haptics-and-tactile seminar held at Korea University in June 2026. The presentations of four researchers — Sang-Youn Kim (KOREATECH), Jung-Hwan Youn, Youngsu Cha (Korea University), and Chaeyong Park (Korea University) — map precisely onto this chapter's four axes: actuator taxonomy, the last-millimeter problem, soft-wearable limits, and multimodal HCI haptics. We weave that thread directly into the narrative.
7.1 Why Haptic Feedback? The Teleoperation Data-Collection Bottleneck
We do not open with a device taxonomy. We open with the failure.
The decisive datapoint is DexViTac [1]. Across four contact-rich tasks (pipetting, whiteboard erasing, marker insertion, fruit collection), the vision-only baseline averaged 17.5% success while the full visuo-tactile system reached 85.8% — a ~68-point collapse. The point is not merely that vision-only data is lower quality; for contact-rich dexterity, the data often does not get collected at all.
KineDex shows the other face of the same failure [2]. In teleoperation, where the operator feels no contact, the collection success rate itself is below 50%, whereas in kinesthetic teaching — where a human directly moves the hand and feels real contact forces — it reaches ~100%. Operators frequently cannot even complete a demonstration without feeling contact.
There is a telling contrast here. DexViTac is a capture-only rig that provides no haptic feedback to the operator (that capture machinery is Chapter 6 territory). Even with dense fingertip tactile capture, its vision-only ablation collapsed. The problem, in other words, is not the absence of sensor-side capture but the absence of the operator-side feedback loop. This is exactly where this chapter diverges from Chapter 6.
An Old, Already-Validated Finding
There is a risk a reader dismisses this as well-known. The better move is to own the precedent. Robot-assisted surgery has documented the same failure for over 15 years. Okamura [3] showed that in vision-only surgical teleoperation operators apply excessive, poorly regulated forces (tissue damage, suture breakage), and that adding force feedback or sensory substitution measurably reduces applied force and error.
So the chapter's precise position is this: the dexterous-data-collection community is re-discovering, at scale and with learned policies, what bilateral-teleoperation control theory and surgical robotics established decades ago. That reframes the chapter as a synthesis bridging two literatures that rarely cite each other.
The phrase "last-millimeter problem" is a framing offered at the June 2026 Korea University seminar — it is best taken as an operator's intuition, not a citable metric. At the final instant of precise contact alignment and fine force regulation, vision is occluded, depth is ambiguous, and the operator needs resistance at the fingertip.
7.2 Kinesthetic (Force/Proprioceptive) Feedback
Haptics splits into two channels. Kinesthetic is the force, torque, and posture sensed at joints and muscles; tactile/cutaneous is the texture, slip, and temperature captured by skin mechanoreceptors. Hayward et al. [4] is the tutorial that codifies this dichotomy and the device taxonomy that follows from it. This chapter's spine rests on that distinction.
The kinesthetic channel has a mature 30-year control theory. Massie & Salisbury [5]'s PHANToM was the first widely adopted grounded kinesthetic interface, rendering a controlled force vector at the fingertip with three actuators to create the illusion of touching a solid virtual object. This "joystick-type" grounded device became the standard research tool and the archetype that the seminar contrasts with glove-type haptics.
Two axes of control theory matter. Lawrence [6] formalized transparency (the match between the impedance the operator feels and the actual remote-environment impedance) and derived the four-channel architecture — transmitting both position and force in each direction — that achieves ideal transparency in the delay-free limit, while establishing the stability-versus-transparency trade-off every force-feedback loop must navigate. Niemeyer & Slotine[7] introduced wave variables, a passivity-based encoding that renders the communication channel passive under any constant delay, guaranteeing stability of the force-reflecting loop — the result that made networked force feedback viable.
Tellingly, the classical tension — you cannot have perfect transparency and guaranteed passivity simultaneously under delay — re-emerges today for networked teleoperation with learned policies. And the fact that wearable cutaneous systems sidestep this stability problem by giving up the grounded force loop entirely is what sets up the design fork in 7.4.
From Grounded to Wearable Gloves
CyberGrasp [8] is the archetypal force-feedback exoskeleton glove: a tendon-routed exoskeleton worn over a data glove, with five actuators (one per finger) applying resistive forces from a desktop module so the operator's fingers cannot penetrate a virtual or remote object. Developed under a US Navy STTR contract for telerobotics, this heavy grounded device tethers the operator and renders no cutaneous cues (texture, temperature, slip). Modern lightweight wearables are a reaction to exactly those limits.
One modern answer is CDF-Glove [9]: a lightweight, low-cost (~US$230) cable-driven force-feedback glove that provides real-time state for 20 finger DoF (16 directly sensed, 4 passively coupled), validated across robot hands of differing kinematics. It is positioned explicitly as a feedback-loop device for improving imitation-learning demonstration quality. (Note: this paper's authors/affiliation are not yet confirmed from the primary source, so we describe it conservatively.)
Prometheus [10] is another route: an open-source mocap-based teleoperation system with integrated force feedback, so the operator drives the robot via motion capture while receiving force feedback for higher-quality demonstrations. As a "universal" open-source counterpart to proprietary force-feedback rigs, it lowers the barrier to haptic-in-the-loop data collection.
KineDex [2] shows the most extreme form — the human IS the loop. Arguing that retargeted teleoperation without direct haptic feedback fails to capture the contact forces hard dexterous tasks require, it uses hand-over-hand kinesthetic teaching: the operator directly moves the dexterous hand and feels real contact. Policies trained on these demos collect data 2–3× faster at ~100% collection success versus teleoperation.
How Precise Must Force Resolution Be?
Kinesthetic force-feedback hardware need not be ultra-precise, because human force discrimination is itself not very fine. Jones [11] measured the differential threshold (JND, Weber fraction) for muscular force at roughly 7%, and Pang et al.[7] reported about 5–10% for active finger force. These numbers underwrite the design guideline offered at the June 2026 Korea University seminar — "~10% resolution suffices, ~25% below ~0.5 N" — meaning the precision demand on force-feedback actuators is modest, lowering the engineering bar.
The force range warrants care. The functional force envelope stated at the same seminar — precision grip ~0.2–2 N, power grip ~2–20 N — is a seminar/design figure, not a value from a single paper. Jones and Pang et al. address discrimination thresholds, not this grip-range envelope. The envelope should therefore be cited only with its origin made explicit.
7.3 Tactile (Cutaneous) Actuation
The cutaneous channel is best read as a capability ladder of skin-stimulation modes: vibration → friction/electrovibration → soft-DEA normal/shear → temperature. Expressiveness rises along the ladder, and so does the cost in bandwidth, power, and wearability.
The first rung is vibrotactile. Choi & Kuchenbecker [13] is the canonical reference establishing that human vibration perception peaks near the Pacinian band (~200–300 Hz) and cataloging the main commercial actuator families — ERM (eccentric rotating mass), LRA (linear resonant actuator), piezo, voice-coil — with their bandwidth/latency/power trade-offs. The ERM/LRA/piezo taxonomy Sang-Youn Kim presented at the June 2026 Korea University seminar sits exactly on this foundation, and his group has extended it into transparent, flexible vibrotactile actuators [14].
The second rung is electrovibration. Bau et al. [15]'s TeslaTouch renders texture with no moving parts: an oscillating voltage on an insulated conductive surface modulates the electrostatic attraction with a sliding fingertip, programming perceived friction. It is fast, low-power, and scalable to arbitrary shapes, but it must have a moving fingertip to be felt — a static finger feels nothing.
The third rung is soft DEA (dielectric elastomer actuators). Pelrine et al. [16] demonstrated DEAs — a soft elastomer film with compliant electrodes that contracts in thickness and expands in area under voltage (Maxwell stress) — achieving very large area strains with prestrain. They are silent and high-strain but have low force density and need kV-class drive, which makes them hard to wear. It is precisely this limit that led Jung-Hwan Youn's seminar work to couple DEAs with springs and tile them into arrays.
That actuation frontier is the FCDEA skin patch [17]: a flat-cone DEA array, thin and soft, that delivers spatiotemporally adjustable static-to-dynamic force over large skin areas from voltage signals, integrated with a photomicrosensor array into a wireless self-sensing patch — skin-attached, soft, voltage-driven stimulation in place of a bulky pneumatic glove (e.g., HaptX G1). (Exact force/frequency/array specs were not fully extracted from the primary source, so we cite the method and concept.)
The fourth rung is temperature. Jones & Ho [18] established that Peltier-based thermal displays reproduce skin-object heat transfer to aid material recognition. But the thermal channel is seconds-slow and strongly area- and baseline-temperature-dependent, making it a supplementary channel unsuited to fast contact-event feedback.
An origami-based approach points another way [19]: an ~2 g ultralight wearable interface that delivers normal force, shear force, and vibration to the fingertip via a novel pneumatic actuator with an installable origami pump, running self-contained without an external air supply. This directly realizes the origami-based normal/shear concept and the pragmatic verdict Youngsu Cha gave at the June 2026 Korea University seminar — "vibration is what actually works in teleoperation, so commercial haptics are vibration-based." (Its exact force values were also not extracted from the primary source, so we cite at the concept level.)
Cha's pragmatic conclusion is the honest counterweight to soft-actuator hype: vibration works best, which is why commercial haptics are vibration-based, while soft normal/shear actuators are expressive but hit the durability wall.
7.4 Wearable Tactile Interfaces for Dexterous Teleoperation
Here the kinesthetic-versus-tactile tension becomes a design fork. Pacchierotti et al. [20] provided the canonical taxonomy of wearable (ungrounded) fingertip and hand haptic interfaces and made the core trade-off explicit: cutaneous-only wearables are portable and intrinsically passive (stable) but cannot render large net forces. Grounded kinesthetic gloves (7.2) give net force but tether and destabilize; wearable cutaneous interfaces are stable and mobile but cannot push back.
The frontier (FCDEA, origami) tries to widen the cutaneous channel to soften this trade-off. The bolder bet is sensory substitution. Bach-y-Rita & Kercel [21] established that information from artificial receptors, delivered through an intact channel such as the skin, can be perceived as the substituted modality after training. This justifies re-routing remote contact information to the operator's skin and is the neuroscientific basis for the human→human→robot tactile transfer that Jung-Hwan Youn proposed at the June 2026 Korea University seminar. It remains, however, a named-but-unsolved goal — no frontier paper closes it end-to-end.
The DOGlove, DexCap, Bunny-VisionPro, and AnyTeleop introduced as input devices in Chapter 6 can be viewed here, in one sentence, as the same devices seen as feedback loops — we do not re-explain their input/capture mechanisms (see Chapter 6). Likewise, exoskeleton capture devices such as DEXOP and DexUMI belong to Chapter 6. By contrast, CyberGrasp is a force-feedback device, not a capture device, so this chapter owns and explains it.
A fundamentally different answer to the wearable wall is AR. TactAR lets the operator "see" contact in 3D space without wearing any actuator — the surrogate-feedback route taken up in the next section.
7.5 Haptic-in-the-Loop Data Collection and Imitation Learning
This is the payoff section. Closing the loop to the operator helps both the human (collection success and speed) and the policy (downstream success). Ordered by effect size, the evidence reveals a gradient, not a single anecdote.
The most cautious anchor is Cuan et al.[14]: a clean controlled study with 8 real conference-room doors and 6 IL models (3 from haptic-aided data, 3 vision-only), in which haptic feedback raised data throughput by +6% and policy success by +11% on average. The effect is modest, but as early controlled evidence it anchors the claim.
The large effect appears in dexterous tasks. CDF-Glove [9] showed 4× task success over no-feedback teleoperation, and policies trained on its glove demos beat kinesthetic teaching by +55% average success with −47.2% completion time.
KineDex [2] isolates what the loop adds. On top of 74.4% overall success across nine tasks, force control contributed +57.7% over position-only and tactile sensing +26.7% on contact-intensive tasks. RDP/TactAR mirrors this on the policy side [23]: once force/tactile closes the loop, bimanual lifting goes 0.00 → 0.70, perturbation-after-contact recovery 0.19 → 0.88, and peeling 0.44 → 0.95.
The KAIST finger-linkage collector is another elegant embodiment [24]. The jaws of a two-jaw gripper are driven directly by the operator's fingers through a compact mechanical linkage, so that the mechanical linkage itself is the haptic loop — force transparency is mechanical, not actuated. It integrates a GoPro fisheye camera, 3D-ViTac 16×16 force arrays on the jaws, and a phase-annotation button to support coarse-to-fine imitation learning. Because this system has no with/without-haptic ablation and demonstrates only a single toy task, we cite it as a system-design contribution and attribute no success delta to it.
Three architectures now stand out:
- Kinesthetic teaching — the human is the loop (KineDex). The most direct, but it does not scale to remote or dangerous settings, since the operator must physically hold the hand.
- Teleop + physical feedback — scales, but needs hardware (gloves, exoskeletons): CDF-Glove, Prometheus, DOGlove.
- AR surrogate feedback — cheapest, but not somatosensory: TactAR.
TactAR [23] renders the tactile deformation/force field as a 3D overlay "attached" to the robot end-effector in AR, so the operator perceives contact in 3D space without a wearable actuator. The Reactive Diffusion Policy (RDP) running on top uses a slow latent-diffusion policy for high-level action chunks and a fast asymmetric tokenizer for high-frequency closed-loop tactile/force control (fast inference <1 ms). The policy details of RDP belong to the learning thread; here we treat only its feedback-loop / AR aspect.
Finally, ManipForce [25] motivates the representation side rather than the actuator. It learns contact-rich policies with a frequency-aware representation of contact forces, reflecting on the policy side that contact and impact carry a frequency-domain signature single-amplitude vibration cannot reproduce — we cite it only as motivation that feedback richer than amplitude alone is needed.
To be honest, effect sizes span +6% to 4×, and the large numbers come from different tasks, hands, and baselines. This is the signature of a young literature with no shared benchmark, and it leads directly into 7.6.
7.6 Limits and Open Problems — Physical Haptics versus Surrogate AR
The chapter's intellectual payoff is the crux of 7.6: physical haptic return versus surrogate AR feedback. AR (TactAR) is scalable and dodges every wearable constraint, but it is visual-spatial, not somatosensory — the operator sees contact but never feels it. Physical haptics is somatosensory but blocked by the wearable wall. There is currently no evidence on which produces better policies at equal cost. This is the chapter's headline open question.
The remaining problems are as follows.
- The wearable wall. Weight, actuator size, bandwidth, power, and the breadth of skin stimulation. HaRing [27] buys weight with a single ring, and the ~2 g origami interface [19] with extreme lightness — both at the cost of spatial resolution. FCDEA [17] needs kV-class drive, a safety/wearability trade-off. No device yet offers full-hand, high-resolution, low-power, durable feedback.
- Frequency-domain force is under-served. Contact and impact carry a spectral signature that single-amplitude vibration cannot reproduce — an insight Chaeyong Park presented at the June 2026 Korea University seminar, mirrored on the representation side by ManipForce [25]. Park's multimodal HCI haptics work (the modular controller HaptiCraft [26], frequency-domain impact modeling, vibration-augmented virtual buttons) suggests which feedback channels matter when a teleoperator drives a robot. Richer-than-vibration actuation is an open target.
- Soft-haptic durability. The blocker Youngsu Cha repeatedly flagged at the June 2026 Korea University seminar — the engineering problem to solve before fielding soft wearable feedback in real teleoperation pipelines.
- Capture-side blindness creates the need. Gloves occlude the operator's own tactile sense while capturing (DexViTac's "glove blindness"). The feedback loop is needed because the capture device removed the natural one. Designing capture and feedback as one system is open.
- Sim/human tactile data and human→robot somatosensory transfer. Bach-y-Rita & Kercel [21] license re-routing contact to the skin, and Jung-Hwan Youn named human→human→robot tactile transfer, but no frontier paper closes it end-to-end. Simulating human-grade tactile data for the feedback side is also unsolved.
- No shared benchmark. Effect sizes from +6% to 4× are non-comparable. And the classical stability-versus-transparency trade-off established by Lawrence [6] and Niemeyer & Slotine[7] re-emerges for networked, learned-policy teleoperation with no unified modern treatment.
The chapter's honest weakness is best not hidden: the empirical core rests on a small number of 2024–2026 papers, with no shared benchmark and effect sizes spanning incompatible tasks, hands, and baselines. But that itself signals a field just taking shape, and the problems above sketch the map for the next five years.
The next step is how a robot, using data collected through this closed loop, actually reads contact as a control input and learns to manipulate (→ Chapter 8: Contact Dynamics). The kinematic side of human→robot transfer via retargeting belongs to the later embodiment chapter (→ Chapter 15); this chapter owns the sensory return path within it.
References
- Chen, X., Pan, Y., Li, M., & Ding, X. (2026). DexViTac: Collecting Human Visuo-Tactile-Kinematic Demonstrations for Contact-Rich Dexterous Manipulation. arXiv preprint. arXiv:2603.17851.
- Zhang, D., Yuan, C., Wen, C., Zhang, H., Zhao, J., & Gao, Y. (2025). KineDex: Learning Tactile-Informed Visuomotor Policies via Kinesthetic Teaching for Dexterous Manipulation. arXiv preprint. arXiv:2505.01974.
- Okamura, A. M. (2009). Haptic Feedback in Robot-Assisted Minimally Invasive Surgery. Current Opinion in Urology, 19(1):102-107.
- Hayward, V., Astley, O. R., Cruz-Hernandez, M., Grant, D., & Robles-De-La-Torre, G. (2004). Haptic Interfaces and Devices. Sensor Review, 24(1):16-29.
- Massie, T. H., & Salisbury, J. K. (1994). The PHANToM Haptic Interface: A Device for Probing Virtual Objects. ASME Dynamic Systems and Control Division, DSC 55-1:295-302.
- Lawrence, D. A. (1993). Stability and Transparency in Bilateral Teleoperation. IEEE Transactions on Robotics and Automation, 9(5):624-637.
- Niemeyer, G., & Slotine, J.-J. E. (1991). Stable Adaptive Teleoperation. IEEE Journal of Oceanic Engineering, 16(1):152-162.
- CyberGlove Systems LLC. (2007). CyberGrasp: A Force-Reflecting Exoskeleton Glove for the Hand. CyberGlove Systems product (origin: Virtual Technologies Inc., US Navy STTR). [CyberGlove Systems, 2007]
- CDF-Glove. (2026). CDF-Glove: A Cable-Driven Force Feedback Glove for Dexterous Teleoperation. arXiv preprint. arXiv:2603.05804. [CDF-Glove, 2026]
- Satsevich, S., Bazhenov, A., Egorov, S., Erkhov, A., Gromakov, M., Fedoseev, A., & Tsetserukou, D. (2025). Prometheus: Universal, Open-Source Mocap-Based Teleoperation System with Force Feedback for Dataset Collection in Robot Learning. arXiv preprint. arXiv:2510.01023.
- Jones, L. A. (1989). Matching Forces: Constant Errors and Differential Thresholds. Perception, 18(5):681-687.
- Pang, X.-D., Tan, H. Z., & Durlach, N. I. (1991). Manual Discrimination of Force Using Active Finger Motion. Perception & Psychophysics, 49(6):531-540.
- Choi, S., & Kuchenbecker, K. J. (2013). Vibrotactile Display: Perception, Technology, and Applications. Proceedings of the IEEE, 101(9):2093-2104.
- Mahanthappa, M., Ko, H.-U., & Kim, S.-Y. (2024). Transparent and Flexible Actuator Based on a Hybrid Dielectric Layer of Wavy Polymer and Dielectric Fluid Mixture. peer-reviewed (PMC10819412).
- Bau, O., Poupyrev, I., Israr, A., & Harrison, C. (2010). TeslaTouch: Electrovibration for Touch Surfaces. ACM UIST 2010, pp. 283-292.
- Pelrine, R., Kornbluh, R., Pei, Q., & Joseph, J. (2000). High-Speed Electrically Actuated Elastomers with Strain Greater Than 100%. Science, 287(5454):836-839.
- Youn, J.-H., Jang, S.-Y., Hwang, I., Pei, Q., Yun, S., & Kyung, K.-U. (2025). Skin-Attached Haptic Patch for Versatile and Augmented Tactile Interaction. Science Advances.
- Jones, L. A., & Ho, H.-N. (2008). Warm or Cool, Large or Small? The Challenge of Thermal Displays. IEEE Transactions on Haptics, 1(1):53-70.
- Min, S., et al., & Cha, Y. (2025). Ultralight Soft Wearable Haptic Interface with Shear-Normal-Vibration Feedback. Advanced Intelligent Systems. doi:10.1002/aisy.202500374. [Min et al., 2025]
- Pacchierotti, C., Sinclair, S., Solazzi, M., Frisoli, A., Hayward, V., & Prattichizzo, D. (2017). Wearable Haptic Systems for the Fingertip and the Hand: Taxonomy, Review, and Perspectives. IEEE Transactions on Haptics, 10(4):580-600.
- Bach-y-Rita, P., & Kercel, S. W. (2003). Sensory Substitution and the Human-Machine Interface. Trends in Cognitive Sciences, 7(12):541-546.
- Cuan, C., Okamura, A. M., & Khansari, M. (2024). Leveraging Haptic Feedback to Improve Data Quality and Quantity for Deep Imitation Learning Models. IEEE Transactions on Haptics. doi:10.1109/TOH.2024.3384482.
- Xue, H., Ren, J., Chen, W., Zhang, G., Fang, Y., Gu, G., Xu, H., & Lu, C. (2025). Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation (TactAR). RSS 2025. arXiv:2503.02881.
- Kim, Y., Oh, N., Park, J., Thamronglak, T., & Park, D. (2026). A Visuo-Tactile Data Collection System with Haptic Feedback for Coarse-to-Fine Imitation Learning. arXiv preprint. arXiv:2605.08757.
- Lee, G., Lee, Y., Kim, K., Lee, S., Noh, S., Back, S., & Lee, K. (2025). ManipForce: Force-Guided Policy Learning with Frequency-Aware Representation for Contact-Rich Manipulation. arXiv preprint. arXiv:2509.19047. [Lee et al., 2025]
- Park, C., et al. (2026a). HaptiCraft: A Modular Multimodal Haptic Controller for Immersive Virtual Reality Interactions. IEEE TVCG 2026. [Park et al., 2026a]
- Park, C., et al. (2026b). HaRing: A Haptic Ring Interface for One-Handed Interaction with High-Dimensional Spatial Information. ACM CHI 2026. [Park et al., 2026b]