CEER: Compliant End-Effector and Root Control as a Unified Interface for Hierarchical Humanoid Loco-Manipulation

Abstract

Abstract— Humanoid robots have achieved impressive loco-motion performance, yet contact-rich and long-horizon manipulation remains a major bottleneck. Manipulation is inherently contact-rich and demands compliant whole-body control for stable interaction, while its diversity and long-horizon nature favor modular, planner-compatible interfaces over joint-space tracking. We propose CEER, a compliant end-effector–root (EE-root) control abstraction for modular humanoid loco-manipulation within a hierarchical planning framework. CEER enables compliance-aware whole-body control in an interpretable task space defined by root motion commands and end-effector pose targets, and supports plug-and-play integration with heterogeneous high-level planners. A teacher–student framework is adopted to distill a general motion-tracking controller into a low-level policy that consumes only EE-root commands. We further construct a hierarchical system that integrates heterogeneous planners and task modules through the EE-root interface, enabling diverse manipulation tasks without retraining the underlying whole-body policy. Experiments in simulation and on hardware demonstrate 3.3 cm end-effector tracking accuracy with substantially reduced jerk compared to baselines, stable contact-rich manipulation under teleoperation, and up to 70% success in simulated single-object loco-manipulation tasks within a room-scale environment. These results indicate that compliant EE-root control provides a practical abstraction for humanoid loco-manipulation, enabling modular and scalable integration of diverse skills.

Video

System Framework

Overview of the proposed three-layer hierarchical system. At the high level, a language instruction is interpreted by an LLM-based skill manager, which selects and composes mid-level skills based on environmental information. The mid-level consists of plug-and-play locomotion and manipulation modules that generate unified end-effector and root commands. At the low level, a unified CEER policy converts these commands into joint-space actions. This modular design decouples task reasoning, skill execution, and low-level control, enabling scalable and extensible system integration.

System Evaluation on Long-Horizon Tasks in a Room Scene

We evaluate the proposed three-layer hierarchical system in simulation on long-horizon household tasks. Under the unified end-effector and root control interface, tasks are decomposed into locomotion, manipulation invocation, and task completion. To focus on system-level capability and connectivity, we use a minimal skill set and a single grasp primitive across all tasks.

The simulated room contains a blue bed, a yellow table, and a green sofa. The videos below compare two control modes on the same four tasks: LLM-driven execution and human keyboard teleoperation.

For the human baseline, five participants perform the same tasks through keyboard teleoperation with the same control degrees of freedom as the robot. After a short practice period, each participant is given two attempts per task type, for a total of 40 trials. We record task success rate and completion time.

In keyboard teleoperation, W, S, A, and D control forward, backward, left, and right base velocity; Q and E control yaw rotation; I, K, J, and L control end-effector motion in the horizontal plane; and U and O move the end-effectors up and down along the vertical axis. The two end-effectors are controlled symmetrically with respect to the robot's sagittal plane.

LLM-Controlled Examples

Move the blue box to the middle of the yellow and red boxes

Move the blue box to the bed

Move the blue box to the sofa

Move the blue box to the bed, then move the red box to the bed

Keyboard-Controlled Examples

Move the blue box to the middle of the yellow and red boxes

Move the blue box to the bed

Move the blue box to the sofa

Move the blue box to the bed, then move the red box to the bed

Citation

@misc{luo2026ceercompliantendeffectorroot,
      title={CEER: Compliant End-Effector and Root Control as a Unified Interface for Hierarchical Humanoid Loco-Manipulation}, 
      author={Xinyuan Luo and Xingrui Chen and Xunjian Yin and Hongxuan Wu and Boxi Xia and Zhuoqun Chen and Jinzhou Li and Boyuan Chen and Xianyi Cheng},
      year={2026},
      eprint={2605.19981},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2605.19981}, 
}