Ultra Inertial Poser

Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband Ranging

Department of Computer Science, ETH Zürich
Ultra Inertial Poser teaser image

Ultra Inertial Poser estimates 3D full body poses and global translation from the inertial measurements on a sparse set of wearable sensors, augmented and stabilized by the estimated inter-sensor distances based on UWB ranging. Our lightweight standalone sensors stream raw IMU signals, from which we estimate each tracker’s 3D state and fuse it with acceleration and inter-sensor distances in a graph-based machine learning model for pose estimation.


While camera-based capture systems remain the gold standard for recording human motion, learning-based tracking systems based on sparse wearable sensors are gaining popularity. Most commonly, they use inertial sensors, whose propensity for drift and jitter have so far limited tracking accuracy. In this paper, we propose Ultra Inertial Poser, a novel 3D full body pose estimation method that constrains drift and jitter in inertial tracking via inter-sensor distances. We estimate these distances across sparse sensor setups using a lightweight embedded tracker that augments inexpensive off-the-shelf 6D inertial measurement units with ultra-wideband radio-based ranging—dynamically and without the need for stationary reference anchors. Our method then fuses these inter-sensor distances with the 3D states estimated from each sensor. Our graph-based machine learning model processes the 3D states and distances to estimate a person’s 3D full body pose and translation. To train our model, we synthesize inertial measurements and distance estimates from the motion capture database AMASS. For evaluation, we contribute a novel motion dataset of 10 participants who performed 25 motion types, captured by 6 wearable IMU+UWB trackers and an optical motion capture system, totaling 200 minutes of synchronized sensor data (UIP-DB). Our extensive experiments show state-of-the-art performance for our method over PIP and TIP, reducing position error from 13.62 to 10.65 cm (22% better) and lowering jitter from 1.56 to 0.055 km/s^3 (a reduction of 97%).



Rayan Armani, Changlin Qian, Jiaxi Jiang, and Christian Holz. Ultra Inertial Poser: Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband Ranging. In Proceedings of ACM SIGGRAPH 2024.

More images

Ultra Inertial Poser method overview

Figure 2: Overview of our method. The person is wearing 6 sensor nodes, each featuring an IMU and a UWB radio. Our method processes the raw data from each sensor to estimate sequences of global orientations, global accelerations, and inter-sensor distances. These serve as input into our learning-based pose estimation to predict leaf-joint angles as well as the global root orientation and translation as SMPL parameters.

Ultra inertial Poser Qualitative results

Figure 3: Qualitative comparison of pose estimates from the recordings in our dataset

Related projects

Articulated Full-Body Pose Tracking from Sparse Motion Sensing.
User Motion Accentuation in Virtual Reality.
Hand Out-Of-View Tracking for Mixed Reality using Inertial Sensing.
3D Hand Pose Prediction and Depth Estimation from Capacitive Images.
Uncertainty-aware Heart Rate Estimation from PPG signals via Belief Propagation.
High-Precision Translation and Rotation Sensing on Passive Surfaces.