egoEMOTION

Egocentric Vision and Physiological Signals for Emotion and Personality Recognition in Real-World Tasks

NeurIPS 2025
Matthias Jammot*, Bjoern Braun*, Paul Streli, Rafael Wampfler, and Christian Holz
Department of Computer Science, ETH Zürich* equal contribution
egoEMOTION teaser image

egoEMOTION is a multimodal emotion and personality recognition dataset that captures participants’ facial, eye-tracking, egocentric, and physiological signals during induced video stimuli and naturalistic real-world activities. Participants reported their emotions via emoti-SAM and a weighted Mikels’ Wheel, and their personality using the Big Five model.

Abstract

Understanding affect is central to anticipating human behavior, yet current egocentric vision benchmarks largely ignore the person’s emotional states that shape their decisions and actions. Existing tasks in egocentric perception focus on physical activities, hand-object interactions, and attention modeling—assuming neutral affect and uniform personality. This limits the ability of vision systems to capture key internal drivers of behavior. In this paper, we present egoEMOTION, the first dataset that couples egocentric visual and physiological signals with dense self-reports of emotion and personality across controlled and real-world scenarios. Our dataset includes over 50 hours of recordings from 43 participants, captured using Meta’s Project Aria glasses. Each session provides synchronized eye-tracking video, head-mounted photoplethysmography, inertial motion data, and physiological baselines for reference. Participants completed emotion-elicitation tasks and naturalistic activities while self-reporting their affective state using the Circumplex Model and Mikels’ Wheel as well as their personality via the Big Five model. We define three benchmark tasks: (1) continuous affect classification (valence, arousal, dominance); (2) discrete emotion classification; and (3) trait-level personality inference. We show that a classical learning-based method, as a simple baseline in real-world affect prediction, produces better estimates from signals captured on egocentric vision systems than processing physiological signals. Our dataset establishes emotion and personality as core dimensions in egocentric perception and opens new directions in affect-driven modeling of behavior, intent, and interaction.

Reference

Matthias Jammot*, Bjoern Braun*, Paul Streli, Rafael Wampfler, and Christian Holz. egoEMOTION: Egocentric Vision and Physiological Signals for Emotion and Personality Recognition in Real-World Tasks. In Conference on Neural Information Processing Systems 2025 (Datasets and Benchmarks, NeurIPS).