We propose Hi4D, a method and dataset for the automatic analysis of physically close human-human interaction under prolonged contact. Robustly disentangling several in-contact subjects is a challenging task due to occlusions and complex shapes. Hence, existing multi-view systems typically fuse 3D surfaces of close subjects into a single, connected mesh. To address this issue we leverage i) individually fitted neural implicit avatars; ii) an alternating optimization scheme that refines pose and surface through periods of close proximity; and iii) thus segment the fused 4D raw scans into individual instances. From these instances we compile a dataset Hi4D of 4D textured scans of 20 subject pairs, 100 sequences, and a total of more than 11K frames. Hi4D contains rich interaction centric annotations in 2D and 3D alongside accurately registered parametric body models. We define varied human pose and shape estimation tasks on this dataset and provide results from state-of-the-art methods on these benchmarks.
Vision-based disentanglement of in-contact subjects is a challenging task due to strong occlusions and a-priori unknown geometries. Hence, multi-view systems typically fuse 3D surfaces of close subjects into a single, connected mesh. In this paper, we aim to segment 4D scans of closely interacting people to obtain instance-level annotations. Our method makes use of two main components:
Hi4D is the first dataset containing rich interaction centric annotations and high-quality 4D textured geometry of closely interacting humans.
Please fill out the Hi4D Application Form to access Hi4D. We will send you an email with more information after approval of your application.We thank Stefan Walter and Dean Bakker for the infrastructure support. We thank Deniz Yildiz and Laura Wuelfroth for the data collection. We also thank all the participants who contribute to Hi4D.
@inproceedings{yin2023hi4d,
author = {Yin, Yifei and Guo, Chen and Kaufmann, Manuel and Zarate, Juan and Song, Jie and Hilliges, Otmar},
title = {Hi4D: 4D Instance Segmentation of Close Human Interaction},
booktitle = {Computer Vision and Pattern Recognition (CVPR)},
year = {2023}
}