Although occlusion widely exists in nature and remains a fundamental challenge for pose estimation, existing heatmap-based approaches suffer serious degradation on occlusions. Their intrinsic problem is that they directly localize the joints based on visual information; however, the invisible joints are lack of that. In contrast to localization, our framework estimates the invisible joints from an inference perspective by proposing an Image-Guided Progressive GCN module which provides a comprehensive understanding of both image context and pose structure. Moreover, existing benchmarks contain limited occlusions for evaluation. Therefore, we thoroughly pursue this problem and propose a novel OPEC-Net framework together with a new Occluded Pose (OCPose) dataset with 9k annotated images. Extensive quantitative and qualitative evaluations on benchmarks demonstrate that OPEC-Net achieves significant improvements over recent leading works. Notably, our OCPose is the most complex occlusion dataset with respect to average IoU between adjacent instances. Source code and OCPose will be publicly available.
This figure depicts the two stages of estimation for one single pose. The GCN-based pose correction stage contains two modules: the Cascaded Feature Adaptation and the Image-Guided Progressive GCN. Firstly a base module is employed to generate heatmaps. After that, an integral regression method is employed to transform the heatmap representation into a coordinate representation, which can be the initial pose for GCN network. The initial pose and the three feature maps from the base module are processed in Image-Guided Progressive GCN. The multi-scale feature maps are updated through the Cascaded Feature Adaptation module and put into each ResGCN Attention blocks. J1, J2 and J3 are the node features excavated on related location (x, y) from image features. The error of Initial Pose, Pose1, Pose2, and Final Pose are all considered in the objective function. Then the OPEC-Net is trained entirely to estimate the human pose.
Our approach outperforms existing state-of-the-art methods on both multi-person and couple situation.
We create a new dataset called OCPose to evaluate our method on very crowd situation. More details and the download link can be found in our here.
@inproceedings{qiu2020peeking,
title={Peeking into occluded joints: A novel framework for crowd pose estimation},
author={Qiu, Lingteng and Zhang, Xuanye and Li, Yanran and Li, Guanbin and Wu, Xiaojun and Xiong, Zixiang and Han, Xiaoguang and Cui, Shuguang},
booktitle={European Conference on Computer Vision},
pages={488--504},
year={2020},
organization={Springer}
}