Occlusion-aware Hand Pose Estimation Using Hierarchical Mixture Density Network

11/29/2017
by   Qi Ye, et al.
0

Hand pose estimation is to predict the pose parameters representing a 3D hand model, such as locations of hand joints. This problem is very challenging due to large changes in viewpoints and articulations, and intense self-occlusions, etc. Many researchers have investigated the problem from both aspects of input feature learning and output prediction modelling. Though effective, most of the existing discriminative methods only give a deterministic estimation of target poses. Also, due to their single-value mapping intrinsic, they fail to adequately handle self-occlusion problems, where occluded joints present multiple modes. In this paper, we tackle the self-occlusion issue and provide a complete description of observed poses given an input depth image through a hierarchical mixture density network (HMDN) framework. In particular, HMDN leverages the state-of-the-art CNN module to facilitate feature learning, while proposes a density in a two-level hierarchy to reconcile single-valued and multi-valued mapping in the output. The whole framework is naturally end-to-end trainable with a mixture of two differentiable density functions. HMDN produces interpretable and diverse candidate samples, and significantly outperforms the state-of-the-art algorithms on benchmarks that exhibit occlusions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro