2D/3D Face Vision & Graphics | Multimedia AI Lab

Understanding and generating human faces is a fundamental and important research topic within computer vision. PI has extensive experience in research and development related to human face analysis, spanning areas such as face detection, recognition, image synthesis, 3D face reconstruction, novel-view rendering of faces, and more. Many of these algorithms have been shipped to Microsoft Azure AI services and other products (Huang et al., 2021), (Huang et al., 2023), (Kim et al., 2021).

Constructing 3D Face Morphable Model (3DMM) using a raw 3D scanned data.

In the context of 3D face reconstruction, a common approach involves utilizing a 3D face morphable model, which offers a robust prior understanding of human facial shape and texture.

Face Texture Completion.

Face Alignment.

References

2023

AAAI

FreeEnricher: Enriching Face Landmarks without Additional Cost

Yangyu Huang, Xi Chen, Jongyoo Kim, and 4 more authors

In AAAI Conference on Artificial Intelligence, 2023

Abs

Recent years have witnessed significant growth of face alignment. Though dense facial landmark is highly demanded in various scenarios, e.g., cosmetic medicine and facial beautification, most works only consider sparse face alignment. To address this problem, we present a framework that can enrich landmark density by existing sparse landmark datasets, e.g., 300W with 68 points and WFLW with 98 points. Firstly, we observe that the local patches along each semantic contour are highly similar in appearance. Then, we propose a weakly-supervised idea of learning the refinement ability on original sparse landmarks and adapting this ability to enriched dense landmarks. Meanwhile, several operators are devised and organized together to implement the idea. Finally, the trained model is applied as a plug-and-play module to the existing face alignment networks. To evaluate our method, we manually label the dense landmarks on 300W testset. Our method yields state-of-the-art accuracy not only in newly-constructed dense 300W testset but also in the original sparse 300W and WFLW testsets without additional cost.

2021

ICCV

ADNet: Leveraging Error-Bias Towards Normal Direction in Face Alignment

Yangyu Huang, Hao Yang, Chong Li, and 2 more authors

In IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2021

Abs DOI

The recent progress of CNN has dramatically improved face alignment performance. However, few works have paid attention to the error-bias with respect to error distribution of facial landmarks. In this paper, we investigate the error-bias issue in face alignment, where the distributions of landmark errors tend to spread along the tangent line to landmark curves. This error-bias is not trivial since it is closely connected to the ambiguous landmark labeling task. Inspired by this observation, we seek a way to leverage the error-bias property for better convergence of CNN model. To this end, we propose anisotropic direction loss (ADL) and anisotropic attention module (AAM) for coordinate and heatmap regression, respectively. ADL imposes strong binding force in normal direction for each landmark point on facial boundaries. On the other hand, AAM is an attention module which can get anisotropic attention mask focusing on the region of point and its local edge connected by adjacent points, it has a stronger response in tangent than in normal, which means relaxed constraints in the tangent. These two methods work in a complementary manner to learn both facial structures and texture details. Finally, we integrate them into an optimized end-to-end training pipeline named ADNet. Our ADNet achieves state-of-the-art results on 300W, WFLW and COFW datasets, which demonstrates the effectiveness and robustness.
ICCV

Learning High-Fidelity Face Texture Completion without Complete Face Texture

Jongyoo Kim, Jiaolong Yang, and Xin Tong

In IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2021

Abs DOI

For face texture completion, previous methods typically use some complete textures captured by multiview imaging systems or 3D scanners for supervised learning. This paper deals with a new challenging problem - learning to complete invisible texture in a single face image without using any complete texture. We simply leverage a large corpus of face images of different subjects (e. g., FFHQ) to train a texture completion model in an unsupervised manner. To achieve this, we propose DSD-GAN, a novel deep neural network based method that applies two discriminators in UV map space and image space. These two discriminators work in a complementary manner to learn both facial structures and texture details. We show that their combination is essential to obtain high-fidelity results. Despite the network never sees any complete facial appearance, it is able to generate compelling full textures from single images.