
A Technique of Synthesizing an Avatar’s Upper Facial Expression Based on the User’s Lower Face.
Leading Researcher | VR Design&Development, User Test, Interdisciplinary Collaboration
Teammates: Xinge Liu (Tsinghua University), Ziyu Han (Carnegie Mellon University)
Supervisors: Prof. Xin Yi (Tsinghua University), Prof. Xin Tong (Duke Kunshan University)
Time: 06/2022-Present
——Research Description——
In social VR, an avatar’s upper facial expression is often missing due to the occlusion of the head-mounted display (HMD). The mainstream method is to embed an eye-tracking camera in HMD to capture the user’s upper face movements. With another facial tracking camera attached to the HMD to enable the avatar’s lower facial expression, the combination could make the avatar’s whole face move. However, the connection between the upper face and the lower face has not been studied yet. Our research proposed a technique of synthesizing an avatar’s upper facial expression based on a user’s lower face with only a facial tracking camera.
——My Role——
-
Conducted literature reviews on methods of synthesizing avatars’ facial expressions.
-
Designed and developed the technique of synthesizing an avatar’s upper facial expression based on a user's lower face with the Unity game engine.
-
Collaborated with another computational design teammate and designed different blendshapes (units of an avatar's facial expression).
-
Developed a virtual meeting platform with the Mirror framework on Unity.
-
Collaborated with teammates and designed a user test to evaluate the performance of the technique.
-
Recruited 18 participants and carried out the user test.
-
Wrote a manuscript to share research outcomes with the community as an author.
-
Identified existing technique limitations. Proposed feasible and concrete iteration plans to improve the technique.
Demo Video
[Accepted as poster by IEEE VR 2023 Conference]
——VR Device Setup——
A facial tracking camera is attached to the VR Head-Mounted Display (HMD). The camera could track a user's lower face movement and retarget to the avatar's lower face with the help of 38 lower face blendshapes. Blendshape is avatars' facial expression unit. With combinations of different blendshapes can form different facial expressions. The avatar's upper face is still static (14 upper face blendshapes).

——Technique Implementation——
1. Avatar Facial Expression Blendshape Dataset Construction
-
Categorized 6 facial expressions: Happy, Sad, Angry, Disgust, Fear, and Surprise.
-
Generated 13,500 lines of blendshape data with 15 participants.

2. Neural Network for Upper Face Blendshapes Prediction
-
A five-dense layer multi-output neural network with numerical output (developed with TensorFlow)

Neural Network Structure
3. Technique Pipeline
-
Deployed the neural network model into Unity
-
Used SteamVR package to create a VR environment and used the Barracuda package to transfer data between the avatar and neural network model

