Universal Facial Encoding of Codec Avatars from VR Headsets
Akshay Venkatesh Shih-En Wei Yaser Sheikh Gabriel Schwartz Chen Cao Tomas Simon Shaojie Bai Chenghui Li Jason Saragih Te-Li Wang Ryan Wrench
作者信息
Abstract
Faithful real-time facial animation is essential for avatar-mediated
telepresence in Virtual Reality (VR). To emulate authentic communication,
avatar animation needs to be efficient and accurate: able to capture both
extreme and subtle expressions within a few milliseconds to sustain the rhythm
of natural conversations. The oblique and incomplete views of the face,
variability in the donning of headsets, and illumination variation due to the
environment are some of the unique challenges in generalization to unseen
faces. In this paper, we present a method that can animate a photorealistic
avatar in realtime from head-mounted cameras (HMCs) on a consumer VR headset.
We present a self-supervised learning approach, based on a cross-view
reconstruction objective, that enables generalization to unseen users. We
present a lightweight expression calibration mechanism that increases accuracy
with minimal additional cost to run-time efficiency. We present an improved
parameterization for precise ground-truth generation that provides robustness
to environmental variation. The resulting system produces accurate facial
animation for unseen users wearing VR headsets in realtime. We compare our
approach to prior face-encoding methods demonstrating significant improvements
in both quantitative metrics and qualitative results.引用本文复制引用
Akshay Venkatesh,Shih-En Wei,Yaser Sheikh,Gabriel Schwartz,Chen Cao,Tomas Simon,Shaojie Bai,Chenghui Li,Jason Saragih,Te-Li Wang,Ryan Wrench.Universal Facial Encoding of Codec Avatars from VR Headsets[EB/OL].(2024-07-17)[2026-04-03].https://arxiv.org/abs/2407.13038.学科分类
计算技术、计算机技术/电子技术应用/通信
评论