Periodic-MAE: Periodic Video Masked Autoencoder for rPPG Estimation
Periodic-MAE: Periodic Video Masked Autoencoder for rPPG Estimation
In this paper, we propose a method that learns a general representation of periodic signals from unlabeled facial videos by capturing subtle changes in skin tone over time. The proposed framework employs the video masked autoencoder to learn a high-dimensional spatio-temporal representation of the facial region through self-supervised learning. Capturing quasi-periodic signals in the video is crucial for remote photoplethysmography (rPPG) estimation. To account for signal periodicity, we apply frame masking in terms of video sampling, which allows the model to capture resampled quasi-periodic signals during the pre-training stage. Moreover, the framework incorporates physiological bandlimit constraints, leveraging the property that physiological signals are sparse within their frequency bandwidth to provide pulse cues to the model. The pre-trained encoder is then transferred to the rPPG task, where it is used to extract physiological signals from facial videos. We evaluate the proposed method through extensive experiments on the PURE, UBFC-rPPG, MMPD, and V4V datasets. Our results demonstrate significant performance improvements, particularly in challenging cross-dataset evaluations. Our code is available at https://github.com/ziiho08/Periodic-MAE.
Jiho Choi、Sang Jun Lee
医学研究方法电子技术应用
Jiho Choi,Sang Jun Lee.Periodic-MAE: Periodic Video Masked Autoencoder for rPPG Estimation[EB/OL].(2025-06-27)[2025-07-25].https://arxiv.org/abs/2506.21855.点此复制
评论