首页|Research on Audio-Visual Quality Assessment Dataset and Method for User-Generated Omnidirectional Video

Research on Audio-Visual Quality Assessment Dataset and Method for User-Generated Omnidirectional Video

来源：

英文摘要

In response to the rising prominence of the Metaverse, omnidirectional videos (ODVs) have garnered notable interest, gradually shifting from professional-generated content (PGC) to user-generated content (UGC). However, the study of audio-visual quality assessment (AVQA) within ODVs remains limited. To address this, we construct a dataset of UGC omnidirectional audio and video (A/V) content. The videos are captured by five individuals using two different types of omnidirectional cameras, shooting 300 videos covering 10 different scene types. A subjective AVQA experiment is conducted on the dataset to obtain the Mean Opinion Scores (MOSs) of the A/V sequences. After that, to facilitate the development of UGC-ODV AVQA fields, we construct an effective AVQA baseline model on the proposed dataset, of which the baseline model consists of video feature extraction module, audio feature extraction and audio-visual fusion module. The experimental results demonstrate that our model achieves optimal performance on the proposed dataset.

作者：Fei Zhao、Da Pan、Zelu Qi、Ping Shi

作者单位：

学科分类：通信计算技术、计算机技术

推荐引用：Fei Zhao,Da Pan,Zelu Qi,Ping Shi.Research on Audio-Visual Quality Assessment Dataset and Method for User-Generated Omnidirectional Video[EB/OL].(2025-06-11)[2025-06-29].https://arxiv.org/abs/2506.10331.点此复制

Research on Audio-Visual Quality Assessment Dataset and Method for User-Generated Omnidirectional Video

Research on Audio-Visual Quality Assessment Dataset and Method for User-Generated Omnidirectional Video

评论