|国家预印本平台
首页|Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision

Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision

Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision

来源:Arxiv_logoArxiv
英文摘要

Video quality assessment (VQA) is essential for quantifying perceptual quality in various video processing workflows, spanning from camera capture systems to over-the-top streaming platforms. While recent supervised VQA models have made substantial progress, the reliance on manually annotated datasets -- a process that is labor-intensive, costly, and difficult to scale up -- has hindered further optimization of their generalization to unseen video content and distortions. To bridge this gap, we introduce a self-supervised learning framework for VQA to learn quality assessment capabilities from large-scale, unlabeled web videos. Our approach leverages a \textbf{learning-to-rank} paradigm to train a large multimodal model (LMM) on video pairs automatically labeled via two manners, including quality pseudo-labeling by existing VQA models and relative quality ranking based on synthetic distortion simulations. Furthermore, we introduce a novel \textbf{iterative self-improvement training strategy}, where the trained model acts an improved annotator to iteratively refine the annotation quality of training data. By training on a dataset $10\times$ larger than the existing VQA benchmarks, our model: (1) achieves zero-shot performance on in-domain VQA benchmarks that matches or surpasses supervised models; (2) demonstrates superior out-of-distribution (OOD) generalization across diverse video content and distortions; and (3) sets a new state-of-the-art when fine-tuned on human-labeled datasets. Extensive experimental results validate the effectiveness of our self-supervised approach in training generalized VQA models. The datasets and code will be publicly released to facilitate future research.

Linhan Cao、Wei Sun、Kaiwei Zhang、Yicong Peng、Guangtao Zhai、Xiongkuo Min

计算技术、计算机技术

Linhan Cao,Wei Sun,Kaiwei Zhang,Yicong Peng,Guangtao Zhai,Xiongkuo Min.Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision[EB/OL].(2025-05-06)[2025-07-02].https://arxiv.org/abs/2505.03631.点此复制

评论