首页|基于时空特征融合的全景视频显著性检测网络

基于时空特征融合的全景视频显著性检测网络

Spatial-Temporal Network for 360 Videos Saliency Prediction

来源：

中文摘要

英文摘要

自动捕获视频显著区域对于引导用户观看全景视频至关重要，本文提出了一种基于时空特征融合的全景视频显著性检测网络，该网络由预处理模块、特征编码模块、显著性预测模块以及输出集成模块组成。预处理模块通过立方体数据填充映射解决了等角矩形平面下全景视频的边界拉伸和图像失真问题，特征编码模块提取多尺度的上下文时空特征，并基于注意力机制实现时空特征自适应权重的特征融合过程，显著性预测模块则通过Bi-ConvLSTM对帧间时序信息建模，同时结合当前帧的前一帧和后一帧的结果来预测显著区域。模型在两个公开数据集上取得了比现有模型更好的准确度，验证了模型的有效性。

utomaticly capture video salient region is very important to guide users to watch panoramic video. This paper proposes a saliency detection network for panoramic video based on spatio-temporal features fusion. The network consists of preprocessing module, feature coding module, salient prediction module and output integration module. The preprocessing module solves the boundary stretching and image distortion problem of panoramic video in the equiangular rectangular plane by cube-padding mapping. The feature coding module extracts multi-scale context spatio-temporal features, and realizes the feature fusion process of spatiot-emporal features adaptive weight based on attention mechanism. The saliency prediction module uses Bi-ConvLSTM to model the temporal information between frames, and the salient region of the current frame.is predicted by combining the results of the previous frame and the next frame. The model achieves better accuracy than the existing models on two public datasets, which verifies the effectiveness of the model.

作者：李明伟、王晶

作者单位：

学科分类：计算技术、计算机技术

中文关键词：深度学习全景视频显著性检测特征融合

英文关键词：deep learningpanoramic videosaliency detectionfeature fusion

推荐引用：李明伟,王晶.基于时空特征融合的全景视频显著性检测网络[EB/OL].(2021-03-16)[2025-08-16].http://www.paper.edu.cn/releasepaper/content/202103-158.点此复制

基于时空特征融合的全景视频显著性检测网络

Spatial-Temporal Network for 360 Videos Saliency Prediction

评论