|国家预印本平台
首页|基于扩张空间金字塔的点云语义分割网络

基于扩张空间金字塔的点云语义分割网络

SPNet: A Lightweight Dilated Spatial Pyramid Network for Semantic Segmentation of Point Clouds

中文摘要英文摘要

点云的稀疏、不规则和非结构化的属性使得无法高效地获得点云的均匀下采样,并且很难捕捉点云中的上下文信息。高成本的下采样方法和复杂的网络层设计使得现有的深度网络无法直接应用于大型点云。本文探索了一种用于点云语义分割的深度网络新结构。所提出的可扩展性模块使用了扩张卷积的思想来放大感受野,避免了对于大规模点云来说计算成本很高的下采样操作。为了进一步优化网络性能,我们将扩张邻域与空间金字塔结构相结合,以取代现有方法中的复杂网络层。扩张卷积减少了下采样中的信息丢失,使用扩张卷积简化了为了弥补随机下采样中的损失而设计的复杂网络层。可以显著降低网络参数,提高准确率。文章中提出的以RandLA-Net为主干网络的模型DSPNet,实现了更高的精度和效率。模型在SemanticKITTI和Semantic3D数据集上测试了准确性,并与其他先进方法进行了比较。实验结果表明,与基准模型相比,该方法在相似的精度下将参数减少59\%,并且具有更快的时间和更多的输入点。

Point cloud is a sparse, irregular, and unstructured form of data. These properties of the point cloud make it impossible to efficiently obtain the uniform down-sampling of the point cloud, and it becomes difficult to capture the contextual information in the point cloud. High-cost downsampling methods and complex network layers design make existing deep networks can not directly applicable to large point clouds. %The RandLA-Net explored the use of random downsampling for the first time, and achieved a great improvement in efficiency. However, in order to compensate for the information loss caused by random downsampling, the network layer of RandLA-Net method requires a large number of parameters and a well-designed complex encoder layer to obtain semantic wider information, which leads to a large amount of model calculation and limits the scale of point cloud processing on the network.%Existing deep neural networks for point clouds often require a large number of parameters and well-designed complex encoder layers to obtain semantic information, which leads to the computationally heavy model and limits the scale of point clouds processing on the network. In this paper, we explored a new structure for point clouds semantic segmentation deep network. The proposed scalability module uses the idea of dilated convolution in 2D image processing to enlarge receptive fields without downsampling operation, which is computationally expensive for large-scale point clouds.In order to further optimize the performance of the network, we combine the dilated neighborhood with the spatial pyramid structure to replace the complex network layer in the existing methods. Dilated convolution reduces the information loss in the down-sampling operation and simplifies the complex network layers designed to compensate for the loss in the random downsampling. Combining the expanded neighborhood with the spatial pyramid structure to replace the complex network layer in the existing method can significantly reduce network parameters and improve the accuracy rate. %This efficient module can replace several deep complex decoder layers in a deep network for semantic information extraction, reducing network depth, which in turn greatly reduces network size and consumption. The proposed model DSPNet, with RandLA-Net as a backbone network, achieves more superior accuracy and efficiency. We tested accuracy on SemanticKITTI,and Semantic3D datasets. We also made experiments to compare our model with other state-of-the-art methods. The simulation results shows that compared to the baseline model, our method reduces the number of parameters by 59\% with similar accuracy, and has faster time and more inference points.

杨震、林松楠

计算技术、计算机技术

人工智能深度学习点云语义分割

rtificial IntelligenceDeep LearningPointCloudSemantic Segmentation

杨震,林松楠.基于扩张空间金字塔的点云语义分割网络[EB/OL].(2022-03-17)[2025-08-04].http://www.paper.edu.cn/releasepaper/content/202203-225.点此复制

评论