基于Spark的改进并行BP算法
Improved Parallel BP Algorithm Based on Spark
BP(Back Propagation)神经网络是一种以误差反向传播进行训练的多层前馈网络,是目前最受欢迎的神经网络模型之一。传统BP算法收敛速度慢、训练时间长、经常陷入局部极小值且易发生过拟合。本文从动态调整学习率、增加动量因子、采用小批量梯度下降法、换用交叉熵代价函数、使用early stop防止过拟合等多个方面对传统BP算法加以改进,并将改进后的算法基于Spark平台进行实现。Spark是一种基于内存的并行计算框架,擅长处理迭代计算,因此非常适合实现BP算法的并行化。通过与MLlib中多层感知机的实验对比发现,本文实现的BP算法在保证准确率的同时,收敛速度至少提高30%以上。
BP (Back Propagation) neural network is a multilayer feedforward neural network with error back propagation, and it is one of the most popular neural network models. The traditional BP algorithm has a slow convergence rate, long training time, often caught by a local minimum and easy to overfit. In this paper, we improved the traditional BP algorithm in many aspects,such as the self-adaptive learning rate, the momentum factor, the mini-batch gradient descent method, the cross entropy cost function, the early stop criteria and so on, and then implemented the improved algorithm based on spark platform. Spark is a memory-based parallel computing framework, and it is good at dealing with iterative computation, so it is very suitable for the parallelization of BP algorithm. Compared with the experimental results of multi-layer perceptron classifier of Mllib, we found that the BP algorithm in this paper can ensure the accuracy of the classification, and the convergence speed is increased by at least 30%.
吴斌、方维、刘永
计算技术、计算机技术
神经网络改进BP算法Spark框架迭代计算并行实现
neural networkimproved back propagation algorithmSpark frameworkiterative computationparallel implementation
吴斌,方维,刘永.基于Spark的改进并行BP算法[EB/OL].(2016-12-19)[2025-08-02].http://www.paper.edu.cn/releasepaper/content/201612-356.点此复制
评论