Gradient Normalization & Depth Based Decay For Deep Learning
Robert Kwiatkowski Oscar Chang
作者信息
Abstract
In this paper we introduce a novel method of gradient normalization and decay
with respect to depth. Our method leverages the simple concept of normalizing
all gradients in a deep neural network, and then decaying said gradients with
respect to their depth in the network. Our proposed normalization and decay
techniques can be used in conjunction with most current state of the art
optimizers and are a very simple addition to any network. This method, although
simple, showed improvements in convergence time on state of the art networks
such as DenseNet and ResNet on image classification tasks, as well as on an
LSTM for natural language processing tasks.引用本文复制引用
Robert Kwiatkowski,Oscar Chang.Gradient Normalization & Depth Based Decay For Deep Learning[EB/OL].(2017-12-10)[2026-04-05].https://arxiv.org/abs/1712.03607.学科分类
计算技术、计算机技术
评论