Training Deep Neural Networks

http://handong1587.github.io/deep_learning/2015/10/09/training-dnn.html  //转载于

Training Deep Neural Networks

Published: 09 Oct 2015  Category: deep_learning

Tutorials

Popular Training Approaches of DNNs?—?A Quick Overview

https://medium.com/@asjad/popular-training-approaches-of-dnns-a-quick-overview-26ee37ad7e96#.pqyo039bb

Activation functions

Rectified linear units improve restricted boltzmann machines (ReLU)

Rectifier Nonlinearities Improve Neural Network Acoustic Models (leaky-ReLU, aka LReLU)

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification (PReLU)

Empirical Evaluation of Rectified Activations in Convolutional Network (ReLU/LReLU/PReLU/RReLU)

Deep Learning with S-shaped Rectified Linear Activation Units (SReLU)

Parametric Activation Pools greatly increase performance and consistency in ConvNets

Noisy Activation Functions

Weights Initialization

An Explanation of Xavier Initialization

Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?

All you need is a good init

Data-dependent Initializations of Convolutional Neural Networks

What are good initial weights in a neural network?

RandomOut: Using a convolutional gradient norm to win The Filter Lottery

Batch Normalization

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift(ImageNet top-5 error: 4.82%)

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks

Loss Function

The Loss Surfaces of Multilayer Networks

Optimization Methods

On Optimization Methods for Deep Learning

On the importance of initialization and momentum in deep learning

Invariant backpropagation: how to train a transformation-invariant neural network

A practical theory for designing very deep convolutional neural network

Stochastic Optimization Techniques

Alec Radford’s animations for optimization algorithms

http://www.denizyuret.com/2015/03/alec-radfords-animations-for.html

Faster Asynchronous SGD (FASGD)

An overview of gradient descent optimization algorithms (★★★★★)

Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters

Writing fast asynchronous SGD/AdaGrad with RcppParallel

Regularization

DisturbLabel: Regularizing CNN on the Loss Layer [University of California & MSR] (2016)

Dropout

Improving neural networks by preventing co-adaptation of feature detectors (Dropout)

Regularization of Neural Networks using DropConnect

Regularizing neural networks with dropout and with DropConnect

Fast dropout training

Dropout as data augmentation

A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

Improved Dropout for Shallow and Deep Learning

Gradient Descent

Fitting a model via closed-form equations vs. Gradient Descent vs Stochastic Gradient Descent vs Mini-Batch Learning. What is the difference?(Normal Equations vs. GD vs. SGD vs. MB-GD)

http://sebastianraschka.com/faq/docs/closed-form-vs-gd.html

An Introduction to Gradient Descent in Python

Train faster, generalize better: Stability of stochastic gradient descent

A Variational Analysis of Stochastic Gradient Algorithms

The vanishing gradient problem: Oh no?—?an obstacle to deep learning!

Gradient Descent For Machine Learning

http://machinelearningmastery.com/gradient-descent-for-machine-learning/

Revisiting Distributed Synchronous SGD

Accelerate Training

Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices

Image Data Augmentation

DataAugmentation ver1.0: Image data augmentation tool for training of image recognition algorithm

Caffe-Data-Augmentation: a branc caffe with feature of Data Augmentation using a configurable stochastic combination of 7 data augmentation techniques

Papers

Scalable and Sustainable Deep Learning via Randomized Hashing

Tools

pastalog: Simple, realtime visualization of neural network training performance

torch-pastalog: A Torch interface for pastalog - simple, realtime visualization of neural network training performance

时间: 2024-10-24 01:43:21

Training Deep Neural Networks的相关文章

论文翻译:BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or ?1

目录 BinaryNet:通过权重和激活约束为+1或-1训练深度神经网络 摘要 引言 1.BinaryNet 符号函数 梯度计算和累积 通过离散化传播梯度 一些有用的成分 算法1 使用BinaryNet训练DNN 算法2 批量标准化转换(Ioffe和Szegedy,2015),适用于小批量激活x. 算法3 ADAM学习规则(Kingma&Ba,2014). 2.基准测试结果 MLP on MNIST ConvNet on CIFAR-10 ConvNet on SVHN 3.在运行时更快 第一层

为什么深度神经网络难以训练Why are deep neural networks hard to train?

Imagine you're an engineer who has been asked to design a computer from scratch. One day you're working away in your office, designing logical circuits, setting out AND gates, OR gates, and so on, when your boss walks in with bad news. The customer h

[C4] Andrew Ng - Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

About this Course This course will teach you the "magic" of getting deep learning to work well. Rather than the deep learning process being a black box, you will understand what drives performance, and be able to more systematically get good res

On Explainability of Deep Neural Networks

On Explainability of Deep Neural Networks « Learning F# Functional Data Structures and Algorithms is Out! On Explainability of Deep Neural Networks During a discussion yesterday with software architect extraordinaire David Lazar regarding how everyth

(转)Understanding, generalisation, and transfer learning in deep neural networks

Understanding, generalisation, and transfer learning in deep neural networks FEBRUARY 27, 2017 This is the first in a series of posts looking at the 'top 100 awesome deep learning papers.' Deviating from the normal one-paper-per-day format, I'll take

Why are Eight Bits Enough for Deep Neural Networks?

Why are Eight Bits Enough for Deep Neural Networks? Deep learning is a very weird technology. It evolved over decades on a very different track than the mainstream of AI, kept alive by the efforts of a handful of believers. When I started using it a

Introduction to Deep Neural Networks

Introduction to Deep Neural Networks Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. They interpret sensory data through a kind of machine perception, labeling or clustering raw

Classifying plankton with deep neural networks

Classifying plankton with deep neural networks The National Data Science Bowl, a data science competition where the goal was to classify images of plankton, has just ended. I participated with six other members of my research lab, the Reservoir lab o

Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1

Normalizing input Vanishing/Exploding gradients deep neural network suffer from these issues. they are huge barrier to training deep neural network. There is a partial solution to solve the above problem but help a lot which is careful choice how you