DEEPLY SUPERVISED NETS

DEEPLY SUPERVISED NETS

Fig1: Deeply-supervised Nets architecture and cost functions illustration

Abstract

We propose deeply-supervised nets (DSN), a method that simultaneously minimizes classification error while improving the directness and transparency of the hidden layer learning process. We focus our attention on three specific aspects in traditional convolutional-neural-network-type (CNN-type) architectures: (1) transparency in the effect that intermediate layers have on the overall classification; (2) discriminativeness and robustness of learned features, especially in early network layers; (3) training effectiveness in the face of “exploding” and “vanishing” gradients. To combat these issues, we introduce “companion” objective functions at each individual hidden layer, in addition to the overall objective function at the output layer (a strategy distinct from layer-wise pre-training). We also analyze our algorithm using techniques extended from stochastic gradient methods. The advantages provided by our method are evident in our experimental results on benchmark datasets, showing state-of-the-art performance on MNIST, CIFAR-10, CIFAR-100, and SVHN.

Experiments

Our results on several benchmark datasets are listed below:

Fig2: Test accuracy on different benchmark datasets.

Fig3: Visualization of the convolutional feature map learned by DSN

Code, preprocessed data and configuration files are available.

GET IT ON GITHUB


Publication

[1] Deeply-Supervised Nets [pdf][Arxiv version] 
Chen-Yu Lee*, Saining Xie*, Patrick Gallagher, Zhengyou Zhang, Zhuowen Tu 
(* indicates equal contributions) In Proceedings of AISTATS 2015 
An early and undocumented version presented at Deep Learning and Representation Learning Workshop, NIPS 2014


Disclosure

Deeply-supervised neural networks, Zhuowen Tu, Chen-Yu Lee, Saining Xie, 
UCSD Docket No. SD2014-313, May 22, 2014

时间: 2024-10-13 04:04:46

DEEPLY SUPERVISED NETS的相关文章

CVPR 2017 Paper list

CVPR2017 paper list Machine Learning 1 Spotlight 1-1A Exclusivity-Consistency Regularized Multi-View Subspace Clustering Xiaojie Guo, Xiaobo Wang, Zhen Lei, Changqing Zhang, Stan Z. Li Borrowing Treasures From the Wealthy: Deep Transfer Learning Thro

如何科研

原文链接 https://mmcheng.net/paperreading/ 关于文献阅读和科研选题 2014年04月01日 星期二2018年03月30日 星期五 MM Cheng 26 Comments ? 自从2007年一月去我即将读研的清华大学计算机图形学组做本科毕业设计开始,我就陷入了一个困扰我许久的问题之中:如何阅读文献,如何寻找科研题目?之后长达三年的时间,我一直被这个问题深深困扰,直至2009年底首次以第二作者发表论文(Sketch2Photo和Resizing)才稍微有缓和.在之

DSOD

目录 论文信息 前言 Introduction DSOD DSOD Architecture Overall Framework Principle 1: Proposal-free Principle 2: Deep Supervision Principle 3: Stem Block Principle 4: Dense Prediction Structure Experiments Results on MS COCO Conclusion 论文信息 Zhiqiang Shen, Zh

One Stage目标检测

在计算机视觉中,目标检测是一个难题.在大型项目中,首先需要先进行目标检测,得到对应类别和坐标后,才进行之后的各种分析.如人脸识别,通常是首先人脸检测,得到人脸的目标框,再对此目标框进行人脸识别.如果该物体都不能检测得到,则后续的分析就无从入手.因此,目标检测占据着十分重要的地位.在目标检测算法中,通常可以分成One-Stage单阶段和Two-Stage双阶段.而在实际中,我经常接触到的是One-Stage算法,如YOLO,SSD等.接下来,对常接触到的这部分One-stage单阶段目标检测算法进

【转载】Project on Learning Deep Belief Nets

Project on Learning Deep Belief Nets Deep Belief Nets (DBN's) will be explained in the lecture on Oct 29. Instead of learning layers of features by backpropagating errors, they learn one layer at a time by trying to build a generative model of the da

卷积神经网络和CIFAR-10:Yann LeCun专访 Convolutional Nets and CIFAR-10: An Interview with Yann LeCun

Recently Kaggle hosted a competition on the CIFAR-10 dataset. The CIFAR-10 dataset consists of 60k 32x32 colour images in 10 classes. This dataset was collected by AlexKrizhevsky, Vinod Nair, and Geoffrey Hinton. Many contestants used convolutional n

对抗神经网络(Adversarial Nets)的介绍[1]

对抗NN简介 概念介绍 对抗名字的由来及对抗过程 对抗NN的模型 对抗NN的模型和训练 判别网络D的最优值 模拟学习高斯分布 对抗NN实验结果 <生成对抗NN>代码的安装与运行 对抗网络相关论文 论文引用 一.对抗NN简介 大牛Ian J. Goodfellow 的2014年的<Generative Adversative Nets>第一次提出了对抗网络模型,短短两年的时间,这个模型在深度学习生成模型领域已经取得了不错的成果.论文提出了一个新的框架,可以利用对抗过程估计生成模型,相

论文笔记之:Generative Adversarial Nets

Generative Adversarial Nets NIPS 2014 本文通过对抗过程,提出了一种新的框架来预测产生式模型,我们同时训练两个模型:一个产生式模型 G,该模型可以抓住数据分布:还有一个判别式模型 D 可以预测来自训练样本 而不是 G 的样本的概率.训练 G 的目的是让 D 尽可能的犯错误,让其无法判断一个图像是产生的,还是来自训练样本.这个框架对应了

Supervised Descent Method Face Alignment 代码下载 和 算法研究 之一

1 主要内容: Supervised Descent Method and its Applications to Face Alignment算法研究. 2代码彩蛋:我问了好久,xxiong好心人发给我的,希望能对你们学习有帮助: 低调下载: http://humansensing.cs.cmu.edu/xxiong/mexintraface1.3.1%28release%29.zip. 注意杜绝一切商业用途,如果需要商业用途,请联系作者本人!! 3本文分为几个部分: (1)解决什么问题 (2