deconvolution layer parameter setting

reference:

1. Paper describes initializing the deconv layer with bilinear filter coefficients and train them. But in the provided train/val.prototxt, we can see lr_mult=0, which means, deconv layer is not trained. Any idea why and how does it affect the accuracy?

? In further experiments? on PASCAL VOC we found that learning the interpolation parameters made little difference, and fixing these weights gives a slight speed-up since the interpolation filter gradient can be skipped.

Keep in mind that there is only one channel per class in this particular architecture, so not that much is there to be learned except perhaps for the spatial extent of the kernel. The results for other data (with more scale variation) or other architectures (with more deconvolution channels and layers) could differ.

2. Previous fcn files used group=21 in the deconv layer. But now, they are removed. Any idea how does it affect the accuracy?

? These are equivalent as long as these parameters are not learned. In the group case, the no. of groups is equal to the no. of channels so that each class is interpolated separately. ?In the no group case, only the "diagonal" of the weight matrix is initialized to the bilinear filter kernels so that each class is likewise interpolated separately with all cross-channel weights set to zero.

?Happy brewing,?

Evan Shelhamer

that is:

conv: N class

deconv:N class

N group

时间: 2024-10-11 12:07:14

deconvolution layer parameter setting的相关文章

Caffe源码-Layer类

Layer类简介 Layer是caffe中搭建网络的基本单元,caffe代码中包含大量Layer基类派生出来的各种各样的层,各自通过虚函数 Forward() 和 Backward() 实现自己的功能. Forward() 函数用于前向计算过程,由 bottom blob 计算 top blob 和 loss ,实现数据由浅至深的传递.而 Backward() 函数用于反向传播过程,由 top blob 的计算 bottom blob 的梯度,将网络的预测误差向浅层网络传递,以便更新网络的参数.

{ICIP2014}{收录论文列表}

This article come from HEREARS-L1: Learning Tuesday 10:30–12:30; Oral Session; Room: Leonard de Vinci 10:30  ARS-L1.1—GROUP STRUCTURED DIRTY DICTIONARY LEARNING FOR CLASSIFICATION Yuanming Suo, Minh Dao, Trac Tran, Johns Hopkins University, USA; Hojj

Classifying plankton with deep neural networks

Classifying plankton with deep neural networks The National Data Science Bowl, a data science competition where the goal was to classify images of plankton, has just ended. I participated with six other members of my research lab, the Reservoir lab o

【CS-4476-project 6】Deep Learning

AlexNet / VGG-F network visualized by mNeuron. Project 6: Deep LearningIntroduction to Computer Vision Brief Due date: Tuesday, December 6th, 11:55pm Project materials including starter code, training and testing data, and html writeup template: proj

Overview and Evaluation of Bluetooth Low Energy: An Emerging Low-Power Wireless Technology

转自:http://www.mdpi.com/1424-8220/12/9/11734/htm Sensors 2012, 12(9), 11734-11753; doi:10.3390/s120911734 Article Carles Gomez 1,*, Joaquim Oller 2 and Josep Paradells 2 1 Universitat Politècnica de Catalunya/Fundació i2Cat, C/Esteve Terradas, 7, Cast

Hacker's guide to Neural Networks

Hacker's guide to Neural Networks Andrej Karpathy blog About Hacker's guide to Neural Networks Hacker's guide to Neural Networks Hi there, I'm a CS PhD student at Stanford. I've worked on Deep Learning for a few years as part of my research and among

Network Load Balancing Technical Overview--reference

http://technet.microsoft.com/en-us/library/bb742455.aspx Abstract Network Load Balancing, a clustering technology included in the Microsoft Windows 2000 Advanced Server and Datacenter Server operating systems, enhances the scalability and availabilit

C++开源库集合

| Main | Site Index | Download | mimetic A free/GPL C++ MIME Library mimetic is a free/GPL Email library (MIME) written in C++ designed to be easy to use and integrate but yet fast and efficient. It is based on the C++ standard library and heavily us

Generative Adversarial Nets[Improved GAN]

0.背景 Tim Salimans等人认为之前的GANs虽然可以生成很好的样本,然而训练GAN本质是找到一个基于连续的,高维参数空间上的非凸游戏上的纳什平衡.然而不幸的是,寻找纳什平衡是一个十分困难的问题.在现有的针对特定场景算法中,找不适合GAN游戏的算法,因为GAN的实现通常是使用梯度下降的方法去训练GAN网络的目标函数,而不是真的找零和游戏中的纳什平衡.且目标函数本身是非凸函数,其中是连续参数且参数空间维度很高,所以如果真的去搜寻纳什平衡,那么这些算法都是无法收敛的. 当游戏中每个人都认为