Caffe中im2col的实现解析

这里，我是将Caffe中im2col的解析过程直接拉了出来，使用C++进行了输出，方便理解。代码如下：

  1 #include<iostream>
  2
  3 using namespace std;
  4
  5 bool is_a_ge_zero_and_a_lt_b(int a,int b)
  6 {
  7     if(a>=0 && a <b) return true;
  8     return false;
  9 }
 10
 11 void im2col_cpu(const float* data_im, const int channels,
 12     const int height, const int width, const int kernel_h, const int kernel_w,
 13     const int pad_h, const int pad_w,
 14     const int stride_h, const int stride_w,
 15     const int dilation_h, const int dilation_w,
 16     float* data_col) {
 17   const int output_h = (height + 2 * pad_h -
 18     (dilation_h * (kernel_h - 1) + 1)) / stride_h + 1;
 19   const int output_w = (width + 2 * pad_w -
 20     (dilation_w * (kernel_w - 1) + 1)) / stride_w + 1;
 21   const int channel_size = height * width;
 22   for (int channel = channels; channel--; data_im += channel_size) {
 23     for (int kernel_row = 0; kernel_row < kernel_h; kernel_row++) {
 24       for (int kernel_col = 0; kernel_col < kernel_w; kernel_col++) {
 25         int input_row = -pad_h + kernel_row * dilation_h;
 26         for (int output_rows = output_h; output_rows; output_rows--) {
 27           if (!is_a_ge_zero_and_a_lt_b(input_row, height)) {
 28             for (int output_cols = output_w; output_cols; output_cols--) {
 29               *(data_col++) = 0;
 30             }
 31           } else {
 32             int input_col = -pad_w + kernel_col * dilation_w;
 33             for (int output_col = output_w; output_col; output_col--) {
 34               if (is_a_ge_zero_and_a_lt_b(input_col, width)) {
 35                 *(data_col++) = data_im[input_row * width + input_col];
 36               } else {
 37                 *(data_col++) = 0;
 38               }
 39               input_col += stride_w;
 40             }
 41           }
 42           input_row += stride_h;
 43         }
 44       }
 45     }
 46   }
 47 }
 48
 49
 50 int main()
 51 {
 52      float* data_im;
 53     int height=5;
 54     int width=5;
 55     int kernel_h=3;
 56     int kernel_w=3;
 57     int pad_h=1;
 58     int pad_w=1;
 59     int stride_h=1;
 60     int stride_w=1;
 61     int dilation_h=1;
 62     int dilation_w=1;
 63     float* data_col;
 64     int channels =3;
 65     const int output_h = (height + 2 * pad_h -
 66     (dilation_h * (kernel_h - 1) + 1)) / stride_h + 1;
 67       const int output_w = (width + 2 * pad_w -
 68     (dilation_w * (kernel_w - 1) + 1)) / stride_w + 1;
 69     data_im = new float[channels*height*width];
 70     data_col = new float[channels*output_h*output_w*kernel_h*kernel_w];
 71
 72     //init input image data
 73     for(int m=0;m<channels;++m)
 74     {
 75       for(int i=0;i<height;++i)
 76       {
 77         for(int j=0;j<width;++j)
 78         {
 79           data_im[m*width*height+i*width+j] = m*width*height+ i*width +j;
 80           cout <<data_im[m*width*height+i*width+j] <<‘ ‘;
 81         }
 82         cout <<endl;
 83       }
 84     }
 85
 86     im2col_cpu(data_im, channels,
 87      height,width, kernel_h, kernel_w,
 88     pad_h, pad_w,
 89     stride_h, stride_w,
 90     dilation_h, dilation_w,
 91      data_col);
 92     cout <<channels<<endl;
 93     cout <<output_h<<endl;
 94     cout <<output_w<<endl;
 95     cout <<kernel_h<<endl;
 96     cout <<kernel_w<<endl;
 97    // cout <<"error"<<endl;
 98     for(int i=0;i<kernel_w*kernel_h*channels;++i)
 99     {
100         for(int j=0;j<output_w*output_h;++j)
101         {
102             cout <<data_col[i*output_w*output_h+j]<<‘ ‘;
103         }
104         cout <<endl;
105     }
106
107     return 0;
108 }

多通道卷积的图像别人已经给过很多了，大家可以搜到的基本都来自于一篇。这里附上一个我自己的理解过程，和程序的输出是完全一致的

原文地址：https://www.cnblogs.com/jourluohua/p/9735897.html

时间： 2024-10-19 09:55:33

Caffe中im2col的实现解析的相关文章

Caffe中的损失函数解析

Caffe中的损失函数解析导言在有监督的机器学习中,需要有标签数据,与此同时,也需要有对应的损失函数(Loss Function). 在Caffe中,目前已经实现了一些损失函数,包括最常见的L2损失函数,对比损失函数,信息增益损失函数等等.在这里做一个笔记,归纳总结Caffe中用到的不同的损失函数,以及分析它们各自适合的使用场景. 欧式距离损失函数(Euclidean Loss) 输入: 预测的值: y ^ ∈[?∞,+∞] , 其中,它们的形状为:N×C×H×W 标签的值: y∈[?∞,+

Caffe中对MNIST执行train操作执行流程解析

之前在 http://blog.csdn.net/fengbingchun/article/details/49849225 中简单介绍过使用Caffe train MNIST的文章,当时只是仿照caffe中的example实现了下,下面说一下执行流程,并精简代码到仅有10余行: 1. 先注册所有层,执行layer_factory.hpp中类LayerRegisterer的构造函数,类LayerRegistry的AddCreator和Registry静态函数:关于Caffe中Lay

Caffe 中卷积运算的原理与实现

caffe中卷积运算设计的很巧妙,今天就来讨论一下caffe中卷积运算的原理,最后会给出一个自己的实现版本,便于初学者理解. Caffe中卷积运算的原理俗话说,一图胜千言,首先先给出原理示意图,为了方便,这里以二维核为例滑动窗口在图像中每滑动一个地方,将图像中该滑动窗口图像展开为一列,所有列组成图中的滑动窗口矩阵,这里假设pad=1,stride=1,K=3,则滑动窗口矩阵每行大小为W*H,一共K*K行. 每个核展开为一行,N个核形成的核矩阵大小为N*K*K. 最后将核矩阵和滑动窗口矩阵相乘

如何在caffe中添加新的Layer

如何在caffe中添加新的Layer 本文分为两部分,先写一个入门的教程,然后再给出自己添加maxout与NIN的layer的方法 (一) 其实在Github上已经有答案了(https://github.com/BVLC/caffe/issues/684) Here's roughly the process I follow. Add a class declaration for your layer to the appropriate one of common_layers.hpp,

CAFFE中训练与使用阶段网络设计的不同

神经网络中,我们通过最小化神经网络来训练网络,所以在训练时最后一层是损失函数层(LOSS), 在测试时我们通过准确率来评价该网络的优劣,因此最后一层是准确率层(ACCURACY). 但是当我们真正要使用训练好的数据时,我们需要的是网络给我们输入结果,对于分类问题,我们需要获得分类结果,如下右图最后一层我们得到的是概率,我们不需要训练及测试阶段的LOSS,ACCURACY层了. 下图是能过$CAFFE_ROOT/python/draw_net.py绘制$CAFFE_ROOT/models/caf

caffe 中 python 数据层

caffe中大多数层用C++写成. 但是对于自己数据的输入要写对应的输入层,比如你要去图像中的一部分,不能用LMDB,或者你的label 需要特殊的标记. 这时候就需要用python 写一个输入层. 如在fcn 的voc_layers.py 中有两个类: VOCSegDataLayer SBDDSegDataLayer 分别包含:setup,reshape,forward, backward, load_image, load_label. 不需要backward 没有参数更新. import

caffe中权值初始化方法

首先说明:在caffe/include/caffe中的 filer.hpp文件中有它的源文件,如果想看,可以看看哦,反正我是不想看,代码细节吧,现在不想知道太多,有个宏观的idea就可以啦,如果想看代码的具体的话,可以看:http://blog.csdn.net/xizero00/article/details/50921692,写的还是很不错的(不过有的地方的备注不对,不知道改过来了没). 文件 filler.hpp提供了7种权值初始化的方法,分别为:常量初始化(constant).高斯分布初

Caffe 中添加自己的网络层

写在前面: Caffe 中有众多的网络层,最新版本的代码已经涵盖了很多种类型的网络层,然而,有时候由于各种原因,其给定的网络层不能满足我们的要求,这时候就要对其更改,以使其满足自己的需求,感谢作者开源代码以及众多的代码维护者. 由于Caffe 中的网络层都是直接或者间接地给予Layer 基类,所以,在我们需要添加新的类型时,就需要选择好自己的基类,以使我们能够更好的利用基类已有的一些方法.我们新建的类可以基于 1. 直接继承于Layer 2. 继承于DataLayer 3. 继承于NeuronL

微信中QQ表情的解析(php)

微信公众平台接受的消息中,标签是用'/:'开头的字符串表示的,假设要在网页上显示(比方制作微信大屏幕),就须要进行转换. 所以我向微信公众平台按顺序发送了各个QQ表情,在微信公众平台后台能够看到接受的表情会被解析成https://res.wx.qq.com/mpres/htmledition/images/icon/emotion/0.gif 这种图片.所以自己写一套解析函数就可以. 在微信公众平台后台发现,腾讯自己干了一件错误的事情:有一些表情没有被正确解析,这些标签的特点是有括号.引號这种字