用numpy实现CNN卷积神经网络

为了加深对卷积神经网络底层原理的理解,本文通过使用numpy来搭建一个基础的包含卷积层、池化层、全连接层和Softmax层的卷积神经网络,并选择relu作为我们的激活函数,选择多分类交叉熵损失函数,最后使用了mnist数据集进行了训练和测试。

关于卷积网络的详细原理和实现可参考下列文章:

刘建平Pinard:卷积网络前向反向传播算法

卷积层的反向传播

手把手带你 Numpy实现CNN

1、卷积层

卷积层的前向传播输出由卷积核和特征图作卷积运算得到,反向传播时需要计算kernel和bias的梯度以及delta的反向传播误差,kernel的梯度由原特征图和delta作卷积得到,bias每个通道的梯度由对delta每个通道直接求和得到,delta的反向传播误差由delta和旋转180度的卷积核作卷积运算得到。其中卷积运算在实现时先将特征图的对应部分和卷积核展开成了向量的形式,再作向量乘法运算,这样可以通过并行运算加快速度,实现代码如下:

def img2col(x, ksize, stride):
    wx, hx, cx = x.shape                     # [width,height,channel]
    feature_w = (wx - ksize) // stride + 1   # 返回的特征图尺寸
    image_col = np.zeros((feature_w*feature_w, ksize*ksize*cx))
    num = 0
    for i in range(feature_w):
        for j in range(feature_w):
            image_col[num] =  x[i*stride:i*stride+ksize, j*stride:j*stride+ksize, :].reshape(-1)
            num += 1
    return image_col

class Conv(object):
    def __init__(self, kernel_shape, stride=1, pad=0):
        width, height, in_channel, out_channel = kernel_shape
        self.stride = stride
        self.pad = pad
        scale = np.sqrt(3*in_channel*width*height/out_channel)   #batch=3
        self.k = np.random.standard_normal(kernel_shape) / scale
        self.b = np.random.standard_normal(out_channel) / scale
        self.k_gradient = np.zeros(kernel_shape)
        self.b_gradient = np.zeros(out_channel)

    def forward(self, x):
        self.x = x
        if self.pad != 0:
            self.x = np.pad(self.x, ((0,0),(self.pad,self.pad),(self.pad,self.pad),(0,0)), 'constant')
        bx, wx, hx, cx = self.x.shape
        wk, hk, ck, nk = self.k.shape             # kernel的宽、高、通道数、个数
        feature_w = (wx - wk) // self.stride + 1  # 返回的特征图尺寸
        feature = np.zeros((bx, feature_w, feature_w, nk))

        self.image_col = []
        kernel = self.k.reshape(-1, nk)
        for i in range(bx):
            image_col = img2col(self.x[i], wk, self.stride)
            feature[i] = (np.dot(image_col, kernel)+self.b).reshape(feature_w,feature_w,nk)
            self.image_col.append(image_col)
        return feature

    def backward(self, delta, learning_rate):
        bx, wx, hx, cx = self.x.shape # batch,14,14,inchannel
        wk, hk, ck, nk = self.k.shape # 5,5,inChannel,outChannel
        bd, wd, hd, cd = delta.shape  # batch,10,10,outChannel

        # 计算self.k_gradient,self.b_gradient
        delta_col = delta.reshape(bd, -1, cd)
        for i in range(bx):
            self.k_gradient += np.dot(self.image_col[i].T, delta_col[i]).reshape(self.k.shape)
        self.k_gradient /= bx
        self.b_gradient += np.sum(delta_col, axis=(0, 1))
        self.b_gradient /= bx    

        # 计算delta_backward
        delta_backward = np.zeros(self.x.shape)
        k_180 = np.rot90(self.k, 2, (0,1))      # numpy矩阵旋转180度
        k_180 = k_180.swapaxes(2, 3)
        k_180_col = k_180.reshape(-1,ck)

        if hd-hk+1 != hx:
            pad = (hx-hd+hk-1) // 2
            pad_delta = np.pad(delta, ((0,0),(pad,pad),(pad,pad),(0,0)), 'constant')
        else:
            pad_delta = delta

        for i in range(bx):
            pad_delta_col = img2col(pad_delta[i], wk, self.stride)
            delta_backward[i] = np.dot(pad_delta_col, k_180_col).reshape(wx,hx,ck)

        # 反向传播
        self.k -=  self.k_gradient * learning_rate
        self.b -=  self.b_gradient * learning_rate

        return delta_backward

在这里顺便给出relu的实现代码:

class Relu(object):
    def forward(self, x):
        self.x = x
        return np.maximum(x, 0)

    def backward(self, delta):
        delta[self.x<0] = 0
        return delta

2、池化层

池化层实现了ksize=2、stride=2的最大池化,前向传播时取对应核的最大值作为输出,并记录最大值的位置,反向传播时先将特征图按原值扩充一次,再将非最大值位置置0即可。

class Pool(object):
    def forward(self, x):
        b, w, h, c = x.shape
        feature_w = w // 2
        feature = np.zeros((b, feature_w, feature_w, c))
        self.feature_mask = np.zeros((b, w, h, c))   # 记录最大池化时最大值的位置信息用于反向传播
        for bi in range(b):
            for ci in range(c):
                for i in range(feature_w):
                    for j in range(feature_w):
                        feature[bi, i, j, ci] = np.max(x[bi,i*2:i*2+2,j*2:j*2+2,ci])
                        index = np.argmax(x[bi,i*2:i*2+2,j*2:j*2+2,ci])
                        self.feature_mask[bi, i*2+index//2, j*2+index%2, ci] = 1
        return feature

    def backward(self, delta):
        return np.repeat(np.repeat(delta, 2, axis=1), 2, axis=2) * self.feature_mask

3、全连接层

全连接层的实现前文已经给出,这里给出了封装成单独的类后的形式,增强了复用性:

class Linear(object):
    def __init__(self, inChannel, outChannel):
        scale = np.sqrt(inChannel/2)
        self.W = np.random.standard_normal((inChannel, outChannel)) / scale
        self.b = np.random.standard_normal(outChannel) / scale
        self.W_gradient = np.zeros((inChannel, outChannel))
        self.b_gradient = np.zeros(outChannel)

    def forward(self, x):
        self.x = x
        x_forward = np.dot(self.x, self.W) + self.b
        return x_forward

    def backward(self, delta, learning_rate):
        ## 梯度计算
        batch_size = self.x.shape[0]
        self.W_gradient = np.dot(self.x.T, delta) / batch_size  # bxin bxout
        self.b_gradient = np.sum(delta, axis=0) / batch_size
        delta_backward = np.dot(delta, self.W.T)                # bxout inxout
        ## 反向传播
        self.W -= self.W_gradient * learning_rate
        self.b -= self.b_gradient * learning_rate 

        return delta_backward

4、Softmax层

一般分类模型在全连接层给出每个类别的预测值后会再经过softmax层来得到最终的预测值,其前向传播公式如下:
\[
a_j=\frac{e^{z_j}}{\sum\limits_k{e^{z_k}}}
\]
在将标签onehot编码后,反向传播公式可给出向量形式如下:
\[
\delta=a-y
\]
对单个样本,其多分类交叉熵loss计算公式给出向量形式如下:
\[
loss=-\sum y\log a
\]
最后给出代码实现:

class Softmax(object):
    def cal_loss(self, predict, label):
        batchsize, classes = predict.shape
        self.predict(predict)
        loss = 0
        delta = np.zeros(predict.shape)
        for i in range(batchsize):
            delta[i] = self.softmax[i] - label[i]
            loss -= np.sum(np.log(self.softmax[i]) * label[i])
        loss /= batchsize
        return loss, delta

    def predict(self, predict):
        batchsize, classes = predict.shape
        self.softmax = np.zeros(predict.shape)
        for i in range(batchsize):
            predict_tmp = predict[i] - np.max(predict[i])
            predict_tmp = np.exp(predict_tmp)
            self.softmax[i] = predict_tmp / np.sum(predict_tmp)
        return self.softmax

5、训练和测试

训练和测试是直接使用的torchvision集成的mnist数据集,训练后将权重参数通过numpy提供的接口保存到本地文件中,测试时再从文件中读取权重参数,在只训练了两个epoch的情况下测试集的准确率达到了98.05%,相比使用全连接的神经网络提高了不少。训练和测试的代码如下:

def train():
    # Mnist手写数字集
    dataset_path = "D://datasets//mnist"
    train_data = torchvision.datasets.MNIST(root=dataset_path, train=True, download=True)
    train_data.data = train_data.data.numpy()  # [60000,28,28]
    train_data.targets = train_data.targets.numpy()  # [60000]
    train_data.data = train_data.data.reshape(60000, 28, 28, 1) / 255.   # 输入向量处理
    train_data.targets = onehot(train_data.targets, 60000) # 标签one-hot处理 (60000, 10) 

    conv1 = Conv(kernel_shape=(5,5,1,6))   # 24x24x6
    relu1 = Relu()
    pool1 = Pool()                         # 12x12x6
    conv2 = Conv(kernel_shape=(5,5,6,16))  # 8x8x16
    relu2 = Relu()
    pool2 = Pool()                         # 4x4x16
    nn = Linear(256, 10)
    softmax = Softmax()

    lr = 0.01
    batch = 3
    for epoch in range(10):
        for i in range(0, 60000, batch):
            X = train_data.data[i:i+batch]
            Y = train_data.targets[i:i+batch]

            predict = conv1.forward(X)
            predict = relu1.forward(predict)
            predict = pool1.forward(predict)
            predict = conv2.forward(predict)
            predict = relu2.forward(predict)
            predict = pool2.forward(predict)
            predict = predict.reshape(batch, -1)
            predict = nn.forward(predict)

            loss, delta = softmax.cal_loss(predict, Y)

            delta = nn.backward(delta, lr)
            delta = delta.reshape(batch,4,4,16)
            delta = pool2.backward(delta)
            delta = relu2.backward(delta)
            delta = conv2.backward(delta, lr)
            delta = pool1.backward(delta)
            delta = relu1.backward(delta)
            conv1.backward(delta, lr)

            print("Epoch-{}-{:05d}".format(str(epoch), i), ":", "loss:{:.4f}".format(loss))

        lr *= 0.95**(epoch+1)
        np.savez("data2.npz", k1=conv1.k, b1=conv1.b, k2=conv2.k, b2=conv2.b, w3=nn.W, b3=nn.b)

def eval():
    r = np.load("data2.npz")

    # Mnist手写数字集
    dataset_path = "D://datasets//mnist"
    test_data = torchvision.datasets.MNIST(root=dataset_path, train=False)
    test_data.data = test_data.data.numpy()        # [10000,28,28]
    test_data.targets = test_data.targets.numpy()  # [10000]

    test_data.data = test_data.data.reshape(10000, 28, 28, 1) / 255.

    conv1 = Conv(kernel_shape=(5, 5, 1, 6))  # 24x24x6
    relu1 = Relu()
    pool1 = Pool()  # 12x12x6
    conv2 = Conv(kernel_shape=(5, 5, 6, 16))  # 8x8x16
    relu2 = Relu()
    pool2 = Pool()  # 4x4x16
    nn = Linear(256, 10)
    softmax = Softmax()

    conv1.k = r["k1"]
    conv1.b = r["b1"]
    conv2.k = r["k2"]
    conv2.b = r["b2"]
    nn.W = r["w3"]
    nn.n = r["b3"]

    num = 0
    for i in range(10000):
        X = test_data.data[i]
        X = X[np.newaxis, :]
        Y = test_data.targets[i]

        predict = conv1.forward(X)
        predict = relu1.forward(predict)
        predict = pool1.forward(predict)
        predict = conv2.forward(predict)
        predict = relu2.forward(predict)
        predict = pool2.forward(predict)
        predict = predict.reshape(1, -1)
        predict = nn.forward(predict)

        predict = softmax.predict(predict)

        if np.argmax(predict) == Y:
            num += 1

    print("TEST-ACC: ", num/10000*100, "%")

6、完整代码

import numpy as np
import torchvision
import time, functools
import logging
np.set_printoptions(threshold=np.inf)  

def onehot(targets, num):
    result = np.zeros((num, 10))
    for i in range(num):
        result[i][targets[i]] = 1
    return result

def img2col(x, ksize, stride):
    wx, hx, cx = x.shape                     # [width,height,channel]
    feature_w = (wx - ksize) // stride + 1   # 返回的特征图尺寸
    image_col = np.zeros((feature_w*feature_w, ksize*ksize*cx))
    num = 0
    for i in range(feature_w):
        for j in range(feature_w):
            image_col[num] =  x[i*stride:i*stride+ksize, j*stride:j*stride+ksize, :].reshape(-1)
            num += 1
    return image_col

## nn
class Linear(object):
    def __init__(self, inChannel, outChannel):
        scale = np.sqrt(inChannel/2)
        self.W = np.random.standard_normal((inChannel, outChannel)) / scale
        self.b = np.random.standard_normal(outChannel) / scale
        self.W_gradient = np.zeros((inChannel, outChannel))
        self.b_gradient = np.zeros(outChannel)

    def forward(self, x):
        self.x = x
        x_forward = np.dot(self.x, self.W) + self.b
        return x_forward

    def backward(self, delta, learning_rate):
        ## 梯度计算
        batch_size = self.x.shape[0]
        self.W_gradient = np.dot(self.x.T, delta) / batch_size  # bxin bxout
        self.b_gradient = np.sum(delta, axis=0) / batch_size
        delta_backward = np.dot(delta, self.W.T)                # bxout inxout
        ## 反向传播
        self.W -= self.W_gradient * learning_rate
        self.b -= self.b_gradient * learning_rate 

        return delta_backward

## conv
class Conv(object):
    def __init__(self, kernel_shape, stride=1, pad=0):
        width, height, in_channel, out_channel = kernel_shape
        self.stride = stride
        self.pad = pad
        scale = np.sqrt(3*in_channel*width*height/out_channel)
        self.k = np.random.standard_normal(kernel_shape) / scale
        self.b = np.random.standard_normal(out_channel) / scale
        self.k_gradient = np.zeros(kernel_shape)
        self.b_gradient = np.zeros(out_channel)

    def forward(self, x):
        self.x = x
        if self.pad != 0:
            self.x = np.pad(self.x, ((0,0),(self.pad,self.pad),(self.pad,self.pad),(0,0)), 'constant')
        bx, wx, hx, cx = self.x.shape
        wk, hk, ck, nk = self.k.shape             # kernel的宽、高、通道数、个数
        feature_w = (wx - wk) // self.stride + 1  # 返回的特征图尺寸
        feature = np.zeros((bx, feature_w, feature_w, nk))

        self.image_col = []
        kernel = self.k.reshape(-1, nk)
        for i in range(bx):
            image_col = img2col(self.x[i], wk, self.stride)
            feature[i] = (np.dot(image_col, kernel)+self.b).reshape(feature_w,feature_w,nk)
            self.image_col.append(image_col)
        return feature

    def backward(self, delta, learning_rate):
        bx, wx, hx, cx = self.x.shape # batch,14,14,inchannel
        wk, hk, ck, nk = self.k.shape # 5,5,inChannel,outChannel
        bd, wd, hd, cd = delta.shape  # batch,10,10,outChannel

        # 计算self.k_gradient,self.b_gradient
        delta_col = delta.reshape(bd, -1, cd)
        for i in range(bx):
            self.k_gradient += np.dot(self.image_col[i].T, delta_col[i]).reshape(self.k.shape)
        self.k_gradient /= bx
        self.b_gradient += np.sum(delta_col, axis=(0, 1))
        self.b_gradient /= bx    

        # 计算delta_backward
        delta_backward = np.zeros(self.x.shape)
        k_180 = np.rot90(self.k, 2, (0,1))      # numpy矩阵旋转180度
        k_180 = k_180.swapaxes(2, 3)
        k_180_col = k_180.reshape(-1,ck)

        if hd-hk+1 != hx:
            pad = (hx-hd+hk-1) // 2
            pad_delta = np.pad(delta, ((0,0),(pad,pad),(pad,pad),(0,0)), 'constant')
        else:
            pad_delta = delta

        for i in range(bx):
            pad_delta_col = img2col(pad_delta[i], wk, self.stride)
            delta_backward[i] = np.dot(pad_delta_col, k_180_col).reshape(wx,hx,ck)

        # 反向传播
        self.k -=  self.k_gradient * learning_rate
        self.b -=  self.b_gradient * learning_rate

        return delta_backward

## pool
class Pool(object):
    def forward(self, x):
        b, w, h, c = x.shape
        feature_w = w // 2
        feature = np.zeros((b, feature_w, feature_w, c))
        self.feature_mask = np.zeros((b, w, h, c))   # 记录最大池化时最大值的位置信息用于反向传播
        for bi in range(b):
            for ci in range(c):
                for i in range(feature_w):
                    for j in range(feature_w):
                        feature[bi, i, j, ci] = np.max(x[bi,i*2:i*2+2,j*2:j*2+2,ci])
                        index = np.argmax(x[bi,i*2:i*2+2,j*2:j*2+2,ci])
                        self.feature_mask[bi, i*2+index//2, j*2+index%2, ci] = 1
        return feature

    def backward(self, delta):
        return np.repeat(np.repeat(delta, 2, axis=1), 2, axis=2) * self.feature_mask

## Relu
class Relu(object):
    def forward(self, x):
        self.x = x
        return np.maximum(x, 0)

    def backward(self, delta):
        delta[self.x<0] = 0
        return delta

## Softmax
class Softmax(object):
    def cal_loss(self, predict, label):
        batchsize, classes = predict.shape
        self.predict(predict)
        loss = 0
        delta = np.zeros(predict.shape)
        for i in range(batchsize):
            delta[i] = self.softmax[i] - label[i]
            loss -= np.sum(np.log(self.softmax[i]) * label[i])
        loss /= batchsize
        return loss, delta

    def predict(self, predict):
        batchsize, classes = predict.shape
        self.softmax = np.zeros(predict.shape)
        for i in range(batchsize):
            predict_tmp = predict[i] - np.max(predict[i])
            predict_tmp = np.exp(predict_tmp)
            self.softmax[i] = predict_tmp / np.sum(predict_tmp)
        return self.softmax

def train():
    # Mnist手写数字集
    dataset_path = "D://datasets//mnist"
    train_data = torchvision.datasets.MNIST(root=dataset_path, train=True, download=True)
    train_data.data = train_data.data.numpy()  # [60000,28,28]
    train_data.targets = train_data.targets.numpy()  # [60000]
    train_data.data = train_data.data.reshape(60000, 28, 28, 1) / 255.   # 输入向量处理
    train_data.targets = onehot(train_data.targets, 60000) # 标签one-hot处理 (60000, 10) 

    conv1 = Conv(kernel_shape=(5,5,1,6))   # 24x24x6
    relu1 = Relu()
    pool1 = Pool()                         # 12x12x6
    conv2 = Conv(kernel_shape=(5,5,6,16))  # 8x8x16
    relu2 = Relu()
    pool2 = Pool()                         # 4x4x16
    nn = Linear(256, 10)
    softmax = Softmax()

    lr = 0.01
    batch = 3
    for epoch in range(10):
        for i in range(0, 60000, batch):
            X = train_data.data[i:i+batch]
            Y = train_data.targets[i:i+batch]

            predict = conv1.forward(X)
            predict = relu1.forward(predict)
            predict = pool1.forward(predict)
            predict = conv2.forward(predict)
            predict = relu2.forward(predict)
            predict = pool2.forward(predict)
            predict = predict.reshape(batch, -1)
            predict = nn.forward(predict)

            loss, delta = softmax.cal_loss(predict, Y)

            delta = nn.backward(delta, lr)
            delta = delta.reshape(batch,4,4,16)
            delta = pool2.backward(delta)
            delta = relu2.backward(delta)
            delta = conv2.backward(delta, lr)
            delta = pool1.backward(delta)
            delta = relu1.backward(delta)
            conv1.backward(delta, lr)

            print("Epoch-{}-{:05d}".format(str(epoch), i), ":", "loss:{:.4f}".format(loss))

        lr *= 0.95**(epoch+1)
        np.savez("data2.npz", k1=conv1.k, b1=conv1.b, k2=conv2.k, b2=conv2.b, w3=nn.W, b3=nn.b)

def eval():
    r = np.load("data2.npz")

    # Mnist手写数字集
    dataset_path = "D://datasets//mnist"
    test_data = torchvision.datasets.MNIST(root=dataset_path, train=False)
    test_data.data = test_data.data.numpy()        # [10000,28,28]
    test_data.targets = test_data.targets.numpy()  # [10000]

    test_data.data = test_data.data.reshape(10000, 28, 28, 1) / 255.

    conv1 = Conv(kernel_shape=(5, 5, 1, 6))  # 24x24x6
    relu1 = Relu()
    pool1 = Pool()  # 12x12x6
    conv2 = Conv(kernel_shape=(5, 5, 6, 16))  # 8x8x16
    relu2 = Relu()
    pool2 = Pool()  # 4x4x16
    nn = Linear(256, 10)
    softmax = Softmax()

    conv1.k = r["k1"]
    conv1.b = r["b1"]
    conv2.k = r["k2"]
    conv2.b = r["b2"]
    nn.W = r["w3"]
    nn.n = r["b3"]

    num = 0
    for i in range(10000):
        X = test_data.data[i]
        X = X[np.newaxis, :]
        Y = test_data.targets[i]

        predict = conv1.forward(X)
        predict = relu1.forward(predict)
        predict = pool1.forward(predict)
        predict = conv2.forward(predict)
        predict = relu2.forward(predict)
        predict = pool2.forward(predict)
        predict = predict.reshape(1, -1)
        predict = nn.forward(predict)

        predict = softmax.predict(predict)

        if np.argmax(predict) == Y:
            num += 1

    print("TEST-ACC: ", num/10000*100, "%")

if __name__ == '__main__':
    #train()
    eval()

原文地址:https://www.cnblogs.com/qxcheng/p/11729773.html

时间: 2024-07-31 22:52:28

用numpy实现CNN卷积神经网络的相关文章

DeepLearning tutorial(4)CNN卷积神经网络原理简介+代码详解

DeepLearning tutorial(4)CNN卷积神经网络原理简介+代码详解 @author:wepon @blog:http://blog.csdn.net/u012162613/article/details/43225445 本文介绍多层感知机算法,特别是详细解读其代码实现,基于python theano,代码来自:Convolutional Neural Networks (LeNet).经详细注释的代码和原始代码:放在我的github地址上,可下载. 一.CNN卷积神经网络原理

CNN卷积神经网络学习笔记2:网络结构

在上篇笔记<CNN卷积神经网络学习笔记1:背景介绍>中已经介绍过CNN的结构,这篇笔记中,通过一个简单的CNN的例子,梳理一下CNN的网络结构的细节. 以下是一个6层的CNN网络,这个简单的CNN网络是DeepLearning Toolbox里面CNN的一个例子,后面要通过DeepLearning Toolbox中CNN的代码来进一步理解CNN的整个过程.我们输入的是1张大小为28*28的图片. 需要注意的有: 1,这里输入的是一张图片,如果我们输入了50张图片,那么下图中的每一个方框(代表一

Deep Learning模型之:CNN卷积神经网络(一)深度解析CNN

http://m.blog.csdn.net/blog/wu010555688/24487301 本文整理了网上几位大牛的博客,详细地讲解了CNN的基础结构与核心思想,欢迎交流. [1]Deep learning简介 [2]Deep Learning训练过程 [3]Deep Learning模型之:CNN卷积神经网络推导和实现 [4]Deep Learning模型之:CNN的反向求导及练习 [5]Deep Learning模型之:CNN卷积神经网络(一)深度解析CNN [6]Deep Learn

[转]Theano下用CNN(卷积神经网络)做车牌中文字符OCR

Theano下用CNN(卷积神经网络)做车牌中文字符OCR 原文地址:http://m.blog.csdn.net/article/details?id=50989742 之前时间一直在看 Michael Nielsen 先生的 Deep Learning 教程. 用了他的代码在theano下测试了下中文车牌字符的识别.由于我没有GPU,简单的在进行了16个epoch之后,识别率达到了 98.41% ,由于图像本来质量就不高,达到这个识别率,效果挺不错了. 一共 31 类 车牌中文字符数据来源于

【转】Deep Learning论文笔记之(四)CNN卷积神经网络推导和实现

原作者:zouxy09 原文链接:http://blog.csdn.net/zouxy09/article/details/9993371 Deep Learning论文笔记之(四)CNN卷积神经网络推导和实现 [email protected] http://blog.csdn.net/zouxy09          自己平时看了一些论文,但老感觉看完过后就会慢慢的淡忘,某一天重新拾起来的时候又好像没有看过一样.所以想习惯地把一些感觉有用的论文中的知识点总结整理一下,一方面在整理过程中,自己

Deep Learning论文笔记之(四)CNN卷积神经网络推导和实现(转)

Deep Learning论文笔记之(四)CNN卷积神经网络推导和实现 [email protected] http://blog.csdn.net/zouxy09          自己平时看了一些论文,但老感觉看完过后就会慢慢的淡忘,某一天重新拾起来的时候又好像没有看过一样.所以想习惯地把一些感觉有用的论文中的知识点总结整理一下,一方面在整理过程中,自己的理解也会更深,另一方面也方便未来自己的勘察.更好的还可以放到博客上面与大家交流.因为基础有限,所以对论文的一些理解可能不太正确,还望大家不

CNN(卷积神经网络)、RNN(循环神经网络)、DNN(深度神经网络)

CNN(卷积神经网络).RNN(循环神经网络).DNN(深度神经网络)的内部网络结构有什么区别? DNN以神经网络为载体,重在深度,可以说是一个统称.RNN,回归型网络,用于序列数据,并且有了一定的记忆效应,辅之以lstm.CNN应该侧重空间映射,图像数据尤为贴合此场景. DNN以神经网络为载体,重在深度,可以说是一个统称.RNN,回归型网络,用于序列数据,并且有了一定的记忆效应,辅之以lstm.CNN应该侧重空间映射,图像数据尤为贴合此场景. Stanford University CS231

CNN卷积神经网络学习笔记3:权值更新公式推导

在上篇<CNN卷积神经网络学习笔记2:网络结构>中,已经介绍了CNN的网络结构的详细构成,我们已经可以初始化一个自己的CNN网络了,接下来就是要用训练得到一个确定的CNN的模型,也就是确定CNN的参数. CNN本质上就是人工神经网络的一种,只是在前几层的处理上有所不同,我们可以把卷积核看成是人工神经网络里的权值W,而采样层实质上也是一种卷积运算.所以可以基于人工神经网络的权值更新的方法来推导CNN里的权值更新公式.人工神经网络里是用反向传播算法将误差层层回传,利用梯度下降法更新每一层的权值,C

CNN卷积神经网络

CNN是一种多层神经网络,基于人工神经网络,在人工神经网络前,用滤波器进行特征抽取,使用卷积核作为特征抽取器,自动训练特征抽取器,就是说卷积核以及阈值参数这些都需要由网络去学习. 图像可以直接作为网络的输入,避免了传统识别算法中复杂的特征提取和数据重建过程. 一般卷积神经网络的结构: 前面feature extraction部分体现了CNN的特点,feature extraction部分最后的输出可以作为分类器的输入.这个分类器你可以用softmax或RBF等等. 局部感受野与权值共享 局部感受