MNIST手写数字图片识别（线性回归、CNN方法的手工及框架实现）（未完待续）

0-Background

作为Deep Learning中的Hello World 项目无论如何都要做一遍的。

代码地址：Github 练习过程中将持续更新blog及代码。

第一次写博客，很多地方可能语言组织不清，请多多提出意见。。谢谢~

0.1 背景知识：

Linear regression
CNN

LeNet-5

AlexNet

ResNet

VGG
各种regularization方式

0.2 Catalog

1-Prepare
2-MNIST
3-LinearRegression

1-Prepare

Numpy 开源的数值计算库
matplotlib Python 的 2D绘图库
TensorFlow 开源的人工智能学习系统
Keras 基Tensorflow、Theano以及CNTK后端的一个高层神经网络API

2-MNIST

MNIST作为NIST的一个超集，是一个由来自 250 个不同人手写的数字构成。其中包含60,000个训练样本和10,000个测试样本。

加载MNIST

import numpy as np
import os
import struct
import matplotlib.pyplot as plt

class load:
    def __init__(self,
                 path=‘mnist‘):
        self.path = path

    def load_mnist(self):
        """Read train and test dataset and labels from path"""

        train_image_path = ‘train-images.idx3-ubyte‘
        train_label_path = ‘train-labels.idx1-ubyte‘

        test_image_path = ‘t10k-images.idx3-ubyte‘
        test_label_path = ‘t10k-labels.idx1-ubyte‘

        with open(os.path.join(self.path, train_label_path), ‘rb‘) as labelpath:
            magic, n = struct.unpack(‘>II‘, labelpath.read(8))
            labels = np.fromfile(labelpath, dtype=np.uint8)
            train_labels = labels.reshape(len(labels), 1)

        with open(os.path.join(self.path, train_image_path), ‘rb‘) as imgpath:
            magic, num, rows, cols = struct.unpack(‘>IIII‘, imgpath.read(16))
            images = np.fromfile(imgpath,
                                 dtype=np.uint8).reshape(len(train_labels), 784)
            train_images = images

        with open(os.path.join(self.path, test_label_path), ‘rb‘) as labelpath:
            magic, n = struct.unpack(‘>II‘, labelpath.read(8))
            labels = np.fromfile(labelpath,
                                 dtype=np.uint8)
            test_labels = labels.reshape(len(labels), 1)

        with open(os.path.join(self.path, test_image_path), ‘rb‘) as imgpath:
            magic, num, rows, cols = struct.unpack(‘>IIII‘, imgpath.read(16))
            images = np.fromfile(imgpath, dtype=np.uint8).reshape(len(test_labels), 784)
            test_images = images

        return train_images, train_labels, test_images, test_labels

if __name__ == ‘__main__‘:
    train_images, train_labels, test_images, test_labels = load().load_mnist()
    print(‘train_images shape:%s‘ % str(train_images.shape))
    print(‘train_labels shape:%s‘ % str(train_labels.shape))
    print(‘test_images shape:%s‘ % str(test_images.shape))
    print(‘test_labels shape:%s‘ % str(test_labels.shape))

    np.random.seed(1024)

    trainImage = np.random.randint(60000, size=4)
    testImage = np.random.randint(10000, size=2)

    img1 = train_images[trainImage[0]].reshape(28, 28)
    label1 = train_labels[trainImage[0]]
    img2 = train_images[trainImage[1]].reshape(28, 28)
    label2 = train_labels[trainImage[1]]
    img3 = train_images[trainImage[2]].reshape(28, 28)
    label3 = train_labels[trainImage[2]]
    img4 = train_images[trainImage[3]].reshape(28, 28)
    label4 = train_labels[trainImage[3]]

    img5 = test_images[testImage[0]].reshape(28, 28)
    label5 = test_labels[testImage[0]]
    img6 = test_images[testImage[1]].reshape(28, 28)
    label6 = test_labels[testImage[1]]

    plt.figure(num=‘mnist‘, figsize=(2, 3))

    plt.subplot(2, 3, 1)
    plt.title(label1)
    plt.imshow(img1)

    plt.subplot(2, 3, 2)
    plt.title(label2)
    plt.imshow(img2)

    plt.subplot(2, 3, 3)
    plt.title(label3)
    plt.imshow(img3)

    plt.subplot(2, 3, 4)
    plt.title(label4)
    plt.imshow(img4)

    plt.subplot(2, 3, 5)
    plt.title(label5)
    plt.imshow(img5)

    plt.subplot(2, 3, 6)
    plt.title(label6)
    plt.imshow(img6)
    plt.show()

运行得到输出：

3-LinearRegression

采用线性回归的方式对MNIST数据集训练识别。

采用2层网络，hidden layer具有四个神经元，激活函数分别使用Tanh和ReLu。

由于MNIST是一个多分类问题，故输出层采用Softmax作为激活函数，并使用cross entropy作为Loss Function。

3.1 使用Numpy实现

3.1.1 通过Tran data、label获取 layer size

Code：

def layer_size(X, Y):
    """
    Get number of input and output size, and set hidden layer size
    :param X: input dataset‘s shape(m, 784)
    :param Y: input labels‘s shape(m,1)
    :return:
    n_x -- the size of the input layer
    n_h -- the size of the hidden layer
    n_y -- the size of the output layer
    """

    n_x = X.T.shape[0]
    n_h = 4
    n_y = Y.T.shape[0]

    return n_x, n_h, n_y

3.1.2 初始化参数

初始化W1、b1、W2、b2*

W初始化为非0数字

b均初始化为0

Code：

def initialize_parameters(n_x, n_h, n_y):
    """
    Initialize parameters
    :param n_x: the size of the input layer
    :param n_h: the size of the hidden layer
    :param n_y: the size of the output layer
    :return: dictionary of parameters
    """

    W1 = np.random.randn(n_h, n_x) * 0.01
    b1 = np.zeros((n_h, 1))
    W2 = np.random.randn(n_y, n_h) * 0.01
    b2 = np.zeros((n_y, 1))

    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2
                  }

    return parameters

3.1.3 Forward Propagation

ReLu采用\((|Z|+Z)/2\)的方式实现

def ReLu(Z):
    return (abs(Z) + Z) / 2

def forward_propagation(X, parameters, activation="tanh"):
    """
    Compute the forword propagation
    :param X: input data (m, n_x)
    :param parameters: parameters from initialize_parameters
    :param activation: activation function name, has "tanh" and "relu"
    :return:
        cache: caches of forword result
        A2: sigmoid output
    """

    X = X.T

    W1 = parameters["W1"]
    b1 = parameters["b1"]
    W2 = parameters["W2"]
    b2 = parameters["b2"]

    Z1 = np.dot(W1, X) + b1
    if activation == "tanh":
        A1 = np.tanh(Z1)
    elif activation == "relu":
        A1 = ReLu(Z1)
    else:
        raise Exception(‘Activation function is not found!‘)
    Z2 = np.dot(W2, A1) + b2
    A2 = 1 / (1 + np.exp(-Z2))

    cache = {"Z1": Z1,
             "A1": A1,
             "Z2": Z2,
             "A2": A2}

    return A2, cache

3.1.4 Compute Cost

原文地址：https://www.cnblogs.com/qihuang/p/9228958.html

时间： 2024-10-17 06:00:26

MNIST手写数字图片识别（线性回归、CNN方法的手工及框架实现）（未完待续）的相关文章

LSTM用于MNIST手写数字图片分类

按照惯例,先贴代码 import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data #载入数据集 mnist = input_data.read_data_sets("MNIST_data/",one_hot=True) # 输入图片是28*28 n_inputs = 28 #输入一行,一行有28个数据 max_time = 28 #一共28行 lstm_size = 100 #隐层单

吴裕雄 python神经网络手写数字图片识别（5）

import kerasimport matplotlib.pyplot as pltfrom keras.models import Sequentialfrom keras.layers import Dense,Activation,Flatten,Dropout,Convolution2D,MaxPooling2Dfrom keras.utils import np_utilsfrom keras.optimizers import RMSpropfrom skimage import

基于MNIST手写数字数据集的数字识别小程序

30行代码奉上!(MNIST手写数字的识别,识别率大约在91%,简单尝试的一个程序,小玩具而已) 1 import tensorflow.examples.tutorials.mnist.input_data as input_data 2 import tensorflow as tf 3 mnist = input_data.read_data_sets('/temp/', one_hot=True) 4 5 #设置 6 x = tf.placeholder(tf.float32,[None

一文全解：利用谷歌深度学习框架Tensorflow识别手写数字图片（初学者篇）

笔记整理者:王小草笔记整理时间2017年2月24日原文地址 http://blog.csdn.net/sinat_33761963/article/details/56837466?fps=1&locationNum=5 Tensorflow官方英文文档地址:https://www.tensorflow.org/get_started/mnist/beginners 本文整理时官方文档最近更新时间:2017年2月15日 1.案例背景本文是跟着Tensorflow官方文档的第二篇教程–识别手

Tensorflow实践 mnist手写数字识别

minst数据集 tensorflow的文档中就自带了mnist手写数字识别的例子,是一个很经典也比较简单的入门tensorflow的例子,非常值得自己动手亲自实践一下.由于我用的不是tensorflow中自带的mnist数据集,而是从kaggle的网站下载下来的,数据集有些不太一样,所以直接按照tensorflow官方文档上的参数训练的话还是踩了一些坑,特此记录. 首先从kaggle网站下载mnist数据集,一份是

tensorflow 基础学习五：MNIST手写数字识别

MNIST数据集介绍: from tensorflow.examples.tutorials.mnist import input_data # 载入MNIST数据集,如果指定地址下没有已经下载好的数据,tensorflow会自动下载数据 mnist=input_data.read_data_sets('.',one_hot=True) # 打印 Training data size:55000. print("Training data size: {}".format(mnist.

安装MXnet包，实现MNIST手写数体识别

我想写一系列深度学习的简单实战教程,用mxnet做实现平台的实例代码简单讲解深度学习常用的一些技术方向和实战样例.这一系列的主要内容偏向于讲解实际的例子,从样例和代码里中学习解决实际问题.我会默认读者有一定神经网络和深度学习的基础知识,读者在这里不会看到大段推导和理论阐述.基础理论知识十分重要,如果读者对理论知识有兴趣,可以参看已有的深度学习教程补充和巩固理论基础,这里http://deeplearning.net/reading-list/tutorials/有一些不错的理论教程,相关的理论知

Andrew Ng 机器学习课程笔记 ———— 通过初步的神经网络实现手写数字的识别(尽力去向量化实现)

上一篇我总结了自己在学完逻辑回归后,实现了对手写数字的初步识别 , 在学完了Andrew教授的神经网络简易教程后,趁着知识刚学完没多久,记下了自己在运用简易神经网络实现手写数字识别过程中的总结和问题 ^_^ 菜鸡QP的第二篇学习笔记 ~ 错误在所难免 ,希望自己可以通过一篇篇菜鸡的笔记心得 ,取得一点点的进步 ~\(≧▽≦)/~ ) 依旧是给定 5000个20 * 20像素点的手写数字图片 ,与前几天自己完成的逻辑回归完成任务不同 ,这次自己终于要用到极富魅力的神经网络啦(虽然只是最基础

Pytorch入门实战一：LeNet神经网络实现 MNIST手写数字识别

记得第一次接触手写数字识别数据集还在学习TensorFlow,各种sess.run(),头都绕晕了.自从接触pytorch以来,一直想写点什么.曾经在2017年5月,Andrej Karpathy发表的一片Twitter,调侃道:l've been using PyTorch a few months now, l've never felt better, l've more energy.My skin is clearer. My eye sight has improved.确实,使用p