mxnet教程

官方教程,讲的还行,我用自己的实例讲解。

1:引入module:

import mxnet as mx
import numpy as np
import cv2
import matplotlib.pyplot as plt
import logging

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

2:创建网络:

# Variables are place holders for input arrays. We give each variable a unique name.
data = mx.symbol.Variable(‘data‘)

# The input is fed to a fully connected layer that computes Y=WX+b.
# This is the main computation module in the network.
# Each layer also needs an unique name. We‘ll talk more about naming in the next section.
fc1  = mx.symbol.FullyConnected(data = data, name=‘fc1‘, num_hidden=128)
# Activation layers apply a non-linear function on the previous layer‘s output.
# Here we use Rectified Linear Unit (ReLU) that computes Y = max(X, 0).
act1 = mx.symbol.Activation(data = fc1, name=‘relu1‘, act_type="relu")

fc2  = mx.symbol.FullyConnected(data = act1, name = ‘fc2‘, num_hidden = 64)
act2 = mx.symbol.Activation(data = fc2, name=‘relu2‘, act_type="relu")

fc3  = mx.symbol.FullyConnected(data = act2, name=‘fc3‘, num_hidden=10)
# Finally we have a loss layer that compares the network‘s output with label and generates gradient signals.
mlp  = mx.symbol.SoftmaxOutput(data = fc3, name = ‘softmax‘)

3:显示网络:

mx.viz.plot_network(mlp)

不过这个在spyder上无法显示,所以本人使用这个,会在运行目录下创建jpg的图:

mx.viz.plot_network(mlp).view()  

4:加载数据:

由于官方mxnet只用mnist数据来测试,所以:

又由于data很难下下来,所以在example目录下新建data文件夹,在data文件夹中创建mldata文件夹,再放入从github上下载的original_mnist.mat文件

from sklearn.datasets import fetch_mldata
import os,sys
curr_path = sys.path[0]
sys.path = [os.path.join("/home/hu/mxnet-master/example/autoencoder")] + sys.path
import data
X,Y=data.get_mnist()

for i in range(10):
    plt.subplot(1,10,i+1)
    plt.imshow(X[i].reshape((28,28)), cmap=‘Greys_r‘)
    plt.axis(‘off‘)
plt.show()

X = X.astype(np.float32)/255
X_train = X[:60000]
X_test = X[60000:]
Y_train = Y[:60000]
Y_test = Y[60000:]

5:设置数据迭代器:

mxnet这个数据迭代器创建可以自己写函数,网上可以查得到,mxnet工作其实就是数据一块一块的迭代

batch_size = 100
train_iter = mx.io.NDArrayIter(X_train, Y_train, batch_size=batch_size)
test_iter = mx.io.NDArrayIter(X_test, Y_test, batch_size=batch_size)

6:训练:

网上看到,好像不要这样去训练,因为这样的话,你能够调试的东西就少了

model = mx.model.FeedForward(
    ctx = mx.gpu(0),      # Run on GPU 0
    symbol = mlp,         # Use the network we just defined
    num_epoch = 10,       # Train for 10 epochs
    learning_rate = 0.1,  # Learning rate
    momentum = 0.9,       # Momentum for SGD with momentum
    wd = 0.00001)         # Weight decay for regularization
model.fit(
    X=train_iter,  # Training data set
    eval_data=test_iter,  # Testing data set. MXNet computes scores on test set every epoch
    batch_end_callback = mx.callback.Speedometer(batch_size, 200))  # Logging module to print out progress

第二种:

先把数据放入显存,初始化参数,然后在训练(貌似,用这个准确率更高?)

# construct a simple MLP
data = mx.symbol.Variable(‘data‘)
fc1  = mx.symbol.FullyConnected(data, name=‘fc1‘, num_hidden=128)
act1 = mx.symbol.Activation(fc1, name=‘relu1‘, act_type="relu")
fc2  = mx.symbol.FullyConnected(act1, name = ‘fc2‘, num_hidden = 64)
act2 = mx.symbol.Activation(fc2, name=‘relu2‘, act_type="relu")
fc3  = mx.symbol.FullyConnected(act2, name=‘fc3‘, num_hidden=10)
out  = mx.symbol.SoftmaxOutput(fc3, name = ‘softmax‘)
# construct the module
mod = mx.mod.Module(out)
mod.bind(data_shapes=train_iter.provide_data,label_shapes=train_iter.provide_label)
mod.init_params()
mod.fit(train_iter, eval_data=test_iter,optimizer_params={‘learning_rate‘:0.01, ‘momentum‘: 0.9},num_epoch=10)

7:用训练好的模型进行来预测:

plt.imshow((X_test[0].reshape((28,28))*255).astype(np.uint8), cmap=‘Greys_r‘)
plt.show()
print ‘Result:‘, model.predict(X_test[0:1])[0].argmax()

8:有模型评估函数:

print ‘Accuracy:‘, model.score(test_iter)*100, ‘%‘

9:弄成网页调用函数:

# run hand drawing test
from IPython.display import HTML

def classify(img):
    img = img[len(‘data:image/png;base64,‘):].decode(‘base64‘)
    img = cv2.imdecode(np.fromstring(img, np.uint8), -1)
    img = cv2.resize(img[:,:,3], (28,28))
    img = img.astype(np.float32).reshape((1, 784))/255.0
    return model.predict(img)[0].argmax()

html = """<style type="text/css">canvas { border: 1px solid black; }</style><div id="board"><canvas id="myCanvas" width="100px" height="100px">Sorry, your browser doesn‘t support canvas technology.</canvas><p><button id="classify" onclick="classify()">Classify</button><button id="clear" onclick="myClear()">Clear</button>Result: <input type="text" id="result_output" size="5" value=""></p></div>"""
script = """<script type="text/JavaScript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js?ver=1.4.2"></script><script type="text/javascript">function init() {var myCanvas = document.getElementById("myCanvas");var curColor = $(‘#selectColor option:selected‘).val();if(myCanvas){var isDown = false;var ctx = myCanvas.getContext("2d");var canvasX, canvasY;ctx.lineWidth = 5;$(myCanvas).mousedown(function(e){isDown = true;ctx.beginPath();var parentOffset = $(this).parent().offset(); canvasX = e.pageX - parentOffset.left;canvasY = e.pageY - parentOffset.top;ctx.moveTo(canvasX, canvasY);}).mousemove(function(e){if(isDown != false) {var parentOffset = $(this).parent().offset(); canvasX = e.pageX - parentOffset.left;canvasY = e.pageY - parentOffset.top;ctx.lineTo(canvasX, canvasY);ctx.strokeStyle = curColor;ctx.stroke();}}).mouseup(function(e){isDown = false;ctx.closePath();});}$(‘#selectColor‘).change(function () {curColor = $(‘#selectColor option:selected‘).val();});}init();function handle_output(out) {document.getElementById("result_output").value = out.content.data["text/plain"];}function classify() {var kernel = IPython.notebook.kernel;var myCanvas = document.getElementById("myCanvas");data = myCanvas.toDataURL(‘image/png‘);document.getElementById("result_output").value = "";kernel.execute("classify(‘" + data +"‘)",  { ‘iopub‘ : {‘output‘ : handle_output}}, {silent:false});}function myClear() {var myCanvas = document.getElementById("myCanvas");myCanvas.getContext("2d").clearRect(0, 0, myCanvas.width, myCanvas.height);}</script>"""
HTML(html+script)

10:输出权重:

def norm_stat(d):
    """The statistics you want to see.
    We compute the L2 norm here but you can change it to anything you like."""
    return mx.nd.norm(d)/np.sqrt(d.size)
mon = mx.mon.Monitor(
    100,                 # Print every 100 batches
    norm_stat,           # The statistics function defined above
    pattern=‘.*weight‘,  # A regular expression. Only arrays with name matching this pattern will be included.
    sort=True)           # Sort output by name
model = mx.model.FeedForward(ctx = mx.gpu(0), symbol = mlp, num_epoch = 1,
                             learning_rate = 0.1, momentum = 0.9, wd = 0.00001)
model.fit(X=train_iter, eval_data=test_iter, monitor=mon,  # Set the monitor here
          batch_end_callback = mx.callback.Speedometer(100, 100))

11:就像之前所说的,数据ilter是能够自己写loop来聚类的

但说实话,自己写的loop如何调用gpu?作者的自己写的例子,也没有调用gpu,我实在是怀疑

epoch迭代次数,ilter是分的数据patch的个数

# ==================Binding=====================
# The symbol we created is only a graph description.
# To run it, we first need to allocate memory and create an executor by ‘binding‘ it.
# In order to bind a symbol, we need at least two pieces of information: context and input shapes.
# Context specifies which device the executor runs on, e.g. cpu, GPU0, GPU1, etc.
# Input shapes define the executor‘s input array dimensions.
# MXNet then run automatic shape inference to determine the dimensions of intermediate and output arrays.

# data iterators defines shapes of its output with provide_data and provide_label property.
input_shapes = dict(train_iter.provide_data+train_iter.provide_label)
print ‘input_shapes‘, input_shapes
# We use simple_bind to let MXNet allocate memory for us.
# You can also allocate memory youself and use bind to pass it to MXNet.
exe = mlp.simple_bind(ctx=mx.gpu(0), **input_shapes)

# ===============Initialization=================
# First we get handle to input arrays
arg_arrays = dict(zip(mlp.list_arguments(), exe.arg_arrays))
data = arg_arrays[train_iter.provide_data[0][0]]
label = arg_arrays[train_iter.provide_label[0][0]]

# We initialize the weights with uniform distribution on (-0.01, 0.01).
init = mx.init.Uniform(scale=0.01)
for name, arr in arg_arrays.items():
    if name not in input_shapes:
        init(name, arr)

# We also need to create an optimizer for updating weights
opt = mx.optimizer.SGD(
    learning_rate=0.1,
    momentum=0.9,
    wd=0.00001,
    rescale_grad=1.0/train_iter.batch_size)
updater = mx.optimizer.get_updater(opt)

# Finally we need a metric to print out training progress
metric = mx.metric.Accuracy()

# Training loop begines
for epoch in range(10):
    train_iter.reset()
    metric.reset()
    t = 0
    for batch in train_iter:
        # Copy data to executor input. Note the [:].
        data[:] = batch.data[0]
        label[:] = batch.label[0]

        # Forward
        exe.forward(is_train=True)

        # You perform operations on exe.outputs here if you need to.
        # For example, you can stack a CRF on top of a neural network.

        # Backward
        exe.backward()

        # Update
        for i, pair in enumerate(zip(exe.arg_arrays, exe.grad_arrays)):
            weight, grad = pair
            updater(i, grad, weight)
        metric.update(batch.label, exe.outputs)
        t += 1
        if t % 100 == 0:
            print ‘epoch:‘, epoch, ‘iter:‘, t, ‘metric:‘, metric.get()

12:新的层

输入的数据,输出数据个数都要好好申明

# Define custom softmax operator
class NumpySoftmax(mx.operator.NumpyOp):
    def __init__(self):
        # Call the parent class constructor.
        # Because NumpySoftmax is a loss layer, it doesn‘t need gradient input from layers above.
        super(NumpySoftmax, self).__init__(need_top_grad=False)

    def list_arguments(self):
        # Define the input to NumpySoftmax.
        return [‘data‘, ‘label‘]

    def list_outputs(self):
        # Define the output.
        return [‘output‘]

    def infer_shape(self, in_shape):
        # Calculate the dimensions of the output (and missing inputs) from (some) input shapes.
        data_shape = in_shape[0]  # shape of first argument ‘data‘
        label_shape = (in_shape[0][0],)  # ‘label‘ should be one dimensional and has batch_size instances.
        output_shape = in_shape[0] # ‘output‘ dimension is the same as the input.
        return [data_shape, label_shape], [output_shape]

    def forward(self, in_data, out_data):
        x = in_data[0]  # ‘data‘
        y = out_data[0]  # ‘output‘

        # Compute softmax
        y[:] = np.exp(x - x.max(axis=1).reshape((x.shape[0], 1)))
        y /= y.sum(axis=1).reshape((x.shape[0], 1))

    def backward(self, out_grad, in_data, out_data, in_grad):
        l = in_data[1]  # ‘label‘
        l = l.reshape((l.size,)).astype(np.int)  # cast to int
        y = out_data[0]  # ‘output‘
        dx = in_grad[0]  # gradient for ‘data‘

        # Compute gradient
        dx[:] = y
        dx[np.arange(l.shape[0]), l] -= 1.0

numpy_softmax = NumpySoftmax()

data = mx.symbol.Variable(‘data‘)
fc1 = mx.symbol.FullyConnected(data = data, name=‘fc1‘, num_hidden=128)
act1 = mx.symbol.Activation(data = fc1, name=‘relu1‘, act_type="relu")
fc2 = mx.symbol.FullyConnected(data = act1, name = ‘fc2‘, num_hidden = 64)
act2 = mx.symbol.Activation(data = fc2, name=‘relu2‘, act_type="relu")
fc3 = mx.symbol.FullyConnected(data = act2, name=‘fc3‘, num_hidden=10)
# Use the new operator we just defined instead of the standard softmax operator.
mlp = numpy_softmax(data=fc3, name = ‘softmax‘)

model = mx.model.FeedForward(ctx = mx.gpu(0), symbol = mlp, num_epoch = 2,
                             learning_rate = 0.1, momentum = 0.9, wd = 0.00001)
model.fit(X=train_iter, eval_data=test_iter,
          batch_end_callback = mx.callback.Speedometer(100, 100))

13:新层加新的迭代:

我创建在example/mytest文件夹下面

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
"""
Created on Thu Mar 30 15:35:02 2017

@author: root
"""
from __future__ import print_function
import sys
import os
# code to automatically download dataset
curr_path = os.path.dirname(os.path.abspath(os.path.expanduser(__file__)))
sys.path = [os.path.join(curr_path, "../autoencoder")] + sys.path
import mxnet as mx
import numpy as np
import data
from scipy.spatial.distance import cdist
from sklearn.cluster import KMeans
import model
from autoencoder import AutoEncoderModel
from solver import Solver, Monitor
import logging
import time
global YT
import scipy.io as sio
import matplotlib.pyplot as plt
# ==================start setting My-layer=====================
class NumpySoftmax(mx.operator.NumpyOp):
    def __init__(self):
        # Call the parent class constructor.
        # Because NumpySoftmax is a loss layer, it doesn‘t need gradient input from layers above.
        super(NumpySoftmax, self).__init__(need_top_grad=False)

    def list_arguments(self):
        # Define the input to NumpySoftmax.
        return [‘data‘, ‘label‘]

    def list_outputs(self):
        # Define the output.
        return [‘output‘]

    def infer_shape(self, in_shape):
        # Calculate the dimensions of the output (and missing inputs) from (some) input shapes.
        data_shape = in_shape[0]  # shape of first argument ‘data‘
        label_shape = (in_shape[0][0],)  # ‘label‘ should be one dimensional and has batch_size instances.
        output_shape = in_shape[0] # ‘output‘ dimension is the same as the input.
        return [data_shape, label_shape], [output_shape]

    def forward(self, in_data, out_data):
        alpha=1.0
        z = in_data[0]
        q= out_data[0]  # ‘output‘
        kmeans = KMeans(n_clusters=10, random_state=170).fit(z)
        mu=kmeans.cluster_centers_
        # Compute softmax
        mask = 1.0/(1.0+cdist(z, mu)**2/alpha)
        q[:] = mask**((alpha+1.0)/2.0)
        q[:] = (q.T/q.sum(axis=1)).T

    def backward(self, out_grad, in_data, out_data, in_grad):
        alpha=1.0
        x = in_data[0]  # ‘label‘
        y = out_data[0]  # ‘output‘
        dx = in_grad[0]  # gradient for ‘data‘
        kmeans = KMeans(n_clusters=10, random_state=170).fit(x)
        mu=kmeans.cluster_centers_
        mask = 1.0/(1.0+cdist(x, mu)**2/alpha)
        p = mask**((alpha+1.0)/2.0)
        mask*= (alpha+1.0)/alpha*(p-y)
        dx[:] = (x.T*mask.sum(axis=1)).T - mask.dot(mu)
#======================end setting==========================
# ==================start of the process of data=====================
X, Y = data.get_mnist()
X_train = X[:60000]
X_test = X[60000:]
Y_train = Y[:60000]
Y_test = Y[60000:]
numpy_softmax = NumpySoftmax()
batch_size = 100
#the office code to create iter
train_iter = mx.io.NDArrayIter(X_train, Y_train, batch_size=batch_size)
test_iter = mx.io.NDArrayIter(X_test, Y_test, batch_size=batch_size)
input_shapes = dict(train_iter.provide_data+train_iter.provide_label)
# ==================end of the process=====================
# ==================start of setting the net=====================
data = mx.symbol.Variable(‘data‘)
fc1 = mx.symbol.FullyConnected(data = data, name=‘fc1‘, num_hidden=128)
act1 = mx.symbol.Activation(data = fc1, name=‘relu1‘, act_type="relu")
fc2 = mx.symbol.FullyConnected(data = act1, name = ‘fc2‘, num_hidden = 64)
act2 = mx.symbol.Activation(data = fc2, name=‘relu2‘, act_type="relu")
fc3 = mx.symbol.FullyConnected(data = act2, name=‘fc3‘, num_hidden=10)
mlp = numpy_softmax(data=fc3, name = ‘softmax‘)
mx.viz.plot_network(mlp).view()
# ==================start of setting the net=====================
exe = mlp.simple_bind(ctx=mx.gpu(0), **input_shapes)
# ===============Initialization=================
# First we get handle to input arrays
arg_arrays = dict(zip(mlp.list_arguments(), exe.arg_arrays))
data = arg_arrays[train_iter.provide_data[0][0]]
label = arg_arrays[train_iter.provide_label[0][0]]

# We initialize the weights with uniform distribution on (-0.01, 0.01).
init = mx.init.Uniform(scale=0.01)
for name, arr in arg_arrays.items():
    if name not in input_shapes:
        init(name, arr)

# We also need to create an optimizer for updating weights
opt = mx.optimizer.SGD(
    learning_rate=0.1,
    momentum=0.9,
    wd=0.00001,
    rescale_grad=1.0/train_iter.batch_size)
updater = mx.optimizer.get_updater(opt)

# Finally we need a metric to print out training progress
metric = mx.metric.Accuracy()

# Training loop begines
for epoch in range(10):
    train_iter.reset()
    metric.reset()
    t = 0
    for batch in train_iter:
        # Copy data to executor input. Note the [:].
        data[:] = batch.data[0]
        label[:] = batch.label[0]

        # Forward
        exe.forward(is_train=True)

        # You perform operations on exe.outputs here if you need to.
        # For example, you can stack a CRF on top of a neural network.

        # Backward
        exe.backward()

        # Update
        for i, pair in enumerate(zip(exe.arg_arrays, exe.grad_arrays)):
            weight, grad = pair
            updater(i, grad, weight)
        metric.update(batch.label, exe.outputs)
        t += 1
        if t % 100 == 0:
            print(‘epoch:‘, epoch, ‘iter:‘, t, ‘metric:‘, metric.get())
时间: 2024-12-17 05:13:53

mxnet教程的相关文章

深度学习库比较

  深度学习库比较 库名 主语言 从语言 速度 灵活性 文档 适合模型 平台 上手难易 开发者 模式 Tensorflo C++ cuda/python 中等 好 中等 CNN/RNN Linux,OSX 难 Google 分布式/声明式 Caffe C++ cuda/python/Matlab 快 一般 全面 CNN 所有系统 中等 贾杨清 声明式 PyTorc python C/C++ 中等 好 中等 - -- 中等 FaceBook   MXNet c++ cuda/R/julia 快 好

深度学习框架总结

深度学习库比较 库名 主语言 从语言 速度 灵活性 文档 适合模型 平台 上手难易 开发者 模式 Tensorflo C++ cuda/python/Matlab/Ruby/R 中等 好 中等 CNN/RNN Linux,OSX 难 Google 分布式/声明式 Caffe C++ cuda/python/Matlab 快 一般 全面 CNN 所有系统 中等 贾杨清 声明式 PyTorc python C/C++ 中等 好 中等 - – 中等 FaceBook MXNet c++ cuda/R/

MXNet官方文档中文版教程(3):神经网络图(Symbol)

https://blog.csdn.net/qq_36165459/article/details/78394259 文档英文原版参见Symbol - Neural network graphs and auto-differentiation 在前面的教程中,我们介绍了NDArray,MXNet中操作数据的基本数据结构.仅仅使用NDArray本身,我们就可以执行很多数学操作.实际上,我们也可以使用NDArray定义和更新一整个神经网络.NDArray支持命令式编程(用于科学计算),充分利用任何

Win10 + Python + MXNet + VS2015配置

项目需要使用MTCNN来检测.对齐.剪切出人脸,它是使用MXNet作为框架的,但是我自己的Ubuntu里各种框架乱成一团,不想再添乱就铁了心要在windows里配一个.无奈网上的资料不多,挣扎了几天之后决定留下这么一份文档. 首先我们使用的不是DMLC在github上发布的那一套MXNet,它那个的windows版2016年就不更新了,现在负责维护的地址是这个: https://github.com/yajiedesign/mxnet/releases 这是一个日更的版本,非常新.巧的是MTCN

MXNet设计笔记之:深度学习的编程模式比较

市面上流行着各式各样的深度学习库,它们风格各异.那么这些函数库的风格在系统优化和用户体验方面又有哪些优势和缺陷呢?本文旨在于比较它们在编程模式方面的差异,讨论这些模式的基本优劣势,以及我们从中可以学到什么经验. 我们主要关注编程模式本身,而不是其具体实现.因此,本文并不是一篇关于深度学习库相互比较的文章.相反,我们根据它们所提供的接口,将这些函数库分为几大类,然后讨论各类形式的接口将会对深度学习编程的性能和灵活性产生什么影响.本文的讨论可能不只针对于深度学习,但我们会采用深度学习的例子来分析和优

MXNet 学习 (1) --- 最易上手的深度学习开源库 --- 安装及环境搭建

安装环境:Win 10 专业版 64位 + Visual Studio 2015 Community. 记录下自己在有GPU的环境下安装配置MXNet的过程.该过程直接使用MXNet release 的 pre-built 包,没有自己使用CMake编译.网上有很多自己编译的教程,过程都比较繁琐,直接使用release包对新手来说更加简单方便. 选择MXNet的原因是因为看了<Caffe.TensorFlow.MXNet三个开源库的对比>这篇博文,其中指出MXNet相对来说是最易上手的深度学习

安装MXnet包,实现MNIST手写数体识别

我想写一系列深度学习的简单实战教程,用mxnet做实现平台的实例代码简单讲解深度学习常用的一些技术方向和实战样例.这一系列的主要内容偏向于讲解实际的例子,从样例和代码里中学习解决实际问题.我会默认读者有一定神经网络和深度学习的基础知识,读者在这里不会看到大段推导和理论阐述.基础理论知识十分重要,如果读者对理论知识有兴趣,可以参看已有的深度学习教程补充和巩固理论基础,这里http://deeplearning.net/reading-list/tutorials/有一些不错的理论教程,相关的理论知

mxnet:结合R与GPU加速深度学习(转)

近年来,深度学习可谓是机器学习方向的明星概念,不同的模型分别在图像处理与自然语言处理等任务中取得了前所未有的好成绩.在实际的应用中,大家除了关心模型的准确度,还常常希望能比较快速地完成模型的训练.一个常用的加速手段便是将模型放在GPU上进行训练.然而由于种种原因,R语言似乎缺少一个能够在GPU上训练深度学习模型的程序包. DMLC(Distributed (Deep) Machine Learning Community)是由一群极客发起的组织,主要目标是提供快速高质量的开源机器学习工具.近来流

深度学习框架哪家强:TensorFlow?Caffe?MXNet?Keras?PyTorch?

深度学习框架哪家强:TensorFlow?Caffe?MXNet?Keras?PyTorch?对于这几大框架在运行各项深度任务时的性能差异如何,各位读者不免会有所好奇. 微软数据科学家Ilia Karmanov最新测试的结果显示,亚马逊MXNet在CNN.RNN与NLP情感分析任务上性能强劲,而TensorFlow仅擅长于特征提取. 测试详情更新在Ilia Karmanov的GitHub项目DeepLearningFrameworks(https://github.com/ilkarman/De