机器学习-反向传播算法（BP）代码实现（matlab）

%% Machine Learning Online Class - Exercise 4 Neural Network Learning

% Instructions
% ------------
%
% This file contains code that helps you get started on the
% linear exercise. You will need to complete the following functions
% in this exericse:
%
% sigmoidGradient.m
% randInitializeWeights.m
% nnCostFunction.m
%
% For this exercise, you will not need to change any code in this file,
% or any other files other than those mentioned above.
%

%% Initialization
clear ; close all; clc

%% Setup the parameters you will use for this exercise
input_layer_size = 400; % 20x20 Input Images of Digits
hidden_layer_size = 25; % 25 hidden units
num_labels = 10; % 10 labels, from 1 to 10
% (note that we have mapped "0" to label 10)

%% =========== Part 1: Loading and Visualizing Data =============
% We start the exercise by first loading and visualizing the dataset.
% You will be working with a dataset that contains handwritten digits.
%

% Load Training Data
fprintf(‘Loading and Visualizing Data ...\n‘)

load(‘ex4data1.mat‘);
m = size(X, 1);

% Randomly select 100 data points to display
sel = randperm(size(X, 1));
sel = sel(1:100);

sel(1:100); 1 2 3 4 5 ...100

解释

a = X(sel, :);

X(sel, :);
1.......
2.......
3.......
4.......
5.......
.
.
.
100......

解释

displayData(X(sel, :));

fprintf(‘Program paused. Press enter to continue.\n‘);
pause;

%% ================ Part 2: Loading Parameters ================
% In this part of the exercise, we load some pre-initialized
% neural network parameters.

fprintf(‘\nLoading Saved Neural Network Parameters ...\n‘)

% Load the weights into variables Theta1 and Theta2
load(‘ex4weights.mat‘);

% Unroll parameters
nn_params = [Theta1(:) ; Theta2(:)];

https://www.cnblogs.com/liu-wang/p/9466123.html

解释

%% ================ Part 3: Compute Cost (Feedforward) ================
% To the neural network, you should first start by implementing the
% feedforward part of the neural network that returns the cost only. You
% should complete the code in nnCostFunction.m to return cost. After
% implementing the feedforward to compute the cost, you can verify that
% your implementation is correct by verifying that you get the same cost
% as us for the fixed debugging parameters.
%
% We suggest implementing the feedforward cost *without* regularization
% first so that it will be easier for you to debug. Later, in part 4, you
% will get to implement the regularized cost.
%
fprintf(‘\nFeedforward Using Neural Network ...\n‘)

% Weight regularization parameter (we set this to 0 here).
lambda = 0;

J = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, ...
num_labels, X, y, lambda);

fprintf([‘Cost at parameters (loaded from ex4weights): %f ‘...
‘\n(this value should be about 0.287629)\n‘], J);

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% =============== Part 4: Implement Regularization ===============
% Once your cost function implementation is correct, you should now
% continue to implement the regularization with the cost.
%

fprintf(‘\nChecking Cost Function (w/ Regularization) ... \n‘)

% Weight regularization parameter (we set this to 1 here).
lambda = 1;

J = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, ...
num_labels, X, y, lambda);

fprintf([‘Cost at parameters (loaded from ex4weights): %f ‘...
‘\n(this value should be about 0.383770)\n‘], J);

fprintf(‘Program paused. Press enter to continue.\n‘);
pause;

%% ================ Part 5: Sigmoid Gradient ================
% Before you start implementing the neural network, you will first
% implement the gradient for the sigmoid function. You should complete the
% code in the sigmoidGradient.m file.
%

fprintf(‘\nEvaluating sigmoid gradient...\n‘)

g = sigmoidGradient([1 -0.5 0 0.5 1]);
fprintf(‘Sigmoid gradient evaluated at [1 -0.5 0 0.5 1]:\n ‘);
fprintf(‘%f ‘, g);
fprintf(‘\n\n‘);

fprintf(‘Program paused. Press enter to continue.\n‘);
pause;

%% ================ Part 6: Initializing Pameters ================
% In this part of the exercise, you will be starting to implment a two
% layer neural network that classifies digits. You will start by
% implementing a function to initialize the weights of the neural network
% (randInitializeWeights.m)

fprintf(‘\nInitializing Neural Network Parameters ...\n‘)

initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size);
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels);

% Unroll parameters
initial_nn_params = [initial_Theta1(:) ; initial_Theta2(:)];

%% =============== Part 7: Implement Backpropagation ===============
% Once your cost matches up with ours, you should proceed to implement the
% backpropagation algorithm for the neural network. You should add to the
% code you‘ve written in nnCostFunction.m to return the partial
% derivatives of the parameters.
%
fprintf(‘\nChecking Backpropagation... \n‘);

% Check gradients by running checkNNGradients
checkNNGradients;

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% =============== Part 8: Implement Regularization ===============
% Once your backpropagation implementation is correct, you should now
% continue to implement the regularization with the cost and gradient.
%

fprintf(‘\nChecking Backpropagation (w/ Regularization) ... \n‘)

% Check gradients by running checkNNGradients
lambda = 3;
checkNNGradients(lambda);

% Also output the costFunction debugging values
debug_J = nnCostFunction(nn_params, input_layer_size, ...
hidden_layer_size, num_labels, X, y, lambda);

fprintf([‘\n\nCost at (fixed) debugging parameters (w/ lambda = 10): %f ‘ ...
‘\n(this value should be about 0.576051)\n\n‘], debug_J);

fprintf(‘Program paused. Press enter to continue.\n‘);
pause;

%% =================== Part 8: Training NN ===================
% You have now implemented all the code necessary to train a neural
% network. To train your neural network, we will now use "fmincg", which
% is a function which works similarly to "fminunc". Recall that these
% advanced optimizers are able to train our cost functions efficiently as
% long as we provide them with the gradient computations.
%
fprintf(‘\nTraining Neural Network... \n‘)

% After you have completed the assignment, change the MaxIter to a larger
% value to see how more training helps.
options = optimset(‘MaxIter‘, 50);

% You should also try different values of lambda
lambda = 1;

% Create "short hand" for the cost function to be minimized
costFunction = @(p) nnCostFunction(p, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, X, y, lambda);

% Now, costFunction is a function that takes in only one argument (the
% neural network parameters)
[nn_params, cost] = fmincg(costFunction, initial_nn_params, options);

% Obtain Theta1 and Theta2 back from nn_params
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));

Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));

fprintf(‘Program paused. Press enter to continue.\n‘);
pause;

%% ================= Part 9: Visualize Weights =================
% You can now "visualize" what the neural network is learning by
% displaying the hidden units to see what features they are capturing in
% the data.

fprintf(‘\nVisualizing Neural Network... \n‘)

displayData(Theta1(:, 2:end));

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% ================= Part 10: Implement Predict =================
% After training the neural network, we would like to use it to predict
% the labels. You will now implement the "predict" function to use the
% neural network to predict the labels of the training set. This lets
% you compute the training set accuracy.

pred = predict(Theta1, Theta2, X);

fprintf(‘\nTraining Set Accuracy: %f\n‘, mean(double(pred == y)) * 100);

原文地址：https://www.cnblogs.com/liu-wang/p/9466132.html

时间： 2024-11-04 08:46:46

机器学习-反向传播算法（BP）代码实现（matlab）的相关文章

《神经网络和深度学习》系列文章十六：反向传播算法代码

出处: Michael Nielsen的<Neural Network and Deep Learning>,点击末尾“阅读原文”即可查看英文原文. 本节译者:哈工大SCIR硕士生李盛秋声明:如需转载请联系[email protected],未经授权不得转载. 使用神经网络识别手写数字反向传播算法是如何工作的热身:一个基于矩阵的快速计算神经网络输出的方法关于损失函数的两个假设 Hadamard积反向传播背后的四个基本等式四个基本等式的证明(选读) 反向传播算法反向传播算法代码

【神经网络和深度学习】笔记 - 第二章反向传播算法

上一章中我们遗留了一个问题,就是在神经网络的学习过程中,在更新参数的时候,如何去计算损失函数关于参数的梯度.这一章,我们将会学到一种快速的计算梯度的算法:反向传播算法. 这一章相较于后面的章节涉及到的数学知识比较多,如果阅读上有点吃力的话也可以完全跳过这一章,把反向传播当成一个计算梯度的黑盒即可,但是学习这些数学知识可以帮助我们更深入的理解神经网络. 反向传播算法的核心目的是对于神经网络中的任何weight或bias计算损失函数$C$关于它们的偏导数$\frac{\partial C}{\par

机器学习之反向传播算法

Thoughts of Algorithms 博客园首页联系订阅管理随笔 - 54 文章 - 1 评论 - 141 机器学习公开课笔记(5):神经网络(Neural Network)--学习这一章可能是Andrew Ng讲得最不清楚的一章,为什么这么说呢?这一章主要讲后向传播(Backpropagration, BP)算法,Ng花了一大半的时间在讲如何计算误差项δδ,如何计算ΔΔ的矩阵,以及如何用Matlab去实现后向传播,然而最关键的问题--为什么要这么计算?前面计算的这些量到

记一下机器学习笔记多层感知机的反向传播算法

<神经网络与机器学习>第4章前半段笔记以及其他地方看到的东西的混杂-第2.3章的内容比较古老预算先跳过. 不得不说幸亏反向传播的部分是<神机>里边人话比较多的部分,看的时候没有消化不良. 多层感知机书里前三章的模型的局限都很明显,对于非线性可分问题苦手,甚至简单的异或都弄不了.于是多层感知机(也就是传说中的神经网络)就被发明了出来对付这个问题. 多层感知机就是由一系列的感知机,或者说神经元组成,每个神经元都接受若干的输入(树突)并产生一个输出(轴突). 这些神经元被分成若干层,每

神经网络训练中的Tricks之高效BP（反向传播算法）

神经网络训练中的Tricks之高效BP(反向传播算法) 神经网络训练中的Tricks之高效BP(反向传播算法) [email protected] http://blog.csdn.net/zouxy09 Tricks!这是一个让人听了充满神秘和好奇的词.对于我们这些所谓的尝试应用机器学习技术解决某些问题的人,更是如此.曾记得,我们绞尽脑汁,搓手顿足,大喊“为什么我跑的模型不work?”,“为什么我实现的效果那么差?”,“为什么我复现的结果没有他论文里面说的那么好?”.有人会和你说“你不懂调参!

稀疏自动编码之反向传播算法（BP）

假设给定m个训练样本的训练集,用梯度下降法训练一个神经网络,对于单个训练样本(x,y),定义该样本的损失函数: 那么整个训练集的损失函数定义如下: 第一项是所有样本的方差的均值.第二项是一个归一化项(也叫权重衰减项),该项是为了减少权连接权重的更新速度,防止过拟合. 我们的目标是最小化关于 W 和 b 的函数J(W,b). 为了训练神经网络,把每个参数和初始化为很小的接近于0的随机值(例如随机值由正态分布Normal(0,ε2)采样得到,把 ε 设为0.01), 然后运用批量梯度下降算法进行优

DL4NLP——神经网络（一）前馈神经网络的BP反向传播算法步骤整理

这里把按[1]推导的BP算法(Backpropagation)步骤整理一下,备忘使用.[1] 中直接使用矩阵微分的记号进行推导,整个过程十分简洁.而且这种矩阵形式有一个非常大的优势就是对照其进行编程实现时非常方便. 但其实用标量计算推导也有一定的好处,比如可以清楚地知道某个权重是被谁所影响的. 记号约定: $L$:神经网络的层数.输入层不算. $n^l$:第 $l$ 层神经元的个数.偏置神经元不算在内. $W^{l}\in\mathbb R^{n^l\times n^{l-1}}$:第 $l-1

深度学习基础--神经网络--BP反向传播算法

BP算法: 1.是一种有监督学习算法,常被用来训练多层感知机. 2.要求每个人工神经元(即节点)所使用的激励函数必须可微. (激励函数:单个神经元的输入与输出之间的函数关系叫做激励函数.) (假如不使用激励函数,神经网络中的每层都只是做简单的线性变换,多层输入叠加后也还是线性变换.因为线性模型的表达能力不够,激励函数可以引入非线性因素) 下面两幅图分别为:无激励函数的神经网络和激励函数的神经网络如图所示,加入非线性激活函数后的差异:上图为用线性组合逼近平滑曲线来分割平面,下图为使用平滑的曲线

机器学习之五：神经网络、反向传播算法

一.逻辑回归的局限在逻辑回归一节中,使用逻辑回归的多分类,实现了识别20*20的图片上的数字. 但所使用的是一个一阶的模型,并没有使用多项式,为什么? 可以设想一下,在原有400个特征的数据样本中,增加二次.三次.四次多项式,会是什么情形? 很显然,训练样本的特征数量将会拔高多个数量级,而且,更重要的,要在一个式子中拟合这么多的特征,其难度是非常大的,可能无法收敛到一个比较理想的状态. 也就是说,逻辑回归没法提供很复杂的模型. 因为其本质上是一个线性的分类器,擅长解决的是线性可分的问题. 那么