UFLDL tutorial 代码分析

之前一直没怎么接触过代码，前段时间朋友提起了caffe。本想看看caffe怎么用，无奈自己太渣了，不会用……想起之前也没怎么接触过这方面知识，就从入门开始吧。本文代码来自UFLDL tutorial。

1.函数分析

MATLAB代码，和UFLDL Tutorial对应。代码调用minFunc求解，可以先看第二部分再看此处。

minFunc : uncontrained optimizer using a line search strategy.（注：虽然此处没有表明，但方法只能求解无约束凸优化问题）

函数采用下降方法求解最小值，可以参考凸优化和机器学习第3节的内容，当然Boyd的《凸优化》在时间充裕的情况下显然是更好的选择。

Inputs : funObj, x0, options, varagin

funObj 提供了代价函数和梯度

x0 迭代过程中的初值

options 函数参数传递

varagin funObj中需要的参数.

Outputs : x, f, exitflag output

x 迭代结果，最小值

f 最小值处的代价函数的取值
exitflag 退出时的状态
output 函数运行时相关信息

输入参数（options）

DerivativeCheck

verbose & verboseI & debug & doPlot 通过DISPLAY控制, 决定运行过程中显示多少信息.

method METHOD控制, 表示采用的下降方法. LS_init，LS_type，LS_interp，LS_multi，Fref，Damped，HessianIter，c2 的取值和采用的方法有关，也能够自己指定（部分）.

其他参数，包括maxFunEvals, maxIter, optTol, progTol, ...可以保持默认或指定

凸优化求解——下降方法

参考凸优化和机器学习第3节的内容，当然Boyd的《凸优化》在时间充裕的情况下显然是更好的选择。

处理流程

（以最速下降法为例，比较简单，主要是其他的我不会……）

变量含义

x: 此时的位置

d: 下降方向

t: 下降步长

1. 预处理

2. SD < NEWTON, Hessian矩阵不需要计算。采用funObj 计算f(代价函数值)和g(x点处的梯度）。

3. 循环，直到迭代次数达到或者最小步长/代价函数差值达到。

对于最速下降法, 下降方向= 负梯度，因此d=-g

之后采用直线搜索策略，找最小步长。默认采用回溯直线搜索方法。

初始化t=1，调用函数ArmijoBacktrack计算即可。

4 . 对结果进行相应的检验和判断。

ArmijoBacktrack

标准的回溯直线搜索方法。参数含义可参见注释

本程序中c1代表alpha（默认值为1e-4），每次循环t=t/2。

2.代码

注：MATLAB 脚本（不是函数）开头最好加上以下几行,清楚工作区，变量，以及关闭图等。

clc;
clear all;
close all;

Linear Regression

ex1a_linreg.m line 47

theta = minFunc(@linear_regression, theta, options, train.X, train.y);

其中options 是struct，决定minFunc参数。train.x和train.y是输入数据集（训练），linear_regression是编写的函数，输入参数中的theta是初值，输出theta是求得的线性回归的权值。即

minFunc 起到了求解上式的功能。

linear_regression(注释已略去)

function [f,g] = linear_regression(theta, X,y)

  f=0;
  g=zeros(size(theta));

  yEst=theta‘*X;
  f=(y-yEst)*(y-yEst)‘;
  g=X*(yEst-y)‘;

end

minFunc计算过程中需要得知代价函数和梯度，以上代码提供了这一功能。

Logistic Regression

function [f,g] = logistic_regression(theta, X,y)
  %
  % Arguments:
  %   theta - A column vector containing the parameter values to optimize.
  %   X - The examples stored in a matrix.
  %       X(i,j) is the i‘th coordinate of the j‘th example.
  %   y - The label for each example.  y(j) is the j‘th example‘s label.
  %

  m=size(X,2);

  % initialize objective value and gradient.
  f = 0;
  g = zeros(size(theta));

%%% YOUR CODE HERE %%%
  h = 1./(1+exp(-theta‘*X));
  f = -y*log(h‘)+(y-1)*log(1-h‘);
  g = X*(h-y)‘;

Softmax Regression（供参考）

function [f,g] = softmax_regression(theta, X,y)
  %
  % Arguments:
  %   theta - A vector containing the parameter values to optimize.
  %       In minFunc, theta is reshaped to a long vector.  So we need to
  %       resize it to an n-by-(num_classes-1) matrix.
  %       Recall that we assume theta(:,num_classes) = 0.
  %
  %   X - The examples stored in a matrix.
  %       X(i,j) is the i‘th coordinate of the j‘th example.
  %   y - The label for each example.  y(j) is the j‘th example‘s label.
  %
  m=size(X,2);
  n=size(X,1);

  % theta is a vector;  need to reshape to n x num_classes.
  theta=reshape(theta, n, []);
  num_classes=size(theta,2)+1;

  % initialize objective value and gradient.

  %
  % TODO:  Compute the softmax objective function and gradient using vectorized code.
  %        Store the objective function value in ‘f‘, and the gradient in ‘g‘.
  %        Before returning g, make sure you form it back into a vector with g=g(:);
  %
%%% YOUR CODE HERE %%%
    f = 0;
  g = zeros(size(theta));
  a = theta‘*X;
  a = [a;zeros(1,size(a,2))];
  a = exp(a);
  aSum = sum(a);
  compareMatrix = 1:10;
  compareMatrix = repmat(compareMatrix‘,1,m);
  h=log(a./repmat(aSum,num_classes,1));
  judMatrix = abs(compareMatrix - repmat(y,num_classes,1));
  A = judMatrix;
  A(judMatrix>0) = 0;
  A(judMatrix==0) = 1;
  B = A*h‘;
  f = -sum(diag(B));
  g = -X*(A-a./repmat(aSum,num_classes,1))‘;
  g = g(:,1:9);
  g=g(:); % make gradient a vector for minFunc

PCA Whitening（这个看教程就可以了）

%%================================================================
%% Step 0a: Load data
%  Here we provide the code to load natural image data into x.
%  x will be a 784 * 600000 matrix, where the kth column x(:, k) corresponds to
%  the raw image data from the kth 12x12 image patch sampled.
%  You do not need to change the code below.
clear all;
close all;
clc;
x = loadMNISTImages(‘train-images-idx3-ubyte‘);
figure(‘name‘,‘Raw images‘);
randsel = randi(size(x,2),200,1); % A random selection of samples for visualization
display_network(x(:,randsel));

%%================================================================
%% Step 0b: Zero-mean the data (by row)
%  You can make use of the mean and repmat/bsxfun functions.

%%% YOUR CODE HERE %%%
xMeanRow = mean(x);
x = x-repmat(xMeanRow,size(x,1),1);
%%================================================================
%% Step 1a: Implement PCA to obtain xRot
%  Implement PCA to obtain xRot, the matrix in which the data is expressed
%  with respect to the eigenbasis of sigma, which is the matrix U.

%%% YOUR CODE HERE %%%
xCorr = x*x‘/size(x,2);
[U S V] = svd(xCorr);
xRot = U‘*x;
%%================================================================
%% Step 1b: Check your implementation of PCA
%  The covariance matrix for the data expressed with respect to the basis U
%  should be a diagonal matrix with non-zero entries only along the main
%  diagonal. We will verify this here.
%  Write code to compute the covariance matrix, covar.
%  When visualised as an image, you should see a straight line across the
%  diagonal (non-zero entries) against a blue background (zero entries).

%%% YOUR CODE HERE %%%
covar = xRot*xRot‘/size(xRot,2);
% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure(‘name‘,‘Visualisation of covariance matrix‘);
imagesc(covar);

%%================================================================
%% Step 2: Find k, the number of components to retain
%  Write code to determine k, the number of components to retain in order
%  to retain at least 99% of the variance.

%%% YOUR CODE HERE %%%
var = sum(diag(covar));
varMin = 0.99*var;
varSum = 0;
k = 0;
A = diag(covar);
for i=1:length(A)
    varSum = varSum+A(i);
    if(varSum>=varMin && k==0)
        k = i;
    end
end
%%================================================================
%% Step 3: Implement PCA with dimension reduction
%  Now that you have found k, you can reduce the dimension of the data by
%  discarding the remaining dimensions. In this way, you can represent the
%  data in k dimensions instead of the original 144, which will save you
%  computational time when running learning algorithms on the reduced
%  representation.
%
%  Following the dimension reduction, invert the PCA transformation to produce
%  the matrix xHat, the dimension-reduced data with respect to the original basis.
%  Visualise the data and compare it to the raw data. You will observe that
%  there is little loss due to throwing away the principal components that
%  correspond to dimensions with low variation.

%%% YOUR CODE HERE %%%
xRot = U‘*x;
xTilde = U(:,1:k)‘*x;
xHat = U*[xTilde;zeros(size(x,1)-k,size(x,2))];
% Visualise the data, and compare it to the raw data
% You should observe that the raw and processed data are of comparable quality.
% For comparison, you may wish to generate a PCA reduced image which
% retains only 90% of the variance.

figure(‘name‘,[‘PCA processed images ‘,sprintf(‘(%d / %d dimensions)‘, k, size(x, 1)),‘‘]);
display_network(xHat(:,randsel));
figure(‘name‘,‘Raw images‘);
display_network(x(:,randsel));

%%================================================================
%% Step 4a: Implement PCA with whitening and regularisation
%  Implement PCA with whitening and regularisation to produce the matrix
%  xPCAWhite. 

epsilon = 1e-1;
%%% YOUR CODE HERE %%%
xPCAWhite = diag(1./sqrt(diag(S) + epsilon)) * xRot;
covar = xPCAWhite*xPCAWhite‘;
%% Step 4b: Check your implementation of PCA whitening
%  Check your implementation of PCA whitening with and without regularisation.
%  PCA whitening without regularisation results a covariance matrix
%  that is equal to the identity matrix. PCA whitening with regularisation
%  results in a covariance matrix with diagonal entries starting close to
%  1 and gradually becoming smaller. We will verify these properties here.
%  Write code to compute the covariance matrix, covar.
%
%  Without regularisation (set epsilon to 0 or close to 0),
%  when visualised as an image, you should see a red line across the
%  diagonal (one entries) against a blue background (zero entries).
%  With regularisation, you should see a red line that slowly turns
%  blue across the diagonal, corresponding to the one entries slowly
%  becoming smaller.

%%% YOUR CODE HERE %%%

% Visualise the covariance matrix. You should see a red line across the
% diagonal against a blue background.
figure(‘name‘,‘Visualisation of covariance matrix‘);
imagesc(covar);

%%================================================================
%% Step 5: Implement ZCA whitening
%  Now implement ZCA whitening to produce the matrix xZCAWhite.
%  Visualise the data and compare it to the raw data. You should observe
%  that whitening results in, among other things, enhanced edges.

%%% YOUR CODE HERE %%%
xZCAWhite = U * diag(1./sqrt(diag(S) + epsilon)) * U‘ * x;
% Visualise the data, and compare it to the raw data.
% You should observe that the whitened images have enhanced edges.
figure(‘name‘,‘ZCA whitened images‘);
display_network(xZCAWhite(:,randsel));
figure(‘name‘,‘Raw images‘);
display_network(x(:,randsel));

暂时就这些了，其他的代码看了，写了之后也会写在这里……

时间： 2024-07-30 14:51:04

UFLDL tutorial 代码分析的相关文章

关于Nehe‘s OpenGL tutorial on Win32 Multisampling Application Creation的一些代码分析。

最近的工作需使用OpenGL的MultiSample功能,在不使用glut,glfw等库的情况下,要创建支持Multisampling的OpenGL Render Context对于初学者来讲还是要花一番功夫的.昨天晚上又把Nehe’s OpenGL tutorial里的Lesson 46拿出来看了一下,主要讲的是如何建立支持Multisampling的OpenGL RC. 为了加强理解,现把其代码实现的主要流程整理如下: (下面的流程图只针对建立RC过程中核心操作进行解析,并不涉及其他诸如 w

AngularJS PhoneCat代码分析

原文:http://blog.javachen.com/2015/01/09/angular-phonecat-examples/ AngularJS 官方网站提供了一个用于学习的示例项目:PhoneCat.这是一个Web应用,用户可以浏览一些Android手机,了解它们的详细信息,并进行搜索和排序操作. 本文主要分析 AngularJS 官方网站提供的一个用于学习的示例项目 PhoneCat 的构建.测试过程以及代码的运行原理.希望能够对 PhoneCat 项目有一个更加深入全面的认识.这其中

cocos2d-x v3.2 FlappyBird 各个类对象具体代码分析(6)

今天我们要讲三个类,这三个类应该算比较简单的 HelpLayer类 NumberLayer类 GetLocalScore类 HelpLayer类,主要放了两个图形精灵上去,一个是游戏的名字,一个是提示游戏怎么玩的,就一张图: NumberLayer类,涉及到自定义字体的制作,我们提取出来的资源里,有很多数字图片: 现在我们要把它们做成这样子的: 这个跟游戏图片资源差不多,做成这样就可以直接拿来用,省了很多事情,那怎么做了,这里我们要用到一个叫软件,具体使用方法和过程,我这里就不介绍了,大家直接戳

java代码分析及分析工具

java代码分析及分析工具一个项目从搭建开始,开发的初期往往思路比较清晰,代码也比较清晰.随着时间的推移,业务越来越复杂.代码也就面临着耦合,冗余,甚至杂乱,到最后谁都不敢碰. 作为一个互联网电子商务网站的业务支撑系统,业务复杂不言而喻.从09年开始一直沿用到现在,中间代码经过了多少人的手,留下了多少的坑,已经记不清楚了,谁也说不清了. 代码的维护成本越来越高.代码已经急需做调整和改善.最近项目组专门设立了一个小组,利用业余时间做代码分析的工作,目标对核心代码进行分析并进行设计重构. 代码分析

Java静态代码分析工具Infer

Java静态代码分析工具Infer 作者:chszs,转载需注明.博客主页:http://blog.csdn.net/chszs 一.Infer介绍 Infer是Facebook最新开源的静态程序分析工具,用于在发布移动应用之前对代码进行分析,找出潜在的问题.目前Facebook使用此工具分析Facebook的App,包括Android.iOS.Facebook Messenger和Instagram等. Facebook称该工具帮助其每个月检查出应用潜在的数百个Bug,例如一些空指针访问.资源

$*和[email protected]之间区别代码分析

#!/bin/bash set 'apple pie' pears peaches for i in $* /*单引号被去掉,循环单个字符输出*/ do echo $i done [[email protected] Ex_14.02-14.31]# sh 14-14-1 apple pie pears peaches -------------------------------------------------------------- #!/bin/bash set

《linux 内核完全剖析》 keyboard.S 部分代码分析(key_map)

keyboard.S 部分代码分析(key_map) keyboard中间有这么一段,我一开始没看明白,究竟啥意思 key_map: .byte 0,27 .ascii "1234567890-=" .byte 127,9 .ascii "qwertyuiop[]" .byte 13,0 .ascii "asdfghjkl;'" .byte '`,0 .ascii "\\zxcvbnm,./" .byte 0,'*,0,32

20145234黄斐《网络对抗技术》实验四，恶意代码分析

恶意代码概述恶意代码是指故意编制或设置的.对网络或系统会产生威胁或潜在威胁的计算机代码.最常见的恶意代码有计算机病毒(简称病毒).特洛伊木马(简称木马).计算机蠕虫(简称蠕虫).后门.逻辑炸弹等. 特征: 恶意的目的,获取靶机权限.用户隐私等本身是计算机程序,可以执行,并作用于靶机通过执行发生作用,一般来说不运行是没问题的恶意代码分析在大多数情况下,进行恶意代码分析时,我们将只有恶意代码的可执行文件本身,而这些文件并不是我们人类可读的.为了了解这些文件的意义,你需要使用各种工具和技巧

20145326蔡馨熠《网络对抗》——恶意代码分析

20145326蔡馨熠<网络对抗>--恶意代码分析 1.实验后回答问题 (1)如果在工作中怀疑一台主机上有恶意代码,但只是猜想,所以想监控下系统一天天的到底在干些什么.请设计下你想监控的操作有哪些,用什么方法来监控.. 需要监控什么? 系统中各种程序.文件的行为. 还需要注意是否会出现权限更改的行为. 注册表. 是否有可疑进程. 如果有网络连接的情况,需要注意这个过程中的IP地址与端口. 用什么来监控? 最先想到的肯定是使用wireshark抓包了,再进行进一步分析. Sysinternals