Deep Learning五:PCA and Whitening_Exercise(斯坦福大学UFLDL深度学习教程)

前言

本文是基于Exercise:PCA and Whitening的练习。

理论知识见:UFLDL教程

实验内容:从10张512*512自然图像中随机选取10000个12*12的图像块(patch),然后对这些patch进行99%的方差保留的PCA计算,最后对这些patch做PCA Whitening和ZCA Whitening,并进行比较。

实验步骤及结果

1.加载图像数据,得到10000个图像块为原始数据x,它是144*10000的矩阵,随机显示200个图像块,其结果如下:

2.把它的每个图像块0均值归一化。

3.PCA降维过程的第一步:求归一化后的原始数据x的协方差矩阵sigma,然后用svd对sigma求出它的U,即原始数据的特征向量或基,再把x投影或旋转到基的方向上,得到新数据xRot。

4.检查PCA实现的第一步是否正确:只需要把xRot的协方差矩阵显示出来。如果是正确的,就会显示出一条直线对角穿过蓝色背景的图片。结果如下:

5.根据要保留99%方差的要求计算出要保留的主成份个数k

6.PCA降维过程的第二步:保留xRot的前k个成份,后面的全置为0,得到数据xTilde,基U乘以数据xTilde的前k个成份(即:前k行)就得降维后数据xHat。xHat显示结果如下:

为了对比,有0均值归一化后未降维前的数据显示如下:

7.对0均值归一化后的数据x实现PCA Whitening,得到PCA白化后的数据xPCAWhite,其显示结果如下:

8.检查PCA白化是否规整化:显示数据xPCAWhite的协方差矩阵。如未规整化,则数据xPCAWhite的协方差矩阵是一个恒等矩阵;如已规整化,则数据xPCAWhite的协方差矩阵的对角线上的值接近于1且依次变小。所以,如未规整化,把epsilon置为0或接近于0,就会得到一条红线对角穿过蓝色背景图片;如已规整化,就会得到就会得到一条从红色渐变到蓝色的线对角穿过蓝色背景的图片。显示结果如下:

9.在PCA Whitening的基础上实现ZCAWhitening,得到的数据xZCAWhite=U* xPCAWhite。因为前面已经检查过PCA白化,而zca白化是在pca的基础上做的,故这一步不需要再检查。ZCA白化的结果显示如下:

对比PCA白化结果,可以看出,ZCA白化更接近原始数据。

与其相对应的归一化原始数据显示如下:

代码

pca_gen.m

close all;
% clear all;
%%================================================================
%% Step 0a: Load data
% Here we provide the code to load natural image data into x.
% x will be a 144 * 10000 matrix, where the kth column x(:, k) corresponds to
% the raw image data from the kth 12x12 image patch sampled.
% You do not need to change the code below.

x = sampleIMAGESRAW();
figure(‘name‘,‘Raw images‘);
randsel = randi(size(x,2),200,1); % A random selection of samples for visualization
display_network(x(:,randsel));

%%================================================================
%% Step 0b: Zero-mean the data (by row)
% You can make use of the mean and repmat/bsxfun functions.

% -------------------- YOUR CODE HERE --------------------
avg = mean(x, 1);                 %x的每一列的均值
x = x - repmat(avg, size(x, 1), 1);
%%================================================================
%% Step 1a: Implement PCA to obtain xRot
% Implement PCA to obtain xRot, the matrix in which the data is expressed
% with respect to the eigenbasis of sigma, which is the matrix U.

% -------------------- YOUR CODE HERE --------------------
xRot = zeros(size(x)); % You need to compute this
sigma = x * x‘ / size(x, 2);
[U,S,V]=svd(sigma);
xRot=U‘*x;

%%================================================================
%% Step 1b: Check your implementation of PCA
% The covariance matrix for the data expressed with respect to the basis U
% should be a diagonal matrix with non-zero entries only along the main
% diagonal. We will verify this here.
% Write code to compute the covariance matrix, covar.
% When visualised as an image, you should see a straight line across the
% diagonal (non-zero entries) against a blue background (zero entries).

% -------------------- YOUR CODE HERE --------------------
covar = zeros(size(x, 1)); % You need to compute this
covar = xRot * xRot‘ / size(xRot, 2);
% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure(‘name‘,‘Visualisation of covariance matrix‘);
imagesc(covar);

%%================================================================
%% Step 2: Find k, the number of components to retain
% Write code to determine k, the number of components to retain in order
% to retain at least 99% of the variance.

% -------------------- YOUR CODE HERE --------------------
k = 0; % Set k accordingly
sum_k=0;
sum=trace(S);
for k=1:size(S,1)
        sum_k=sum_k+S(k,k);
        if(sum_k/sum>=0.99) %0.9
               break;
       end
end

%%================================================================
%% Step 3: Implement PCA with dimension reduction
% Now that you have found k, you can reduce the dimension of the data by
% discarding the remaining dimensions. In this way, you can represent the
% data in k dimensions instead of the original 144, which will save you
% computational time when running learning algorithms on the reduced
% representation.
%
% Following the dimension reduction, invert the PCA transformation to produce
% the matrix xHat, the dimension-reduced data with respect to the original basis.
% Visualise the data and compare it to the raw data. You will observe that
% there is little loss due to throwing away the principal components that
% correspond to dimensions with low variation.

% -------------------- YOUR CODE HERE --------------------
xHat = zeros(size(x));% You need to compute this
xTilde = U(:,1:k)‘ * x;
xHat(1:k,:)=xTilde;
xHat=U*xHat;

% Visualise the data, and compare it to the raw data
% You should observe that the raw and processed data are of comparable quality.
% For comparison, you may wish to generate a PCA reduced image which
% retains only 90% of the variance.

figure(‘name‘,[‘PCA processed images ‘,sprintf(‘(%d / %d dimensions)‘, k, size(x, 1)),‘‘]);
display_network(xHat(:,randsel));
figure(‘name‘,‘Raw images‘);
display_network(x(:,randsel));

%%================================================================
%% Step 4a: Implement PCA with whitening and regularisation
% Implement PCA with whitening and regularisation to produce the matrix
% xPCAWhite.

epsilon = 0.1;
xPCAWhite = zeros(size(x));

% -------------------- YOUR CODE HERE --------------------
xPCAWhite = diag(1./sqrt(diag(S) + epsilon)) * U‘ * x;

figure(‘name‘,‘PCA whitened images‘);
display_network(xPCAWhite(:,randsel));

%%================================================================
%% Step 4b: Check your implementation of PCA whitening
% 检查PCA白化是否规整化。如未规整化,则协方差矩阵是一个恒等矩阵;如已规整化,则其协方差矩阵的对角线上的值接近于1且依次变小。
% Check your implementation of PCA whitening with and without regularisation.
% PCA whitening without regularisation results a covariance matrix
% that is equal to the identity matrix. PCA whitening with regularisation
% results in a covariance matrix with diagonal entries starting close to
% 1 and gradually becoming smaller. We will verify these properties here.
% Write code to compute the covariance matrix, covar.
%
% 如未规整化,把epsilon置为0或接近于0,就会得到一条红线对角穿过蓝色背景图片。
% 如已规整化,就会得到就会得到一条从红色渐变到蓝色的线对角穿过蓝色背景的图片。
% Without regularisation (set epsilon to 0 or close to 0),
% when visualised as an image, you should see a red line across the
% diagonal (one entries) against a blue background (zero entries).
% With regularisation, you should see a red line that slowly turns
% blue across the diagonal, corresponding to the one entries slowly
% becoming smaller.

% -------------------- YOUR CODE HERE --------------------
covar = zeros(size(xPCAWhite, 1));
covar = xPCAWhite * xPCAWhite‘ / size(xPCAWhite, 2);
% Visualise the covariance matrix. You should see a red line across the
% diagonal against a blue background.
figure(‘name‘,‘Visualisation of covariance matrix‘);
imagesc(covar);

%%================================================================
%% Step 5: Implement ZCA whitening
% Now implement ZCA whitening to produce the matrix xZCAWhite.
% Visualise the data and compare it to the raw data. You should observe
% that whitening results in, among other things, enhanced edges.

xZCAWhite = zeros(size(x));

% -------------------- YOUR CODE HERE --------------------
xZCAWhite=U * diag(1./sqrt(diag(S) + epsilon)) * U‘ * x;
% Visualise the data, and compare it to the raw data.
% You should observe that the whitened images have enhanced edges.
figure(‘name‘,‘ZCA whitened images‘);
display_network(xZCAWhite(:,randsel));
figure(‘name‘,‘Raw images‘);
display_network(x(:,randsel));

 

参考资料:

http://deeplearning.stanford.edu/wiki/index.php/UFLDL_Tutorial

Deep Learning三:预处理之主成分分析与白化_总结(斯坦福大学UFLDL深度学习教程)

Deep learning:十二(PCA和whitening在二自然图像中的练习)

时间: 2024-10-13 01:45:34

Deep Learning五:PCA and Whitening_Exercise(斯坦福大学UFLDL深度学习教程)的相关文章

Deep Learning三:PCA in 2D_Exercise(斯坦福大学UFLDL深度学习教程)

前言 这节主要是练习下PCA,PCA Whitening以及ZCA Whitening在2D数据上的使用,2D的数据集是45个数据点,每个数据点是2维的. 一些matlab函数 彩色分散点图函数:scatter(x,y,c,s) x, y为两个矢量,用于定位数据点,s为绘图点的大小,c为绘图所使用的色彩,s和c均可以以矢量或表达式形式给出,s和c为与x或y同长度的矢量时标记点尺 寸和颜色将按线性规律变化.在 scatter函数的前4各参数之后还可以增加第五个参数‘ filled‘,表示填充绘图点

Deep Learning九之深度学习UFLDL教程:linear decoder_exercise(斯坦福大学深度学习教程)

前言 实验内容:Exercise:Learning color features with Sparse Autoencoders.即:利用线性解码器,从100000张8*8的RGB图像块中提取彩色特征,这些特征会被用于下一节的练习 理论知识:线性解码器和http://www.cnblogs.com/tornadomeet/archive/2013/04/08/3007435.html 实验基础说明: 1.为什么要用线性解码器,而不用前面用过的栈式自编码器等?即:线性解码器的作用? 这一点,Ng

Deep Learning 12_深度学习UFLDL教程:Sparse Coding_exercise(斯坦福大学深度学习教程)

前言 理论知识:UFLDL教程.Deep learning:二十六(Sparse coding简单理解).Deep learning:二十七(Sparse coding中关于矩阵的范数求导).Deep learning:二十九(Sparse coding练习) 实验环境:win7, matlab2015b,16G内存,2T机械硬盘 本节实验比较不好理解也不好做,我看很多人最后也没得出好的结果,所以得花时间仔细理解才行. 实验内容:Exercise:Sparse Coding.从10张512*51

Deep Learning 十一_深度学习UFLDL教程:数据预处理(斯坦福大学深度学习教程)

理论知识:UFLDL数据预处理和http://www.cnblogs.com/tornadomeet/archive/2013/04/20/3033149.html 数据预处理是深度学习中非常重要的一步!如果说原始数据的获得,是深度学习中最重要的一步,那么获得原始数据之后对它的预处理更是重要的一部分. 1.数据预处理的方法: ①数据归一化: 简单缩放:对数据的每一个维度的值进行重新调节,使其在 [0,1]或[ − 1,1] 的区间内 逐样本均值消减:在每个样本上减去数据的统计平均值,用于平稳的数

Deep Learning 十_深度学习UFLDL教程:Convolution and Pooling_exercise(斯坦福大学深度学习教程)

前言 理论知识:UFLDL教程和http://www.cnblogs.com/tornadomeet/archive/2013/04/09/3009830.html 实验环境:win7, matlab2015b,16G内存,2T机械硬盘 实验内容:Exercise:Convolution and Pooling.从2000张64*64的RGB图片(它是the STL10 Dataset的一个子集)中提取特征作为训练数据集,训练softmax分类器,然后从3200张64*64的RGB图片(它是th

Deep Learning论文笔记之(一)K-means特征学习

Deep Learning论文笔记之(一)K-means特征学习 [email protected] http://blog.csdn.net/zouxy09          自己平时看了一些论文,但老感觉看完过后就会慢慢的淡忘,某一天重新拾起来的时候又好像没有看过一样.所以想习惯地把一些感觉有用的论文中的知识点总结整理一下,一方面在整理过程中,自己的理解也会更深,另一方面也方便未来自己的勘察.更好的还可以放到博客上面与大家交流.因为基础有限,所以对论文的一些理解可能不太正确,还望大家不吝指正

2011斯坦福大学iOS应用开发教程学习笔记(第一课)MVC.and.Introduction.to.Objective-C

2011年冬季斯坦福大学公开课 iOS应用开发教程是个很经典的教程,这个老头讲的很给力.做笔记总结. 第一课名称: MVC and Introduction to Objective-C 这课的主要内容有: iOS的概述  -什么是iOS MVC - 面向对象的概念 Objective-C-介绍下语言的概念 iOS包括四层 内核 内核是mach 4.x BSD UNIX内核 mac OS  10操作系统,是个多任务的UNIX内核,在这层上提供了网络,socket ,安全机制,文件系统,大部分这些

斯坦福cs224d(深度学习在自然语言处理上的应用)Lecture 2

原文作者:Rohit Mundra, Richard Socher 原文翻译:@熊杰([email protected]) && @王昱森 内容调整与校对:寒小阳 && 龙心尘 特别鸣谢:@面包包包包包同学的帮助 时间:2016年6月 出处: http://blog.csdn.net/han_xiaoyang/article/details/51648483 http://blog.csdn.net/longxinchen_ml/article/details/516485

【深度学习Deep Learning】资料大全

转载:http://www.cnblogs.com/charlotte77/p/5485438.html 最近在学深度学习相关的东西,在网上搜集到了一些不错的资料,现在汇总一下: Free Online Books Deep Learning66 by Yoshua Bengio, Ian Goodfellow and Aaron Courville Neural Networks and Deep Learning42 by Michael Nielsen Deep Learning27 by