UFLDL教程笔记及练习答案五(自编码线性解码器与处理大型图像)

自动编码线性解码器

自动编码线性解码器主要是考虑到稀疏自动编码器最后一层输出如果用sigmoid函数,由于稀疏自动编码器学习是的输出等于输入,simoid函数的值域在[0,1]之间,这就要求输入也必须在[0,1]之间,这是对输入特征的隐藏限制,为了解除这一限制,我们可以使最后一层用线性函数及a
= z

习题答案:

SparseAutoEncoderLinerCost.m

function [cost,grad,features] = sparseAutoencoderLinearCost(theta, visibleSize, hiddenSize, ...
                                                            lambda, sparsityParam, beta, data)
% -------------------- YOUR CODE HERE --------------------
% Instructions:
%   Copy sparseAutoencoderCost in sparseAutoencoderCost.m from your
%   earlier exercise onto this file, renaming the function to
%   sparseAutoencoderLinearCost, and changing the autoencoder to use a
%   linear decoder.
% -------------------- YOUR CODE HERE --------------------      

% visibleSize: the number of input units (probably 64)
% hiddenSize: the number of hidden units (probably 25)
% lambda: weight decay parameter
% sparsityParam: The desired average activation for the hidden units (denoted in the lecture
%                           notes by the greek alphabet rho, which looks like a lower-case "p").
% beta: weight of sparsity penalty term
% data: Our 64x10000 matrix containing the training data.  So, data(:,i) is the i-th training example. 

% The input theta is a vector (because minFunc expects the parameters to be a vector).
% We first convert theta to the (W1, W2, b1, b2) matrix/vector format, so that this
% follows the notation convention of the lecture notes. 

W1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize);    %W1为25*64
W2 = reshape(theta(hiddenSize*visibleSize+1:2*hiddenSize*visibleSize), visibleSize, hiddenSize);  % W2为64*25
b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);     % b1为25维
b2 = theta(2*hiddenSize*visibleSize+hiddenSize+1:end);              %b2为64维

% Cost and gradient variables (your code needs to compute these values).
% Here, we initialize them to zeros.
cost = 0;
W1grad = zeros(size(W1));      %W1grad 为25*64
W2grad = zeros(size(W2));     %W2grad为64*25
b1grad = zeros(size(b1));      % 25   hidden
b2grad = zeros(size(b2));      %64   visible

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute the cost/optimization objective J_sparse(W,b) for the Sparse Autoencoder,
%                and the corresponding gradients W1grad, W2grad, b1grad, b2grad.
%
% W1grad, W2grad, b1grad and b2grad should be computed using backpropagation.
% Note that W1grad has the same dimensions as W1, b1grad has the same dimensions
% as b1, etc.  Your code should set W1grad to be the partial derivative of J_sparse(W,b) with
% respect to W1.  I.e., W1grad(i,j) should be the partial derivative of J_sparse(W,b)
% with respect to the input parameter W1(i,j).  Thus, W1grad should be equal to the term
% [(1/m) \Delta W^{(1)} + \lambda W^{(1)}] in the last block of pseudo-code in Section 2.2
% of the lecture notes (and similarly for W2grad, b1grad, b2grad).
%
% Stated differently, if we were using batch gradient descent to optimize the parameters,
% the gradient descent update to W1 would be W1 := W1 - alpha * W1grad, and similarly for W2, b1, b2.
% 

%1.forward propagation
data_size=size(data);           % [64, 10000]
active_value2=repmat(b1,1,data_size(2));    % 将b1扩展10000列 25*10000
active_value3=repmat(b2,1,data_size(2));    % 将b2扩展10000列 64*10000
active_value2=sigmoid(W1*data+active_value2);  %隐结点的值 矩阵表示所有的样本     25*10000 一列表示一个样本 hidden
active_value3=W2*active_value2+active_value3;   %输出结点的值 矩阵表示所有的样本  64*10000 一列表示一个样本 output
%2.computing error term and cost
ave_square=sum(sum((active_value3-data).^2)./2)/data_size(2);   %cost第一项  最小平方和
weight_decay=lambda/2*(sum(sum(W1.^2))+sum(sum(W2.^2)));         %cost第二项   所有参数的平方和 贝叶斯学派

p_real=sum(active_value2,2)./data_size(2);       % 稀疏惩罚项中的估计p 为25维
p_para=repmat(sparsityParam,hiddenSize,1);       %稀疏化参数
sparsity=beta.*sum(p_para.*log(p_para./p_real)+(1-p_para).*log((1-p_para)./(1-p_real)));   %KL diversion
cost=ave_square+weight_decay+sparsity;      % 最终的cost function

delta3=(active_value3-data);      % 为error 是64*10000 矩阵表示所有的样本,每一列表示一个样本
average_sparsity=repmat(sum(active_value2,2)./data_size(2),1,data_size(2));  %求error中的稀疏项
default_sparsity=repmat(sparsityParam,hiddenSize,data_size(2));     %稀疏化参数
sparsity_penalty=beta.*(-(default_sparsity./average_sparsity)+((1-default_sparsity)./(1-average_sparsity)));
delta2=(W2'*delta3+sparsity_penalty).*((active_value2).*(1-active_value2));  %error 是25*10000 矩阵表示所有的样本,每一列表示一个样本
%3.backword propagation
W2grad=delta3*active_value2'./data_size(2)+lambda.*W2;      % 梯度 为64*25
W1grad=delta2*data'./data_size(2)+lambda.*W1;          %梯度 为25*64
b2grad=sum(delta3,2)./data_size(2);           %64   visible
b1grad=sum(delta2,2)./data_size(2);          % 25   hidden

%-------------------------------------------------------------------
% After computing the cost and gradient, we will convert the gradients back
% to a vector format (suitable for minFunc).  Specifically, we will unroll
% your gradient matrices into a vector.

grad = [W1grad(:) ; W2grad(:) ; b1grad(:) ; b2grad(:)];                              

end

%-------------------------------------------------------------------
% Here's an implementation of the sigmoid function, which you may find useful
% in your computation of the costs and the gradients.  This inputs a (row or
% column) vector (say (z1, z2, z3)) and returns (f(z1), f(z2), f(z3)). 

function sigm = sigmoid(x)

    sigm = 1 ./ (1 + exp(-x));
end

处理大型图像

处理大型图像主要采用的是卷积和池化,卷积来源于自然图像有其固有特性,也就是说,图像的一部分的统计特性与其他部分是一样的。这也意味着我们在这一部分学习的特征也能用在另一部分上,所以对于这个图像上的所有位置,我们都能使用同样的学习特征。

池化主要考虑卷积得到的特征过多易产生过拟合,然图像具有一种“静态性”的属性,意味着在一个图像区域有用的特征极有可能在另外一个区域同样适用,因此我们可以对不同位置的特征进行聚合统计(平均池化和最大池化)。

习题答案

cnnConvolve.m

function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch)  % patcheDim =8  numFeatures = hidden images
%cnnConvolve Returns the convolution of the features given by W and b with
%the given images
%
% Parameters:
%  patchDim - patch (feature) dimension
%  numFeatures - number of features
%  images - large images to convolve with, matrix in the form
%           images(r, c, channel, image number)
%  W, b - W, b for features from the sparse autoencoder
%  ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for
%                        preprocessing
%
% Returns:
%  convolvedFeatures - matrix of convolved features in the form
%                      convolvedFeatures(featureNum, imageNum, imageRow, imageCol)

numImages = size(images, 4);
imageDim = size(images, 1);        %% = 64
imageChannels = size(images, 3);

convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);

% Instructions:
%   Convolve every feature with every large image here to produce the
%   numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1)
%   matrix convolvedFeatures, such that
%   convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the
%   value of the convolved featureNum feature for the imageNum image over
%   the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1)
%
% Expected running times:
%   Convolving with 100 images should take less than 3 minutes
%   Convolving with 5000 images should take around an hour
%   (So to save time when testing, you should convolve with less images, as
%   described earlier)

% -------------------- YOUR CODE HERE --------------------
% Precompute the matrices that will be used during the convolution. Recall
% that you need to take into account the whitening and mean subtraction
% steps

WT = W*ZCAWhite;    % 可以看exercise中的推导
b_mean = b - WT * meanPatch;

% --------------------------------------------------------
patchSize = patchDim * patchDim;

convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
for imageNum = 1:numImages
  for featureNum = 1:numFeatures

    % convolution of image with feature matrix for each channel
    convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1);
    for channel = 1:imageChannels

      % Obtain the feature (patchDim x patchDim) needed during the convolution
      % ---- YOUR CODE HERE ----
      feature = zeros(8,8); % You should replace this
      offset = (channel -1) * patchSize;
      feature = reshape(WT(featureNum, offset+1 : offset+patchSize), patchDim, patchDim);

      % ------------------------

      % Flip the feature matrix because of the definition of convolution, as explained later
      feature = flipud(fliplr(squeeze(feature)));

      % Obtain the image
      im = squeeze(images(:, :, channel, imageNum));

      % Convolve "feature" with "im", adding the result to convolvedImage
      % be sure to do a 'valid' convolution
      % ---- YOUR CODE HERE ----
      convolvedoneChannel = conv2(im, feature, 'valid');    %卷积操作
      convolvedImage = convolvedImage + convolvedoneChannel;    %三通道相加

      % ------------------------

    end

    % Subtract the bias unit (correcting for the mean subtraction as well)
    % Then, apply the sigmoid function to get the hidden activation
    % ---- YOUR CODE HERE ----
    convolvedIamge = sigmoid(convolvedImage + b_mean(featureNum));    %最后的取值为sigmoid函数得到的结果

    % ------------------------

    % The convolved feature is the sum of the convolved values for all channels
    convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage;
  end
end

end

function sigm = sigmoid(x)
    sigm = 1./(1+exp(-x));
end

cnnPool.m

function pooledFeatures = cnnPool(poolDim, convolvedFeatures)
%cnnPool Pools the given convolved features
%
% Parameters:
%  poolDim - dimension of pooling region
%  convolvedFeatures - convolved features to pool (as given by cnnConvolve)
%                      convolvedFeatures(featureNum, imageNum, imageRow, imageCol)
%
% Returns:
%  pooledFeatures - matrix of pooled features in the form
%                   pooledFeatures(featureNum, imageNum, poolRow, poolCol)
%     

numImages = size(convolvedFeatures, 2);
numFeatures = size(convolvedFeatures, 1);
convolvedDim = size(convolvedFeatures, 3);

resultDim  = floor(convolvedDim / poolDim);
pooledFeatures = zeros(numFeatures, numImages, resultDim, resultDim);

% -------------------- YOUR CODE HERE --------------------
% Instructions:
%   Now pool the convolved features in regions of poolDim x poolDim,
%   to obtain the
%   numFeatures x numImages x (convolvedDim/poolDim) x (convolvedDim/poolDim)
%   matrix pooledFeatures, such that
%   pooledFeatures(featureNum, imageNum, poolRow, poolCol) is the
%   value of the featureNum feature for the imageNum image pooled over the
%   corresponding (poolRow, poolCol) pooling region
%   (see http://ufldl/wiki/index.php/Pooling )
%
%   Use mean pooling here.
% -------------------- YOUR CODE HERE --------------------

for imageNum = 1:numImages
    for featureNum = 1:numFeatures
        for poolRow = 1:resultDim
            offsetRow = 1+(poolRow-1)*poolDim;
            for poolCol = 1:resultDim
                offsetCol = 1 + (poolCol-1)*poolDim;
                patch = convolvedFeatures(featureNum, imageNum, offsetRow:offsetRow+poolDim-1, offsetCol:offsetCol+poolDim-1);
                pooledFeatures(featureNum, imageNum, poolRow, poolCol) = mean(patch(:));
            end
        end
    end
end

end

cnnExercise.m

%% CS294A/CS294W Convolutional Neural Networks Exercise

%  Instructions
%  ------------
%
%  This file contains code that helps you get started on the
%  convolutional neural networks exercise. In this exercise, you will only
%  need to modify cnnConvolve.m and cnnPool.m. You will not need to modify
%  this file.

%%======================================================================
%% STEP 0: Initialization
%  Here we initialize some parameters used for the exercise.

imageDim = 64;         % image dimension
imageChannels = 3;     % number of channels (rgb, so 3)

patchDim = 8;          % patch dimension
numPatches = 50000;    % number of patches

visibleSize = patchDim * patchDim * imageChannels;  % number of input units
outputSize = visibleSize;   % number of output units
hiddenSize = 400;           % number of hidden units 

epsilon = 0.1;	       % epsilon for ZCA whitening

poolDim = 19;          % dimension of pooling region

%%======================================================================
%% STEP 1: Train a sparse autoencoder (with a linear decoder) to learn
%  features from color patches. If you have completed the linear decoder
%  execise, use the features that you have obtained from that exercise,
%  loading them into optTheta. Recall that we have to keep around the
%  parameters used in whitening (i.e., the ZCA whitening matrix and the
%  meanPatch)

% --------------------------- YOUR CODE HERE --------------------------
% Train the sparse autoencoder and fill the following variables with
% the optimal parameters:

optTheta =  zeros(2*hiddenSize*visibleSize+hiddenSize+visibleSize, 1);
ZCAWhite =  zeros(visibleSize, visibleSize);
meanPatch = zeros(visibleSize, 1);

load STL10Features.mat;

% --------------------------------------------------------------------

% Display and check to see that the features look good
W = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize);
b = optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);

displayColorNetwork( (W*ZCAWhite)');

%%======================================================================
%% STEP 2: Implement and test convolution and pooling
%  In this step, you will implement convolution and pooling, and test them
%  on a small part of the data set to ensure that you have implemented
%  these two functions correctly. In the next step, you will actually
%  convolve and pool the features with the STL10 images.

%% STEP 2a: Implement convolution
%  Implement convolution in the function cnnConvolve in cnnConvolve.m

% Note that we have to preprocess the images in the exact same way
% we preprocessed the patches before we can obtain the feature activations.

load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels

%% Use only the first 8 images for testing
convImages = trainImages(:, :, :, 1:8); 

% NOTE: Implement cnnConvolve in cnnConvolve.m first!
convolvedFeatures = cnnConvolve(patchDim, hiddenSize, convImages, W, b, ZCAWhite, meanPatch);

%% STEP 2b: Checking your convolution
%  To ensure that you have convolved the features correctly, we have
%  provided some code to compare the results of your convolution with
%  activations from the sparse autoencoder

% For 1000 random points
for i = 1:1000
    featureNum = randi([1, hiddenSize]);
    imageNum = randi([1, 8]);
    imageRow = randi([1, imageDim - patchDim + 1]);
    imageCol = randi([1, imageDim - patchDim + 1]);    

    patch = convImages(imageRow:imageRow + patchDim - 1, imageCol:imageCol + patchDim - 1, :, imageNum);
    patch = patch(:);
    patch = patch - meanPatch;
    patch = ZCAWhite * patch;

    features = feedForwardAutoencoder(optTheta, hiddenSize, visibleSize, patch); 

    if abs(features(featureNum, 1) - convolvedFeatures(featureNum, imageNum, imageRow, imageCol)) > 1e-9
        fprintf('Convolved feature does not match activation from autoencoder\n');
        fprintf('Feature Number    : %d\n', featureNum);
        fprintf('Image Number      : %d\n', imageNum);
        fprintf('Image Row         : %d\n', imageRow);
        fprintf('Image Column      : %d\n', imageCol);
        fprintf('Convolved feature : %0.5f\n', convolvedFeatures(featureNum, imageNum, imageRow, imageCol));
        fprintf('Sparse AE feature : %0.5f\n', features(featureNum, 1));
        error('Convolved feature does not match activation from autoencoder');
    end
end

disp('Congratulations! Your convolution code passed the test.');

%% STEP 2c: Implement pooling
%  Implement pooling in the function cnnPool in cnnPool.m

% NOTE: Implement cnnPool in cnnPool.m first!
pooledFeatures = cnnPool(poolDim, convolvedFeatures);

%% STEP 2d: Checking your pooling
%  To ensure that you have implemented pooling, we will use your pooling
%  function to pool over a test matrix and check the results.

testMatrix = reshape(1:64, 8, 8);
expectedMatrix = [mean(mean(testMatrix(1:4, 1:4))) mean(mean(testMatrix(1:4, 5:8))); ...
                  mean(mean(testMatrix(5:8, 1:4))) mean(mean(testMatrix(5:8, 5:8))); ];

testMatrix = reshape(testMatrix, 1, 1, 8, 8);

pooledFeatures = squeeze(cnnPool(4, testMatrix));

if ~isequal(pooledFeatures, expectedMatrix)
    disp('Pooling incorrect');
    disp('Expected');
    disp(expectedMatrix);
    disp('Got');
    disp(pooledFeatures);
else
    disp('Congratulations! Your pooling code passed the test.');
end

%%======================================================================
%% STEP 3: Convolve and pool with the dataset
%  In this step, you will convolve each of the features you learned with
%  the full large images to obtain the convolved features. You will then
%  pool the convolved features to obtain the pooled features for
%  classification.
%
%  Because the convolved features matrix is very large, we will do the
%  convolution and pooling 50 features at a time to avoid running out of
%  memory. Reduce this number if necessary

stepSize = 50;
assert(mod(hiddenSize, stepSize) == 0, 'stepSize should divide hiddenSize');

load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels
load stlTestSubset.mat  % loads numTestImages,  testImages,  testLabels

pooledFeaturesTrain = zeros(hiddenSize, numTrainImages, ...
    floor((imageDim - patchDim + 1) / poolDim), ...
    floor((imageDim - patchDim + 1) / poolDim) );
pooledFeaturesTest = zeros(hiddenSize, numTestImages, ...
    floor((imageDim - patchDim + 1) / poolDim), ...
    floor((imageDim - patchDim + 1) / poolDim) );

tic();

for convPart = 1:(hiddenSize / stepSize)

    featureStart = (convPart - 1) * stepSize + 1;
    featureEnd = convPart * stepSize;

    fprintf('Step %d: features %d to %d\n', convPart, featureStart, featureEnd);
    Wt = W(featureStart:featureEnd, :);
    bt = b(featureStart:featureEnd);    

    fprintf('Convolving and pooling train images\n');
    convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...
        trainImages, Wt, bt, ZCAWhite, meanPatch);
    pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
    pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;
    toc();
    clear convolvedFeaturesThis pooledFeaturesThis;

    fprintf('Convolving and pooling test images\n');
    convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...
        testImages, Wt, bt, ZCAWhite, meanPatch);
    pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
    pooledFeaturesTest(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;
    toc();

    clear convolvedFeaturesThis pooledFeaturesThis;

end

% You might want to save the pooled features since convolution and pooling takes a long time
save('cnnPooledFeatures.mat', 'pooledFeaturesTrain', 'pooledFeaturesTest');
toc();

%%======================================================================
%% STEP 4: Use pooled features for classification
%  Now, you will use your pooled features to train a softmax classifier,
%  using softmaxTrain from the softmax exercise.
%  Training the softmax classifer for 1000 iterations should take less than
%  10 minutes.

% Add the path to your softmax solution, if necessary
% addpath /path/to/solution/

% Setup parameters for softmax
softmaxLambda = 1e-4;
numClasses = 4;
% Reshape the pooledFeatures to form an input vector for softmax
softmaxX = permute(pooledFeaturesTrain, [1 3 4 2]);
softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,...
    numTrainImages);
softmaxY = trainLabels;

options = struct;
options.maxIter = 200;
softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,...
    numClasses, softmaxLambda, softmaxX, softmaxY, options);

%%======================================================================
%% STEP 5: Test classifer
%  Now you will test your trained classifer against the test images

softmaxX = permute(pooledFeaturesTest, [1 3 4 2]);
softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages);
softmaxY = testLabels;

[pred] = softmaxPredict(softmaxModel, softmaxX);
acc = (pred(:) == softmaxY(:));
acc = sum(acc) / size(acc, 1);
fprintf('Accuracy: %2.3f%%\n', acc * 100);

% You should expect to get an accuracy of around 80% on the test images.
时间: 2024-10-14 16:31:49

UFLDL教程笔记及练习答案五(自编码线性解码器与处理大型图像)的相关文章

UFLDL教程笔记及练习答案二(预处理:主成分分析和白化)

首先将本节主要内容记录下来,然后给出课后习题的答案. 笔记: 1:首先我想推导用SVD求解PCA的合理性. PCA原理:假设样本数据X∈Rm×n,其中m是样本数量,n是样本的维数.PCA降维的目的就是为了使将数据样本由原来的n维降低到k维(k<n).方法是找数据随之变化的主轴,在Andrew Ng的网易公开课上我们知道主方向就是X的协方差所对应的最大特征值所对应的特征向量的方向(前提是这里X在维度上已经进行了均值归一化).在matlab中我们通常可以用princomp函数来求解,详细见:http

UFLDL教程笔记及练习答案六(稀疏编码与稀疏编码自编码表达)

稀疏编码(SparseCoding) sparse coding也是deep learning中一个重要的分支,同样能够提取出数据集很好的特征(稀疏的).选择使用具有稀疏性的分量来表示我们的输入数据是有原因的,因为绝大多数的感官数据,比如自然图像,可以被表示成少量基本元素的叠加,在图像中这些基本元素可以是面或者线. 稀疏编码算法的目的就是找到一组基向量使得我们能将输入向量x表示成这些基向量的线性组合: 这里构成的基向量要求是超完备的,即要求k大于n,这样的方程就大多情况会有无穷多个解,此时我们给

UFLDL教程笔记及练习答案三(Softmax回归与自我学习)

1:softmax回归 当p(y|x,theta)满足多项式分布,通过GLM对其进行建模就能得到htheta(x)关于theta的函数,将其称为softmax回归.教程中已经给了cost及gradient的求法.需要注意的是一般用最优化方法求解参数theta的时候,采用的是贝叶斯学派的思想,需要加上参数theta. 习题答案: (1) 数据加载------代码已给 (2) %% STEP 2: Implement softmaxCost   得到计算cost和gradient M = theta

UFLDL教程笔记及练习答案三(Softmax回归与自我学习***)

1:softmax回归 当p(y|x,theta)满足多项式分布,通过GLM对其进行建模就能得到htheta(x)关于theta的函数,将其称为softmax回归. 教程中已经给了cost及gradient的求法.须要注意的是一般用最优化方法求解參数theta的时候,採用的是贝叶斯学派的思想,须要加上參数theta. softmax回归 习题的任务就是用原有的像素数据集构建一个softmax回归模型进行分类.准确率略低 92.6%,. 而自我学习是用5~9之间的数据集当做无标签的数据集,然后构建

UFLDL教程笔记及练习答案四(建立分类用深度学习)

此次主要由自我学习过度到深度学习,简单记录如下: (1)深度学习比浅层网络学习对特征具有更优异的表达能力和紧密简洁的表达了比浅层网络大的多的函数集合. (2)将传统的浅层神经网络进行扩展会存在数据获取.局部最值和梯度弥散的缺点. (3)栈式自编码神经网络是由多层稀疏自编码器构成的神经网络(最后一层采用的softmax回归或者logistic回归分类),采用逐层贪婪的训练方法得到初始的参数,这样在数据获取方面就可以充分利用无标签的数据.通过逐层贪婪的训练方法又称为预训练,然后可以使用有标签的数据集

神级网络 - UFLDL教程笔记

激活函数: 1)sigmoid函数 - 值域(0,1)    2)tanh函数 - 值域(-1,1)   两个函数都扩展至向量表示:      - 网络层数  - 第l层的节点数(不包括偏置单元)  - 第l层第j单元 与 第l+1层第i单元之间的连接参数,大小为  - 第l+1层第i单元的偏置项  - 第l层的激活值  - 第l层第i单元输入加权和(包括偏置单元)  - 样本 m - 样本数 α - 学习率 λ - 权重衰减参数,控制方差代价函数两项的相对重要性. hw,b(x)=a 前向传播

UFLDL 教程三总结与答案

主成分分析(PCA)是一种能够极大提升无监督特征学习速度的数据降维算法.更重要的是,理解PCA算法,对实现白化算法有很大的帮助,很多算法都先用白化算法作预处理步骤.这里以处理自然图像为例作解释. 1.计算协方差矩阵:   按照通常约束,x为特征变量,上边表示样本数目,下标表示特征数目.这里样本数为m. xRot = zeros(size(x)); sigma=x*x'/size(x,2); %sigma为协方差矩阵 [U,S,V]=svd(sigma); %U为特征向量,X为特征值,V为U的转置

【《Objective-C基础教程 》笔记ch04】(五)OC中的继承inheritance机制

一.为什么需要继承 使用继承一方面继承了父类的特性,另一方便解决了重复代码维护问题. 二.继承之语法 1. @interface 子类名:父类名 2. OC只支持单继承,不支持继承多个父类. 3. 重构--移植和优化代码. 三. 继承的工作机制 1. 方法调度 子类拥有一个指向它父类的引用指针,消息传递时,OC的方法调度机制使用该信息来找到正确的实现方法,查找过程是现在子类中找,找不到再到父类中继续找. 2. 实例变量 1)继承实例源码 @interface Shape : NSObject {

UFLDL 教程答案 稀疏编码与softmax篇的答案已经传到资源,大家可以免费下载~

UFLDL 教程答案 稀疏编码篇与softmax篇的答案已经传到资源,大家可以免费下载~ 另外,关于资源里面描述的低效率的代码的问题,逗比的博主已经找到了解决方案,大家需要修改两个文件的两处代码,绿色是需要被注释的 softmaxCost.m文件 %% 非向量化 %for i = 1 : numCases %    thetagrad = thetagrad + (groundTruth(:,i) - Hx(:,i)) * data(:,i)'; % 10 * 100, 8 * 100 %end