ufldl出了新教程,感觉比之前的好,从基础讲起,系统清晰,又有编程实践。
在deep learning高质量群里面听一些前辈说,不必深究其他机器学习的算法,可以直接来学dl。
于是最近就开始搞这个了,教程加上matlab编程,就是完美啊。
新教程的地址是:http://ufldl.stanford.edu/tutorial/
本节学习链接:http://ufldl.stanford.edu/tutorial/supervised/DebuggingGradientChecking/
所谓梯度,可以理解为高中所学的导数,斜率。
检测就是利用极限的思想,取无穷短的距离,来算两点的斜率。
当然无穷短是理想,一般取10?4
这节的编程作业是检测之前的线性回归和逻辑回归的梯度。
我直接用了提供的ex1/grad_check.m
来检测。
% %This exercise uses a data from the UCI repository: % Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository % http://archive.ics.uci.edu/ml % Irvine, CA: University of California, School of Information and Computer Science. % %Data created by: % Harrison, D. and Rubinfeld, D.L. % ''Hedonic prices and the demand for clean air'' % J. Environ. Economics & Management, vol.5, 81-102, 1978. % addpath "../common" addpath "../common/minFunc_2012/minFunc" addpath "../common/minFunc_2012/minFunc/compiled" % Load housing data from file. data = load('housing.data'); data=data'; % put examples in columns 坑爹啊,这里转置了,看了半天 % Include a row of 1s as an additional intercept feature. data = [ ones(1,size(data,2)); data ]; % Shuffle examples. data = data(:, randperm(size(data,2))); % Split into train and test sets % The last row of 'data' is the median home price. train.X = data(1:end-1,1:400); train.y = data(end,1:400); test.X = data(1:end-1,401:end); test.y = data(end,401:end); m=size(train.X,2); n=size(train.X,1); % Initialize the coefficient vector theta to random values. theta = rand(n,1); % Run the minFunc optimizer with linear_regression.m as the objective. % % TODO: Implement the linear regression objective and gradient computations % in linear_regression.m % tic; %options = struct('MaxIter', 200); %theta = minFunc(@linear_regression, theta, options, train.X, train.y); %fprintf('Optimization took %f seconds.\n', toc); grad_check(@linear_regression, theta, 200, train.X, train.y);
主要就是最后一行调用grad_check。前面都是load数据。
第三列是梯度误差,第四列是linear_regression函数求的梯度,第五列是估算的梯度。
看起来,梯度误差挺大的。但是不影响梯度的数量级别。
还有,我用的是Octave,貌似没有randsample函数。貌似有人说只有旧版的matlab才有这函数,何况我这个连matlab都不是。
幸好这里可以用randperm来代替。
grad_check.m代码修改如下:
function average_error = grad_check(fun, theta0, num_checks, varargin) delta=1e-3; sum_error=0; fprintf(' Iter i err'); fprintf(' g_est g f\n') for i=1:num_checks T = theta0; %j = randsample(numel(T),1); j = randperm(numel(T),1); T0=T; T0(j) = T0(j)-delta; T1=T; T1(j) = T1(j)+delta; [f,g] = fun(T, varargin{:}); f0 = fun(T0, varargin{:}); f1 = fun(T1, varargin{:}); g_est = (f1-f0) / (2*delta); error = abs(g(j) - g_est); fprintf('% 5d % 6d % 15g % 15f % 15f % 15f\n', ... i,j,error,g(j),g_est,f); sum_error = sum_error + error; end average=sum_error/num_checks;
本文作者:linger
本文链接:http://blog.csdn.net/lingerlanlan/article/details/38390955
ufldl学习笔记与编程作业:Debugging: Gradient Checking(梯度检测),布布扣,bubuko.com
时间: 2024-10-04 18:13:08