广义线性模型与Logistic回归

一、广义线性模型

广义线性模型应满足三个假设：

第一个假设为给定X和参数theta,Y的分布服从某一指数函数族的分布。

第二个假设为给定了X，目标是输出 X条件下T(y)的均值，这个T(y)一般等于y，也有不等的情况，

第三个假设是对假设一种的变量eta做出定义。

二、指数函数族

前面提到了指数函数族，这里给出定义，满足以下形式的函数构成了指数函数族：

其中a,b,T都是函数。

三、Logistic 函数的导出

Logistic回归假设P(y|x)满足伯努利Bernouli分布即

我们的目标是在给定X的情况下，能对参数phi进行建模，进而得到phi关于X的模型，怎么选择这个模型是一个问题，现在我们把满足努伯利分布的后验概率转化成指数函数族的形式，进而得出phi关于X的模型。

现在我们就得到了phi关于X的模型，参数是theta，同时我们要建立我们分类的假说模型：

这意味着如果我们得到了参数theta，对于给定的x就可以得到y=1的概率，则y=0的概率也可以求出，问题也就解决了，下面是讲怎么求解参数theta。

四、目标函数与梯度

现在我们已经知道了Logistic回归模型的形式，那么为了得到最优的参数，我们利用最大似然估计对theta进行估计。

现在我们已经得到了优化函数的导数，利用最速下降法可以得到参数更新公式：

利用牛顿法求解最小值也是可以的，这里会用到Hessian矩阵

牛顿法的参数更新公式为

当然也可以用其他的最优化的算法，BGGS,L-BFGS等等。

五、Matlab实验

实验中的是mnist数据库，用到了其中的手写数字0和1的数据，用的是梯度下降法求解。

%%======================================================================
%% STEP 0: Initialise constants and parameters
%
%  Here we define and initialise some constants which allow your code
%  to be used more generally on any arbitrary input.
%  We also initialise some parameters used for tuning the model.

inputSize = 28 * 28+1; % Size of input vector (MNIST images are 28x28)
numClasses = 2;     % Number of classes (MNIST images fall into 10 classes)

% lambda = 1e-4; % Weight decay parameter

%%======================================================================
%% STEP 1: Load data
%
%  In this section, we load the input and output data.
%  For softmax regression on MNIST pixels,
%  the input data is the images, and
%  the output data is the labels.
%

% Change the filenames if you've saved the files under different names
% On some platforms, the files might be saved as
% train-images.idx3-ubyte / train-labels.idx1-ubyte

images = loadMNISTImages('mnist/train-images-idx3-ubyte');
labels = loadMNISTLabels('mnist/train-labels-idx1-ubyte');

index=(labels==0|labels==1);

images=images(:,index);
labels=labels(index);

inputData = [images;ones(1,size(images,2))];

% Randomly initialise theta

%%======================================================================
%% STEP 2: Implement softmaxCost
%
%  Implement softmaxCost in softmaxCost.m. 

% [cost, grad] = logisticCost(theta, inputSize,inputData, labels);

%%======================================================================
%% STEP 4: Learning parameters
%
%  Once you have verified that your gradients are correct,
%  you can start training your softmax regression code using softmaxTrain
%  (which uses minFunc).

options.maxIter = 100;
options.alpha = 0.1;
options.method = 'Grad';
theta = logisticTrain( inputData, labels,options);

% Although we only use 100 iterations here to train a classifier for the
% MNIST data set, in practice, training for more iterations is usually
% beneficial.

%%======================================================================
%% STEP 5: Testing
%
%  You should now test your model against the test images.
%  To do this, you will first need to write softmaxPredict
%  (in softmaxPredict.m), which should return predictions
%  given a softmax model and the input data.

images = loadMNISTImages('mnist/t10k-images-idx3-ubyte');
labels = loadMNISTLabels('mnist/t10k-labels-idx1-ubyte');

index=(labels==0|labels==1);
images=images(:,index);
labels=labels(index);

inputData = [images;ones(1,size(images,2))];

% You will have to implement softmaxPredict in softmaxPredict.m
[pred] = logisticPredict(theta, inputData);

acc = mean(labels(:) == pred(:));
fprintf('Accuracy: %0.3f%%\n', acc * 100);

% Accuracy is the proportion of correctly classified images
% After 100 iterations, the results for our implementation were:
%
% Accuracy: 92.200%
%
% If your values are too low (accuracy less than 0.91), you should check
% your code for errors, and make sure you are training on the
% entire data set of 60000 28x28 training images
% (unless you modified the loading code, this should be the case)

function [modelTheta] = logisticTrain(inputData, labels,option)

if ~exist('options', 'var')
    options = struct;
end

if ~isfield(options, 'maxIter')
    options.maxIter = 400;
end

if ~isfield(options, 'method')
    options.method = 'Newton';
end

if ~isfield(options, 'alpha')
    options.method = 0.01;
end
theta = 0.005 * randn(size(inputData,1),1);
iter=1;

maxIter=option.maxIter;
alpha=option.alpha;
method=option.method;
fprintf('iter\tStep Length\n');
lastSteps=0;
while iter<=maxIter

    h=sigmoid(theta'*inputData);
%     cost=sum(labels'.*log(h)+(1-labels').*log(1-h),2)/size(inputData,2);
    grad=inputData*(labels'-h)';

    if strcmp(method,'Grad')>0
        steps=alpha.*grad;
%     else
%         H = inputData* diag(h) * diag(1-h) * inputData';
%         steps=-alpha.*H\grad;
    end
    theta=theta+steps;
    stepLength=sum(steps.^2)/size(steps,1);
    fprintf('%d\t%f\n',iter,stepLength);
    if abs(stepLength)<1e-9
        break;
    end
    iter=iter+1;
end
modelTheta=theta;

    function z=sigmoid(x)
        z=1./(1+exp(-1.*x));
    end
end

function [pred] = logisticPredict(theta, data)

% softmaxModel - model trained using softmaxTrain
% data - the N x M input matrix, where each column data(:, i) corresponds to
%        a single test set
%
% Your code should produce the prediction matrix
% pred, where pred(i) is argmax_c P(y(c) | x(i)).

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute pred using theta assuming that the labels start
pred=theta'*data>0.5;
% ---------------------------------------------------------------------
end

to be continued.....

时间： 2024-12-20 01:15:28

广义线性模型与Logistic回归的相关文章

斯坦福CS229机器学习课程笔记二：GLM广义线性模型与Logistic回归

一直听闻Logistic Regression逻辑回归的大名,比如吴军博士在<数学之美>中提到,Google是利用逻辑回归预测搜索广告的点击率.因为自己一直对个性化广告感兴趣,于是疯狂google过逻辑回归的资料,但没有一个网页资料能很好地讲清到底逻辑回归是什么.幸好,在CS229第三节课介绍了逻辑回归,第四节课介绍了广义线性模型,综合起来总算让我对逻辑回归有了一定的理解.与课程的顺序相反,我认为应该先了解广义线性模型再来看逻辑回归,也许这也是为什么讲逻辑回归的网页资料总让人感觉云里雾里的原因

从广义线性模型到逻辑回归，逻辑回归的深入理解

总结:由于逻辑回归假定y的条件分布(y|x)是伯努利分布,所以根据广义线性模型和指数分布簇的定义可以得到逻辑回归的假设函数是sigmoid函数. 广义线性模型的三个假设——逻辑回归 1. 假定服从指数分布簇的某个分布逻辑回归中,,所以假定[即已知参数θ的情况下,给定x,y的条件概率服从参数的伯努利分布],此时有.在将伯努利分布转换成指数簇的表达形式中,得知[见文章后面参考] 2.假定在假设h下预测值hθ(x)满足[即预测结果要和均值相同] 在逻辑回归中, 3. 假定自然参数η与输入x之间是线性

Andrew机器学习课程的学习总结1：监督学习的一种方法论，广义线性模型（GLM）的方法学

1. 监督学习的一种方法学,广义线性模型(GLM)的方法学 [转载时请注明来源]:http://www.cnblogs.com/aria313 ——根据Andrew Ng 2008年课程的第1~4节,以及相关的讲义notes 1,进行总结网易公开课地址:http://study.163.com/plan/planMain.htm?id=1200146 2015.8.14 1.1. 总体核心监督学习Supervised learning: 有样本集合,样本是标准的正确答案:以此为根据学习

logistic回归模型

一.模型简介线性回归默认因变量为连续变量,而实际分析中,有时候会遇到因变量为分类变量的情况,例如阴性阳性.性别.血型等.此时如果还使用前面介绍的线性回归模型进行拟合的话,会出现问题,以二分类变量为例,因变量只能取0或1,但是拟合出的结果却无法保证只有这两个值. 那么使用概率的概念来进行拟合是否可以呢?答案也是否定的,因为1.因变量的概率和自变量之间的关系不是线性的,通常呈S型曲线,并且这种曲线是无法通过曲线直线化进行处理的.2.概率的取值应该在0-1之间,但是线性拟合的结果范围是整个实数集,并

机器学习 —— 基础整理（五）：线性回归；二项Logistic回归；Softmax回归；广义线性模型

本文简单整理了以下内容: (一)线性回归 (二)二分类:二项Logistic回归 (三)多分类:Softmax回归 (四)广义线性模型二项Logistic回归是我去年入门机器学习时学的第一个模型,我觉得这个模型很适合用来入门(但是必须注意这个模型有很多很多很多很多可以展开的地方).比较有意思的是那时候还不会矩阵微积分,推导梯度时还是把矩阵全都展开求的(牛顿法要用的二阶梯度也是)... 下面的文字中,"Logistic回归"都表示用于二分类的二项Logistic回归. 首先约定一下记号

分类和逻辑回归(Classification and logistic regression)，广义线性模型(Generalized Linear Models) ，生成学习算法(Generative Learning algorithms)

分类和逻辑回归(Classification and logistic regression) http://www.cnblogs.com/czdbest/p/5768467.html 广义线性模型(Generalized Linear Models) http://www.cnblogs.com/czdbest/p/5769326.html 生成学习算法(Generative Learning algorithms) http://www.cnblogs.com/czdbest/p/5771