03 Logistic Regression

Binary Classification

Define

Sigmoid Function Logistic Function
\[ h_\theta(x) = g(\theta^Tx) \]
\[ z = \theta^Tx \]
\[ 0 <= g(z) = \frac{1}{1 + e^{-z}} <= 1 \]
\( h_\theta(x) \) the probability that the output is 1.
\( h_\theta(x) = P(y = 1 | x; \theta) \)
\( P(y = 0 | x; \theta) + P(y = 1 | x; \theta) = 1 \)
设置 0.5为判定边界，则 \(h_\theta(x)=0.5 <==> \theta^Tx = 0\)

Cost Function

\[J(\theta) = \dfrac{1}{m} \sum_{i=1}^m \mathrm{Cost}(h_\theta(x^{(i)}),y^{(i)})\]
\[ \mathrm{Cost}(h_\theta(x),y) = -\log(h_\theta(x)) ,\text{if y = 1} \]
\[ \mathrm{Cost}(h_\theta(x),y) = -\log(1-h_\theta(x)), \text{if y = 0} \]
\[ Cost(h_\theta(x), y) = -ylog(h_\theta(x)) - (1 - y)log(1 - h_\theta(x)) \]

Algorithm

\(\begin{align*} & Repeat \; \lbrace \newline & \; \theta_j := \theta_j - \frac{\alpha}{m} \sum_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)}) x_j^{(i)} \newline & \rbrace \end{align*}\)

虽然跟梯度下降相同但两者\(h_\theta(x)\)的定义并不相同
逻辑回归也可以通过特征缩放来加快收敛速度

可用于计算 \(\theta\) 的算法
- Gradient descent
- Conjugate gradient
- BFGS 共轭梯度法（变尺度法）
- L-BFGS 限制变尺度法
- 后三种算法的特性
Advantages:
a.no need to manually pick \(\alpha\)
b.often faster than gradient descent
Disadvantages:
More complex
Octave 的优化算法使用

%exitFlag: 1 收敛
%R(optTheta) >= 2
options = optimset(‘GradObj’, ‘on’, ‘MaxIter’, ‘100’);
initialTheta = zeros(2, 1);
[optTheta, functionVal, exitFlag] ...
    = fminumc(@costFunction, initialTheta, options);

%costFunction:
function [jVal, gradient] = costFunction(theta)
    jVal = ... %cost function
    gradient = zeros(n, 1); %gradient

    gradient(1) = ...
    ...
    gradient(n) = ...

Multi-class classification

one-vs-all `one-vs-rest`

Train a logistic regression classifier \(h_\theta^{(i)}(x)\) for each class \(i\) to predict the probability that \(y = i\).
On a new input \(x\), to make a prediction, pick the class \(i\) that maximizes \(\max \limits_ih_\theta^{(i)}(x)\).

原文地址：https://www.cnblogs.com/QQ-1615160629/p/03-Logistic-Regression.html

时间： 2024-12-21 19:58:32

03 Logistic Regression的相关文章

机器学习笔记04：逻辑回归(Logistic regression)、分类(Classification)

之前我们已经大概学习了用线性回归(Linear Regression)来解决一些预测问题,详见: 1.<机器学习笔记01:线性回归(Linear Regression)和梯度下降(Gradient Decent)> 2.<机器学习笔记02:多元线性回归.梯度下降和Normal equation> 3.<机器学习笔记03:Normal equation及其与梯度下降的比较> 说明:本文章所有图片均属于Stanford机器学课程,转载请注明出处面对一些类似回归问题,我们可

Deep learning：四(logistic regression练习)

前言: 本节来练习下logistic regression相关内容,参考的资料为网页:http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex4/ex4.html.这里给出的训练样本的特征为80个学生的两门功课的分数,样本值为对应的同学是否允许被上大学,如果是允许的话则用'1'表示,否则不允许就用'0'表示,这是一个典型的二分类问题.在此问题中,给出的80个

Logistic Regression and Gradient Descent

Logistic Regression and Gradient Descent Logistic regression is an excellent tool to know for classification problems. Classification problems are problems where you are trying to classify observations into groups. To make our examples more concrete,

Logistic Regression & Classification (1)

一.为什么不使用Linear Regression 一个简单的例子:如果训练集出现跨度很大的情况,容易造成误分类.如图所示,图中洋红色的直线为我们的假设函数 .我们假定,当该直线纵轴取值大于等于0.5时,判定Malignant为真,即y=1,恶性肿瘤:而当纵轴取值小于0.5时,判定为良性肿瘤,即y=0. 就洋红色直线而言,是在没有最右面的"×"的训练集,通过线性回归而产生的.因而这看上去做了很好的分类处理,但是,当训练集中加入了右侧的"×"之后,导致整个线性回归的结

对Logistic Regression 的初步认识

线性回归回归就是对已知公式的未知参数进行估计.比如已知公式是y=a∗x+b,未知参数是a和b,利用多真实的(x,y)训练数据对a和b的取值去自动估计.估计的方法是在给定训练样本点和已知的公式后,对于一个或多个未知参数,机器会自动枚举参数的所有可能取值,直到找到那个最符合样本点分布的参数(或参数组合).也就是给定训练样本,拟合参数的过程,对y= a*x + b来说这就是有一个特征x两个参数a b,多个样本的话比如y=a*x1+b*x2+...,用向量表示就是y = ,就是n个特征,n个参数的拟

Coursera台大机器学习课程笔记9 -- Logistic Regression

这一节课主要讲如何用logistic regression做分类. 在误差衡量问题上,选取了最大似然函数误差函数,这一段推导是难点. 接下来是如何最小化Ein,采用的是梯度下降法,这个比较容易. 参考:http://beader.me/mlnotebook/section3/logistic-regression.html http://www.cnblogs.com/ymingjingr/p/4330304.html

logistic regression编程练习

本练习以<机器学习实战>为基础, 重现书中代码, 以达到熟悉算法应用为目的 1.梯度上升算法新建一个logRegres.py文件, 在文件中添加如下代码: from numpy import * #加载模块 numpy def loadDataSet(): dataMat = []; labelMat = [] #加路径的话要写作:open('D:\\testSet.txt','r') 缺省为只读 fr = open('testSet.txt') #readlines()函数一次读取整个文件

最详细的基于R语言的Logistic Regression（Logistic回归）源码，包括拟合优度，Recall，Precision的计算

这篇日志也确实是有感而发,我对R不熟悉,但实验需要,所以简单学了一下.发现无论是网上无数的教程,还是书本上的示例,在讲Logistic Regression的时候就是给一个简单的函数及输出结果说明.从来都没有讲清楚几件事情: 1. 怎样用训练数据训练模型,然后在测试数据上进行验证(测试数据和训练数据可能有重合)? 2. 怎样计算预测的效果,也就是计算Recall,Precision,F-measure等值? 3. 怎样计算Nagelkerke拟合优度等评价指标? 发现这些书本和一些写博客的朋友,

深度学习 Deep LearningUFLDL 最新Tutorial 学习笔记 2：Logistic Regression

1 Logistic Regression 简述 Linear Regression 研究连续量的变化情况,而Logistic Regression则研究离散量的情况.简单地说就是对于推断一个训练样本是属于1还是0.那么非常easy地我们会想到概率,对,就是我们计算样本属于1的概率及属于0的概率,这样就能够依据概率来预计样本的情况,通过概率也将离散问题变成了连续问题. Specifically, we will try to learn a function of the form: P(y=1