Machine Learning Techniques -5-Kernel Logistic Regression

5-Kernel Logistic Regression

Last class, we learnt about soft margin and its application. Now, a new idea comes to us, could we

apply the kernel trick to our old frirend logistic regression?

Firstly, let‘s review those four concepts of margin handling:

As we can see, the differences between "Hard" and "Soft" is showed from constant C, which is a bit similar to Regularization.

Since we define a new factor called ξ, we can use MAX function smoothly express the margin violation:

thus the unconstrained form of soft-margin SVM:

We can easily find that the form of this subject is similar to L2 regularization.

However, there is no QP formation and the function of max may lead to some place not differentiable.

Thus we get the idea, which apply SVM as a regularization model:

For the REGULARIZATION FORM SVM, a larger C means the smaller influence from wTw, which is the regularization factor.

Then, a comparition about error will be given,

the SVM error has different appearance with the middle point, which we called hinge error measure.

Now for this binary Classification, could LogReg and SVM be joint?

Because we know the advantage of SVM, which is able to simplify the computing by kernel, while the LogReg holds some other benefits.

Here we apply the Platt‘s Scaling https://en.wikipedia.org/wiki/Platt_scaling

which is found to be a nice method to better the binary problem.

We caculate the transforming of SVM to get the w and b, and we other tool to find best A and B.

In conclusion, the structure of our demand is like that :

we want to use KERNEL ->   we need wT*Z (to package into KERNEL)  -> we need linear combination of Zn

optimal w be represented by zn:

Since the w|| can be prpved to be the only subitem in w:

It can be proved that any L2- regularized linear model can be kernelized.

So, here we get a new represention called Kernel Logistic Regression (KLR),

There are something we should pay attention:

1. the dimention of this issue is subject to  N of samples.

2. the β can be seen as a description toward the relationship between xn and any other points in X space.

3. βn could not be zero, which means larger computing cost compared to the process of finding good w.

时间: 2024-10-10 05:51:58

Machine Learning Techniques -5-Kernel Logistic Regression的相关文章

Andrew Ng Machine Learning - Week 3:Logistic Regression & Regularization

此文是斯坦福大学,机器学习界 superstar - Andrew Ng 所开设的 Coursera 课程:Machine Learning 的课程笔记.力求简洁,仅代表本人观点,不足之处希望大家探讨. 课程网址:https://www.coursera.org/learn/machine-learning/home/welcome Week 1: Introduction 笔记:http://blog.csdn.net/ironyoung/article/details/46845233 We

machine learning(13) --Regularization:Regularized linear regression

machine learning(13) --Regularization:Regularized linear regression Gradient descent without regularization                    with regularization                     θ0与原来是的没有regularization的一样 θ1-n和原来相比会稍微变小(1-αλ⁄m)<1 Normal equation without regular

Kernel Logistic Regression

所以,这里我们通过一种两个步骤的训练方式把SVM方法和Logistic Regression结合起来,第一步我们还是通过SVM求解得到Wsvm和bsvm,然后我们把得到的w和b,用上面的方法进行Logistic Regression的训练, 通过A和B这两个参数进行放缩和平移,最终得到的结果如果A>0的话,那么Wsvm就是好的,B接近0的话, bsvm也是可靠的. 这里我们把Platt’s Model的步骤概括成为以下的步骤: 因为有B的存在,所以有一定的平移的效果,所以soft binary

Machine Learning Techniques -0

开学前还有一段时间,正好差不多可以follow台大Hsuan-Tien Lin老师Machine Learning Techniques这门课: 不过只输入信息而不输出效率太低,所以建个博客记录一下.

Probabilistic SVM 与 Kernel Logistic Regression(KLR)

本篇讲的是SVM与logistic regression的关系. (一) SVM算法概论 首先我们从头梳理一下SVM(一般情况下,SVM指的是soft-margin SVM)这个算法. 这个算法要实现的最优化目标是什么?我们知道这个目标必然与error measurement有关. 那么,在SVM中,何如衡量error的?也即:在SVM中ε具体代表着什么? SVM的目标是最小化上式.我们用来衡量error.这个式子是不是有点眼熟?我们在regularzation一篇中,最小化的目标也是如此形式.

Machine Learning Techniques -6-Support Vector Regression

6-Support Vector Regression For the regression with squared error, we discuss the kernel ridge regression. With the knowledge of kernel function, could we find an analytic solution for kernel ridge regression? Since we want to find the best βn Howeve

【Kernel Logistic Regression】林轩田机器学习技术

最近求职真慌,一方面要看机器学习,一方面还刷代码.还是静下心继续看看课程,因为觉得实在讲的太好了.能求啥样搬砖工作就随缘吧. 这节课的核心就在如何把kernel trick到logistic regression上. 首先把松弛变量的表达形式修改一下,把constrained的形式改成unconstrained的形式. 改成这种'unconstrained' form of soft-margin SVM之后,突然发现很像L2 regularization 如果用regularized mode

Machine Learning - week 2 - Multivariate Linear Regression

Gradient Descent in Practice - Feature Scaling Make sure features are on a similar scale. Features 的范围越小,总的可能性就越小,计算速度就能加快. Dividing by the range 通过 feature/range 使每个 feature 大概在 [-1, 1] 的范围内 下题是一个例子: Mean normalization 将值变为接近 0.除了 x0,因为 x0 的值为 1. mu

Machine Learning Techniques -3-Dual Support Vector Machine

For the naive thought to practise my academic English skill, the rest of my notes will be wrriten in my terrrible English.XD If you have any kind of uncomfortable feel, please close this window and refer to the original edition from Mr. Lin. I will b