Classification 1

Classification examples:

1, multiclass classifier

input : web page

output : what the web page is about, education, finance or technology

2, spam filtering

input : text of email, ip, sender...

output : spam or not spam

3, image classification

input : image pixels

output : predicted object

4, personalized medical diagnosis

input : temperature, x-ray result, lab test result, DNA, life style

output : healthy, flu, cold, pneumonia...

5, reading your mind

input : FMRI image

output : what are you reading, what are you looking at

Overview of content

1, Models

linear classifiers

logistic regression : probability

decision trees : provide non linear feature for the data, example : whether a loan is risky or safe

ensembles

2, Algorithms

Gradient, Stochastic gradient, Recursive greedy, Boosting using AdaBoost

3, Core ML

Alleviating overfitting, handling missing data, precision recall, online learning

Linear Classifiers

Logistic regression : the most commonly used linear classifiers and one of the most useful ones.

1, Linear classifier : a motivating example

Classifying sentiment of review, build a sentence sentiment classifier, given a sentence, is it a positive sentiment or negative sentiment?

2, Intuition behind linear classifier:

split our data into training set and validation set, then we will feed our traning dataset to some learning algorithm to learn the weights associated with each word, finally we will go back to evaluate its accuracy on that validation set.

3, Decision boudaries

A boundary between positive predictions and negative predictions.

when 2 coefficients are non-zero, it‘s a line, when 3 coefficients are non-zero, it‘s a plane, when many coefficients are non-zero, it‘s a hyperplane.

 Class probabilities

1, Predict class probabilities

In logistic regression, we don‘t just predict plus one or minus one, we predict a probability, how likely is this review to be positive? how likely is this review to be negative? probabilities give us an indication of how sure we are about predictions we make.

2, Using probabilities in classification

Logistic regression

1, Predicting class probabilities with (generalized) linear models

2, The sigmoid (or logistic) link function

3,

时间: 2024-11-23 06:19:42

Classification 1的相关文章

Sentiment Analysis(1)-Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables

The content is from this paper: Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables, by Tetsuji Nakagawa. A typical approach for sentiment classification is to use supervised machine learning algorithms with bag-of-words a

Logistic Regression & Classification (1)

一.为什么不使用Linear Regression 一个简单的例子:如果训练集出现跨度很大的情况,容易造成误分类.如图所示,图中洋红色的直线为我们的假设函数 .我们假定,当该直线纵轴取值大于等于0.5时,判定Malignant为真,即y=1,恶性肿瘤:而当纵轴取值小于0.5时,判定为良性肿瘤,即y=0. 就洋红色直线而言,是在没有最右面的"×"的训练集,通过线性回归而产生的.因而这看上去做了很好的分类处理,但是,当训练集中加入了右侧的"×"之后,导致整个线性回归的结

Random Forest Classification of Mushrooms

There is a plethora of classification algorithms available to people who have a bit of coding experience and a set of data. A common machine learning method is the random forest, which is a good place to start. This is a use case in R of the randomFo

cdmc2016数据挖掘竞赛题目Android Malware Classification

http://www.csmining.org/cdmc2016/ Data Mining Tasks Description Task 1: 2016 e-News categorisation For this year, the dataset is sourced from 6 online news media: The New Zealand Herald (www.nzherald.co.nz), Reuters(www.reuters.com), The Times (www.t

Support Vector Machines for classification

Support Vector Machines for classification To whet your appetite for support vector machines, here’s a quote from machine learning researcher Andrew Ng: “SVMs are among the best (and many believe are indeed the best) ‘off-the-shelf’ supervised learni

Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification

引入 Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity in training and O(n) in testing, where n is the training size, implying t

[notes] ImageNet Classification with Deep Convolutional Neual Network

Paper: ImageNet Classification with Deep Convolutional Neual Network Achievements: The model addressed by Alex etl. achieved top-1 and top-5 test error rate of 37.5% and 17.0% of classifying the 1.2 million high-resolution images in the ImageNet LSVR

机器学习笔记(Washington University)- Classification Specialization-week five

1. Ensemble classifier  Each classifier votes on prediction Ensemble model = sign(w1f1(xi) + w2f2(xi) + w3f3(xi)) w1 w2 w3 is the learning coefficients f1(xi), f2(xi), f3(xi)) is three classifiers 2. Boosting Focus on hard or more important pointsand

Logistic Regression‘s Cost Function & Classification (2)

一.为什么不用Linear Regression的Cost Function来衡量Logistic Regression的θ向量 回顾一下,线性回归的Cost Function为 我们使用Cost函数来简化上述公式: 那么通过上一篇文章,我们知道,在Logistic Regression中,我们的假设函数是sigmoid形式的,也就是: 这样一来会产生一个凸(convex)函数优化的问题,我们将g(z)带入到Cost函数中,得到的J(θ)是一个十分不规则的非凸函数,如图所示,如果使用梯度下降法来

ODC(Orthogonal Defect Classification)简介——正交缺陷分类法

Defect分析是软件开发和测试中一个重要的环节,ODC介绍了一种不同于大家常用的非常有效的defect分类及分析方法.这篇文章简单的向大家介绍了什么是ODC,以及如何在项目和产品开发中使用ODC来改进开发测试流程从而增强产品质量.希望读者具有基本的软件开发和测试经验,并且了解defect分析的基本方法. Defect 分类帮助改进产品质量 软件开发中都包含有控制软件开发的流程.我们设计模块.开发代码.对产品进行测试.然后发布产品.但是,我们怎样从以前的错误中学习,怎样做得更好呢?一般情况下,我