http://scikit-learn.org/stable/modules/multiclass.html
在实际项目中,我们真的很少用到那些简单的模型,比如LR、kNN、NB等,虽然经典,但在工程中确实不实用。
今天我们关注在工程中用的相对较多的 Multiclass and multilabel algorithms。
warning:scikit-learn的所有分类器都是可以do multiclass classification out-of-the-box(可直接使用),所以没必要使用本节介绍的
sklearn.multiclass module,这里只是讲些知识点。
Below is a summary of the classifiers supported by scikit-learn grouped by strategy; you don’t need the meta-estimators in this class if you’re using one of these unless you want custom multiclass behavior:
- Inherently multiclass: Naive
Bayes, sklearn.lda.LDA, Decision
Trees, Random Forests, Nearest
Neighbors, setting “multi_class=multinomial” in sklearn.linear_model.LogisticRegression.- One-Vs-One: sklearn.svm.SVC.
- One-Vs-All: all linear models except sklearn.svm.SVC.
Some estimators also support multioutput-multiclass classification tasks Decision
Trees, Random Forests, Nearest
Neighbors.
三类问题:
Multiclass classification means
a classification task with more than two classes;但是一个sample只能属于其中一个类别(相当于一个多元分类)。
Multilabel
classification assigns to each sample a set of target labels.一个sample可以属于多个类别(相当于多个二元分类)。
Multioutput-multiclass
classification and multi-task
classification means that a single estimator has to handle several joint classification tasks.(相当于多个多元分类:The
set of labels can be different for each output variable. For instance a sample could be assigned “pear” for an output variable that takes possible values in a finite set of species such as “pear”, “apple”, “orange” and “green” for a second output variable
that takes possible values in a finite set of colors such as “green”, “red”, “orange”, “yellow”...)。
版权声明:本文为博主原创文章,未经博主允许不得转载。