scikit-learn：4.7. Pairwise metrics, Affinities and Kernels

参考：http://scikit-learn.org/stable/modules/metrics.html

The sklearn.metrics.pairwise submodule
implements utilities to evaluate pairwise distances（样本对的距离） or affinity of sets of samples（样本集的相似度）。

Distance metrics are functions d(a, b) such that d(a, b) < d(a, c) if
objects a and b are considered “more similar” than objects a and c.

Kernels are measures of similarity, i.e. s(a, b) > s(a, c) if
objects a and b are considered “more similar” than objects a and c.

1、Cosine similarity

向量点积的L2-norm：

if and are
row vectors, their cosine similarity is
defined as:

This kernel is a popular choice
for computing the similarity of documents represented as tf-idf vectors.

2、Linear kernel

If x and y are column vectors, their linear kernel is:

(x, y) = x_transport
* y

3、Polynomial kernel

Conceptually, the polynomial kernels
considers not only the similarity between vectors under the same dimension, but also across dimensions. When used in machine learning algorithms, this allows to account for feature interaction.

The polynomial kernel is defined as:

4、Sigmoid kernel

defined as:

5、RBF kernel

defined as:

If the
kernel is known as the Gaussian kernel of variance .

6、Chi-squared kernel

defined as:

The chi-squared kernel is a very popular choice for training non-linear SVMs in computer
vision applications. It can be computed usingchi2_kernel and
then passed to an sklearn.svm.SVC with kernel="precomputed":

>>>

>>> from sklearn.svm import SVC
>>> from sklearn.metrics.pairwise import chi2_kernel
>>> X = [[0, 1], [1, 0], [.2, .8], [.7, .3]]
>>> y = [0, 1, 0, 1]
>>> K = chi2_kernel(X, gamma=.5)
>>> K
array([[ 1.        ,  0.36...,  0.89...,  0.58...],
       [ 0.36...,  1.        ,  0.51...,  0.83...],
       [ 0.89...,  0.51...,  1.        ,  0.77... ],
       [ 0.58...,  0.83...,  0.77... ,  1.        ]])

>>> svm = SVC(kernel=‘precomputed‘).fit(K, y)
>>> svm.predict(K)
array([0, 1, 0, 1])

It can also be directly used as the kernel argument:

>>>

>>> svm = SVC(kernel=chi2_kernel).fit(X, y)
>>> svm.predict(X)
array([0, 1, 0, 1])

时间： 2024-11-05 12:20:31

scikit-learn：4.7. Pairwise metrics, Affinities and Kernels

scikit-learn：4.7. Pairwise metrics, Affinities and Kernels的相关文章

scikit learn 模块调参 pipeline+girdsearch 数据举例：文档分类

Query意图分析：记一次完整的机器学习过程（scikit learn library学习笔记）

Python之扩展包安装（scikit learn）

Linear Regression with Scikit Learn

Scikit Learn安装教程

Spark技术在京东智能供应链预测的应用——按照业务进行划分，然后利用scikit learn进行单机训练并预测

机器学习-scikit learn学习笔记

【359】scikit learn 官方帮助文档

Scikit Learn