What is an eigenvector of a covariance matrix?

What is an eigenvector of a covariance matrix?

One of the most intuitive explanations of eigenvectors of a covariance matrix is that they are the directions in which the data varies the most.

(More precisely, the first eigenvector is the direction in which the data varies the most, the second eigenvector is the direction of greatest variance among those that are orthogonal (perpendicular) to the first eigenvector, the third eigenvector is the direction of greatest variance among those orthogonal to the first two, and so on.)

Here is an example in 2 dimensions [1]:

Each data sample is a 2 dimensional point with coordinates x, y. The eigenvectors of the covariance matrix of these data samples are the vectors u and v; u, longer arrow, is the first eigenvector and v, the shorter arrow, is the second. (The eigenvalues are the length of the arrows.) As you can see, the first eigenvector points (from the mean of the data) in the direction in which the data varies the most in Euclidean space, and the second eigenvector is orthogonal (perpendicular) to the first.

It‘s a little trickier to visualize in 3 dimensions, but here‘s an attempt [2]:

In this case, imagine that all of the data points lie within the ellipsoid. v1, the direction in which the data varies the most, is the first eigenvector (lambda1 is the corresponding eigenvalue). v2 is the direction in which the data varies the most among those directions that are orthogonal to v1. And v3 is the direction of greatest variance among those directions that are orthogonal to v1 and v2 (though there is only one such orthogonal direction).

[1] Image taken from Duncan Gillies‘s lecture on Principal Component Analysis
[2] Image taken from Fiber Crossing in Human Brain Depicted with Diffusion Tensor MR Imaging

Written 1 Aug, 2013 • 48,160 views

Upvote409

Downvote

Comments10+

Anonymous

7 upvotes by Bruno Castro da SilvaPierre PenninckxJiwon KimLei Quan(more)

Given a set of random variables {x1,...,xn}, the covariance matrix A is defined so that Ai,j=Cov(xi,xj). We can represent a linear combination ∑bixi as a vector x=(b1,...,bn).

It turns out that the covariance of two such vectors x and y can be written as Cov(x,y)=xtAy. In particular, Var(x)=xtAx. This means that covariance is a Bilinear form.

Now, since A is a real symmetric matrix, there is an orthonormal basis for Rnof eigenvectors of A. Orthonormal in this case means that each vector‘s norm is 1 and they‘re orthogonal with respect to A, that is vt1Av2=0, or Cov(v1,v2)=0.

Next, suppose v is a unit eigenvector of A with eigenvalue λ. Then Var(v)=λ∥v∥2=λ.

There are a couple interesting conclusions we can draw from this. First, since the eigenvectors form a basis {v1,...,vn}, every linear combination of the original random variables can actually be represented as a linear combination of the independent random variables vi. Second, every unit vector‘s variance is a weighted average of the eigenvalues. This means that the leading eigenvector is the direction of greatest variance, the next eigenvector has the greatest variance in the orthogonal subspace, and so on.

So, sum up, eigenvectors are uncorrelated linear combinations of the original set of random variables.

The primary application of this is Principal Components Analysis. If you have n features, you can find eigenvectors of the covariance matrix of the features. This allows you to represent the data with uncorrelated features. Moreover, the eigenvalues tell you the amount of variance in each feature, allowing you to choose a subset of the features that retain the most information about your data.

Written 28 Nov, 2013 • 5,789 views

Upvote7

Downvote

Comment

Vincent Spruyt

6 upvotes by Rajdeep KhandelwalTom ConderRaviteja ChiralaTaha Yavuz Bodur,(more)

The largest eigenvector of a covariance matrix points into the direction of the largest variance. All other eigenvectors are orthogonal to the largest one.

Now, if this direction of the largest variance is axis-aligned (covariances are zero), then the eigenvalues simply correspond to the variances of the data:

It becomes a little more complicated if the covariance matrix is not diagonal, such that the covariances are not zero. In this case, the principal components (directions of largest variance) do no coincide with the axes, and the data is rotated. The eigenvalues then still correspond to the spread of the data in the direction of the largest variance, whereas the variance components of the covariance matrix still defines the spread of the data along the axes:

An in-depth discussion of how the covariance matrix can be interpreted from a geometric point of view (and the source of the above images) can be found on:A geometric interpretation of the covariance matrix

Written 10 Mar • 2,674 views

Upvote6

Downvote

Comment

Shreyas Ghuge

3 upvotes by Sameer Gupta, Anonymous, and Ram Shankar

Finding the directions of maximum and minimum variance is the same as looking for where the orthogonal least squares best fit line and plane of the data. The sums of squares for that line and plane can be written in terms of covariance matrix.The connections between them can be worked out to get the Eigen vectors of this covariance matrix.

Written 22 Aug, 2013 • 4,598 views

Upvote3

Downvote

Comment

Julius Bier Kirkegaard, physics, computers, ‘n‘ stuff

4 upvotes by David Joyce (Professor of Mathematics at Clark University), Andrei Kucharavy (PhD student in Bioinformatics), Martin Andrews, and Laasya Alamuru

Finding the eigenvectors a covariance matrix is exactly the technique of Principal Component Analysis (PCA).

The eigenvectors are those variables that are linearly uncorrelated.

Written 1 Aug, 2013 • 4,307 views

Upvote4

Downvote

Comment

Write an answer

Related Questions

时间: 2024-08-29 12:06:46

What is an eigenvector of a covariance matrix?的相关文章

A geometric interpretation of the covariance matrix

Introduction In this article, we provide an intuitive, geometric interpretation of the covariance matrix, by exploring the relation between linear transformations and the resulting data covariance. Most textbooks explain the shape of data based on th

方差variance, 协方差covariance, 协方差矩阵covariance matrix

参考: 如何通俗易懂地解释「协方差」与「相关系数」的概念?(非常通俗易懂) 浅谈协方差矩阵 方差(variance) 集合中各个数据与平均数之差的平方的平均数.在概率论与数理统计中,方差(Variance)用来度量随机变量和其数学期望(即均值)之间的偏离程度.  方差越大,数据的离散程度就越大. 协方差(covariance) 协方差表示的是两个变量总体误差的方差,这与只表示一个变量误差的方差不同.如果两个变量的变化趋势一致,也就是说如果其中一个大于自身的期望值,另外一个也大于自身的期望值,那么

A Beginner’s Guide to Eigenvectors, PCA, Covariance and Entropy

A Beginner’s Guide to Eigenvectors, PCA, Covariance and Entropy Content: Linear Transformations Principal Component Analysis (PCA) Covariance Matrix Change of Basis Entropy & Information Gain Resources This post introduces eigenvectors and their rela

协方差(Covariance)

统计学上用方差和标准差来度量数据的离散程度 ,但是方差和标准差是用来描述一维数据的(或者说是多维数据的一个维度),现实生活中我们常常会碰到多维数据,因此人们发明了协方差(covariance),用来度量两个随机变量之间的关系. 我们仿照方差的公式来定义协方差: 方差:  协方差:  (注:因为这里是计算样本的方差,因此用n-1.之所以除以n-1而不是除以n,是因为这样能使我们以较小的样本集更好地逼近总体,即统计上所谓的"无偏估计".) 协方差如果为正值,说明两个变量的变化趋势一致:如果

A tutorial on Principal Components Analysis | 主成分分析(PCA)教程

A tutorial on Principal Components Analysis 原著:Lindsay I Smith, A tutorial on Principal Components Analysis, February 26, 2002. 翻译:houchaoqun.时间:2017/01/18.出处:http://blog.csdn.net/houchaoqun_xmu  |  http://blog.csdn.net/Houchaoqun_XMU/article/details

ML | PCA

what's xxx PCA principal components analysis is for dimensionality reduction. 主要是通过对协方差矩阵Covariance matrix进行特征分解,以得出数据的主成分(即特征向量eigenvector)与它们的权值(即特征值eigenvalue). PCA是最简单的以特征量分析多元统计分布的方法.其结果可以理解为对原数据中的方差variance做出解释:哪一个方向上的数据值对方差的影响最大?换而言之,PCA提供了一种降

PCA 原理

PCA的数学原理(转) 1 年前 PCA(Principal Component Analysis)是一种常用的数据分析方法.PCA通过线性变换将原始数据变换为一组各维度线性无关的表示,可用于提取数据的主要特征分量,常用于高维数据的降维.网上关于PCA的文章有很多,但是大多数只描述了PCA的分析过程,而没有讲述其中的原理.这篇文章的目的是介绍PCA的基本数学原理,帮助读者了解PCA的工作机制是什么. 当然我并不打算把文章写成纯数学文章,而是希望用直观和易懂的方式叙述PCA的数学原理,所以整个文章

matlab pca基础知识

PCA的一些基本资料 最近因为最人脸表情识别,提取的gabor特征太多了,所以需要用PCA进行对提取的特征进行降维. 本来最早的时候我没有打算对提取的gabor特征进行降维,但是如果一个图像时64*64,那么使用五个尺度八个方向的gabor滤波器进行滤波,这样提取的特征足足有64*64*5*8这么多,如果图像稍微大一点,比如128*128的图像,那么直接提取的特征就会几十万,所以不降维的话直接用SVM训练分类器是非常困难的. 所以在这段时间我就学习了一下PCA降维的基本原理和使用方法,网上给出的

Machine Learning第八周笔记

刚刚完成了Machine Learning第八周的课程,这一周主要介绍了K-means和降维,现将笔记整理在下面. Unsupervised Learning Clustering Unsupervised Learning: Introduction 今天我们开始介绍无监督式学习(unsupervised learning).下面两张图分别代表典型的有监督式学习和无监督式学习.一个典型的有监督式学习是从一个有标记的训练数据集出发(图中的两类数据点分别用圈圈和叉叉表示,圈圈一类,叉叉一类),目标