Distinctive Image Features from Scale-Invariant Keypoints(个人翻译+笔记)-介绍

Distinctive Image Features from Scale-Invariant Keypoints,这篇论文是图像识别领域SIFT算法最为经典的一篇论文,导师给布置的第一篇任务就是它。网上找了好多找不到中译本,那就自己动手丰衣足食吧,顺便造福后人,花时间翻译啃下来并做一个笔记在这吧。

--------------------------------------------------------------------------------------------------------

Distinctive Image Features from Scale-Invariant Keypoints

独特的尺度无关的图像特征关键点

abstract
摘要

This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a a substantial(充实的,有实力的) range of affine(仿射,几何学) distortion(扭曲,变形), change in 3D viewpoint, addition of noise, and change in illumination.The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.
这篇文章展示了一种从图片中提取有特色的不变特征方法,它可以用来执行对一个物体或者风景不同视角之间的匹配。这些特征对于图像的伸缩以及旋转是不变的,而且展示出能对几何扭曲变形、变换三维视角,增加噪声,光照改变进行健壮的匹配。这些特征是独特的,在一幅场景中,一个单个的特征能够被正确的在很多图片的大量数据库中进行高可能性的匹配。这篇文章也提供一种方法来利用这些特征用于物体识别,这一识别通过在已知物体是什么的数据库中利用快速邻域法(fast nearest-neighbor algorithm)匹配独立的特征。紧接着用Hough变换以鉴别对于一个单个物体的类属,最终对一致姿势的属性通过最小方差法(least-squares solution)执行认证。这种方法识别能够很好的在聚类与闭塞(occlusion )之间识别物体的同时接近实时的表现

【笔记】SIFT这种方法,能够有效的对物体在不同的视角不同的光照有噪声的情况下对图像进行匹配,这种匹配是一幅图在一堆图片中的匹配。同时,该文章提供一种方法,通过快速邻域法匹配特征,用霍夫变换对这些类聚类,再通过最小方差法进行图像的匹配。

1.Introduction
1.介绍

Image matching is a fundamental aspect of many problems in computer vision, including object or scene recognition, solving for 3D structure from multiple images, stereo correspondence, and motion tracking. This paper describes image features that have many properties that make them suitable for matching differing images of an object or scene. The features are invariant to image scaling and rotation, and partially invariant to change in illumination and 3D camera viewpoint. They are well localized in both the spatial and frequency domains, reducing the probability of disruption by occlusion, clutter, or noise. Large numbers of features can be extracted from typical images with efficient algorithms. In addition, the features are highly distinctive, which allows a single feature to be correctly matched with high probability against a large database of features, providing a basis for object and scene recognition.
图像匹配在于计算机视觉中是很多问题的根本问题。包括物体识别、场景识别、从多幅图像中计算3D结构、立体对应和动作跟踪。这篇文章描述图像特征有很多属性使得他们更加合适于从不同的图像匹配物体或风景。这些特征对于图片的缩放以及旋转是不变的。对于光照以及3D照相机所得到的部分不变。它们能够很好的在频率域以及空间域定位,排出了可能的光照,聚类或者噪声的干扰。大量的特征能够通过使用合适的算法从典型的图片中提取出来。除此之外,这些特征是高度有特色的。能够允许用一个单个的特征在大量特征的数据库中以很高的正确概论匹配一幅图像。提供一个物体识别以及场景识别的基础。

The cost of extracting these features is minimized by taking a cascade filtering approach,in which the more expensive operations are applied only at locations that pass an initial test.Following are the major stages of computation used to generate the set of image features:
采用瀑布滤波器(cascade filtering卷积滤波器?)可以使提取特征的开销最小化,其中开销最大运算只在定位跟初始化测试时。接下来生成图像特征的主要的几个阶段:

1. Scale-space extrema detection: The first stage of computation searches over all scales and image locations. It is implemented efficiently by using a difference-of-Gaussian function to identify potential interest points that are invariant to scale and orientation.
1.尺度空间极值检测:第一步运算查找所有尺度和图片位置,使用差分高斯运算识别潜在的尺度、方向不变的兴趣点能够使得运行更快。

2. Keypoint localization: At each candidate location, a detailed model is fit to determine location and scale. Keypoints are selected based on measures of their stability.
2.关键点定位:对于每一个候选点,一个详细的模型要适应确定的位置与尺度,基于测量稳定性来确定关键点。

3. Orientation assignment: One or more orientations are assigned to each keypoint location based on local image gradient directions. All future operations are performed on image data that has been transformed relative to the assigned orientation, scale, and location for each feature, thereby providing invariance to these transformations.
3.方向分配,一个或者多个方向为每个关键点指定,基于局部图像梯度指示,所有接下来在图片数据上对于每个特征的操作的都转变到相对的指定的方向、尺度、和位置。从而为这些变换提供了不变性。

4. Keypoint descriptor: The local image gradients are measured at the selected scale in the region around each keypoint. These are transformed into a representation that allows for significant levels of local shape distortion and change in illumination.
4.关键点描述:在选定的尺度上对每一个关键点周围的区域测量局部图像的梯度。他们都被转换到了转换到了一个代表允许特征局部的形状变形和光线的改变。

This approach has been named the Scale Invariant Feature Transform (SIFT), as it transforms image data into scale-invariant coordinates relative to local eatures.
这个方法命名为尺度不变的特征变换(SIFT),因为他转换图片进入了一个尺度不变的坐标对英语局部特征。

An important aspect of this approach is that it generates large numbers of features that densely cover the image over the full range of scales and locations. A typical image of size 500x500 pixels will give rise to about 2000 stable features (although this number depends on both image content and choices for various parameters). The quantity of features is particularly important for object recognition, where the ability to detect small objects in cluttered backgrounds requires that at least 3 features be correctly matched from each object for reliable identification.
对于这个方法一个重要的方面在于这个方法能够生成大量特征稠密的覆盖全尺度和位置。一个典型的500x500像素的图片将产生大约2000稳定的特征(尽管这个数字决定于图像的内容以及所选择的属性)。这些特征的量对于物体识别特别重要,在从杂乱的背景中检测小物体时,要得到可信的鉴别则至少3个特征与被正确的匹配、

For image matching and recognition, SIFT features are first extracted from a set of reference images and stored in a database. A new image is matched by individually comparing each feature from the new image to this previous database and finding candidate matching features based on Euclidean distance of their feature vectors. This paper will discuss fast nearest-neighbor algorithms that can perform this computation rapidly against large databases.
对于图像匹配和识别,SIFT特征是第一个从一组相关图像提取出来并存储到数据库中的。一个新的图片与之前数据库中的特征单个的比较每个特征被匹配,基于计算特征向量之间的欧拉距离找出候选的匹配特征。这篇文章会讨论快速邻域可以使得面对大的数据库时计算快速。

The keypoint descriptors are highly distinctive, which allows a single feature to find its correct match with good probability in a large database of features. However, in a cluttered 2 image, many features from the background will not have any correct match in the database, giving rise to many false matches in addition to the correct ones. The correct matches can be filtered from the full set of matches by identifying subsets of keypoints that agree on the object and its location, scale, and orientation in the new image. The probability that several features will agree on these parameters by chance is much lower than the probability that any individual feature match will be in error. The determination of these consistent clusters can be performed rapidly by using an efficient hash table implementation of the generalized Hough transform.
关键点的描述是高度有特色的允许它在大数据库中用一个特征以较高的可能在数据库中找到正确的匹配。然而,在两幅聚类了的图片中,许多来自背景的特征不能够与数据库很好的对应上,会在正确的匹配上增加许多错误的匹配。正确的匹配,用鉴定关键点子集对应物体与它的位置,尺度,和方向的方法,能从所有匹配集中滤除出来。这样一些特征点与属性偶然的匹配错误要比单个点的匹配错误低很多。确定这些始终如一的聚类在使用有效的霍夫变换实现的哈希表实现能够快速的表现出来。

Each cluster of 3 or more features that agree on an object and its pose is then subject to further detailed verification. First, a least-squared estimate is made for an affine approximation to the object pose. Any other image features consistent with this pose are identified,and outliers are discarded. Finally, a detailed computation is made of the probability that a particular set of features indicates the presence of an object, given the accuracy of fit and number of probable false matches. Object matches that pass all these tests can be identified as correct with high confidence .
对每个聚类的3或多个特征对应的一个物体,它的姿势受制于更深入的详细的验证,首先,最小二乘的估计是用做仿射近似一个物体的姿势,恒定不变的其他图片的特征当姿势鉴别出来,异常的被丢弃,最终,一个详细的计算是由特征的特定的集合组成的,代表了存在一个物体。给定匹配的准确度以及可能错误的匹配。拖过这些测试,物体匹配能够有足够的自信能够成功鉴别。

时间: 2024-10-12 12:10:47

Distinctive Image Features from Scale-Invariant Keypoints(个人翻译+笔记)-介绍的相关文章

Distinctive Image Features from Scale-Invariant Keypoints 翻译

从尺度不变的关键点选择可区分的图像特征 David G.Lowe 温哥华不列颠哥伦比亚省加拿大英属哥伦比亚大学计算机科学系 [email protected] 2003年1月10日接受,2004年1月7日修改,2004年1月22日采用   摘要:本文提出了一种从图像中提取独特不变特征的方法,可用于完成不同视角之间目标或场景的可靠匹配的方法.这种特征对图像的尺度和旋转具有不变性.并跨越很大范围的对仿射变换,三维视点的变化,添加的噪音和光照变化的图像匹配具有鲁棒性.特征是非常鲜明的,场景中的一个单一

Distinctive Image Features from Scale-Invariant

http://nichol.as/papers/Lowe/Distinctive Image Features from Scale-Invariant.pdf Abstract This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of

Scale Space(zz Wiki)

Scale space From Wikipedia, the free encyclopedia Scale space Scale-space axioms Scale-space implementation Feature detection Edge detection Blob detection Corner detection Ridge detection Interest point detection Scale selection Affine shape adaptat

Behavior Recognition via Sparse Spatio-Temporal Features 基于稀疏时空特征点的运动识别

Duanxx的论文阅读: Behavior Recognition via Sparse Spatio-Temporal Features 基于稀疏时空特征点的运动识别 ——Duanxx ——2015-04-24 1.Inreoduction In this work we develop a general framework for detecting and characterizing behavior from video sequences, making few underlyin

【Paper Reading】Object Recognition from Scale-Invariant Features

Paper: Object Recognition from Scale-Invariant Features Sorce: http://www.cs.ubc.ca/~lowe/papers/iccv99.pdf SIFT 即Scale Invariant Feature Transfrom, 尺度不变变换,由David Lowe提出.是CV最著名也最常用的特征.在图像目标识别的应用中,常常要求图像的特征有很好的roboust即不容易受到平移,旋转,尺度缩放,光照,仿射的英雄.SIFT算子具有

SIFT Features

Scale Invariant Feature Transform (SIFT) is an approach for detecting and extracting local feature descriptors that are reasonably invariant to change in illumination, image noise, rotation, scaling, and small changes in viewpoint. SIFT是一种可以检测并计算出对于在

Aggregating local features for Image Retrieval

Josef和Andrew在2003年的ICCV上发表的论文[10]中,将文档检索的方法借鉴到了视频中的对象检测中.他们首先将图像的特征描述类比成单词,并建立了基于SIFT特征的vusual word dictionary,结合停止词.TF-IDF和余弦相似度等思想检索包含相同对象的图像帧,最后基于局部特征的匹配和空间一致性完成了对象的匹配.文档检索与计算机视觉之间渊源颇深,在CV领域常常会遇到要将图像的多个局部特征描述融合为一条特征向量的问题,比如常用的BoVW.VLAD和Fisher Vect

Local Features

局部特征入门 局部特征(local features),是近来研究的一大热点.大家都了解全局特征(global features),就是方差.颜色直方图等等.如果用户对整个图像的整体感兴趣,而不是前景本身感兴趣的话,全局特征用来描述总是比较合适的.但是无法分辨出前景和背景却是全局特征本身就有的劣势,特别是在我们关注的对象受到遮挡等影响的时候,全局特征很有可能就被破坏掉了.而所谓局部特征,顾名思义就是一些局部才会出现的特征,这个局部,就是指一些能够稳定出现并且具有良好的可区分性的一些点了.这样在物

深度学习特征检测LIFT,learnd invariant feature transform(1)

LIFT: Learned Invariant Feature Transform(1) 我的阅读翻译与理解 2016 ECCV 收录 Kwang Moo Yi?, Eduard Trulls?, Vincent Lepetit, Pascal Fua 1.介绍 在CV领域局部特征发挥重要作用,从图像中寻找与匹配它们是大量的研究工作的课题.到最近,最好的技术依赖于手工设计的特征(SIFT,SURF,ORB).在过去几年,在许多计算机视觉领域,基于machine learning或更确切说是dee