Image Retrieval Using Customized Bag of Features

This example shows how to create a Content Based Image Retrieval (CBIR) system using a customized bag-of-features workflow.

Introduction

Content Based Image Retrieval (CBIR) systems are used to find images that are visually similar to a query image. The application of CBIR systems can be found in many areas such as a web-based product search, surveillance, and visual place identification. A common technique used to implement a CBIR system is bag of visual words, also known as bag of features[1,2]. Bag of features is a technique adapted to image retrieval from the world of document retrieval. Instead of using actual words as in document retrieval, bag of features uses image features as the visual words that describe an image.

Image features are an important part of CBIR systems. These image features are used to gauge similarity between images and can include global image features such as color, texture, and shape. Image features can also be local image features such as speeded up robust features (SURF), histogram of gradients (HOG), or local binary patterns (LBP). The benefit of the bag-of-features approach is that the type of features used to create the visual word vocabulary can be customized to fit the application.

The speed and efficiency of image search is also important in CBIR systems. For example, it may be acceptable to perform a brute force search in a small collection of images of less than a 100 images, where features from the query image are compared to features from each image in the collection. For larger collections, a brute force search is not feasible and more efficient search techniques must be used. The bag of features provides a concise encoding scheme to represent a large collection of images using a sparse set of visual word histograms. This enables compact storage and efficient search through an inverted index data structure.

The Computer Vision System Toolbox? provides a customizable bag-of-features framework to implement an image retrieval system. The following steps outline the procedure:

  1. Select the Image Features for Retrieval
  2. Create a Bag Of Features
  3. Index the Images
  4. Search for Similar Images

In this example, you will go through these steps to create an image retrieval system for searching the 17 Category Flower Dataset[3]. This dataset contains about 1300 images of 17 different types of flowers.

Download this dataset for use in the rest of this example.

% Location of the compressed data set
url = ‘http://www.robots.ox.ac.uk/~vgg/data/flowers/17/17flowers.tgz‘;

% Store the output in a temporary folder
outputFolder = fullfile(tempdir, ‘17Flowers‘); % define output folder

Note that downloading the dataset from the web can take a very long time depending on your Internet connection. The commands below will block MATLAB for that period of time. Alternatively, you can use your web browser to first download the set to your local disk. If you choose that route, re-point the ‘url‘ variable above to the file that you downloaded.

if ~exist(outputFolder, ‘dir‘) % download only once
    disp(‘Downloading 17-Category Flower Dataset (58 MB)...‘);
    untar(url, outputFolder);
end

flowerImageSet = imageDatastore(fullfile(outputFolder,‘jpg‘),‘LabelSource‘,‘foldernames‘);

% Total number of images in the data set
numel(flowerImageSet.Files)
ans =

        1360

Step 1 - Select the Image Features for Retrieval

The type of feature used for retrieval depends on the type of images within the collection. For example, if searching an image collection made up of scenes (beaches, cities, highways), it is preferable to use a global image feature, such as a color histogram that captures the color content of the entire scene. However, if the goal is to find specific objects within the image collections, then local image features extracted around object keypoints are a better choice.

Let‘s start by viewing a few of the images to get an idea of how to approach the problem.

% Display a few of the flower images
helperDisplayImageMontage(flowerImageSet.Files(1:50:1000));

In this example, the goal is to search for similar flowers in the dataset using the color information in the query image. The images in the dataset contain one kind of flower in every image. Therefore, a simple image feature based on the spatial layout of color is a good place to start.

The following function describes the algorithm used to extract color features from a given image. This function will be used as a custom feature extractor withinbagOfFeatures to extract color features.

type exampleBagOfFeaturesColorExtractor.m
function [features, metrics] = exampleBagOfFeaturesColorExtractor(I)
% Example color layout feature extractor. Designed for use with bagOfFeatures.
%
% Local color layout features are extracted from truecolor image, I and
% returned in features. The strength of the features are returned in
% metrics.

[~,~,P] = size(I);

isColorImage = P == 3; 

if isColorImage

    % Convert RGB images to the L*a*b* colorspace. The L*a*b* colorspace
    % enables you to easily quantify the visual differences between colors.
    % Visually similar colors in the L*a*b* colorspace will have small
    % differences in their L*a*b* values.
    Ilab = rgb2lab(I);                                                                             

    % Compute the "average" L*a*b* color within 16-by-16 pixel blocks. The
    % average value is used as the color portion of the image feature. An
    % efficient method to approximate this averaging procedure over
    % 16-by-16 pixel blocks is to reduce the size of the image by a factor
    % of 16 using IMRESIZE.
    Ilab = imresize(Ilab, 1/16);

    % Note, the average pixel value in a block can also be computed using
    % standard block processing or integral images.

    % Reshape L*a*b* image into "number of features"-by-3 matrix.
    [Mr,Nr,~] = size(Ilab);
    colorFeatures = reshape(Ilab, Mr*Nr, []); 

    % L2 normalize color features
    rowNorm = sqrt(sum(colorFeatures.^2,2));
    colorFeatures = bsxfun(@rdivide, colorFeatures, rowNorm + eps);

    % Augment the color feature by appending the [x y] location within the
    % image from which the color feature was extracted. This technique is
    % known as spatial augmentation. Spatial augmentation incorporates the
    % spatial layout of the features within an image as part of the
    % extracted feature vectors. Therefore, for two images to have similar
    % color features, the color and spatial distribution of color must be
    % similar.

    % Normalize pixel coordinates to handle different image sizes.
    xnorm = linspace(-0.5, 0.5, Nr);
    ynorm = linspace(-0.5, 0.5, Mr);
    [x, y] = meshgrid(xnorm, ynorm);

    % Concatenate the spatial locations and color features.
    features = [colorFeatures y(:) x(:)];

    % Use color variance as feature metric.
    metrics  = var(colorFeatures(:,1:3),0,2);
else

    % Return empty features for non-color images. These features are
    % ignored by bagOfFeatures.
    features = zeros(0,5);
    metrics  = zeros(0,1);
end

Step 2 - Create a Bag Of Features

With the feature type defined, the next step is to learn the visual vocabulary within the bagOfFeatures using a set of training images. The code shown below picks a random subset of images from the dataset for training and then trains bagOfFeatures using the ‘CustomExtractor‘ option.

% Pick a random subset of the flower images
% trainingSet = splitEachLabel(flowerImageSet, 0.4, ‘randomized‘);
%
% Create a custom bag of features using the ‘CustomExtractor‘ option
% colorBag = bagOfFeatures(trainingSet, ...
%   ‘CustomExtractor‘, @exampleBagOfFeaturesColorExtractor, ...
%   ‘VocabularySize‘, 10000);

The previous code is commented out because the training process takes several minutes. The rest of the example uses a pre-trained bagOfFeatures to save time. If you wish to recreate colorBag locally, consider enabling parallel computing to reduce processing time.

% Load pre-trained bagOfFeatures
load(‘savedColorBagOfFeatures.mat‘,‘colorBag‘);

Step 3 - Index the Images

Now that the bagOfFeatures is created, the entire flower image set can be indexed for search. The indexing procedure extracts features from each image using the custom extractor function from step 1. The extracted features are encoded into a visual word histogram and added into the image index.

% Create a search index
% flowerImageIndex = indexImages(flowerImageSet, colorBag, ‘SaveFeatureLocations‘, false);

Because the indexing step processes thousands of images, the rest of this example uses a saved index to save time. You may recreate the index locally by running the code shown above. Consider enabling parallel computing to reduce processing time.

% Load the pre-saved index
load(‘savedColorBagOfFeatures.mat‘, ‘flowerImageIndex‘);

Step 4 - Search for Similar Images

The final step is to use the retrieveImages function to search for similar images.

% Define a query image
queryImage = readimage(flowerImageSet, 502);

figure
imshow(queryImage)

% Search for the top 20 images with similar color content
[imageIDs, scores] = retrieveImages(queryImage, flowerImageIndex);

retrieveImages returns the image IDs and the scores of each result. The scores are sorted from best to worst.

scores
scores =

    0.9463
    0.2838
    0.2814
    0.2694
    0.2647
    0.2635
    0.2443
    0.2432
    0.2370
    0.2212
    0.2198
    0.2176
    0.2175
    0.2164
    0.2131
    0.2048
    0.2038
    0.2032
    0.2031
    0.2022

The imageIDs correspond to the images within the image set that are similar to the query image.

% Display results using montage. Resize images to thumbnails first.
helperDisplayImageMontage(flowerImageSet.Files(imageIDs))

flowerImageIndex contains several index statistics that are relevant to search. One of these is the ‘WordFrequency‘ property. It contains the percentage of images in which each visual word occurs. This shows you which words are more common as well as which ones are rare across the entire dataset. It is often helpful to suppress the most common words as these do not help us reduce the search set when looking for the most relevant images. Conversely, it is also helpful to suppress very rare words as they may be coming from outliers in the image set. You can control how much the upper and lower ends of the visual word distribution effects the search results by tuning the ‘WordFrequencyRange‘ property. A good way to set this value is to plot the sorted WordFrequency values.

figure
plot(sort(flowerImageIndex.WordFrequency))

The plot shows that the distribution of visual words reaches its peak at around 45%. This means that only a few visual words are in 45% of the images. If the upper part of the distribution is at 80% or 90%, then it‘s best to exclude those via the ‘WordFrequencyRange‘ property, and run the search again. For example, let‘s lower the upper range to 20% and check the effects on the search results.

% Lower WordFrequencyRange
flowerImageIndex.WordFrequencyRange = [0.01 0.2];

% Re-run retrieval
[imageIDs, scores] = retrieveImages(queryImage, flowerImageIndex);

% Show results
helperDisplayImageMontage(flowerImageSet.Files(imageIDs))

In this case, because the upper range is near 45%, a setting of 20% is going to prevent a lot of relevant matches from appearing in the search results. This is confirmed by the poor search results.

Conclusion

This example showed you how to customize the bagOfFeatures and how to use indexImages and retrieveImages to create an image retrieval system based on color features. The techniques shown here may be extended to other feature types by further customizing the features used within bagOfFeatures.

References

[1] Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV. (2003) 1470-1477

[2] Philbin, J., Chum, O., Isard, M., A., J.S., Zisserman: Object retrieval with large vocabularies and fast spatial matching. In: CVPR. (2007)

[3] Nilsback, M-E. and Zisserman, A. A Visual Vocabulary for Flower Classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2006)

时间: 2024-10-23 15:54:34

Image Retrieval Using Customized Bag of Features的相关文章

Bag of Features (BOF)图像检索算法

1.首先,我们用surf算法生成图像库中每幅图的特征点及描述符. 2.再用k-means算法对图像库中的特征点进行训练,生成类心. 3.生成每幅图像的BOF,具体方法为:判断图像的每个特征点与哪个类心最近,最近则放入该类心,最后将生成一列频数表,即初步的无权BOF. 4.通过tf-idf对频数表加上权重,生成最终的bof.(因为每个类心对图像的影响不同.比如超市里条形码中的第一位总是6,它对辨别产品毫无作用,因此权重要减小). 5.对query进来的图像也进行3.4步操作,生成一列query图的

{ICIP2014}{收录论文列表}

This article come from HEREARS-L1: Learning Tuesday 10:30–12:30; Oral Session; Room: Leonard de Vinci 10:30  ARS-L1.1—GROUP STRUCTURED DIRTY DICTIONARY LEARNING FOR CLASSIFICATION Yuanming Suo, Minh Dao, Trac Tran, Johns Hopkins University, USA; Hojj

paper 15 :整理的CV代码合集

这篇blog,原来是西弗吉利亚大学的Li xin整理的,CV代码相当的全,不知道要经过多长时间的积累才会有这么丰富的资源,在此谢谢LI Xin .我现在分享给大家,希望可以共同进步!还有,我需要说一下,不管你的理论有多么漂亮,不管你有多聪明,如果没有实验来证明,那么都是错误的.  OK~本博文未经允许,禁止转载哦!  By  wei shen Reproducible Research in Computational Science “It doesn't matter how beautif

【CS-4476-project 6】Deep Learning

AlexNet / VGG-F network visualized by mNeuron. Project 6: Deep LearningIntroduction to Computer Vision Brief Due date: Tuesday, December 6th, 11:55pm Project materials including starter code, training and testing data, and html writeup template: proj

行为识别笔记:improved dense trajectories算法(iDT算法)(转载)

iDT算法是行为识别领域中非常经典的一种算法,在深度学习应用于该领域前也是效果最好的算法.由INRIA的IEAR实验室于2013年发表于ICCV.目前基于深度学习的行为识别算法效果已经超过了iDT算法,但与iDT的结果做ensemble总还是能获得一些提升.所以这几年好多论文的最优效果都是"Our method+iDT"的形式. 此前由于项目原因,对iDT算法进行了很多研究和实验,故此处对其核心思路与一些实施的细节进行总结,方便后续回顾,也希望能够在此过程中获得一些新的启发. 介绍的内

空间金字塔方法表示图像

注:本学习笔记是自己的理解,如有错误的地方.请大家指正,共同学习进步. 本文学习自CVPR论文<Discriminative Spatial Pyramid>.<Discriminative Spatial Saliency for Image Classification>及<Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories>,在此感谢论

Activity Recognition行为识别

暑假听了computer vision的一个Summer School,里面Jason J. Corso讲了他们运用Low-Mid-High层次结构进行Video Understanding 和 Activity Recognition的方法,受益颇深,在这里把他的方法总结一下: ------------------------------------------------------------------------------------------------- 1. 层次结构表示:

Visual Categorization with Bags of Keypoints

1.Introduction and backgrounds 作为本周的论文之一,这是一篇bag of features的基本文章之一,主要了解其中的基本思路,以及用到的基本技术,尽量使得细节更加清楚. 文章中比较了两个基本的方法,分别是:BAYES和SVM. bag of keypoints的基本原理是: A bag of keypoints corresponds to a histogram of the number of occurrences of particular image

计算机视觉整理库

本文章有转载自其它博文,也有自己发现的新库添加进来的,如果发现有新的库,可以推荐我加进来 转自:http://www.cnblogs.com/tornadomeet/archive/2012/05/24/2515980.html Deep Learning(深度学习): ufldl的2个教程(这个没得说,入门绝对的好教程,Ng的,逻辑清晰有练习):一 ufldl的2个教程(这个没得说,入门绝对的好教程,Ng的,逻辑清晰有练习):二 Bengio团队的deep learning教程,用的thean