GMM+Kalman Filter+Blob 目标跟踪

转 http://www.cnblogs.com/YangQiaoblog/p/5462453.html

==========图片版==============================================================================

=====================================================================================

最近学习了一下多目标跟踪,看了看MathWorks的关于Motion-Based Multiple Object Tracking的Documention。

官网链接:http://cn.mathworks.com/help/vision/examples/motion-based-multiple-object-tracking.html?s_tid=gn_loc_drop

程序来自matlab的CV工具箱Computer Vision System Toolbox。这种方法用于静止背景下的多目标检测与跟踪。

程序可以分为两部分,1.每一帧检测运动objects;

2.实时的将检测得到的区域匹配到相同一个物体;

检测部分,用的是基于高斯混合模型的背景剪除法;

参考链接:http://blog.pluskid.org/?p=39

所谓单高斯模型,就是用多维高斯分布概率来进行模式分类

其中μ用训练样本均值代替,Σ用样本方差代替,X为d维的样本向量。通过高斯概率公式就可以得出类别C属于正(负)样本的概率。

混合高斯模型(GMM)就是数据从多个高斯分布中产生的。每个GMM由K个高斯分布线性叠加而成。

P(x)=Σp(k)*p(x|k)     相当于对各个高斯分布进行加权(权系数越大,那么这个数据属于这个高斯分布的可能性越大)

而在实际过程中,我们是在已知数据的前提下,对GMM进行参数估计,具体在这里即为图片训练一个合适的GMM模型。

那么在前景检测中,我们会取静止背景(约50帧图像)来进行GMM参数估计,进行背景建模。分类域值官网取得0.7,经验取值0.7-0.75可调。这一步将会分离前景和背景,输出为前景二值掩码。

然后进行形态学运算,并通过函数返回运动区域的centroids和bboxes,完成前景检测部分。

跟踪部分,用的是卡尔曼滤波。卡尔曼是一个线性估计算法,可以建立帧间bboxs的关系。

跟踪分为5种状态: 1,新目标出现    2,目标匹配    3,目标遮挡    4,目标分离   5,目标消失。

卡尔曼原理在这儿我就不贴了,网上很多。

状态方程: X(k+1)=A(K+1,K)X(K)+w(K)    其中 X(k)=[x(k),y(k),w(k),h(k),v(k)], x,y,w,h,分别表示bboxs的横纵坐标,长,宽。

观测方程: Z(k)=H(k)X(k)+v(k)             w(k), v(k),不相关的高斯白噪声。

定义好了观测方程与状态方程之后就可以用卡尔曼滤波器实现运动目标的跟踪,步骤如下:

1)计算运动目标的特征信息(运动质心,以及外接矩形)。

2)用得到的特征信息初始化卡尔曼滤波器(开始时可以初始为0)。

3)用卡尔曼滤波器对下一帧中对应的目标区域进行预测,当下一帧到来时,在预测区域内进行目标匹配。

4)如果匹配成功,则更新卡尔曼滤波器

在匹配的过程中,使用的是匈牙利匹配算法,匈牙利算法在这里有很好的介绍:http://blog.csdn.net/pi9nc/article/details/11848327

匈牙利匹配算法在此处是将新一帧图片中检测到的运动物体匹配到对应的轨迹。匹配过程是通过最小化卡尔曼预测得到的质心与检测到的质心之间的欧氏距离之和实现的

具体可以分为两步:

1,  计算损失矩阵,大小为[M N],其中,M是轨迹数目,N是检测到的运动物体数目。

2, 求解损失矩阵

主要思路就是这么多,下面贴上matlab的demo,大家可以跑一跑。

function multiObjectTracking()

% create system objects used for reading video, detecting moving objects,
% and displaying the results
obj = setupSystemObjects(); %初始化函数
tracks = initializeTracks(); % create an empty array of tracks  %初始化轨迹对象

nextId = 1; % ID of the next track

% detect moving objects, and track them across video frames
while ~isDone(obj.reader)
    frame = readFrame();  %读取一帧
    [centroids, bboxes, mask] = detectObjects(frame); %前景检测
    predictNewLocationsOfTracks();  %根据位置进行卡尔曼预测
    [assignments, unassignedTracks, unassignedDetections] = ...
        detectionToTrackAssignment(); %匈牙利匹配算法进行匹配

    updateAssignedTracks();%分配好的轨迹更新
    updateUnassignedTracks();%未分配的轨迹更新
    deleteLostTracks();%删除丢掉的轨迹
    createNewTracks();%创建新轨迹

    displayTrackingResults();%结果展示
end

%% Create System Objects
% Create System objects used for reading the video frames, detecting
% foreground objects, and displaying results.

    function obj = setupSystemObjects()
        % Initialize Video I/O
        % Create objects for reading a video from a file, drawing the tracked
        % objects in each frame, and playing the video.

        % create a video file reader
        obj.reader = vision.VideoFileReader(‘atrium.avi‘);         %读入视频

        % create two video players, one to display the video,
        % and one to display the foreground mask
        obj.videoPlayer = vision.VideoPlayer(‘Position‘, [20, 400, 700, 400]);   %创建两个窗口
        obj.maskPlayer = vision.VideoPlayer(‘Position‘, [740, 400, 700, 400]);

        % Create system objects for foreground detection and blob analysis

        % The foreground detector is used to segment moving objects from
        % the background. It outputs a binary mask, where the pixel value
        % of 1 corresponds to the foreground and the value of 0 corresponds
        % to the background. 

        obj.detector = vision.ForegroundDetector(‘NumGaussians‘, 3, ...   %GMM进行前景检测,高斯核数目为3,前40帧为背景帧,域值为0.7
            ‘NumTrainingFrames‘, 40, ‘MinimumBackgroundRatio‘, 0.7);   

        % Connected groups of foreground pixels are likely to correspond to moving
        % objects.  The blob analysis system object is used to find such groups
        % (called ‘blobs‘ or ‘connected components‘), and compute their
        % characteristics, such as area, centroid, and the bounding box.

        obj.blobAnalyser = vision.BlobAnalysis(‘BoundingBoxOutputPort‘, true, ...  %输出质心和外接矩形
            ‘AreaOutputPort‘, true, ‘CentroidOutputPort‘, true, ...
            ‘MinimumBlobArea‘, 400);
    end

%% Initialize Tracks
% The |initializeTracks| function creates an array of tracks, where each
% track is a structure representing a moving object in the video. The
% purpose of the structure is to maintain the state of a tracked object.
% The state consists of information used for detection to track assignment,
% track termination, and display.
%
% The structure contains the following fields:
%
% * |id| :                  the integer ID of the track
% * |bbox| :                the current bounding box of the object; used
%                           for display
% * |kalmanFilter| :        a Kalman filter object used for motion-based
%                           tracking
% * |age| :                 the number of frames since the track was first
%                           detected
% * |totalVisibleCount| :   the total number of frames in which the track
%                           was detected (visible)
% * |consecutiveInvisibleCount| : the number of consecutive frames for
%                                  which the track was not detected (invisible).
%
% Noisy detections tend to result in short-lived tracks. For this reason,
% the example only displays an object after it was tracked for some number
% of frames. This happens when |totalVisibleCount| exceeds a specified
% threshold.
%
% When no detections are associated with a track for several consecutive
% frames, the example assumes that the object has left the field of view
% and deletes the track. This happens when |consecutiveInvisibleCount|
% exceeds a specified threshold. A track may also get deleted as noise if
% it was tracked for a short time, and marked invisible for most of the of
% the frames.        

    function tracks = initializeTracks()
        % create an empty array of tracks
        tracks = struct(...
            ‘id‘, {}, ...  %轨迹ID
            ‘bbox‘, {}, ... %外接矩形
            ‘kalmanFilter‘, {}, ...%轨迹的卡尔曼滤波器
            ‘age‘, {}, ...%总数量
            ‘totalVisibleCount‘, {}, ...%可视数量
            ‘consecutiveInvisibleCount‘, {});%不可视数量
    end

%% Read a Video Frame
% Read the next video frame from the video file.
    function frame = readFrame()
        frame = obj.reader.step();%激活读图函数
    end

%% Detect Objects
% The |detectObjects| function returns the centroids and the bounding boxes
% of the detected objects. It also returns the binary mask, which has the
% same size as the input frame. Pixels with a value of 1 correspond to the
% foreground, and pixels with a value of 0 correspond to the background.
%
% The function performs motion segmentation using the foreground detector.
% It then performs morphological operations on the resulting binary mask to
% remove noisy pixels and to fill the holes in the remaining blobs.  

    function [centroids, bboxes, mask] = detectObjects(frame)

        % detect foreground
        mask = obj.detector.step(frame);

        % apply morphological operations to remove noise and fill in holes
        mask = imopen(mask, strel(‘rectangle‘, [3,3]));%开运算
        mask = imclose(mask, strel(‘rectangle‘, [15, 15])); %闭运算
        mask = imfill(mask, ‘holes‘);%填洞

        % perform blob analysis to find connected components
        [~, centroids, bboxes] = obj.blobAnalyser.step(mask);
    end

%% Predict New Locations of Existing Tracks
% Use the Kalman filter to predict the centroid of each track in the
% current frame, and update its bounding box accordingly.

    function predictNewLocationsOfTracks()
        for i = 1:length(tracks)
            bbox = tracks(i).bbox;

            % predict the current location of the track
            predictedCentroid = predict(tracks(i).kalmanFilter);%根据以前的轨迹,预测当前位置

            % shift the bounding box so that its center is at
            % the predicted location
            predictedCentroid = int32(predictedCentroid) - bbox(3:4) / 2;
            tracks(i).bbox = [predictedCentroid, bbox(3:4)];%真正的当前位置
        end
    end

%% Assign Detections to Tracks
% Assigning object detections in the current frame to existing tracks is
% done by minimizing cost. The cost is defined as the negative
% log-likelihood of a detection corresponding to a track.
%
% The algorithm involves two steps:
%
% Step 1: Compute the cost of assigning every detection to each track using
% the |distance| method of the |vision.KalmanFilter| System object. The
% cost takes into account the Euclidean distance between the predicted
% centroid of the track and the centroid of the detection. It also includes
% the confidence of the prediction, which is maintained by the Kalman
% filter. The results are stored in an MxN matrix, where M is the number of
% tracks, and N is the number of detections.
%
% Step 2: Solve the assignment problem represented by the cost matrix using
% the |assignDetectionsToTracks| function. The function takes the cost
% matrix and the cost of not assigning any detections to a track.
%
% The value for the cost of not assigning a detection to a track depends on
% the range of values returned by the |distance| method of the
% |vision.KalmanFilter|. This value must be tuned experimentally. Setting
% it too low increases the likelihood of creating a new track, and may
% result in track fragmentation. Setting it too high may result in a single
% track corresponding to a series of separate moving objects.
%
% The |assignDetectionsToTracks| function uses the Munkres‘ version of the
% Hungarian algorithm to compute an assignment which minimizes the total
% cost. It returns an M x 2 matrix containing the corresponding indices of
% assigned tracks and detections in its two columns. It also returns the
% indices of tracks and detections that remained unassigned. 

    function [assignments, unassignedTracks, unassignedDetections] = ...
            detectionToTrackAssignment()

        nTracks = length(tracks);
        nDetections = size(centroids, 1);

        % compute the cost of assigning each detection to each track
        cost = zeros(nTracks, nDetections);
        for i = 1:nTracks
            cost(i, :) = distance(tracks(i).kalmanFilter, centroids);%损失矩阵计算
        end

        % solve the assignment problem
        costOfNonAssignment = 20;
        [assignments, unassignedTracks, unassignedDetections] = ...
            assignDetectionsToTracks(cost, costOfNonAssignment);%匈牙利算法匹配
    end

%% Update Assigned Tracks
% The |updateAssignedTracks| function updates each assigned track with the
% corresponding detection. It calls the |correct| method of
% |vision.KalmanFilter| to correct the location estimate. Next, it stores
% the new bounding box, and increases the age of the track and the total
% visible count by 1. Finally, the function sets the invisible count to 0. 

    function updateAssignedTracks()
        numAssignedTracks = size(assignments, 1);
        for i = 1:numAssignedTracks
            trackIdx = assignments(i, 1);
            detectionIdx = assignments(i, 2);
            centroid = centroids(detectionIdx, :);
            bbox = bboxes(detectionIdx, :);

            % correct the estimate of the object‘s location
            % using the new detection
            correct(tracks(trackIdx).kalmanFilter, centroid);

            % replace predicted bounding box with detected
            % bounding box
            tracks(trackIdx).bbox = bbox;

            % update track‘s age
            tracks(trackIdx).age = tracks(trackIdx).age + 1;

            % update visibility
            tracks(trackIdx).totalVisibleCount = ...
                tracks(trackIdx).totalVisibleCount + 1;
            tracks(trackIdx).consecutiveInvisibleCount = 0;
        end
    end

%% Update Unassigned Tracks
% Mark each unassigned track as invisible, and increase its age by 1.

    function updateUnassignedTracks()
        for i = 1:length(unassignedTracks)
            ind = unassignedTracks(i);
            tracks(ind).age = tracks(ind).age + 1;
            tracks(ind).consecutiveInvisibleCount = ...
                tracks(ind).consecutiveInvisibleCount + 1;
        end
    end

%% Delete Lost Tracks
% The |deleteLostTracks| function deletes tracks that have been invisible
% for too many consecutive frames. It also deletes recently created tracks
% that have been invisible for too many frames overall. 

    function deleteLostTracks()
        if isempty(tracks)
            return;
        end

        invisibleForTooLong = 10;
        ageThreshold = 8;

        % compute the fraction of the track‘s age for which it was visible
        ages = [tracks(:).age];
        totalVisibleCounts = [tracks(:).totalVisibleCount];
        visibility = totalVisibleCounts ./ ages;

        % find the indices of ‘lost‘ tracks
        lostInds = (ages < ageThreshold & visibility < 0.6) | ...
            [tracks(:).consecutiveInvisibleCount] >= invisibleForTooLong;

        % delete lost tracks
        tracks = tracks(~lostInds);
    end

%% Create New Tracks
% Create new tracks from unassigned detections. Assume that any unassigned
% detection is a start of a new track. In practice, you can use other cues
% to eliminate noisy detections, such as size, location, or appearance.

    function createNewTracks()
        centroids = centroids(unassignedDetections, :);
        bboxes = bboxes(unassignedDetections, :);

        for i = 1:size(centroids, 1)

            centroid = centroids(i,:);
            bbox = bboxes(i, :);

            % create a Kalman filter object
            kalmanFilter = configureKalmanFilter(‘ConstantVelocity‘, ...
                centroid, [200, 50], [100, 25], 100);

            % create a new track
            newTrack = struct(...
                ‘id‘, nextId, ...
                ‘bbox‘, bbox, ...
                ‘kalmanFilter‘, kalmanFilter, ...
                ‘age‘, 1, ...
                ‘totalVisibleCount‘, 1, ...
                ‘consecutiveInvisibleCount‘, 0);

            % add it to the array of tracks
            tracks(end + 1) = newTrack;

            % increment the next id
            nextId = nextId + 1;
        end
    end

%% Display Tracking Results
% The |displayTrackingResults| function draws a bounding box and label ID
% for each track on the video frame and the foreground mask. It then
% displays the frame and the mask in their respective video players. 

    function displayTrackingResults()
        % convert the frame and the mask to uint8 RGB
        frame = im2uint8(frame);
        mask = uint8(repmat(mask, [1, 1, 3])) .* 255;

        minVisibleCount = 8;
        if ~isempty(tracks)

            % noisy detections tend to result in short-lived tracks
            % only display tracks that have been visible for more than
            % a minimum number of frames.
            reliableTrackInds = ...
                [tracks(:).totalVisibleCount] > minVisibleCount;
            reliableTracks = tracks(reliableTrackInds);

            % display the objects. If an object has not been detected
            % in this frame, display its predicted bounding box.
            if ~isempty(reliableTracks)
                % get bounding boxes
                bboxes = cat(1, reliableTracks.bbox);

                % get ids
                ids = int32([reliableTracks(:).id]);

                % create labels for objects indicating the ones for
                % which we display the predicted rather than the actual
                % location
                labels = cellstr(int2str(ids‘));
                predictedTrackInds = ...
                    [reliableTracks(:).consecutiveInvisibleCount] > 0;
                isPredicted = cell(size(labels));
                isPredicted(predictedTrackInds) = {‘ predicted‘};
                labels = strcat(labels, isPredicted);

                % draw on the frame
                frame = insertObjectAnnotation(frame, ‘rectangle‘, ...
                    bboxes, labels);

                % draw on the mask
                mask = insertObjectAnnotation(mask, ‘rectangle‘, ...
                    bboxes, labels);
            end
        end

        % display the mask and the frame
        obj.maskPlayer.step(mask);
        obj.videoPlayer.step(frame);
    end

%% Summary
% This example created a motion-based system for detecting and
% tracking multiple moving objects. Try using a different video to see if
% you are able to detect and track objects. Try modifying the parameters
% for the detection, assignment, and deletion steps.
%
% The tracking in this example was solely based on motion with the
% assumption that all objects move in a straight line with constant speed.
% When the motion of an object significantly deviates from this model, the
% example may produce tracking errors. Notice the mistake in tracking the
% person labeled #12, when he is occluded by the tree.
%
% The likelihood of tracking errors can be reduced by using a more complex
% motion model, such as constant acceleration, or by using multiple Kalman
% filters for every object. Also, you can incorporate other cues for
% associating detections over time, such as size, shape, and color. 

displayEndOfDemoMessage(mfilename)
end

dd

时间: 2024-10-09 10:34:16

GMM+Kalman Filter+Blob 目标跟踪的相关文章

卡尔曼滤波(Kalman Filter)在目标边框预测中的应用

1.卡尔曼滤波的导论 卡尔曼滤波器(Kalman Filter),是由匈牙利数学家Rudolf Emil Kalman发明,并以其名字命名.卡尔曼出生于1930年匈牙利首都布达佩斯.1953,1954年分别获得麻省理工学院的电机工程学士以及硕士学位.1957年于哥伦比亚大学获得博士学位.卡尔曼滤波器是其在博士期间的研究成果,他的博士论文是<A New Approach to Linear Filtering and Prediction Problem>[1]. 卡尔曼滤波器是一个最优化自回归

目标跟踪之卡尔曼滤波---理解Kalman滤波的使用预测

Kalman滤波简介 Kalman滤波是一种线性滤波与预测方法,原文为:A New Approach to Linear Filtering and Prediction Problems.文章推导很复杂,看了一半就看不下去了,既然不能透彻理解其原理,但总可以通过实验来理解其具体的使用方法. Kalman滤波分为2个步骤,预测(predict)和校正(correct).预测是基于上一时刻状态估计当前时刻状态,而校正则是综合当前时刻的估计状态与观测状态,估计出最优的状态.预测与校正的过程如下: 预

(转) 深度学习在目标跟踪中的应用

深度学习在目标跟踪中的应用 原创 2016-09-05 徐霞清 深度学习大讲堂 点击上方“深度学习大讲堂”可订阅哦!深度学习大讲堂是高质量原创内容的平台,邀请学术界.工业界一线专家撰稿,致力于推送人工智能与深度学习最新技术.产品和活动信息! 开始本文之前,我们首先看上方给出的3张图片,它们分别是同一个视频的第1,40,80帧.在第1帧给出一个跑步者的边框(bounding-box)之后,后续的第40帧,80帧,bounding-box依然准确圈出了同一个跑步者.以上展示的其实就是目标跟踪(vis

目标跟踪之粒子滤波---Opencv实现粒子滤波算法

目标跟踪学习笔记_2(particle filter初探1) 目标跟踪学习笔记_3(particle filter初探2) 前面2篇博客已经提到当粒子数增加时会内存报错,后面又仔细查了下程序,是代码方面的问题.所以本次的代码与前几次改变比较小.当然这些code基本也是参考网上的.代码写得很不规范,时间不够,等以后有机会将其优化并整理成类的形式.)              Opencv实现粒子滤波算法            摘要 本文通过opencv实现了一种目标跟踪算法——粒子滤波算法,算法的

目标跟踪算法综述

转自  https://www.zhihu.com/question/26493945 作者:YaqiLYU 第一部分:目标跟踪速览 先跟几个SOTA的tracker混个脸熟,大概了解一下目标跟踪这个方向都有些什么.一切要从2013年的那个数据库说起..如果你问别人近几年有什么比较niubility的跟踪算法,大部分人都会扔给你吴毅老师的论文,OTB50和OTB100(OTB50这里指OTB-2013,OTB100这里指OTB-2015,50和100分别代表视频数量,方便记忆): Wu Y, L

目标跟踪算法----KCF进阶(基于KCF改进的算法总结)

一.前情提要 如果你对目标跟踪和KCF是什么东西还不了解的话欢迎你看前一篇博文KCF入门详解:http://blog.csdn.net/crazyice521/article/details/53525366.如果你已经对基于KCF的目标跟踪有了一定的了解,并想知道这个算法有怎么样的后续的发展的话,就请听我慢慢介绍以下的东西. 二.KCF的弊端 说道KCF的缺点的话作者在文章中也已经算是说明了,第一点,KCF因为在跟踪过程当中目标框是已经设定好的,从始至终大小为发生变化,但是我们的跟踪序列当中目

(转) How a Kalman filter works, in pictures

How a Kalman filter works, in pictures I have to tell you about the Kalman filter, because what it does is pretty damn amazing. Surprisingly few software engineers and scientists seem to know about it, and that makes me sad because it is such a gener

KCF目标跟踪方法分析与总结

KCF目标跟踪方法分析与总结 correlation filter Kernelized correlation filter tracking 读"J. F. Henriques, R. Caseiro, P. Martins, J. Batista, 'High-speed tracking with kernelized correlation filters'" 笔记 KCF是一种鉴别式追踪方法,这类方法一般都是在追踪过程中训练一个目标检测器,使用目标检测器去检测下一帧预测位置

(二). 细说Kalman滤波:The Kalman Filter

本文为原创文章,转载请注明出处,http://www.cnblogs.com/ycwang16/p/5999034.html 前面介绍了Bayes滤波方法,我们接下来详细说说Kalman滤波器.虽然Kalman滤波器已经被广泛使用,也有很多的教程,但我们在Bayes滤波器的框架上,来深入理解Kalman滤波器的设计,对理解采用Gaussian模型来近似状态分布的多高斯滤波器(Guassian Multi-Hyperthesis-Filter)等都有帮助. 一. 背景知识回顾 1.1 Bayes滤