paper reading in 1/1/2016~1/3/2016

CVPR15:Person Count Localization in Videos from Noisy Foreground and Detections

paper主要的contribution是定义了person count localization及其周边,不过虽然之前提过的person count问题常用结果评估标准是只看最后给出的counts,但其实之前的文章也并不完全是只给出global counts的。文中可能是更加重视这个localization的问题并且确实是利用这个信息去解决问题了,但是这样就要做出一个区分,即首先将问题定义好。

文章提出这个person count localization问题,是对给定视频序列,给出每帧上的detection及其counts,如下图所示:

阅读之后很快发现,其实是用MOT的思路解决的。【虽然文章中说,这是middle-ground between frame-level person counting and person detection,但是我更感觉这是粗糙的MOT,至少Error driven Graph Revision部分的思路是一致的】不知道直接在上面跑MOT是什么样的效果~

其实调研之初我在考虑方法时考虑过这个问题,但不知为什么很排斥tracking by detection的方法去解决person count的方法,可能是觉得如果做tracking就专心做tracking吧~之前的代码测试虽然在无情的打击着我对state of art的信心,但还是充满希望~

言归正传,paper提出的方法是现在视频上跑一遍person detector and foreground segmentation,两个结果互为补充。然后using both the person detector’s results and the relatively larger connected components from the foreground segmentation构建如下流图,u是detection,e是互联的边,为了更加像一个图,又把同一个u分成两个部分,使得中间也有一个边e连接。它满足图的构造限制,每个边上都会有一个流量x,流入等于流出

然后把NP-complete的问题变成解ILP问题:

中间的步骤先略去,文章几个位置还没有看懂,比如红字部分为什么要求和?

最后就是利用MOT的思路加边,加节点或者利用tracker补上missing detection,而三种操作的选择方法是训练了一个random forest classifier,略去~这样原来的流图就会发生改变,再次重复这两个步骤至迭代停止条件满足即可~

实验结果评测时由于他加入了位置信息,传统方法不适用,所以提出了自己的评估方法,没什么可说的~但此时回到前面问过的问题,即前面所说的区分,如果跟MOT的结果比呢?

时间: 2024-12-20 23:23:03

paper reading in 1/1/2016~1/3/2016的相关文章

Paper Reading: Stereo DSO

开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras Abstract Optimization objectives: intrinsic/extrinsic parameters of all keyframes all selected pixels' depth Inte

CVPR 2016 paper reading (6)

1. Neuroaesthetics in fashion: modeling the perception of fashionability, Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, Raquel Urtasun, in CVPR 2015. Goal: learn and predict how fashionable a person looks on a photograph, and suggest subtle

Paper Reading: Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking

Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual TrackingECCV 2016  The key point of KCF is the ability to efficiently exploit available negative data by including all shifted versions of a training sample, in anthor w

Paper Reading: In Defense of the Triplet Loss for Person Re-Identification

In Defense of the Triplet Loss for Person Re-Identification  2017-07-02  14:04:20   This blog comes from: http://blog.csdn.net/shuzfan/article/details/70069822 Paper:  https://arxiv.org/abs/1703.07737 Github: https://github.com/VisualComputingInstitu

Paper Reading: Perceptual Generative Adversarial Networks for Small Object Detection

Perceptual Generative Adversarial Networks for Small Object Detection 2017-07-11  19:47:46   CVPR 2017 This paper use GAN to handle the issue of small object detection which is a very hard problem in general object detection. As shown in the followin

【Paper Reading】Object Recognition from Scale-Invariant Features

Paper: Object Recognition from Scale-Invariant Features Sorce: http://www.cs.ubc.ca/~lowe/papers/iccv99.pdf SIFT 即Scale Invariant Feature Transfrom, 尺度不变变换,由David Lowe提出.是CV最著名也最常用的特征.在图像目标识别的应用中,常常要求图像的特征有很好的roboust即不容易受到平移,旋转,尺度缩放,光照,仿射的英雄.SIFT算子具有

【Paper Reading】Learning while Reading

Learning while Reading 不限于具体的书,只限于知识的宽度 这个系列集合了一周所学所看的精华,它们往往来自不只一本书 我们之所以将自然界分类,组织成各种概念,并按其分类,主要是因为我们是整个口语交流社会共同遵守的协定的参与者,这个协定以语言的形式固定下来.除非赞成这个协定中规定的有关语言信息的组织和分类,否则我们根本无法交谈. --Benjamin Lee Whorf Learning and Asking 为什么选择面向对象? 机器语言.汇编语言.面向过程的语言,通过一层层

Paper Reading - Attention Is All You Need ( NIPS 2017 )

Link of the Paper: https://arxiv.org/abs/1706.03762 Motivation: The inherently sequential nature of Recurrent Models precludes parallelization within training examples. Attention mechanisms have become an integral part of compelling sequence modeling

Paper Reading:RCNN-SPP-Fast RCNN-Faster RCNN

本文对基于RCNN框架的几个模型进行介绍和总结. [目标检测][base64str0] RCNN 论文:Rich feature hierarchies for accurate object detection and semantic segmentation 发表时间:2014 发表作者:(加州大学伯克利分校)Ross Girshick 发表刊物/会议:CVPR 本文具有很多比较重要的意义. 1.在 Pascal VOC 2012 的数据集上,能够将目标检测的验证指标 mAP 提升到 53