关于ADABOOST人脸检测负样本的选取

其他人脸检测方法,负样本需要的数量我不清楚,但是ADABOOST所需负样本数量非常之大。在训练层数后期,当最大误警率很低,如2*10-5次方,同时参与训练的负样本为4000时,那么需要的负样本就要2*10^8。解决的方法是自举。但自举最重要的是负样本本身的尺寸一定要足够大1000*1000。同时针对不同应用场景,我们需要不同的负样本。比如车内的人脸检测,负样本本身就要体现出车内分人脸的特性,湖水 蓝天之类的就没有任何区分效果。这在下面的Q&A中也有体现。

Viola-Jones‘ AdaBoost method is very popular for face detection? We need lots of positive and negative samples o train a face detector.

The rule for collecting positive sample is simple: the image which contains faces. But the rule for collecting negative sample is not very clear: the image which does not contains faces.

But there are so many scene that do not contain faces (which may be sky, river, house animals etc.). Which should I collect it? How can know I have collected enough negative samples?

Some suggested idea for negative samples: using the positive samples and crop the face region using the left part as negative samples. Is this work?

Answer:

You have asked many questions inside your thread.

  1. Amount of samples. As a rule of thumbs: When you train a detector you need roughly few thousands positive and negative examples per stage. Typical detector has 10-20 stages. Each stage reduces the amount of negative by a factor of 2. So
    you will need roughly 3,000 - 10,000 positive examples and ~5,000,000 to 100,000,000 negative examples.
  2. Which negatives to take. A rule of thumb: You need to find a face in a given environment. So you need to take that environment as negative examples. For instance, if you try to detect faces of students sitting in a classroom than take as
    negative examples images from the classroom (walls, windows, human body, clothes etc). Taking images of the moon or of the sky will probably not help you. If you don‘t know your environment than just take as much as possible different natural images (under
    different light conditions).
  3. Should you take facial parts (like an eye, or a nose) as negative? You can but this is definitely not enough (to take only those negatives). The real strength of the detector will come from the negative images which represent the typical
    background of the faces
  4. How to collect/generate negative samples - You don‘t actually need many negative images. You can take 1000 images and generate 10,000,000 negative samples from them. Here is how you do it. Suppose you take a photo of a car of 1 mega pixel
    resolution 1000x1000 pixels. Suppose than you want to train face detector to work on resolution of 20x20 pixels (like openCV did). So you take your 1000x1000 big image and cut it to pieces of 20x20. You can get 2,500 pieces (50x50). So this is how from a single
    big image you generated 2,500 negative examples. Now you can take the same big image and cut it to pieces of size 10x10 pixels. You will now have additional 10,000 negative examples. Each example is of size 10x10 pixels and you can enlarge it by factor of
    2 to force all the sample to have the same size. You can repeat this process as much as you want (cutting the input image to pieces of different size). Mathematically speaking, if your image is of size NxN - You can generate O(N^4) negative examples from it
    by taking each possible rectangle inside it.
  5. In step 4, I described how to take a single big image and cut it to a large amount of negative examples. I must warn you that negative examples should not have high co-variance so I don‘t recommend taking only one image and generating 1 million negative
    examples from it. As a rule of thumb - create a library of 1000 images (or download random images from Google). Verify than none of the images contains faces. Crop about 10,000 negative examples from each image and now you have got a decent 10,000,000 negative
    examples. Train your detector. In the next step you can cut each image to ~50,000 (partially overlapping pieces) and thus enlarge your amount of negatives to 50 millions. You will start having very good results with it.
  6. Final enhancement step of the detector. When you already have a rather good detector, run it on many images. It will produce false detections (detect face where there is no face). Gather all those false detections and add them to your negative
    set. Now retrain the detector once again. The more such iterations you do the better your detector becomes
  7. Real numbers - The best face detectors today (like Facebooks) use hundreds of millions of positive examples and billions of negatives. As positive examples they take not only frontal faces but faces in many orientations, different facial
    expressions (smiling, shouting, angry,...), different age groups, different genders, different races (Caucasians, blacks, Thai, Chinese,....), with or without glasses/hat/sunglasses/make-up etc. You will not be able to compete with the best, so don‘t get angry
    if your detector misses some faces.

    Good luck

时间: 2024-08-24 13:17:00

关于ADABOOST人脸检测负样本的选取的相关文章

AdaBoost 人脸检测介绍(2) : 矩形特征和积分图

本系列文章总共有七篇,目录索引如下: AdaBoost 人脸检测介绍(1) : AdaBoost身世之谜 AdaBoost 人脸检测介绍(2) : 矩形特征和积分图 AdaBoost 人脸检测介绍(3) : AdaBoost算法流程 AdaBoost 人脸检测介绍(4) : AdaBoost算法举例 AdaBoost 人脸检测介绍(5) : AdaBoost算法的误差界限 AdaBoost 人脸检测介绍(6) : 使用OpenCV自带的 AdaBoost程序训练并检测目标 AdaBoost 人脸

AdaBoost 人脸检测介绍(3) : AdaBoost算法流程

本系列文章总共有七篇,目录索引如下: AdaBoost 人脸检测介绍(1) : AdaBoost身世之谜 AdaBoost 人脸检测介绍(2) : 矩形特征和积分图 AdaBoost 人脸检测介绍(3) : AdaBoost算法流程 AdaBoost 人脸检测介绍(4) : AdaBoost算法举例 AdaBoost 人脸检测介绍(5) : AdaBoost算法的误差界限 AdaBoost 人脸检测介绍(6) : 使用OpenCV自带的 AdaBoost程序训练并检测目标 AdaBoost 人脸

AdaBoost 人脸检测介绍(1) : AdaBoost身世之谜

0. 引言 学习和使用AdaBoost算法来研究人脸检测有好几个月了,一直以来想对AdaBoost的算法和原理做一个总结,在网上也参考了很多牛人的博客和看了一些专业论文,总是觉得总结的不够全面和详细,因此想对AdaBoost的来龙去脉做一个详细的总结,对算法里涉及到的原理,自己经过细致考证和推理,给出了详细的理论证明和代码验证,而不仅仅是停留在只知结果而不知推理的层面上.因此需要花不少时间来总结和写下自己在AdaBoost道路上学习和研究的心路历程!在后面会给出所有的参考文献,对网上众多的未曾谋

基于Haar特征Adaboost人脸检测级联分类

基于Haar特征Adaboost人脸检测级联分类,称haar分类器. 通过这个算法的名字,我们能够看到这个算法事实上包括了几个关键点:Haar特征.Adaboost.级联.理解了这三个词对该算法基本就掌握了. 1        算法要点 Haar分类器 = Haar-like特征 + 积分图方法 + AdaBoost +级联: Haar分类器算法的要点例如以下: a)        使用Haar-like特征做检測. b)       使用积分图(IntegralImage)对Haar-like

最简单的ADABOOST人脸检测例程。COPY运行,前提是你配置好OpenCV环境

#include "cv.h" #include "highgui.h" #include "stdio.h" void main() { IplImage* img = NULL; IplImage* cutImg = NULL; CvMemStorage* storage = cvCreateMemStorage(0); //CvHaarClassifierCascade* cascade = (CvHaarClassifierCascade

Adaboost算法详解(haar人脸检测)

Adaboost是一种迭代算法,其核心思想是针对同一个训练集训练不同的分类器(弱分类器),然后把这些弱分类器集合起来,构成一个更强的最终分类器(强分类器).Adaboost算法本身是通过改变数据分布来实现的,它根据每次训练集之中每个样本的分类是否正确,以及上次的总体分类的准确率,来确定每个样本的权值.将修改过权值的新数据集送给下层分类器进行训练,最后将每次得到的分类器最后融合起来,作为最后的决策分类器. 算法概述 1.先通过对N个训练样本的学习得到第一个弱分类器: 2.将分错的样本和其他的新数据

浅析人脸检测之Haar分类器方法:Haar特征、积分图、 AdaBoost 、级联

浅析人脸检测之Haar分类器方法 一.Haar分类器的前世今生 人脸检测属于计算机视觉的范畴,早期人们的主要研究方向是人脸识别,即根据人脸来识别人物的身份,后来在复杂背景下的人脸检测需求越来越大,人脸检测也逐渐作为一个单独的研究方向发展起来. 目前的人脸检测方法主要有两大类:基于知识和基于统计. Ø  基于知识的方法:主要利用先验知识将人脸看作器官特征的组合,根据眼睛.眉毛.嘴巴.鼻子等器官的特征以及相互之间的几何位置关系来检测人脸. Ø  基于统计的方法:将人脸看作一个整体的模式——二维像素矩

人脸检测之Haar-like,Adaboost,级联(cascade)

0:写在前面的话 写在前面的牢骚话,作为一个非主流工程师,我专业与目前工作都与这些知识相隔十万八千里,所以,我所学习和实现的完全是因为兴趣,目前还研究学习的很浅,谈不上高深,所以还是要继续努力学习.希望和大家多交流,也欢迎伪大牛,假专家板砖伺候,也希望真大牛多指点(真大牛不会啰嗦一堆来显得他知道的多,哈哈),总之,本人还在菜鸟阶段,欢迎指教.0.0本文如有错误请及时留言指出,博主会在第一时间修改,确保不会对其他读者产生副作用. 1:人脸检测与识别 人脸识别系统主要包括四个组成部分,分别为:人脸图

人脸检测流程及正负样本下载

人脸检测做训练当然可以用OpenCV训练好的xml,但是岂止于此.我们也要动手做!~ 首先是样本的选取.样本的选取很重要,找了很久才发现几个靠谱的. 人脸样本:http://www.vision.caltech.edu/Image_Datasets/Caltech_10K_WebFaces/   网上抓取的逾10,000个人脸样本 http://vis-www.cs.umass.edu/lfw/  13,000个人脸 负样本(背景环境衣服动物乱七八糟的):http://groups.csail.