Labeled Faces in the Wild 人脸识别数据集

http://blog.csdn.net/garfielder007/article/details/51480525

New (draft) survey paper:

Labeled Faces in the Wild: A Survey
Erik Learned-Miller, Gary Huang, Aruni RoyChowdhury, Haoxiang Li, Gang Hua

The camera-ready has not yet been submitted. If you see typos or errors, please let us know and we will try to correct them.

New results page:

We have recently updated and changed the format and content of our results page. Please refer to the new technical report for details of the changes.

Welcome to Labeled Faces in the Wild, a database of face photographs designed for studying the problem of unconstrained face recognition. The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set. The only constraint on these faces is that they were detected by the Viola-Jones face detector. More details can be found in the technical report below.

There are now four different sets of LFW images including the original and three different types of "aligned" images. The aligned images include "funneled images" (ICCV 2007), LFW-a, which uses an unpublished method of alignment, and "deep funneled" images (NIPS 2012). Among these, LFW-a and the deep funneled images produce superior results for most face verification algorithms over the original images and over the funneled images (ICCV 2007).

Related:

[new] Collected resources related to LFW.

LFW Deep Funneled Images.

LFW attributes file (see Attribute and Simile Classifiers for Face Verification, Kumar et al.).

Face Detection Data set and Benchmark (FDDB), our new database for face detection research.

Faces in Real-Life Images workshop at the European Conference on Computer Vision 2008, run by Erik Learned-Miller, Andras Ferencz, and Frederic Jurie.

last updated: 2015/09/22 15:09 EDT
change log

Mailing list:
If you wish to receive announcements regarding any changes made to the LFW database, please send email to [email protected] (subject and body are ignored).

Explore the database:

Download the database:

Training, Validation, and Testing:
View 1:
For development purposes, we recommend using the below training/testing split, which was generated randomly and independently of the splits for 10-fold cross validation, to avoid unfairly overfitting to the sets above during development. For instance, these sets may be viewed as a model selection set and a validation set. See the tech report below for more details.

Explore the sets: [training][test]
Download the sets: pairsDevTrain.txtpairsDevTest.txtpeopleDevTrain.txt,peopleDevTest.txt

View 2:
As a benchmark for comparison, we suggest reporting performance as 10-fold cross validation using splits we have randomly generated.

Explore the sets: [1][2][3][4][5][6][7][8][9][10]
Download the sets: pairs.txtpeople.txt

For information on the file formats, please refer to the README above.
For details on how the sets were created, please refer to the tech report below.

Results:
Accuracy and ROC curves for various methods available on results page.

Information:

  • 13233 images
  • 5749 people
  • 1680 people with two or more images
Errata:
The following is a list of known errors in LFW. Due to the small number of such errors, the database will be left as is (without corrections) to avoid confusion.

It is important that users of the database provide their algorithms with the databaseas is, i.e. without correcting the errors below, since previous results published for the database did not have the advantage of correcting for these errors.

Currently, there are five incorrectly labeled matched pairs in View 2. While we do not believe this should have a significant effect on accuracy, we do encourage researchers to be aware of these errors when producing any visualizations (e.g. matched pairs most confidently predicted as mismatched, as the matched pair may actually be mismatched).

The current known errors in View 2 are:
Fold 1: Janica_Kostelic_0001, Janica_Kostelic_0002
Fold 1: Nora_Bendijo_0001, Nora_Bendijo_0002
Fold 5: Jim_OBrien_0001, Jim_OBrien_0002
Fold 5: Jim_OBrien_0001, Jim_OBrien_0003
Fold 5: Elisabeth_Schumacher_0001, Elisabeth_Schumacher_0002

More detail about all the errors is given below.

Note: unless stated otherwise below, any error in a matched pair will mean that the label ("matched") is wrong. Any error in a mismatched pair, even with the person having the wrong identity, will generally be correct (the label of "mismatched" will still be correct).

  • Recep_Tayyip_Erdogan_0004 is incorrect (it is an image of Abdullah Gul):

    This image appears only in one matched pair in the training set of View 1:


    Recep Tayyip Erdogan
    , 2

    Recep Tayyip Erdogan
    , 4
  • Anja_Paerson_0001 is incorrect (it is an image of Janica Kostelic):

    This image does not appear in a matched or mismatched pair, in either view.

  • Janica_Kostelic_0001 is incorrect (it is an image of Anja Paerson):

    This image appears in one matched pair in the test set of View 1, and the same matched pair and one mismatched pair (with Don_Carcieri_0001) in fold 1 of View 2:


    Janica Kostelic
    , 1

    Janica Kostelic
    , 2
  • Bart_Hendricks_0001 is incorrect (it is a duplicate image of Ricky_Ray_0001):

    This image appears in two mismatched pairs in the training set of View 1, and one mismatched pair in fold 2 of View 2. (None of the mismatched pairs are with Ricky_Ray.)

  • Carlos_Beltran_0001 is incorrect (it is a duplicate image of Raul_Ibanez_0001): 

    This image appears in one mismatched pair in the test set of View 1, and one mismatched pair in fold 5 of View 2. (None of the mismatched pairs are with Raul_Ibanez.)

  • Emmy_Rossum_0001 is incorrect (it is a duplicate image of Eva_Amurri_0001): 

    This image appears in one mismatched pair in the test set of View 1 (the mismatched pair is not with Eva_Amurri).

  • Michael_Schumacher_0008 is incorrect (it is an image of Rubens Barrichello): 

    This image does not appear in a matched or mismatched pair, in either view.

  • Mahmoud_Abbas_0012 is incorrect (it is an image of Hamad Bin Isa al-Khalifa): 

    This image does not appear in a matched or mismatched pair, in either view.

  • Jim_OBrien contains two distinct persons. Specifically, Jim_OBrien_0001 is a different person from Jim_OBrien_0002, Jim_OBrien_0003.

    This leads to an error in two matched pairs (0001 with 0002; 0001 with 0003), present in both the training set of View 1 and fold 5 of View 2:


    Jim OBrien
    , 1

    Jim OBrien
    , 2

    Jim OBrien
    , 1

    Jim OBrien
    , 3
  • John_Gruden_0001 is an incorrect spelling of Jon_Gruden

    This image appears in a mismatched pair in fold 3 of View 2 (not with Jon_Gruden).

  • Elisabeth_Schumacher contains two distinct persons, where the correct spelling of Elisabeth_Schumacher_0001 is actually Elizabeth Schumacher. This leads to a incorrect matched pair in both the test set of View 1 and fold 5 of View 2.

    Elisabeth Schumacher
    , 1

    Elisabeth Schumacher
    , 2
  • Andrew_Gilligan_0001 is actually an image of Andrew_Caldecott

    This image appears in a mismatched pair in the training set of View 1 and in fold 1 of View 2 (neither with Andrew_Caldecott).

  • Nora_Bendijo_0002 is actually an image of Flor Montulo, andFlor_Montulo_0002 is actually an image of Nora_Bendijo 

    Nora Bendijo
    , 1
    correct label

    Nora Bendijo
    , 2
    is actually Flor Montulo

    Flor Montulo
    , 1
    correct label

    Flor Montulo
    , 2
    is actually Nora Bendijo

    Nora_Bendijo_0002 appears in an incorrect matched pair in fold 1 of View 2.
    Flor_Montulo_0002 appears in an incorrect matched pair in the test set of View 1, and in a mismatched pair in fold 1 of View 2, but not with Nora_Bendijo.

  • Wang_Yingfan and Yingfan_Wang are the same person. The two names are never together in a mismatched pair.
  • Wang_Nan and Nan_Wang are the same person. The two names are never together in a mismatched pair.
  • Talisa_Bratt and Talisa_Soto are the same person.
      

    These images never appear in a mismatched pair together.

  • Shinya_Taniguchi_0001 is actually an image of Takahiro_Mori
      

    This image never appears in a mismatched pair with an image of Takahiro_Mori.

Reference:
Please cite as:

Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller.
Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments.
University of Massachusetts, Amherst, Technical Report 07-49, October, 2007.
[pdf]

BibTeX entry:

@TechReport{LFWTech,
  author =       {Gary B. Huang and Manu Ramesh and Tamara Berg and
                  Erik Learned-Miller},
  title =        {Labeled Faces in the Wild: A Database for Studying
                  Face Recognition in Unconstrained Environments},
  institution =  {University of Massachusetts, Amherst},
  year =         2007,
  number =       {07-49},
  month =        {October}}

Gary B. Huang and Erik Learned-Miller.
Labeled Faces in the Wild: Updates and New Reporting Procedures.
University of Massachusetts, Amherst, Technical Report UM-CS-2014-003, May, 2014.
[pdf]

@TechReport{LFWTechUpdate,
  author =       {Gary B. Huang Erik Learned-Miller},
  title =        {Labeled Faces in the Wild: Updates and New Reporting
                  Procedures},
  institution =  {University of Massachusetts, Amherst},
  year =         2014,
  number =       {UM-CS-2014-003},
  month =        {May}}

LFW funneled images
If you use the LFW images aligned by funneling, please cite:
Gary B. Huang, Vidit Jain, and Erik Learned-Miller
Unsupervised joint alignment of complex images. 
International Conference on Computer Vision (ICCV), 2007.

@InProceedings{Huang2007a,
  author =    {Gary B. Huang and Vidit Jain and Erik Learned-Miller},
  title =     {Unsupervised Joint Alignment of Complex Images},
  booktitle = {ICCV},
  year =      {2007}
}

LFW deep funneled images
If you use the LFW imaged aligned by deep funneling, please cite:
Gary B. Huang, Marwan MattarHonglak Lee, and Erik Learned-Miller.
Learning to Align from Scratch.
Advances in Neural Information Processing Systems (NIPS), 2012.

@InProceedings{Huang2012a,
  author =    {Gary B. Huang and Marwan Mattar and Honglak Lee and
               Erik Learned-Miller},
  title =     {Learning to Align from Scratch},
  booktitle = {NIPS},
  year =      {2012}
}
Resources:
Collected resources related to LFW:
Note: We have not verified the accuracy or reliability of the code and data at the following links; we merely provide them as a convenience. Please use your own judgment about the accuracy of the resources below.

  • LFWgender

    "Getting the known gender based on name of each image in the Labeled Faces in the Wild dataset. This is a python script that calls the genderize.io API with the first name of the person in the image."

  • CASIA WebFace Database

    "While there are many open source implementations of CNN, none of large scale face dataset is publicly available. The current situation in the field of face recognition is that data is more important than algorithm. To solve this problem, we propose a semi-automatical way to collect face images from Internet and build a large scale dataset containing 10,575 subjects and 494,414 images, called CASIA-WebFace. To the best of our knowledge, the size of this dataset rank second in the literature, only smaller than the private dataset of Facebook (SCF). We encourage those data-consuming methods training on this dataset and reporting performance on LFW. "

  • LFW3D - collection of frontalized LFW images and Matlab code for frontalization

    "Frontalization is the process of synthesizing frontal facing views of faces appearing in single unconstrained photos. Recent reports have suggested that this process may substantially boost the performance of face recognition systems... we explore the simpler approach of using a single, unmodified, 3D surface as an approximation to the shape of all input faces. We show that this leads to a straightforward, efficient and easy to implement method for frontalization. More importantly, it produces aesthetic new frontal views and is surprisingly effective when used for face recognition and gender estimation."

Contact:
Questions and comments can be sent to:
Gary Huang - [email protected]
Support:
The building of the LFW database was supported by NSF CAREER Award number 0546666.
Change History:
2015/09/22
Added Ding et al.*, Ding and Tao*, and Xu et al.* to results page.

2015/07/31
Updated Errata. Added Baidu, AuthenMetric commercial system results toresults page.

2015/06/03
Updated Errata.

2015/05/19
Updated Errata.

2015/04/13
Added Sun et al.*, Li and Hua*, updated betaface.com commercial system*, onresults page.
2015/01/19
Updated Face++ commercial system result*, added betaface.com commercial system result* on results page.
2014/12/09
Added Ouamane et al.* and Sun et al.* to results page.
2014/12/08
Added new LFW-related resources section.
Added Hassner et al.* and TCIT* to results page.
2014/12/03
Added Hu et al.* to results page.
2014/12/01
Added Arashloo and Kittler* to results page.
2014/09/19
Added Li et al.* to results page.
2014/06/20
Added Sun et al.* to results page.
2014/06/16
Added Lu and Tang* to results page.
2014/06/12
Added Sun et al.* to results page.
2014/05/28
Added AUC for two unsupervised methods - LHS and MRF-MLBP - to results page.
2014/05/23
Added Zhu et al.* to results page.
2014/05/22
Updated Kumar et al.* with results from journal paper, and added Berg and Belhumeur*, on results page.
2014/05/21
Added Sun et al.* and Hu et al.* to results page.
2014/05/19
Added Taigman et al.* to results page.
2014/05/09
Re-organized results page to more accurately and fairly compare algorithms under a variety of protocols. See the new technical report for details of the changes.
2014/03/17
Added Face++ commercial system result* to results page.
2014/02/10
Added Aurora Computer Services commercial system result* to results page.
2014/01/09
Added VisionLabs commercial system result* to results page.
2013/12/08
Added John_Gruden_0001 labeling error to Errata.
2013/10/18
Added Cao et al.*, Cao et al.*, and Barkan et al.* to results page.
2013/10/08
Added Lei et al.* to results page.
2013/08/23
Added new deep funneled images and corresponding deep funneled superpixels for download. (CJG)
2013/08/12
Temporarily disabled deep funneled data set download. (CJG)
2013/08/09
Removed superpixel images from database gallery. Added deep funnel aligned images to database gallery. (CJG)
2013/08/08
Added deep funneled aligned image set to main page. (CJG)
2013/07/28
Added Simonyan et al.* to results page.
2013/07/13
Added Yi et al.* to results page.
2013/07/13
Added Arashloo and Kittler* to results page.
2013/06/12
Added Sharma et al.* to results page.
2013/05/12
Added Zhen et al.* to results page.
2013/05/02
Added Chen et al.* and Chen et al.* to results page.
2013/04/17
Added Anja_Paerson_0001 labeling error to Errata.
2013/03/21
Added Li et al.* to results page.
2012/08/17
Added Hussain et al.* to results page.
2012/07/18
Added Berg and Belhumeur* to results page.
2012/03/03
Added Huang et al.* to results page.
2011/12/14
Added Ying and Li* to results page.
2011/09/07
Added link to download computed attribute values for all LFW images produced by Kumar et al., on the results page.
2011/08/28
Added Seo and Milanfar* to results page.
2011/08/08
Added images of incorrectly labeled faces, in Errata.
2011/08/08
Added Taigman and Wolf* to results page.
2011/08/01
Added Jim_OBrien_0001 labeling error to Errata.
2011/07/18
Updated the results page, adding notes on the use of external training data, arranging the image-restricted method results to roughly reflect the amount of external training data used, and added specific notes on the type of external training data used for each algorithm.
2011/07/12
Added Mahmoud_Abbas_0012 labeling error to Errata.
2011/06/28
Added Yin et al.* to results page.
2011/04/28
Added superpixel segmentation files to downloads section.
2011/04/04
Added Li et al.* to results page.
2011/01/29
Added Pinto and Cox* to results page.
2010/11/17
Added link to related database: Face Detection Data set and Benchmark (FDDB).
2010/10/26
Added Nguyen and Bai* to results page.
2010/09/07
Added Michael_Schumacher_0008 labeling error to Errata.
2010/04/17
Added Cao et al.* to results page.
2010/02/08
Added Ruiz-del-Solar et al.* and unsupervised (no training data) results toresults page.
2009/10/26
Added Kumar et al.* to results page.
2009/09/24
Added link to LFW-a, LFW images aligned with commercial face alignment software, from Taigman, Wolf, and Hassner, under downloads.
2009/09/02
Added Wolf et al.* to results page.
2009/08/03
Added Taigman et al.* to results page.
2009/07/02
Added Guillaumin et al.* to results page.
2009/06/24
Added Carlos_Beltran_0001 and Emmy_Rossum_0001 labeling errors to Errata.
2009/04/02
Added Pinto et al.* to results page.
2009/02/04
Added Bart_Hendricks_0001 labeling error to Errata.
2008/07/01
Updated LFW technical report with proper reference for VidTIMIT:
C. Sanderson.
Biometric Person Recognition: Face, Speech and Fusion.
VDM-Verlag, 2008.
ISBN 978-3-639-02769-3
2008/06/12
Added Errata section and listed two known labeling errors.
2008/02/04
Added funneled images and super-pixels images to person pages. 
Made all funneled images available as single downloadable file.
2008/01/25
Added results page with numbers for method of Nowak and Jurie, CVPR 2007.
2007/11/21
Added revised version of technical report.
2007/11/19
Added technical report to page.
2007/11/15

Added mailing list and change history to page.

from: http://vis-www.cs.umass.edu/lfw/

时间: 2024-10-28 21:58:29

Labeled Faces in the Wild 人脸识别数据集的相关文章

Labeled Faces in the Wild 人脸识别数据集 部分训练数据

development training set Note: images displayed are original (non-aligned/funneled) images. match pairs   mismatch pairs Aaron Peirsol, 1 Aaron Peirsol, 2   AJ Cook, 1 Marsha Thomason, 1 Aaron Peirsol, 3 Aaron Peirsol, 4   Aaron Sorkin, 2 Frank Solic

基于svm和pca的人脸识别案例分析

数据集介绍 LFW (Labeled Faces in the Wild) 人脸数据库是由美国马萨诸塞州立大学阿默斯特分校计算机视觉实验室整理完成的数据库,主要用来研究非受限情况下的人脸识别问题.LFW 数据库主要是从互联网上搜集图像,而不是实验室,一共含有13000 多张人脸图像,每张图像都被标识出对应的人的名字,其中有1680 人对应不只一张图像,即大约1680个人包含两个以上的人脸.LFW数据集主要测试人脸识别的准确率. 代码实现 from time import time #记录时间 i

Github开源人脸识别项目face_recognition

Github开源人脸识别项目face_recognition 原文:https://www.jianshu.com/p/0b37452be63e 译者注: 本项目face_recognition是一个强大.简单.易上手的人脸识别开源项目,并且配备了完整的开发文档和应用案例,特别是兼容树莓派系统. 为了便于中国开发者研究学习人脸识别.贡献代码,我将本项目README文件翻译成中文. 向本项目的所有贡献者致敬. 英译汉:同济大学开源软件协会 子豪兄Tommy Translator's note: f

[CV]人脸识别检测数据集

[CV]人脸识别检测数据集 做了一段时间的人脸识别和检测,在这里列一下用过的数据集.基本上,大家近期也都是在这几个数据集上检测自己的算法.网上这方面的总结虽然不少,但是一则有些是多年前的数据,或是规模太小或是链接已经失效,再则是数据集的测试协议定义不明,不尽适合用来和其它方法做比较. 1. Labeled Faces in the Wild:做人脸识别,准确的说是人脸验证(Face Verification),UMass的LFW估计是最近被用的最多的了,LFW采用的测试协议也已经被几个新的数据集

人脸识别常用数据集大全(12/20更新)

人脸识别常用数据集大全(12/20更新) 2018-05-18 16:53:37 meng_shangjy 阅读数 2807 1.PubFig: Public Figures Face Database(哥伦比亚大学公众人物脸部数据库) The PubFig database is a large, real-world face dataset consisting of 58,797 images of 200 people collected from the internet. Unli

基于支持向量机SVM的人脸识别

一.线性SVM 1. 背景: 1.1 最早是由 Vladimir N. Vapnik 和 Alexey Ya. Chervonenkis 在1963年提出 1.2 目前的版本(soft margin)是由Corinna Cortes 和 Vapnik在1993年提出,并在1995年发表 1.3 深度学习(2012)出现之前,SVM被认为机器学习中近十几年来最成功,表现最好的算法 2. 机器学习的一般框架: 训练集 => 提取特征向量 => 结合一定的算法(分类器:比如决策树,KNN)=>

Dlib+OpenCV深度学习人脸识别

目录(?)[+] DlibOpenCV深度学习人脸识别 前言 人脸数据库导入 人脸检测 人脸识别 异常处理 Dlib+OpenCV深度学习人脸识别 前言 人脸识别在LWF(Labeled Faces in the Wild)数据集上人脸识别率现在已经99.7%以上,这个识别率确实非常高了,但是真实的环境中的准确率有多少呢?我没有这方面的数据,但是可以确信的是真实环境中的识别率并没有那么乐观.现在虽然有一些商业应用如员工人脸识别管理系统.海关身份验证系统.甚至是银行人脸识别功能,但是我们可以仔细想

#基于SVM的人脸识别

#数据说明 LFW全称为Labeled Faces in the Wild, 是一个应用于人脸识别问题的数据库,更多内容查看官方网站:http://vis-www.cs.umass.edu/lfw LFW语料图片,每张图片都有人名Label标记.每个人可能有多张不同情况下情景下的图片.如George W Bush 有530张图片,而有一些人名对应的图片可能只有一张或者几张.我们将选取出现最多的人名作为人脸识别的类别,如本实验中选取出现频数超过70的人名为类别, 那么共计1288张图片.其中包括A

Python scikit-learn 学习笔记—PCA+SVM人脸识别

人脸识别是一项实用的技术.但是这种技术总是感觉非常神秘,在sklearn中看到了人脸识别的example,代码网址如下: http://scikit-learn.org/0.13/auto_examples/applications/face_recognition.html#example-applications-face-recognition-py 首先介绍一些PCA和SVM的功能,PCA叫做主元分析,它可以从多元事物中解析出主要影响因素,揭示事物的本质,简化复杂的问题.计算主成分的目的