PASCAL VOC 2012

在看目前检测、识别方面的论文时,经常遇到VOC 2007 或者 VOC 2012数据集。为了对这个数据集有一个详细的了解,专门读了相关文档并将一些要点概括如下:

The PASCAL Visual Object Classes Challenge (2012)

The goal of this challenge is to recognize objects from a number of visual object classes in realistic scenes. There are twenty object classes.

There are five main tasks. We only focus on three of them: classification, detection, and segmentation.

Classification: For each of the classes, predict the presence/absence of at least one object of that class in a test image.

Detection: For each of the classes, predict the bounding boxes of each object of that class in a test image (if any).

Segmentation: For each pixel in a test image, predict the class of the object containing that pixel or "background" if the pixel does not belong to one of the twenty specified classes.

下面以图像识别为例进行详细说明

Classication/Detection Image Sets

For the classification and detection tasks, there are four sets of images provided:

train: Training data

val: Validation data

trainval: The union of train and val

test: Test data

Classication Task

For each of the twenty object classes, predict the presence/absence of at least one object of that class in a test image. The output from your system should be a real-valued confidence of the object‘s presence so that a precision/recall curve can be drawn. Participants may choose to tackle all, or any subset of object classes, for example “cars only” or “motorbikes and cars”.

Two competitions are defined according to the choice of training data: (i) taken from the VOC trainval data provided, or (ii) from any source excluding the VOC test data provided.

A separate text file of results should be generated for each competition (1 or 2) and each class e.g. ‘car’. Each line should contain a single identifier and the confidence output by the classifier, separated by a space, for example:

comp1_cls_test_car.txt:

...

2009_000001 0.056313

2009_000002 0.127031

2009_000009 0.287153

...

The classification task will be judged by the precision/recall curve. The principal quantitative measure used will be the average precision (AP).

这个赛事主办方提供了评估性能的函数,我们只需要按照要求输出文本即可,评估可以直接调用赛事主办方的API。

时间: 2024-11-06 16:49:07

PASCAL VOC 2012的相关文章

PASCAL VOC数据集分析(转)

PASCAL VOC数据集分析 PASCAL VOC为图像识别和分类提供了一整套标准化的优秀的数据集,从2005年到2012年每年都会举行一场图像识别challenge. 本文主要分析PASCAL VOC数据集中和图像中物体识别相关的内容. 在这里采用PASCAL VOC2012作为例子.下载地址为:点击打开链接.(本文中的系统环境为ubuntu14.04) 下载完之后解压,可以在VOCdevkit目录下的VOC2012中看到如下的文件: 其中在图像物体识别上着重需要了解的是Annotation

PASCAL VOC DATASET

PASCAL VOC为图像识别和分类提供了一整套标准化的优秀的数据集,从2005年到2012年每年都会举行一场图像识别challenge.该挑战的主要目的是识别真实场景中一些类别的物体.在该挑战中,这是一个监督学习的问题,训练集以带标签的图片的形式给出.这些物体包括20类: Person: personAnimal: bird, cat, cow, dog, horse, sheepVehicle: aeroplane, bicycle, boat, bus, car, motorbike, t

【Detection】物体识别-制作PASCAL VOC数据集

代码下载:github PASCAL VOC数据集 PASCAL VOC为图像识别和分类提供了一整套标准化的优秀的数据集,从2005年到2012年每年都会举行一场图像识别challenge 默认为20类物体 1 数据集结构 ①JPEGImages JPEGImages文件夹中包含了PASCAL VOC所提供的所有的图片信息,包括了训练图片和测试图片. ref:PASCAL VOC数据集分析 ②Annotations Annotations文件夹中存放的是xml格式的标签文件,每一个xml文件都对

Pascal VOC & COCO数据集介绍 & 转换

目录 Pascal VOC & COCO数据集介绍 Pascal VOC数据集介绍 1. JPEGImages 2. Annotations 3. ImageSets 4. SegmentationObject & SegmentationClass COCO数据集介绍 数据集分类 Coco VOC数据集转化为COCO数据集格式 训练detectron 训练 测试 评估 Reference Pascal VOC & COCO数据集介绍 Pascal VOC数据集介绍 Annotat

Finding Action Tubes - cvpr - 2015

论文题目Finding Action Tubes, 论文链接 该篇论文是CVPR 2015的, 主要讲述了action tube的localization. 直接看图说话, 该论文的核心思想/步骤可以分为两个components: 1 Action detection at every frame of the video 2 Linked detection in time produce action tubes 下面就分开来说每个component. 1 Action detection

(转)Understanding, generalisation, and transfer learning in deep neural networks

Understanding, generalisation, and transfer learning in deep neural networks FEBRUARY 27, 2017 This is the first in a series of posts looking at the 'top 100 awesome deep learning papers.' Deviating from the normal one-paper-per-day format, I'll take

models-caffes-大全

caffe的伯克利主页:http://caffe.berkeleyvision.org/caffe的github主页:https://github.com/BVLC/caffe caffe的models: http://dl.caffe.berkeleyvision.org/ Index of / ../ mit_mini_places/ 01-Mar-2016 12:18 - bvlc_alexnet.caffemodel 22-Aug-2014 04:36 243862414 bvlc_go

关于图像语义分割的总结和感悟

转自:http://www.2cto.com/kf/201609/545237.html 前言 (呕血制作啊!)前几天刚好做了个图像语义分割的汇报,把最近看的论文和一些想法讲了一下.所以今天就把它总结成文章啦,方便大家一起讨论讨论.本文只是展示了一些比较经典和自己觉得比较不错的结构,毕竟这方面还是有挺多的结构方法了. 介绍 图像语义分割,简单而言就是给定一张图片,对图片上的每一个像素点分类 从图像上来看,就是我们需要将实际的场景图分割成下面的分割图: 不同颜色代表不同类别. 经过我阅读“大量”论

(转)Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks(更快的RCNN:通过区域提议网络实现实时)

原文出处 感谢作者~ Faster R-CNN: Towards Real-Time Object Detection with Region ProposalNetworks Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun 摘要 目前最先进的目标检测网络需要先用区域建议算法推测目标位置,像SPPnet[7]和Fast R-CNN[5]这些网络已经减少了检测网络的运行时间,这时计算区域建议就成了瓶颈问题.本文中,我们介绍一种区域建议网络(Reg