pytorch COCO2017 目标检测（一）DataLoader

pytorch coco 目标检测 DataLoader实现

pytorch实现目标检测目标检测算法首先要实现数据的读入，即实现Dataset和DataLoader两个类。

借助pycocotools实现了CoCo2017用于目标检测数据的读取，并使用cv2显示。

分析

使用cv2显示读入数据，或者要送入到网络的数据应该有三个部分

图像，Nx3xHeight x Width
BBs，NxMx4
类型，NxMx1
因此，可以将BBs和类型组成一个。Pytorch默认的数据类型是batchsize x nChanns x H x W。

在目标检测中，一般将图像进行缩放，使其尺寸满足一定要求，具体可以参考之前的博客。

也就是要实现一个Resizer()的类进行变换。此外，通常要对图像进行标准化处理，以及水平翻转等变换。因此，在实现Dataset时要

实现的变换有三个: Resizer()、Normilizer()和Augmenter()。

Python中图像数据读入一般都是 nChanns x H x W的numpy数组。常规的做法是使用Dataset中的transform对数据进行转换，

输出torch类型的数组。

由于CoCo数据集中图像的尺寸不一致，不能直接获得Nx3xHeight x Width类型的数组，因此要重写DataLoader中的collate_fn，

将一个minibatch中的图像尺寸调整一致。如果想要按照图像被缩放比例进行采样，就要重写DataLoader中的batch_sampler，

batch_sampler与DataLoader中的batch_size, shuffle, sampler, and drop_last参数是不兼容的，即在DataLoader中使用了batch_sampler

参数就不能再设置batch_size, shuffle, sampler, and drop_last参数。

从coco数据中读入图像、BBs以及类型

coco.getImgIds()返回了图像索引数组，可以分别结合coco.loadImgs()和coco.getAnnIds()分别获得图像、BBs和类型的具体信息。

要注意的事情有：

python中图像的读入的通常是numpy的uint8数组，需要转换成float类型，并除以255以使最大值为1.0；
coco数据中有80个类型，但是给的标签值最大为90，说明并不连续，需要设置新的标签，新的标签要从0到79，一定从0开始。
coco数据集中有些图片的BBs标签高宽小于1，标注的问题，要注意舍去

下面就是一个简单的SimpleCoCoDataset类

class SimpleCoCoDataset(Dataset):
    def __init__(self, rootdir, set_name=‘val2017‘, transform=None):
        self.rootdir, self.set_name = rootdir, set_name
        self.transform = transform
        self.coco = COCO(os.path.join(self.rootdir, ‘annotations‘, ‘instances_‘
                                      + self.set_name + ‘.json‘))
        self.image_ids = self.coco.getImgIds()
        self.load_classes()

    def load_classes(self):
        categories = self.coco.loadCats(self.coco.getCatIds())
        categories.sort(key=lambda x: x[‘id‘])

        # coco ids is not from 1, and not continue
        # make a new index from 0 to 79, continuely

        # classes:             {names:      new_index}
        # coco_labels:         {new_index:  coco_index}
        # coco_labels_inverse: {coco_index: new_index}
        self.classes, self.coco_labels, self.coco_labels_inverse = {}, {}, {}
        for c in categories:
            self.coco_labels[len(self.classes)] = c[‘id‘]
            self.coco_labels_inverse[c[‘id‘]]   = len(self.classes)
            self.classes[c[‘name‘]] = len(self.classes)

        # labels:              {new_index:  names}
        self.labels = {}
        for k, v in self.classes.items():
            self.labels[v] = k

    def __len__(self):
        return len(self.image_ids)            

    def __getitem__(self, index):
        img = self.load_image(index)
        ann = self.load_anns(index)
        sample = {‘img‘:img, ‘ann‘: ann}

        if self.transform:
            sample = self.transform(sample)
        return sample

    def load_image(self, index):
        image_info = self.coco.loadImgs(self.image_ids[index])[0]
        imgpath       =  os.path.join(self.rootdir, ‘images‘, self.set_name,
                                   image_info[‘file_name‘])

        img = skimage.io.imread(imgpath)
        return img.astype(np.float32) / 255.0

    def load_anns(self, index):
        annotation_ids = self.coco.getAnnIds(self.image_ids[index], iscrowd=False)
        # anns is num_anns x 5, (x1, x2, y1, y2, new_idx)
        anns = np.zeros((0, 5))

        # skip the image without annoations
        if len(annotation_ids) == 0:
            return anns

        coco_anns = self.coco.loadAnns(annotation_ids)
        for a in coco_anns:
            # skip the annotations with width or height < 1
            if a[‘bbox‘][2] < 1 or a[‘bbox‘][3] < 1:
                continue

            ann = np.zeros((1, 5))
            ann[0, :4] = a[‘bbox‘]
            ann[0, 4]  = self.coco_labels_inverse[a[‘category_id‘]]
            anns = np.append(anns, ann, axis=0)

        # (x1, y1, width, height) --> (x1, y1, x2, y2)
        anns[:, 2] += anns[:, 0]
        anns[:, 3] += anns[:, 1]

        return anns

    def image_aspect_ratio(self, index):
        image = self.coco.loadImgs(self.image_ids[index])[0]
        return float(image[‘width‘]) / float(image[‘height‘])

原文地址：https://www.cnblogs.com/zi-wang/p/9972102.html

时间： 2024-11-05 20:33:55

pytorch COCO2017 目标检测（一）DataLoader的相关文章

从零开始实现SSD目标检测（pytorch）（一）

目录从零开始实现SSD目标检测(pytorch) 第一章相关概念概述 1.1 检测框表示 1.2 交并比第二章基础网络第三章先验框设计第四章 LOSS设计从零开始实现SSD目标检测(pytorch) 特别说明: 本系列文章是Pytorch目标检测手册的翻译+总结知其然知其所以然,光看论文不够,得亲自实现第一章相关概念概述 1.1 检测框表示边界宽(bounding box)是包围一个物体(objective)的框,用来表示这个物体的位置.形状.大小等信息.不是最小外接矩形

# PyTorch目标检测学习小结

一.环境搭建当前:Windows10 + Anaconda3.6 1.1 创建PyTorch的虚拟环境打开Anaconda中的Anaconda Prompt那个黑框框,输入: #注意这里pytorch是自己设置的虚拟环境名称,可以随意取 conda create --name pytorch python=3.6 之后输入y,创建pytorch虚拟环境.以下是一些常规命令: #进入到虚拟环境 activate pytorch #切回root环境 activate root #删除虚拟环境 c

目标检测SSD模型pytorch版的权重参数

最近,我在学习目标检测算法中的SSD(Single Shot MultiBox Detector),GitHub上已经有人对SSD算法完成了pytorch版本的代码实现(https://github.com/amdegroot/ssd.pytorch),但是其中训练好的参数(ssd300_mAP_77.43_v2.pth)并不容易下载,因此我把它分享出来:链接:https://pan.baidu.com/s/1inytkGtOtppgrf22AEwrOQ,提取码:u721. 原文地址:https

分类和目标检测的性能评价指标

对于深度学习的网络模型,希望其速度快,内存小,精度高.因此需要量化指标来评价这些性能,常用的指标有:mAP(平均准确度均值,精度指标), FPS(每秒处理的图片数量或每张图片处理需要时间,同样硬件条件下的速度指标) , 模型参数大小(内存大小指标). 1.mAP (mean Avearage Precision) mAP指的是各类别的AP平均值,而AP指PR曲线的面积(precision和Recall关系曲线),因此得先了解下precision(精确率)和recall(召回率),以及相关的acc

目标检测框架阅读步骤

总结一下最近一年来阅读目标检测框架的一些经验,主要是基于facebook的maskrcnn_benchmark和Detectron2. 组件目标检测的框架需要包含: 组件功能 config 所有参数设置 structures 数据容器,image, box, label, mask, keypoint data 数据整理,dataset, dataloader, transform modeling 网络模型,meta_arch, backbone, proposals_generator,

目标检测基础

9.3 目标检测和边界框 %matplotlib inline from PIL import Image import sys sys.path.append('/home/kesci/input/') import d2lzh1981 as d2l # 展示用于目标检测的图 d2l.set_figsize() img = Image.open('/home/kesci/input/img2083/img/catdog.jpg') d2l.plt.imshow(img); # 加分号只显示图

目标检测方法——SSD

SSD论文阅读(Wei Liu--[ECCV2016]SSD Single Shot MultiBox Detector) 目录作者文章的选择原因方法概括方法细节相关背景补充实验结果与相关文章的对比总结作者文章的选择原因性能好,single stage 方法概括文章的方法介绍 SSD主要用来解决目标检测的问题(定位+分类),即输入一张待测图像,输出多个box的位置信息和类别信息测试时,输入一张图像到SSD中,网络输出一个下图最右边的tensor(多维矩阵),对该矩阵进行

基于深度学习的目标检测研究进展

前言开始本文内容之前,我们先来看一下上边左侧的这张图,从图中你看到了什么物体?他们在什么位置?这还不简单,图中有一个猫和一个人,具体的位置就是上图右侧图像两个边框(bounding-box)所在的位置.其实刚刚的这个过程就是目标检测,目标检测就是"给定一张图像或者视频帧,找出其中所有目标的位置,并给出每个目标的具体类别". 目标检测对于人来说是再简单不过的任务,但是对于计算机来说,它看到的是一些值为0~255的数组,因而很难直接得到图像中有人或者猫这种高层语义概念,也不清楚目标出现在

caffe框架下目标检测——faster-rcnn实战篇操作

原有模型 1.下载fasrer-rcnn源代码并安装 git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git 1) 经常断的话,可以采取两步: git clone https://github.com/rbgirshick/py-faster-rcnn.git 2) 到py-faster-rcnn中,继续下载caffe-faster-rcnn,采取后台跑: git submodule update --in

pytorch COCO2017 目标检测 （一）DataLoader

pytorch coco 目标检测 DataLoader实现

分析

从coco数据中读入图像、BBs以及类型

pytorch COCO2017 目标检测 （一）DataLoader的相关文章

pytorch COCO2017 目标检测（一）DataLoader

pytorch COCO2017 目标检测（一）DataLoader的相关文章