https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版

#!/usr/bin/env python
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r"""Process the ImageNet Challenge bounding boxes for TensorFlow model training.

Associate the ImageNet 2012 Challenge validation data set with labels.

The raw ImageNet validation data set is expected to reside in JPEG files
located in the following directory structure.

 data_dir/ILSVRC2012_val_00000001.JPEG
 data_dir/ILSVRC2012_val_00000002.JPEG
 ...
 data_dir/ILSVRC2012_val_00050000.JPEG

This script moves the files into a directory structure like such:
 data_dir/n01440764/ILSVRC2012_val_00000293.JPEG
 data_dir/n01440764/ILSVRC2012_val_00000543.JPEG
 ...
where ‘n01440764‘ is the unique synset label associated with
these images.

This directory reorganization requires a mapping from validation image
number (i.e. suffix of the original file) to the associated label. This
is provided in the ImageNet development kit via a Matlab file.

In order to make life easier and divorce ourselves from Matlab, we instead
supply a custom text file that provides this mapping for us.

Sample usage:
  ./preprocess_imagenet_validation_data.py ILSVRC2012_img_val   imagenet_2012_validation_synset_labels.txt
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import sys

from six.moves import xrange  # pylint: disable=redefined-builtin

if __name__ == ‘__main__‘:
  if len(sys.argv) < 3:  # sys.argv返回脚本本身的名字及给定脚本的参数.
    print(‘Invalid usage\n‘
          ‘usage: preprocess_imagenet_validation_data.py ‘
          ‘<validation data dir> <validation labels file>‘)
    sys.exit(-1)  # System.exit(-1)是指所有程序（方法，类等）停止，系统停止运行。
  data_dir = sys.argv[1]
  validation_labels_file = sys.argv[2]

  # Read in the 50000 synsets associated with the validation data set.
  # imagenet_2012_validation_synset_labels.txt 这个文件中有50000行类别，有重复，与50000图片是一一对应的
  labels = [l.strip() for l in open(validation_labels_file).readlines()]  # strip() 方法用于移除字符串头尾指定的字符（默认为空格或换行符）。
  unique_labels = set(labels)  # set() 函数创建一个无序不重复元素集，可进行关系测试，删除重复数据，还可以计算交集、差集、并集等。

  # Make all sub-directories in the validation data dir.
  for label in unique_labels:
    labeled_data_dir = os.path.join(data_dir, label)
    if not os.path.exists(labeled_data_dir):
    	os.makedirs(labeled_data_dir)

  # Move all of the image to the appropriate sub-directory.
  for i in xrange(len(labels)):  # xrange() 函数用法与 range 完全相同，所不同的是生成的不是一个数组，而是一个生成器。
    basename = ‘ILSVRC2012_val_000%.5d.JPEG‘ % (i + 1)
    original_filename = os.path.join(data_dir, basename)
    if not os.path.exists(original_filename):
      #print(‘Failed to find: ‘ % original_filename)
      continue
      #sys.exit(-1)
    new_filename = os.path.join(data_dir, labels[i], basename)
    os.rename(original_filename, new_filename)

82行的代码一加进去，就出错：

TypeError: not all arguments converted during string formatting

过程中还出现了以下错误：

Organizing the validation data into sub-directories.
Traceback (most recent call last):
File "F:/datasets/preprocess_imagenet_validation_data.py", line 86, in <module>
os.rename(original_filename, new_filename)
PermissionError: [WinError 32] ??????????????????????????????????: ‘F:/ILSVRC2012_img_val/ILSVRC2012_val_00032304.JPEG‘ -> ‘F:/ILSVRC2012_img_val/n02109961\\ILSVRC2012_val_00032304.JPEG‘

可能是不能够一次性重命名太多文件，反正我重新运行了

./download_and_convert_imagenet.sh /f/ILSVRC2012_img_val_varified

preprocess_imagenet_validation_data.py这个程序可以继续重命名文件。

原文地址：https://www.cnblogs.com/Time-LCJ/p/9135805.html

时间： 2024-10-08 06:43:54

https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版的相关文章

结对项目https://github.com/bxoing1994/test/blob/master/源代码

所选项目名称:文本替换结对人:曲承玉 github地址 :https://github.com/bxoing1994/test/blob/master/源代码用一个新字符串替换文本文件中所有出现每个字符串的地方.文件名和字符串都作为命令行参数进行传递.给出相应的测试文件和测试字符串. 项目设计方案一起选定项目敲定大体结构后,我负责测试和修改,搭档负责写的代码首先,需要定义一个命令把文本文档读入内存,并进行异常处理:然后定义一个写数据流,以便于替换:最后将内存中修改

用swoole实现mysql的连接池--摘自https://github.com/153734009/doc/blob/master/php/mysql_pool.php

<?php $serv = new swoole_server("0.0.0.0", 9508); $serv->set(['worker_num'=>1, 'task_worker_num'=>5]); function onReceive($serv, $fd, $from_id, $data) { $sql = $data; $result = $serv->taskwait($sql); if($result !== f

TensorFlow models - object detection API 安装

tensorflow 的 models 模块非常有用,不仅实现了各种模型,也包括了原作者训练好的模型及其使用方法,本文以 object detection 为例来说明如何使用训练好的模型: 首先呢,还是建议去官网看看使用方法,因为 tensorflow 的版本混乱,网上教程针对的版本各不相同,所以各种坑: 下面是正题,本文针对 windows 操作系统: 第一步:下载 models 模块,解压 https://github.com/tensorflow/models 第二步:安

https://github.com/CocoaPods/CocoaPods/search?q=No+such+file+or+directory报错解决方案

――― MARKDOWN TEMPLATE ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― ### Command ``` /Users/rwx-mac/.rvm/rubies/ruby-2.0.0-p643/bin/pod install ``` ### Report * What did you do? * What did you expect to happen? * What happened instead? #

TensorFlow和最近发布的slim

笔者将和大家分享一个结合了TensorFlow和最近发布的slim库的小应用,来实现图像分类.图像标注以及图像分割的任务,围绕着slim展开,包括其理论知识和应用场景. 之前自己尝试过许多其它的库,比如Caffe.Matconvnet.Theano和Torch等.它们各有优劣,而我想要一个可靠灵活的.自带预训练模型的python库.最近,新推出了一款名叫slim的库,slim自带了许多预训练的模型,比如ResNet.VGG.Inception-ResNet-v2(ILSVRC的新赢家)等等.这个

git推送到github报错：error: The requested URL returned error: 403 Forbidden while accessing https://github.com

最近使用git命令从github克隆仓库到版本,然后进行提交到github时报错如下: [[email protected] git_test]# git push origin mastererror: The requested URL returned error: 403 Forbidden while accessing https://github.com/jsonhc/git_test.git/info/refs fatal: HTTP request failed 解决办法:参考

https://github.com/996icu/996.ICU/blob/master/blacklist/blacklist.md

以实际行动声援996icu项目. https://github.com/996icu/996.ICU/blob/master/blacklist/blacklist.md 996公司黑名单,京东,华为等大公司赫然在列. 加班最严重的公司是哪家? 我们看看弱西是如何对待他口中的"兄弟们"的:https://www.bianews.com/news/details?id=33843 弱西不会开除一个兄弟,开除的都不是兄弟. 任正非,任老先生,我姑且尊重你一下,毕竟你年纪打了. 你口中的“狼

fatal: could not read Username for 'https://github.com': No such file or directo

Git push origin master报错 fatal: could not read Username for 'https://github.com': No such file or directo 原因使用https方式的时候在git remote add origin 的https url 里面没有用户名和密码修改为如下: git remote add origin https://{username}:{password}@github.com/{username}/pro

Detectron系统实现了最先进的物体检测算法https://github.com/facebookresearch/Detectron

,包括Mask R-CNN. 它是用Python编写的,支持Caffe2深度学习框架. 不久前,FAIR才开源了语音识别的工具wav2letter,戳这里看大数据文摘介绍<快讯 | Facebook开源语音识别工具包wav2letter>. 这一系列工具的开源,将使更多研究人员能使用到Facebook的平台,进一步扩大Facebook人工智能实验室的影响力. 针对Detectron的开源,研究员Ross Girshick发表了一篇博客,具体介绍了该开源平台的性能. Detectron 项目于2