读取TFRecord文件报错

读取保存有多个样例的TFRecord文件时报错:

InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 14410143 values, but the requested shape has 230400
	 [[Node: Reshape = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](DecodeRaw, Reshape/shape)]]

从报错信息来看,与reshape有关。

先生成保存有两张图片的TFRecord文件

 1 #!coding:utf8
 2
 3 import tensorflow as tf
 4 import numpy as np
 5 from PIL import Image
 6
 7 INPUT_DATA = [‘/home/error/tt/cat.jpg‘, ‘/home/error/tt/5605502523_05acb00ae7_n.jpg‘]  # 输入文件
 8 OUTPUT_DATA = ‘/home/error/tt.tfrecords‘  # 输出文件
 9
10 def _int64_feature(value):
11     return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
12
13
14 def _bytes_feature(value):
15     return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
16
17
18 def create_image_lists(sess):
19         writer1 = tf.python_io.TFRecordWriter(OUTPUT_DATA)
20
21         # 处理图片数据
22         for f in INPUT_DATA:
23             print(f)
24             image = Image.open(f)
25             image_value = np.asarray(image, np.uint8)
26
27             height, width, channles = image_value.shape
28             print(height, width, channles, height*width*channles)
29             label = 0
30
31             example = tf.train.Example(features=tf.train.Features(feature={
32                 ‘name‘: _bytes_feature(f.encode(‘utf8‘)),
33                 ‘image‘: _bytes_feature(image_value.tostring()),
34                 ‘label‘: _int64_feature(label),
35                 ‘height‘: _int64_feature(height),
36                 ‘width‘: _int64_feature(width),
37                 ‘channels‘: _int64_feature(channles)
38             }))
39             serialized_example = example.SerializeToString()
40             writer1.write(serialized_example)
41
42         writer1.close()
43
44
45 with tf.Session() as sess:
46     create_image_lists(sess)
47
48
49 # /home/error/tt/cat.jpg
50 # 1797 2673 3 14410143
51 # /home/error/tt/5605502523_05acb00ae7_n.jpg
52 # 240 320 3 230400

错误读取:

 1 #!coding:utf8
 2 import tensorflow as tf
 3 import matplotlib.pyplot as plt
 4 import numpy as np
 5
 6 OUTPUT_DATA = ‘/home/error/tt.tfrecords‘  # 输出文件
 7 train_queue = tf.train.string_input_producer([OUTPUT_DATA])
 8
 9
10 def read_file(file_queue, sess):
11     reader = tf.TFRecordReader()
12     _, serialized_example = reader.read(file_queue)
13     features = tf.parse_single_example(
14         serialized_example,
15         features={
16             ‘image‘: tf.FixedLenFeature([], tf.string),
17             ‘label‘: tf.FixedLenFeature([], tf.int64),
18             ‘height‘: tf.FixedLenFeature([], tf.int64),
19             ‘width‘: tf.FixedLenFeature([], tf.int64),
20             ‘channels‘: tf.FixedLenFeature([], tf.int64),
21         })
22
23     image, label = features[‘image‘], features[‘label‘]
24     height, width = features[‘height‘], features[‘width‘]
25     channels = features[‘channels‘]
26     decoded_image = tf.decode_raw(image, tf.uint8)
27
28
29     print(sess.run(decoded_image).shape)  # (14410143,)
30     #  为啥打印时,会影响reshape??  sess.run造成的,因为可视化时也会出现这个问题,但不知道原因;
31     # 原因是执行sess.run()时会从队列中重新取一个样例导致样例不同。
32
33     height_val, width_val, channels_val = sess.run([height, width, channels])
34     print(height_val, width_val, channels_val, height_val*width_val*channels_val)  # 240 320 3 230400
35     reshaped_decoded_image = tf.reshape(decoded_image, [height_val, width_val, channels_val])
36     # print(reshaped_decoded_image.shape)  # (240, 320, 3)
37
38     reshaped_decoded_image_val = sess.run(reshaped_decoded_image)  # reshape时不会报错,当执行运算时才会报错
39     # plt.imshow(sess.run(reshaped_decoded_image))
40     # plt.show()
41
42 with tf.Session() as sess:
43     tf.local_variables_initializer().run()
44
45     coord = tf.train.Coordinator()
46     threads = tf.train.start_queue_runners(sess=sess, coord=coord)
47
48     for _ in range(2):
49         read_file(train_queue, sess)
50
51     coord.request_stop()
52     coord.join(threads)
53
54 # 从保存有多个样例的tfrecord文件中读取数据会报错
55 # 报错日志:
56 # InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 14410143 values, but the requested shape has 230400
57 #      [[Node: Reshape = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](DecodeRaw, Reshape/shape)]]

正确读取:

#!coding:utf8
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

OUTPUT_DATA = ‘/home/error/tt.tfrecords‘  # 输出文件
train_queue = tf.train.string_input_producer([OUTPUT_DATA])

def read_file(file_queue, sess):
    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(file_queue)
    features = tf.parse_single_example(
        serialized_example,
        features={
            ‘image‘: tf.FixedLenFeature([], tf.string),
            ‘label‘: tf.FixedLenFeature([], tf.int64),
            ‘height‘: tf.FixedLenFeature([], tf.int64),
            ‘width‘: tf.FixedLenFeature([], tf.int64),
            ‘channels‘: tf.FixedLenFeature([], tf.int64),
        })

    image, label = features[‘image‘], features[‘label‘]
    height, width = features[‘height‘], features[‘width‘]
    channels = features[‘channels‘]
    decoded_image = tf.decode_raw(image, tf.uint8)

    return decoded_image, label, height, width, channels

    # decoded_image_val, label_val, height_val, width_val, channels_val = sess.run([decoded_image, label, height, width, channels])
    # # print(decoded_image.shape)  # (230400,)
    # #  为啥打印时,会影响reshape??  sess.run造成的,因为可视化时也会出现这个问题,但不知道原因; 原因是会从队列中重新取一个样例导致样例不同。

with tf.Session() as sess:
    tf.local_variables_initializer().run()

    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)

    decoded_image, label, height, width, channels = read_file(train_queue, sess)

    for _ in range(6):
        # 迭代`sess.run`本身,能够保证在同一时刻处理的是同一个样例
        decoded_image_val, label_val, height_val, width_val, channels_val = sess.run([decoded_image, label, height, width, channels])
        print(height_val)
        reshaped_decoded_image = np.reshape(decoded_image_val, [height_val, width_val, channels_val])
        plt.imshow(reshaped_decoded_image)
        plt.show()

    coord.request_stop()
    coord.join(threads)

InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 14410143 values, but the requested shape has 230400
     [[Node: Reshape = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](DecodeRaw, Reshape/shape)]]

原文地址:https://www.cnblogs.com/yangxiaoling/p/9629831.html

时间: 2024-10-08 18:29:35

读取TFRecord文件报错的相关文章

.net 读取Excel文件报错

错误内容 Microsoft Office Excel 不能访问文件“D:\WWWRoot\Website\Test\Excels\Test1.xls”. 可能的原因有: 1 文件名称或路径不存在. 2 文件正被其他程序使用. 3 您正要保存的工作簿与当前打开的工作簿同名. 解决办法: 1 1).通过webconfig中增加模拟,加入管理员权限, <identity impersonate="true" userName="系统管理员" password=&q

Python读取txt文件报错:UnicodeDecodeError: &#39;utf-8&#39; codec can&#39;t decode byte 0xc8 in position 0

Python使用open读取txt中文内容的文件时,有可能会报错,报错内容如下:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0. 这里举一个例子:txt=open("threekingdoms.txt","r",encoding="utf-8").read(),在运行时就会报错. 要处理这个问题有两个办法,如下: 第一个办法,将编码方式由utf-8改为g

Python读取log文件报错“UnicodeDecodeError”

问题描述: 写了一个读取log文件的Python脚本: # -*- coding:utf-8 -*- import os import numpy as np file = 'D:\pythonfile\test.log' for line in open("test.log","r"): print(line) 但是在执行时报错:执行代码报错: Traceback (most recent call last): File "D:/pythonfile/

读取xml文件报错:Invalid byte 2 of 2-byte UTF-8 sequence。

程序读取xml文件后,系统报“Invalid byte 2 of 2-byte UTF-8 sequence”错误,如何解决呢? 1.程序解析xml的时候,出现Invalid byte 2 of 2-byte UTF-8 sequence的错误 2.应该是编码出现错误,一般用UE,editplus等工具打开文件.修改完成文件后,保存为UTF-8格式就可以了:或者用UltraEdit打开该xml ,一次 文件-->转换-->ASCII到UTF-8, 再保存,即可. 3.也有可能是文件少了头文件定

Pandas读取csv文件报错UnicodeDecodeError

1.问题描述: 在使用pandas的read_csv的时候,报错,如下,UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 254: unexpected end of data. 经过分析,可能是utf-8不识别该字符,经过各种尝试,找到解决方法. 2.解决办法 找到报错的csv文件,以记事本形式打开 打开后文件,另存为,将文件编码格式由ANSI改为UTF-8,文件类型存为csv这里有一个编码格式. 3.执行

python读取xml文件报错ValueError: multi-byte encodings are not supported

1.在使用python对xml文件进行读取时,提示ValueError: multi-byte encodings are not supported 很多贴子上说把xml的编码格式改为,就可以正常执行了 <?xml version="1.0" encoding="utf-8"?> 但是事实证明,不成功,解决方法 1.用记事本方式打开xml文件,encoding后面指的就是编码方式 2.把你的xml文件另外为utf-8 在进行读取,文件编码问题解决 原文

[解决问题] pandas读取csv文件报错OSError解决方案

python用padans.csv_read函数出现OSError: Initializing from file failed 问题:文件路径中存在中文 解决办法:修改文件路径名为全英文包括文件名 原文地址:https://www.cnblogs.com/liu66blog/p/8494997.html

php中读取中文文件夹及文件报错

php读取时出现中文乱码 一般php输出中出现中文乱码我们可用 header ('content:text/html;charset="utf-8"'); php中读取中文文件夹及文件报错? 这就要用到iconv函数了 但php.5以下的版本好像要改php.ini这个配置文件 但我用的是php高版本所以可以直接用 iconv这个函数 用法: iconv('GB2312', 'UTF-8', $file); 但想要继续打开中文文件夹(二级中文目录),还是不行还是会报错, 我认为这应该是地

eclipse 向HDFS中写入文件报错 permission denied

环境:win7  eclipse    hadoop 1.1.2 当执行创建文件的的时候, 即: fileSystem.mkdirs(Path);//想hadoop上创建一个文件报错 报错: org.apache.hadoop.security.AccessControlException:Permission denied:user=Administrator,access=WRITE,inode="tmp":root:supergroup:rwxr-xr-x 原因: 1. 当前用户