单机多GPU训练报错

问题一：

　　在keras中使用多个GPU训练模型时，出现错误 AttributeError: ‘_TfDeviceCaptureOp‘ object has no attribute ‘_set_device_from_string‘ ，根据错误提示是‘_TfDeviceCaptureOp‘对象没有属性‘_set_device_from_string‘。

解决措施：经过思考，我觉得我的tensorflow版本可能有问题，所以将tensorflow从1.14.0版本降到1.12.0版本，此问题得到解决。但是又出现了问题二。

问题二：

　　Can‘t concatenate scalars (use tf.stack instead) for ‘yolo_loss_1/concat‘ (op: ‘ConcatV2‘) with input shapes: [], [], [] 。报错信息提示两个对象不能拼合。

解决措施：第一次看见一脸懵逼，后来看代码发现是调用 model = multi_gpu_model(model, gpus=2) 出错，这里的model对象是有问题的，它应该是网络架构，不应该包含其他，而我在调用时写在了模型文件加载之后，所以我调整了代码将这段代码改为：model_body = multi_gpu_model(model_body, gpus=2) ，这里传入的model_body是keras的 Model 对象。

原文地址：https://www.cnblogs.com/dan-baishucaizi/p/12326026.html

时间： 2024-11-09 03:03:17

单机多GPU训练报错的相关文章

detectron2安装出现Kernel not compiled with GPU support 报错信息

在安装使用detectron2的时候碰到Kernel not compiled with GPU support 问题,前后拖了好久都没解决,现总结一下以备以后查阅. 不想看心路历程的可以直接跳到最后一小节,哈哈哈. environment 因为我使用的是实验室的服务器,所以很多东西没法改,我的cuda环境如下: ubuntu nvcc默认版本是9.2 nvidia-smi版本又是10.0的我之前一直没搞清楚这nvcc和nvidia-smi版本为什么可以不一样,想了解原因的可以看一下我之前的文

使用opencv训练分类器时，traincascade训练报错：Train dataset for temp stage can not be filled.解决方法

opencv分类器训练中,出错一般都是路径出错,例如, 1.opencv_traincascade.exe路径 2.负样本路径文件,neg.dat中的样本前路径是否正确 3.移植到别的电脑并修改完路径后,最好重新生成正样本描述文件,pos.vec 4.同时修改cmd命令中的相关路径知识付费时代,觉得对您有帮助的,别忘了打赏,附微信收款码原文地址:https://www.cnblogs.com/runningsoybean/p/10420224.html

storm单机运行报错 ERROR backtype.storm.daemon.executor -

单机本地运行storm报错: 错误如下: java.lang.NullPointerException: null at test2.Spot2.nextTuple(Spot2.java:27) ~[classes/:na] at backtype.storm.daemon.executor$fn__3371$fn__3386$fn__3415.invoke(executor.clj:572) ~[storm-core-0.9.6.jar:0.9.6] at backtype.storm.uti

解决报错Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/device:GPU:0'

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))改为如下:sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)) 备注:allow_soft_placement=True表示当没有GPU实现可用时,使用将允许TensorFlow回退到CPU. 解决报错Could not sati

单机多GPU训练报错

单机多GPU训练报错的相关文章

detectron2安装出现Kernel not compiled with GPU support 报错信息

使用opencv训练分类器时，traincascade训练报错：Train dataset for temp stage can not be filled.解决方法

storm单机运行报错 ERROR backtype.storm.daemon.executor -

解决报错Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/device:GPU:0'

Ubuntu14.04（估计16.04也可以用，参照的就是16.04）+opencv + caffe(GPU版) + cuDnn超详细包括报错

安装caffe-ssd的GPU版本时候报错：BatchReindexLayerTest/2.TestGradient，where TypeParam=caffe::GPUdevice<float>

安装caffe-ssd的GPU版本时候报错：BatchReindexLayerTest/2.TestGradient，where TypeParam=caffe::GPUdevice<double>

caffe-ssd的GPU在make runtest的时候报错：BatchReindexLayerTest/2.TestGradient，where TypeParam=caffe::GPUdevice（<float>）(<double>)

pytorch中使用多显卡训练以及训练时报错：expect more than 1 value per channel when training, got input size..