远程配置 tensorflow 环境 / 憋错料

1.为了对比别人的方法，需要配置的环境为：Python 3.6.4，Keras 2.1.6，Tensorflow 1.7.0

在自己电脑上，anaconda3, 直接用原环境下的 tensorflow-gpu1.13.1 发现最开始的部分代码段可以运行，但无法保存model 的代码段不起作用，造成错误。

 callbacks=[ModelCheckpoint(filepath=filepath_INCV, monitor=‘val_acc‘, verbose=1, save_best_only=INCV_save_best),

只好寻求安装虚拟环境，准备一模一样的设置。

2. 新开了一个虚拟环境，tf1.7.0, 刚开始没有安装成功，似乎是选用的python版本没有完全对应，删除环境后又重新开始，成功了。似乎如果用 conda install tensorflow-gpu=1.7.0 命令在线安装tensorflow 时，会自动安装所需的 cuda9.0 toolkit 以及 cudadnn 包。

然后运行 import tensorflow as tf 会报缺少 numpy 和 pandas 等包，依次是用 conda install 安装就好，其会自动安装对应可用的版本。

import tensorflow 时，还会报

/home/guixj/anaconda3/envs/tf1.7.0/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:458: FutureWarning: Passing (type, 1) or ‘1type‘ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type‘.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/guixj/anaconda3/envs/tf1.7.0/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:459: FutureWarning: Passing (type, 1) or ‘1type‘ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type‘.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/guixj/anaconda3/envs/tf1.7.0/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:460: FutureWarning: Passing (type, 1) or ‘1type‘ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type‘.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/guixj/anaconda3/envs/tf1.7.0/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:461: FutureWarning: Passing (type, 1) or ‘1type‘ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type‘.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/guixj/anaconda3/envs/tf1.7.0/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:462: FutureWarning: Passing (type, 1) or ‘1type‘ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type‘.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/guixj/anaconda3/envs/tf1.7.0/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:465: FutureWarning: Passing (type, 1) or ‘1type‘ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type‘.
  np_resource = np.dtype([("resource", np.ubyte, 1)])

不过这个不是太大问题，可以修改，也可以不改。

2019-12-24 10:12:41.915407: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2019-12-24 10:12:41.915435: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2019-12-24 10:12:41.915442: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2019-12-24 10:12:41.915447: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2019-12-24 10:12:41.915469: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.

这个也不是太大问题，可以改也可以不改。

安装完 keras 后，发现有一个小bug, keras 与 tensorflow 不兼容，参照 https://ask.csdn.net/questions/687001 将 tf.nn.softmax(*, axis=axis) 换成 dim=axis, 然后就修复了。

softmax() got an unexpected keyword argument ‘axis‘

3. 在远程服务器 Ubuntu18.04 上配置上述环境

这个其实有一点麻烦，因为 tensorflow-gpu 1.7.0 需要cuda9.0 版本，而 cuda9.0 版本原本只支持在 ubuntu 16.04 和 ubuntu17.10 上安装。https://blog.csdn.net/hellocsz/article/details/88372819 指出 python3.6.4 对应 anaconda3-5.1.0，于是从 https://mirrors.tuna.tsinghua.edu.cn 下载了对应版本的 anaconda。

https://blog.csdn.net/qq_27825451/article/details/89082978 指出（并且此文详细地阐述了 cuda 、cudnn 和 graphical driver(显卡驱动）之间的关系）

tensorflow_gpu-1.7.0    python 2.7、  python 3.3-3.6    GCC 4.8    Bazel 0.9.0    cudnn 7   cuda 9

从 nvidia 官网下载了 cuda9.2 及其一个补丁文件 ( 由于没有对应 Ubuntu 18.04，于是下载了 Ubuntu 17.10的，但按照下面博客，似乎下载对应 ubuntu16.04 的可能会更好)，然后同时下载对应的 cudnn 文件

而 ubuntu18.04 的 gcc（g++）编译器和内核版本都过高，造成了一些可能存在的问题。按照A文（ https://www.jianshu.com/p/00c37b09f0f3 ）及其进一步的链接 https://www.jianshu.com/p/f66eed3a3a25 切换了 gcc(g++) 的版本。

然后进一步按照 A 文的指示安装 cuda9.2, 第一次选择了安装显卡驱动，但报如下错误：

Installing the NVIDIA display driver...
The driver installation is unable to locate the kernel source. Please make sure that the kernel source packages are installed and set up correctly.
If you know that the kernel source packages are installed and set up correctly, you may pass the location of the kernel source with the ‘--kernel-source-path‘ flag.

===========
= Summary =
===========

Driver:   Installation Failed
Toolkit:  Installation skipped
Samples:  Not Selected

查了其他的一些博客都说是内核版本过高，

https://askubuntu.com/questions/829890/nvidia-driver-install-fails-unable-to-locate-the-kernel-source

https://blog.51cto.com/xiaoxiaozhou/2344649?source=dra

2cto.com/net/201904/804672.html

https://blog.csdn.net/net_wolf/article/details/100178800

需要降低内核版本，但我有点嫌弃太麻烦了，并且 A 文也未提此。于是心存侥幸，又试了一下，这一次没有选择安装显卡驱动。

奇怪的是这一次居然 toolkit 安装成功了。莫非内核版本过高只影响显卡驱动？不影响 cuda toolkit? 暂时还不是非常清楚

安装 cudnn v7.4, 奇怪的是其下载下来后名字居然是 cudnn-9.2, 应该是为了和 cuda 版本号保持一致。按照 https://blog.csdn.net/fengliang4616/article/details/90142747 设置就好

于是进一步安装 anaconda，注意anaconda 版本的选择，保证其默认安装的 Python版本就是所需的 Python 版本，其他按照默认设置就好。

奇怪! 远程服务器居然可以联网，并且使用 conda install tensorflow-gpu=1.7.0, 并且居然重装了 Python，cudnn, cuda toolkit

Downloading and Extracting Packages
xz 5.2.4: ############################################################## | 100%
pip 19.3.1: ############################################################ | 100%
python 3.6.6: ########################################################## | 100%
absl-py 0.8.1: ######################################################### | 100%
libedit 3.1.20181209: ################################################## | 100%
tensorflow-gpu 1.7.0: ################################################## | 100%
cupti 9.0.176: ######################################################### | 100%
openssl 1.0.2t: ######################################################## | 100%
libprotobuf 3.6.0: ##################################################### | 100%
bleach 1.5.0: ########################################################## | 100%
cudnn 7.6.4: ########################################################### | 100%
certifi 2019.11.28: #################################################### | 100%
sqlite 3.30.1: ######################################################### | 100%
readline 7.0: ########################################################## | 100%
libgcc-ng 9.1.0: ####################################################### | 100%
cudatoolkit 9.0: ####################################################### | 100%
ca-certificates 2019.11.27: ######################################################################################################################################################################## | 100%
werkzeug 0.16.0: ################################################################################################################################################################################### | 100%
grpcio 1.12.1: ##################################################################################################################################################################################### | 100%
protobuf 3.6.0: #################################################################################################################################################################################### | 100%
zlib 1.2.11: ####################################################################################################################################################################################### | 100%
_libgcc_mutex 0.1: ################################################################################################################################################################################# | 100%
numpy 1.14.2: ###################################################################################################################################################################################### | 100%
astor 0.8.0: ####################################################################################################################################################################################### | 100%
gast 0.3.2: ######################################################################################################################################################################################## | 100%
html5lib 0.9999999: ################################################################################################################################################################################ | 100%
markdown 3.1.1: #################################################################################################################################################################################### | 100%
setuptools 42.0.2: ################################################################################################################################################################################# | 100%
blas 1.0: ########################################################################################################################################################################################## | 100%
wheel 0.33.6: ###################################################################################################################################################################################### | 100%
tensorflow-gpu-base 1.7.0: ######################################################################################################################################################################### | 100%
tk 8.6.8: ########################################################################################################################################################################################## | 100%
six 1.13.0: ######################################################################################################################################################################################## | 100%
termcolor 1.1.0: ################################################################################################################################################################################### | 100%
tensorboard 1.7.0: ################################################################################################################################################################################# | 100%
ncurses 6.1: ####################################################################################################################################################################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

继而在线安装 keras-2.1.6

[email protected]:~/anaconda3# conda install keras=2.1.6
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.4.10
  latest version: 4.8.0

Please update conda by running

    $ conda update -n base conda

## Package Plan ##

  environment location: /root/anaconda3

  added / updated specs:
    - keras=2.1.6

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    tensorflow-base-1.7.0      |   py36hdbcaa40_2        38.7 MB
    keras-2.1.6                |           py36_0         500 KB
    tensorflow-1.7.0           |                0           3 KB
    ------------------------------------------------------------
                                           Total:        39.2 MB

The following NEW packages will be INSTALLED:

    keras:           2.1.6-py36_0
    tensorflow:      1.7.0-0
    tensorflow-base: 1.7.0-py36hdbcaa40_2

Proceed ([y]/n)? y

Downloading and Extracting Packages
tensorflow-base 1.7.0: ############################################################################################################################################################################# | 100%
keras 2.1.6: ####################################################################################################################################################################################### | 100%
tensorflow 1.7.0: ################################################################################################################################################################################## | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

# 待续

原文地址：https://www.cnblogs.com/Gelthin2017/p/12094276.html

时间： 2024-11-08 21:30:26

远程配置 tensorflow 环境

远程配置 tensorflow 环境的相关文章

win10 配置tensorflow环境

使用亚马逊的云服务器EC2做深度学习（三）配置TensorFlow

【转】Ubuntu 16.04安装配置TensorFlow GPU版本

Win764位配置Github环境及将代码部署到Gtihub pages-志银强势总结

客户机远程配置交换机

ubuntu16.4搭建tensorflow环境

Anaconda3配置TensorFlow深度学习库

Bugtags 远程配置功能介绍

linux下手动配置lamp环境