官方文档
注意版本一一对应
https://tensorflow.google.cn/install/source
其他请参考
Ubuntu16.04 基于NVIDIA 1080Ti安装TensorFlow-GPU
安装环境
- 系统:Ubuntu 18.04.02 desktop
- 显卡:NVIDIA GeForce GTX 2080
- 显卡驱动:NVIDIA-Linux-x86_64-410.72.run
- CUDA:cuda_10.0.130_410.48_linux
- cuDNN:
- libcudnn7_7.5.0.56-1+cuda10.0_amd64
- libcudnn7-dev_7.5.0.56-1+cuda10.0_amd64
- libcudnn7-doc_7.5.0.56-1+cuda10.0_amd64
- Tensorflow-gpu:1.13.1
安装版本选择时不要安装最新版,往低降一两个稳定版,注意相应软件之间的兼容性;
查看NVIDIA显卡驱动
netc@gpu-2:~$ nvidia-smi
Mon Mar 25 23:16:33 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2080 Off | 00000000:03:00.0 Off | N/A |
| 24% 40C P0 1W / 225W | 0MiB / 7949MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
netc@gpu-2:~$
安装CUDA
netc@gpu-2:/data/tools/GeForce-RTX-2080$ sudo sh cuda_10.0.130_410.48_linux.run
-----------------
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: y
Do you want to install the OpenGL libraries?
(y)es/(n)o/(q)uit [ default is yes ]: n
Do you want to run nvidia-xconfig?
This will update the system X configuration file so that the NVIDIA X driver
is used. The pre-existing X configuration file will be backed up.
This option should not be used on systems that require a custom
X configuration, such as systems with multiple GPU vendors.
(y)es/(n)o/(q)uit [ default is no ]: n
Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-10.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
[ default is /home/netc ]:
Installing the NVIDIA display driver...
Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...
Installing the CUDA Samples in /home/netc ...
Copying samples to /home/netc/NVIDIA_CUDA-10.0_Samples now...
Finished copying samples.
===========
= Summary =
===========
Driver: Installed
Toolkit: Installed in /usr/local/cuda-10.0
Samples: Installed in /home/netc
Please make sure that
- PATH includes /usr/local/cuda-10.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.
Logfile is /tmp/cuda_install_13131.log
netc@gpu-2:/data/tools/GeForce-RTX-2080$
查看CUDA版本
netc@gpu-2:/data/tools/GeForce-RTX-2080$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
netc@gpu-2:/data/tools/GeForce-RTX-2080$
更新pip3
netc@gpu-2:~/cudnn_samples_v7/mnistCUDNN$ sudo pip3 install --upgrade pip
The directory ‘/home/netc/.cache/pip/http‘ or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory.If executing pip with sudo, you may want sudo‘s -H flag.
The directory ‘/home/netc/.cache/pip‘ or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo‘s -H flag.
Collecting pip
Downloading http://mirrors.aliyun.com/pypi/packages/d8/f3/413bab4ff08e1fc4828dfc59996d721917df8e8583ea85385d51125dceff/pip-19.0.3-py2.py3-none-any.whl (1.4MB)
100% |████████████████████████████████| 1.4MB 4.0MB/s
Installing collected packages: pip
Found existing installation: pip 9.0.1
Not uninstalling pip at /usr/lib/python3/dist-packages, outside environment /usr
Successfully installed pip-19.0.3
安装tensorflow-gpu
netc@gpu-2:~/cudnn_samples_v7/mnistCUDNN$ sudo pip3 install --index-url https://mirrors.aliyun.com/pypi/simple tensorflow-gpu
The directory ‘/home/netc/.cache/pip/http‘ or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory.If executing pip with sudo, you may want sudo‘s -H flag.
The directory ‘/home/netc/.cache/pip‘ or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo‘s -H flag.
Looking in indexes: https://mirrors.aliyun.com/pypi/simple
Collecting tensorflow-gpu
Downloading https://mirrors.aliyun.com/pypi/packages/7b/b1/0ad4ae02e17ddd62109cd54c291e311c4b5fd09b4d0678d3d6ce4159b0f0/tensorflow_gpu-1.13.1-cp36-cp36m-manylinux1_x86_64.whl (345.2MB)
100% |████████████████████████████████| 345.2MB 4.4MB/s
Collecting absl-py>=0.1.6 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/da/3f/9b0355080b81b15ba6a9ffcf1f5ea39e307a2778b2f2dc8694724e8abd5b/absl-py-0.7.1.tar.gz (99kB)
100% |████████████████████████████████| 102kB 4.7MB/s
Collecting astor>=0.6.0 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/35/6b/11530768cac581a12952a2aad00e1526b89d242d0b9f59534ef6e6a1752f/astor-0.7.1-py2.py3-none-any.whl
Collecting numpy>=1.13.3 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/35/d5/4f8410ac303e690144f0a0603c4b8fd3b986feb2749c435f7cdbb288f17e/numpy-1.16.2-cp36-cp36m-manylinux1_x86_64.whl (17.3MB)
100% |████████████████████████████████| 17.3MB 4.3MB/s
Collecting keras-applications>=1.0.6 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/90/85/64c82949765cfb246bbdaf5aca2d55f400f792655927a017710a78445def/Keras_Applications-1.0.7-py2.py3-none-any.whl (51kB)
100% |████████████████████████████████| 61kB 7.2MB/s
Collecting gast>=0.2.0 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/4e/35/11749bf99b2d4e3cceb4d55ca22590b0d7c2c62b9de38ac4a4a7f4687421/gast-0.2.2.tar.gz
Collecting tensorboard<1.14.0,>=1.13.0 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/0f/39/bdd75b08a6fba41f098b6cb091b9e8c7a80e1b4d679a581a0ccd17b10373/tensorboard-1.13.1-py3-none-any.whl (3.2MB)
100% |████████████████████████████████| 3.2MB 4.2MB/s
Collecting termcolor>=1.1.0 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz
Requirement already satisfied: wheel>=0.26 in /usr/lib/python3/dist-packages (from tensorflow-gpu) (0.30.0)
Collecting grpcio>=1.8.6 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/f4/dc/5503d89e530988eb7a1aed337dcb456ef8150f7c06132233bd9e41ec0215/grpcio-1.19.0-cp36-cp36m-manylinux1_x86_64.whl (10.8MB)
100% |████████████████████████████████| 10.8MB 4.1MB/s
Collecting tensorflow-estimator<1.14.0rc0,>=1.13.0 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/bb/48/13f49fc3fa0fdf916aa1419013bb8f2ad09674c275b4046d5ee669a46873/tensorflow_estimator-1.13.0-py2.py3-none-any.whl (367kB)
100% |████████████████████████████████| 368kB 9.9MB/s
Collecting protobuf>=3.6.1 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/c5/60/ca38e967360212ddbb004141a70f5f6d47296e1fba37964d8ac6cb631921/protobuf-3.7.0-cp36-cp36m-manylinux1_x86_64.whl (1.2MB)
100% |████████████████████████████████| 1.2MB 3.9MB/s
Requirement already satisfied: six>=1.10.0 in /usr/lib/python3/dist-packages (from tensorflow-gpu) (1.11.0)
Collecting keras-preprocessing>=1.0.5 (from tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/c0/bf/0315ef6a9fd3fc2346e85b0ff1f5f83ca17073f2c31ac719ab2e4da0d4a3/Keras_Preprocessing-1.0.9-py2.py3-none-any.whl (59kB)
100% |████████████████████████████████| 61kB 4.8MB/s
Collecting h5py (from keras-applications>=1.0.6->tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/30/99/d7d4fbf2d02bb30fb76179911a250074b55b852d34e98dd452a9f394ac06/h5py-2.9.0-cp36-cp36m-manylinux1_x86_64.whl (2.8MB)
100% |████████████████████████████████| 2.8MB 4.1MB/s
Collecting markdown>=2.6.8 (from tensorboard<1.14.0,>=1.13.0->tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/7a/6b/5600647404ba15545ec37d2f7f58844d690baf2f81f3a60b862e48f29287/Markdown-3.0.1-py2.py3-none-any.whl (89kB)
100% |████████████████████████████████| 92kB 4.4MB/s
Collecting werkzeug>=0.11.15 (from tensorboard<1.14.0,>=1.13.0->tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/24/4d/2fc4e872fbaaf44cc3fd5a9cd42fda7e57c031f08e28c9f35689e8b43198/Werkzeug-0.15.1-py2.py3-none-any.whl (328kB)
100% |████████████████████████████████| 337kB 4.4MB/s
Collecting mock>=2.0.0 (from tensorflow-estimator<1.14.0rc0,>=1.13.0->tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/e6/35/f187bdf23be87092bd0f1200d43d23076cee4d0dec109f195173fd3ebc79/mock-2.0.0-py2.py3-none-any.whl (56kB)
100% |████████████████████████████████| 61kB 4.9MB/s
Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from protobuf>=3.6.1->tensorflow-gpu) (39.0.1)
Collecting pbr>=0.11 (from mock>=2.0.0->tensorflow-estimator<1.14.0rc0,>=1.13.0->tensorflow-gpu)
Downloading https://mirrors.aliyun.com/pypi/packages/14/09/12fe9a14237a6b7e0ba3a8d6fcf254bf4b10ec56a0185f73d651145e9222/pbr-5.1.3-py2.py3-none-any.whl (107kB)
100% |████████████████████████████████| 112kB 4.4MB/s
Installing collected packages: absl-py, astor, numpy, h5py, keras-applications, gast, protobuf, markdown, werkzeug, grpcio, tensorboard, termcolor, pbr, mock, tensorflow-estimator, keras-preprocessing, tensorflow-gpu
Running setup.py install for absl-py ... done
Running setup.py install for gast ... done
Found existing installation: protobuf 3.0.0
Uninstalling protobuf-3.0.0:
Successfully uninstalled protobuf-3.0.0
Running setup.py install for termcolor ... done
Successfully installed absl-py-0.7.1 astor-0.7.1 gast-0.2.2 grpcio-1.19.0 h5py-2.9.0 keras-applications-1.0.7 keras-preprocessing-1.0.9 markdown-3.0.1 mock-2.0.0 numpy-1.16.2 pbr-5.1.3 protobuf-3.7.0 tensorboard-1.13.1 tensorflow-estimator-1.13.0 tensorflow-gpu-1.13.1 termcolor-1.1.0 werkzeug-0.15.1
netc@gpu-2:~/cudnn_samples_v7/mnistCUDNN$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant(‘Hello, TensorFlow!‘)
>>> sess = tf.Session()
2019-03-25 23:32:23.967770: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-25 23:32:23.968691: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x2ce8960 executing computations on platform CUDA. Devices:
2019-03-25 23:32:23.968749: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): GeForce RTX 2080, Compute Capability 7.5
2019-03-25 23:32:23.992261: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200065000 Hz
2019-03-25 23:32:23.994027: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x33acc10 executing computations on platform Host. Devices:
2019-03-25 23:32:23.994073: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
2019-03-25 23:32:23.994507: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce RTX 2080 major: 7 minor: 5 memoryClockRate(GHz): 1.8
pciBusID: 0000:03:00.0
totalMemory: 7.76GiB freeMemory: 7.62GiB
2019-03-25 23:32:23.994558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-03-25 23:32:23.995840: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-25 23:32:23.995878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-03-25 23:32:23.995900: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-03-25 23:32:23.996310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7413 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080, pci bus id: 0000:03:00.0, compute capability: 7.5)
>>> print(sess.run(hello))
b‘Hello, TensorFlow!‘
报错总结:
运行import tensorflow时报错:
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
原因:
tensorflow版本与CUDA的版本不对应,tensorflow需要的cuda为10.0;
对应关系:https://tensorflow.google.cn/install/source
查看cuda版本
cat /usr/local/cuda/version.txt
查看cudnn版本
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
原文地址:https://blog.51cto.com/moerjinrong/2368993
时间: 2024-10-13 01:30:44