安装环境:
1、centos7.3
2、NVIDIA Corporation GP106 [GeForce GTX 1060 6GB]
安装nvidia-docker
a、安装docker 可参考centos7 安装docker
b:
# Install nvidia-docker and nvidia-docker-plugin wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker-1.0.1-1.x86_64.rpm sudo rpm -i /tmp/nvidia-docker*.rpm && rm /tmp/nvidia-docker*.rpm sudo systemctl start nvidia-docker # Test nvidia-smi nvidia-docker run --rm nvidia/cuda nvidia-smi
如果出现异常:
[[email protected] ~]# nvidia-docker run --rm nvidia/cuda nvidia-smi /usr/bin/docker-current: Error response from daemon: create nvidia_driver_384.69: create nvidia_driver_384.69: Error looking up volume plugin nvidia-docker: plugin not found. See ‘/usr/bin/docker-current run --help‘.
则检查nvidia-docker是否启动:
[[email protected] ~]# systemctl status nvidia-docker ● nvidia-docker.service - NVIDIA Docker plugin Loaded: loaded (/usr/lib/systemd/system/nvidia-docker.service; disabled; vendor preset: disabled) Active: inactive (dead) Docs: https://github.com/NVIDIA/nvidia-docker/wiki [[email protected] ~]# systemctl start nvidia-docker [[email protected] ~]# systemctl status nvidia-docker ● nvidia-docker.service - NVIDIA Docker plugin Loaded: loaded (/usr/lib/systemd/system/nvidia-docker.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2017-08-30 03:18:32 CST; 5s ago Docs: https://github.com/NVIDIA/nvidia-docker/wiki Process: 11135 ExecStartPost=/bin/sh -c /bin/echo unix://$SOCK_DIR/nvidia-docker.sock > $SPEC_FILE (code=exited, status=0/SUCCESS) Process: 11131 ExecStartPost=/bin/sh -c /bin/mkdir -p $( dirname $SPEC_FILE ) (code=exited, status=0/SUCCESS) Main PID: 11130 (nvidia-docker-p) CGroup: /system.slice/nvidia-docker.service └─11130 /usr/bin/nvidia-docker-plugin -s /var/lib/nvidia-docker Aug 30 03:18:32 localhost.localdomain systemd[1]: Starting NVIDIA Docker plugin... Aug 30 03:18:32 localhost.localdomain systemd[1]: Started NVIDIA Docker plugin. Aug 30 03:18:32 localhost.localdomain nvidia-docker-plugin[11130]: /usr/bin/nvidia-docker-plugin | 2017/08/30 03:18:32 Loading NV...mory Aug 30 03:18:32 localhost.localdomain nvidia-docker-plugin[11130]: /usr/bin/nvidia-docker-plugin | 2017/08/30 03:18:32 Loading NV...rary Aug 30 03:18:33 localhost.localdomain nvidia-docker-plugin[11130]: /usr/bin/nvidia-docker-plugin | 2017/08/30 03:18:33 Discoverin...ices Aug 30 03:18:33 localhost.localdomain nvidia-docker-plugin[11130]: /usr/bin/nvidia-docker-plugin | 2017/08/30 03:18:33 Provisioni...umes Aug 30 03:18:33 localhost.localdomain nvidia-docker-plugin[11130]: /usr/bin/nvidia-docker-plugin | 2017/08/30 03:18:33 Serving pl...cker Aug 30 03:18:33 localhost.localdomain nvidia-docker-plugin[11130]: /usr/bin/nvidia-docker-plugin | 2017/08/30 03:18:33 Serving re...3476 Hint: Some lines were ellipsized, use -l to show in full. [[email protected] ~]# nvidia-docker run --rm nvidia/cuda nvidia-smi Tue Aug 29 19:18:46 2017 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 384.69 Driver Version: 384.69 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 106... Off | 00000000:01:00.0 Off | N/A | | 43% 39C P0 22W / 120W | 10MiB / 6072MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
如果出现以下情况:
[[email protected] ~]# nvidia-docker run --rm nvidia/cuda nvidia-smi NVIDIA-SMI couldn‘t find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system. Please also try adding directory that contains libnvidia-ml.so to your system PATH.key
可以尝试一下加参数--privileged=true ,如果没问题,则需要修改selinux配置,编辑/etc/selinux/config:
SELINUX=disabled SELINUXTYPE=targeted 或者尝试selinux模式为permissive模式 setenforce 0
重启机器,重启服务器即可。
可参考:
https://github.com/NVIDIA/nvidia-docker/issues/407
https://github.com/NVIDIA/nvidia-docker
时间: 2024-10-11 22:42:53