首先安装一些依赖项
$sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libgl1-mesa-dev libglu1-mesa libglu1-mesa-dev libxi-dev
此处为离线安装,有两种选择:debian安装包;run安装方法
Deb安装方法
此处以14.04的cuda 7.5为例,在此之前最好对cuda安装文件进行MD5验证
$ md5sum cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
如果得出的校验码和官网给的一致,则可以继续安装,否则换一个版本试试
接下来安装cuda
$ sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
$ sudo apt-get update
$ sudo apt-get install cuda
run安装方法
此处以cuda 7.0为例(若是GPU支持,尽量安装7.5,一些框架如TensorFlow只支持7.5)
按Ctrl+Alt+F1进入命令提示符,新建一个黑名单文件
# sudo vi /etc/modprobe.d/blacklist-nouveau.conf
在其中输入
blacklist nouveau
options nouveau modset=0
保存退出(:wq)
然后执行
$ sudo update-initramfs -u
执行 lspci | grep nouveau查看是否有内容
$ lspci | grep nouveau
如果没有内容 ,说明禁用成功,如果有内容,就重启一下再查看
$ sudo reboot
关闭lightdm
$ sudo service lightdm stop
接下来cd进到你放置cuda的目录安装cuda 7.0
首先验证md5码
$ md5sum cuda_7.0.28_linux.run
此版本的md5 =312aede1c3d1d3425c8aa67bbb7a55e
$ cd 你存放cuda run文件的位置
$ sudo sh cuda_7.0.28_linux.run --no-opengl-libs
安装的时候,要让你先看一堆文字(EULA),直接不停的按空格键到100%,然后输入一堆accept,yes,yes或回车进行安装。安装目录尽量都选默认目录。
安装完成后,重启,然后用ls查看一下,是否生成了四个左右以nvidia开头的文件夹
# ls /dev/nvidia*
如果有,说明安装成功了,如果没有,可能不成功,需要卸载重装。卸载命令如下:
$ sudo /usr/local/cuda-7.5/bin/uninstall_cuda_7.5.pl
$ sudo /usr/bin/nvidia-uninstall
$ sudo apt-get --purge remove nvidia*
清除驱动
到此,上面两种安装方法都完成,下面设置环境变量,验证,编译测试Samples
首先设置环境变量
打开profile
$ sudo gedit /etc/profile
在最后加入以下两行,保存
export PATH=/usr/local/cuda-7.5/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH
然后使其生效
$ source /etc/profile
接下来验证驱动的版本,其实主要是保证驱动程序已经安装正常了
$ cat /proc/driver/nvidia/version
$ nvcc -V
编译samples,进入/usr/local/cuda/samples, 执行下列命令来build samples
$ cd /usr/local/cuda/samples
$ sudo make all –j32
这里的-j32后面的数字32是你的线程数,比如我的机器有32个线程。
可加速make。
整个过程大概几分钟左右, 全部编译完成,然后cd进入samples/bin/x86_64/linux/release, 运行deviceQuery
$ ./deviceQuery
如果出现显卡信息,则驱动及显卡安装成功,如果有多块GPU,在这里会全部显示出来,如果失败,最好还是卸载cuda重装。(下面的信息是在网上粘的,所以GPU对不上)
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 670"
CUDA Driver Version / Runtime Version 6.5 / 6.5
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 4095 MBytes (4294246400 bytes)
( 7) Multiprocessors, (192) CUDA Cores/MP: 1344 CUDA Cores
GPU Clock rate: 1098 MHz (1.10 GHz)
Memory Clock rate: 3105 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GeForce GTX 670
Result = PASS