Setup Linux Env for Deep Learning

Install Ubuntu 18.04 LTS

The old server has been dead for a while. Finally I bought a new PSU (not sure whether it is the PSU’s problem), reinstalled everything, organized cables, now I have new a desktop. Later I reinstalled the OS too, previously using a Windows Server 2012 for some reason (failure to install any linux at that time due to the nividia driver problem).

Surprisingly, this time it is a pretty smooth process to install a Ubuntu 18.04 LTS (with a AMD GPU at the beginning), and then switch to a nVidia GPU (old 4GB GTX 970).

In case you have problem install Ubuntu with a GTX GPU, try the solution of ‘blacklist nouveau driver’ in here

Install nVidia Drivers

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-driver-440
sudo reboot
# also you can install in setting -> advanced drivers

Install CUDA

If you want to delete old CUDA installed, try sudo apt-get –purge remove cuda.

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda

Set CUDA ENV (not sure whether needed)

export PATH=$PATH:/usr/local/cuda-10.2/bin
export CUDADIR=/usr/local/cuda-10.2
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.2/lib64

Test Cuda

mkdir cuda-testing
cd cuda-testing/
cp -a /usr/local/cuda-10.2/samples samples-10.2
cd samples-10.2
make -j 4 # (add -k to skip errors)
~/cuda-testing/samples-10.2/bin/x86_64/linux/release$ ./nbody

Error

Cuda 10.2+Ubuntu 18.04 might give you error message of

'make: Target 'all' not remade because of errors.'
cudaNvSci.h:14:10: fatal error: nvscibuf.h: No such file or directory
#include
^~~~
compilation terminated.
Makefile:394: recipe for target 'cudaNvSci.o' failed

It is a temporary new feature issue according to https://github.com/NVIDIA/cuda-samples/issues/22#issuecomment-562105202

Install CuDNN

Download CuDNN lib for corresponding CUDA version from here

sudo dpkg -i  libcudnn7_7.6.5.32-1+cuda10.2_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.2_amd64.deb
sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.2_amd64.deb

Test CuDNN Installation

cp -r /usr/src/cudnn_samples_v7/ ~/cuda-testing/cudnn_samples_v7/
~/cuda-testing/cudnn_samples_v7/mnistCUDNN$ make clean && make
./mnistCUDNN

Install Pytorch through Anaconda

sh ./Anaconda3-2020.02-Linux-x86_64.sh
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

#test
python -c 'import torch;print(torch.cuda.is_available())'
True

Install Tensorflow and Keras

from here

pip install tensorflow-gpu
pip install keras

conda install tensorflow-gpu keras

#test 
python -c 'from keras import backend as K;K.tensorflow_backend._get_available_gpus()'

from keras import backend as K 
K.tensorflow_backend._get_available_gpus()    

Install MXNet

pip install mxnet-cu102 d2lzh

Done

Reference

Leave a Reply

Your email address will not be published. Required fields are marked *