Install Ubuntu 18.04 LTS
The old server has been dead for a while. Finally I bought a new PSU (not sure whether it is the PSU’s problem), reinstalled everything, organized cables, now I have new a desktop. Later I reinstalled the OS too, previously using a Windows Server 2012 for some reason (failure to install any linux at that time due to the nividia driver problem).
Surprisingly, this time it is a pretty smooth process to install a Ubuntu 18.04 LTS (with a AMD GPU at the beginning), and then switch to a nVidia GPU (old 4GB GTX 970).
In case you have problem install Ubuntu with a GTX GPU, try the solution of ‘blacklist nouveau driver’ in here
Install nVidia Drivers
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-driver-440
sudo reboot
# also you can install in setting -> advanced drivers

Install CUDA
If you want to delete old CUDA installed, try sudo apt-get –purge remove cuda.
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
Set CUDA ENV (not sure whether needed)
export PATH=$PATH:/usr/local/cuda-10.2/bin
export CUDADIR=/usr/local/cuda-10.2
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.2/lib64
Test Cuda
mkdir cuda-testing
cd cuda-testing/
cp -a /usr/local/cuda-10.2/samples samples-10.2
cd samples-10.2
make -j 4 # (add -k to skip errors)
~/cuda-testing/samples-10.2/bin/x86_64/linux/release$ ./nbody

Error
Cuda 10.2+Ubuntu 18.04 might give you error message of
'make: Target 'all' not remade because of errors.'
cudaNvSci.h:14:10: fatal error: nvscibuf.h: No such file or directory
#include
^~~~
compilation terminated.
Makefile:394: recipe for target 'cudaNvSci.o' failed
It is a temporary new feature issue according to https://github.com/NVIDIA/cuda-samples/issues/22#issuecomment-562105202
Install CuDNN
Download CuDNN lib for corresponding CUDA version from here
sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.2_amd64.deb sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.2_amd64.deb sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.2_amd64.deb
Test CuDNN Installation
cp -r /usr/src/cudnn_samples_v7/ ~/cuda-testing/cudnn_samples_v7/ ~/cuda-testing/cudnn_samples_v7/mnistCUDNN$ make clean && make ./mnistCUDNN
Install Pytorch through Anaconda
sh ./Anaconda3-2020.02-Linux-x86_64.sh conda install pytorch torchvision cudatoolkit=10.1 -c pytorch #test python -c 'import torch;print(torch.cuda.is_available())' True
Install Tensorflow and Keras
from here
pip install tensorflow-gpu pip install keras conda install tensorflow-gpu keras #test python -c 'from keras import backend as K;K.tensorflow_backend._get_available_gpus()' from keras import backend as K K.tensorflow_backend._get_available_gpus()
Install MXNet
pip install mxnet-cu102 d2lzh
Done
Reference
- Cuda Downloads: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal
- Install Multiple Vertsion of Cuda and Test Cude Installation: https://www.pugetsystems.com/labs/hpc/How-To-Install-CUDA-10-together-with-9-2-on-Ubuntu-18-04-with-support-for-NVIDIA-20XX-Turing-GPUs-1236/
- https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
- Anaconda: https://gist.github.com/kylemcdonald/3ae0b88a1bf91afc00ba441fe6823a17