The compulsion to have an Ubuntu laptop with GPU configured for deep learning was there for sometime. My previous windows laptop with 32gb ram and Nvidia GeForce GTX 860M GPU recently breathed it's last after being functional for close to 4 years. After that I was in search for a more affordable GPU enabled laptop and got one in recent black friday deal for less than one forth price of the previous one. This one though has 16 gb RAM and GeForce GTX 1050 GPU. Initially although I installed Ubuntu 16.04 LTS on it as dual boot alongside Windows 10, eventually I upgraded that to Ubuntu 18.04.1 LTS.
After that I decided to install Nvidia driver in it along side Cuda and Cudnn following this article published in medium. Following are the steps that I took along with the challenges faced with that.
Step 1 : Install Miniconda
Download Miniconda from here, then run
This is the step where most of the mistakes happen, that can jeopardize the whole installation process as well as screw up your Ubuntu. For example the medium blog asks to do sudo ubuntu-drivers autoinstall, that in turns install latest nvidia driver (nvidia-415 in this case). However your tensorflow-gpu can handle only Cuda 9.2, and which tries to install nvidia-396. A mixed installation of nvidia-396 driver along with nvidia-415 caused broken pipe in my ubuntu. I could come out of that by taking following step sudo dpkg -i --force-overwrite-all path-to-the-nvidia-deb-file . So a better way of avoiding such version conflict is to see what cuda and cudnn version the present tensorflow-gpu supports and also what nvidia version is required for that Cuda. At present the stable version of tensorflow-gpu 1.12 requires Cuda 9.2 and Cudnn 7.2.1, which in turn requires nvidia-396. So following should be a right sequence to follow.
Add graphics drivers to your source list
Install the compatible driver (avoid running sudo ubuntu-drivers autoinstall) using following command for Cuda 9.2 and tensorflow-gpu 1.12.
Now you need to reboot you system and after reboot run
Install CUDA®, which is a parallel computing platform and programming model developed by NVIDIA. Cuda is needed needed to run TensorFlow with GPU support.Download Cuda Toolkit 9.2 from here. Choose the following settings:
~/.bashrc by doing gedit ~/.bashrc
Create virtual environment, I names it tf36 for tensorflow and python 3.6
Install Keras with :
After that I decided to install Nvidia driver in it along side Cuda and Cudnn following this article published in medium. Following are the steps that I took along with the challenges faced with that.
Step 1 : Install Miniconda
Download Miniconda from here, then run
bash Miniconda3-latest-Linux-x86_64.shHowever tensorflow-gpu still don't work with python 3.7, following fix would be required upon installation.
conda install python==3.6Step 2 : Install Java and gcc
sudo apt update
sudo apt install openjdk-8-jdk
sudo apt-get install gcc-4.8 g++-4.8Step 3 : Install drivers
This is the step where most of the mistakes happen, that can jeopardize the whole installation process as well as screw up your Ubuntu. For example the medium blog asks to do sudo ubuntu-drivers autoinstall, that in turns install latest nvidia driver (nvidia-415 in this case). However your tensorflow-gpu can handle only Cuda 9.2, and which tries to install nvidia-396. A mixed installation of nvidia-396 driver along with nvidia-415 caused broken pipe in my ubuntu. I could come out of that by taking following step sudo dpkg -i --force-overwrite-all path-to-the-nvidia-deb-file . So a better way of avoiding such version conflict is to see what cuda and cudnn version the present tensorflow-gpu supports and also what nvidia version is required for that Cuda. At present the stable version of tensorflow-gpu 1.12 requires Cuda 9.2 and Cudnn 7.2.1, which in turn requires nvidia-396. So following should be a right sequence to follow.
Add graphics drivers to your source list
sudo add-apt-repository ppa:graphics-drivers/ppaCheck available nvidia drivers
sudo apt update
sudo apt upgrade
ubuntu-drivers devicesResult :
vendor : NVIDIA Corporation
model : GP107M [GeForce GTX 1050 Mobile]
driver : nvidia-driver-390 - third-party free
driver : nvidia-driver-396 - third-party free
driver : nvidia-driver-415 - third-party free recommended
driver : nvidia-396 - third-party non-free
driver : nvidia-driver-410 - third-party free
driver : xserver-xorg-video-nouveau - distro free builtin
Install the compatible driver (avoid running sudo ubuntu-drivers autoinstall) using following command for Cuda 9.2 and tensorflow-gpu 1.12.
sudo apt install nvidia-396
Now you need to reboot you system and after reboot run
lsmod | grep nvidiaStep 4 : Install Cuda
Install CUDA®, which is a parallel computing platform and programming model developed by NVIDIA. Cuda is needed needed to run TensorFlow with GPU support.Download Cuda Toolkit 9.2 from here. Choose the following settings:
Upon installation verify it with following command
ls /usr/local/cuda-9.2/and the result should be something like this
After that add Cuda to your path using following code
export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Step 5 : Install cuDNN
cuDNN is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK. cuDNN 7.2.1 can be downloaded here. To download you need to sign in or log in to your Nvidia account. Upon download use following set of commands to install it.
Unpack the archive
tar -xzvf cudnn-9.2-linux-x64-v7.2.1.38.tgz
Move the unpacked contents to your CUDA directory
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
Give read access to all users
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*Now you need to add following block of code to the end of
~/.bashrc by doing gedit ~/.bashrc
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"Now you need to run following line of code to make these effective
export CUDA_HOME=/usr/local/cuda
export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
source ~/.bashrcCheck if the path are correctly installed :
sudo ldconfig
$ echo $CUDA_HOMEThe output should be like this
/usr/local/cuda
Step 6 : Install tensorflow-gpu and Keras
conda create --name tf36Although there are lots of discussion on whether to use conda or pip for tensorflow-gpu, I've done it in conda way.
source activate tf36
conda install -c anaconda tensorflow-gpuCheck TensorFlow installation with:
python
>>> import tensorflow as tf
>>> tf.Session(config=tf.ConfigProto(log_device_placement=True))
conda install -c conda-forge kerasLast but not least as shown in the medium blog we can install further libraries such as matplotlib using conda, however that may cause python to downgrade dependent libraries and in turn stop using the gpu. So everytime after adding a new library to the virtual environment if you see that tensorflow isn't using gpu anymore, feel free to run the installation again using
conda install -c anaconda tensorflow-gpu
Hope it goes well for you and feel free to leave feedback about your experience.