Note
This article is contributed by GPU cloud server users as part of a best-practice collection. It is provided for learning and reference only.
Scenarios
This article introduces how to use Windows GPU through the CVM console to set up a deep learning environment.
Instance Environment
Operating System: Windows Server 2019 Datacenter Edition 64-bit
CPU:Intel(R) Xeon(R) CPU E5-2680 v4 @2.40GHz 2.40GHz * 6vCPUs
RAM: 56GB
GPU: Tesla P40 * 1
Drivers, Related Libraries, and Software Versions: CUDA 10.2, Python 3.7, Pytorch 1.8.1, Tensorflow_gpu_2.2.0
Select Driver and Related Libraries, Software Versions
Before you install the drivers, you need to have a general understanding of the version compatibility between CUDA, cuDNN, Pytorch, TensorFlow, and Python, so that you can select the compatible version based on the actual configuration to avoid issues such as version mismatches later on.
Select CUDA Driver Version
CUDA (Compute Unified Device Architecture) is a computing platform introduced by graphics card manufacturer NVIDIA. CUDA™ is a general-purpose parallel computing architecture introduced by NVIDIA, enabling GPUs to solve complex computational problems. It encompasses the CUDA instruction set architecture (ISA) and the parallel computing engines within the GPU.
1. Check the GPU Compute Capability
Before selecting the CUDA driver version, you should first confirm the compute capability of the GPU used in this article (Tesla P40). According to NVIDIA’s official website, the Tesla P40 has a compute capability of 6.1, as shown in the figure below: 2. Select the CUDA version.
As shown in the figure below, CUDA versions have specific requirements based on GPU compute capability. For the Tesla P40, you should choose a CUDA version 8.0 or later. For more information about compute capability and CUDA version compatibility, see Application Compatibility on the NVIDIA Ampere GPU Architecture.
Select GPU Driver Version
After determining the CUDA version, select the GPU driver version. You can refer to the corresponding relationship diagram between CUDA and drivers as shown in the figure below. For more information, see cuda-toolkit-driver-versions.
Select cuDNN Version
NVIDIA cuDNN is a GPU-accelerated library for deep neural networks. It is designed for high performance, ease of use, and low memory overhead. cuDNN can be integrated into higher-level machine learning frameworks such as Google’s TensorFlow and the widely used Caffe framework developed at UC Berkeley. With its plug-in design, developers can focus on designing and implementing neural network models rather than tuning performance, while still achieving high-performance parallel computing on GPUs.
cuDNN is a CUDA-based GPU acceleration library for deep learning. It is required to run deep learning workloads on GPUs. To run deep neural networks with CUDA, you must install cuDNN so that the GPU can accelerate neural network computation, which is significantly faster than running on CPU. For the cuDNN–CUDA version mapping, see cuDNN Archive. Select the PyTorch Version
Select the corresponding PyTorch version based on your CUDA version. For version compatibility details, see previous-versions. Note:
The latest versions of CUDA and Pytorch may not be the optimal choices and could lead to compatibility issues. It is recommended to consult version compatibility information, select appropriate versions, and then install the corresponding drivers.
Select the TesorFIow Version
TensorFlow is slightly more complex than PyTorch, as it also depends on compatible versions of Python and the compiler. The version mapping among TensorFlow (CPU/GPU), Python, CUDA, and cuDNN is shown below:
Procedure
Instance Creation
Install Driver, CUDA and cuDNN
Install the GPU Driver
2. Visit the NVIDIA website in a browser and select the GPU driver version. The configuration selected in this article is shown in the figure below:
3. Click SEARCH to enter the download page, then click Download.
If you prefer downloading the installer locally and uploading it to the Cloud GPU Service instance via FTP, see How to Upload Local Files to CVM. 4. After the download is complete, double-click the installation package and follow the prompts on the page to complete the installation.
Install CUDA
1. Go to CUDA Toolkit Archive and select the required version. This article uses CUDA 10.2 as an example:
2. Go to the CUDA Toolkit 10.2 Download page and select the corresponding system configuration. The configuration selected in this article is shown in the figure below:
3. Click Download to start the download.
4. After the download is complete, double-click the installation package and follow the prompts on the page to proceed with the installation. Note the following steps:
In the pop-up CUDA Setup Package window, the Extraction path is a temporary storage location. No modification is required; leave it as default and click OK, as shown below:
In the License Agreement step, select Custom and click Next.
Select components based on your needs and click Next, as shown below:
Follow the remaining prompts and options as needed until installation completes. Configure Environment Variables
1. On the OS desktop, right-click the icon in the lower-left corner, and select Run from the pop-up menu. 2. In the Run window, enter sysdm.cpl and click OK.
3. In the opened System Properties window, select the Advanced tab and click Environment Variables.
4. Select Path in the system variables , and click Edit.
5. In the Edit environment variable window, click New and add the following entries:
C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2
C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2\\bin
C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2\\libnvvp
C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2\\lib\\x64
C:\\Program Files\\NVIDIA Corporation\\NVSMI
After editing, the result is shown below:
6. Click OK three times to save the settings.
Verify the GPU Driver and CUDA
1. On the OS desktop, right-click the icon , and select Run from the pop-up menu. 2. In the Run window, enter cmd and click OK.
3. In the cmd window:
Run the following command to check whether the GPU driver is installed.
If the output screen is similar to the screenshot below, the GPU driver has been installed successfully. When the GPU is in use, this command also shows GPU utilization.
Run the following command to verify whether CUDA is installed successfully.
If the output screen is similar to the screenshot below, CUDA has been installed successfully.
Install cuDNN
1. Go to the cuDNN Download page, click Archived cuDNN Releases to view more versions. 2. Find and download the required cuDNN version.
3. Extract the cuDNN archive, and copy the bin, include, and lib folders to the C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2.
4. cuDNN installation is now complete.
Install Deep Learning Libraries
Install Anaconda
It is recommended to install PyTorch and TensorFlow in a virtual environment created with Anaconda. With Anaconda, you can easily obtain and manage packages, and manage environments in a unified way. Anaconda includes more than 180 scientific packages and dependencies (including conda and Python). It is easy to install, supports high-performance Python and R usage, and provides free community support. 2. Download the required version on the page. This document uses Anaconda3-2019.03-Windows-x86_64 as an example, as shown below:
3. Double-click the installer and follow the on-screen instructions to proceed with the installation. Note the following steps:
In the Choose Install Location, change the default installation path. The default location uses the hidden ProgramData folder on drive C. For easier management, it is recommended to install Anaconda in another folder. The default path is shown below:
In the Advanced Installation Options, select all options to add Anaconda to environment variables and set Python 3.7 as the interpreter:
4. Click Install and wait for the installation to complete.
Configure Anaconda
1. On the OS desktop, click the icon in the lower-left corner, and select Anaconda Prompt.
2. In the opened Anaconda Prompt command-line window, run the following command to create a virtual environment.
conda create -n xxx_env python=3.7
Note
xxx_env is the environment name, python=3.7 is the Python version. You can modify them based on your needs.
After successful creation, it appears as shown below:
Use the following commands to enter or exit the virtual environment. Once activated, you can install packages as required.
conda activate xxx_env
conda deactivate
Install Pytorch
Go to the Pytorch and use the official recommended installation command.
This article uses CUDA 10.2 and installs via pip. Run the following command inside the created xxx_env environment:
pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
To speed up the installation, change to the Tsinghua source mirror and run the following command:
pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html -i https://pypi.tuna.tsinghua.edu.cn/simple
Install Tensorflow
Run the following command to install Tensorflow_gpu_2.2.0.
pip install tensorflow-gpu==2.2.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
Run the following command to install keras.
pip install keras -i https://pypi.tuna.tsinghua.edu.cn/simple
At this point, the basic deep learning libraries have been installed. You can install additional packages as needed, and start learning by using Anaconda’s built-in tools such as Jupyter Notebook and Spyder, or by installing tools like PyCharm.