NoteThis document is written by a Cloud GPU Service user and is for study and reference only.
You can use Docker to run TensorFlow in a GPU instance quickly. In this way, you only need to install the NVIDIA® driver program in the instance and don't need to install NVIDIA® CUDA® Toolkit.
This document describes how to use Docker to install TensorFlow and configure GPU/CPU support in a GPU instance.
NoteWe recommend you use a public image to create a GPU instance. If you select a public image, then select Automatically install GPU driver on the backend to preinstall the driver on the corresponding version. This method only supports certain Linux public images.
sudo apt-get update
sudo apt-get install \
ca-certificates \
curl \
gnupg \
lsb-release
Run the following command to install the GPG certificate to write the software source information:
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
Run the following commands to update and install Docker-CE:
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin
Run the following command to set the package repository and GPG key as instructed in Setting up NVIDIA Container Toolkit:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
Run the following command to install the nvidia-docker2 package and its dependencies:
sudo apt-get update
sudo apt-get install -y nvidia-docker2
Run the following command to set the default runtime and restart the Docker daemon to complete installation:
sudo systemctl restart docker
Then, you can run the following command to run the base CUDA container to test the job settings:
sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
The following information will appear:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
The official TensorFlow Docker images are in the tensorflow/tensorflow code repository in Docker Hub. Image tags are defined in the following format as listed in Tags:
Tag | Description |
---|---|
latest |
Latest (default) tag of the binary TensorFlow CPU image. |
nightly |
Nightly tag of the TensorFlow image, which is unstable. |
version |
Tag of the TensorFlow binary image, such as `2.1.0`. |
devel |
TensorFlow master Nightly tag of the development environment, which contains the TensorFlow source code. |
custom-op |
Special experimental image for custom TensorFlow operation development. For more information, see tensorflow/custom-op. |
Tag Variant | Description |
---|---|
tag -gpu |
Specified tag supporting GPU. |
tag -jupyter |
Specified tag for Jupyter, which contains the TensorFlow tutorial laptop. |
You can use multiple variants at a time. For example, the following command will download the TensorFlow image tags to your computer:
docker pull tensorflow/tensorflow # latest stable release
docker pull tensorflow/tensorflow:devel-gpu # nightly dev release w/ GPU support
docker pull tensorflow/tensorflow:latest-gpu-jupyter # latest release w/ GPU support and Jupyter
Run the following command to start and configure the TensorFlow container. For more information, see Docker run reference.
docker run [-it] [--rm] [-p hostPort:containerPort] tensorflow/tensorflow[:tag] [command]
Use an image with the latest
tag to verify the TensorFlow installation result. Docker will download the latest TensorFlow image when it runs for the first time.
docker run -it --rm tensorflow/tensorflow \
python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
Below are the samples of other TensorFlow Docker solutions:
Start the bash
shell session in the container where TensorFlow is configured:
docker run -it tensorflow/tensorflow bash
To run the TensorFlow program developed on the host in the container, use the -v hostDir:containerDir -w workDir
parameter to load the server directory and change the container working directory as follows:
docker run -it --rm -v $PWD:/tmp -w /tmp tensorflow/tensorflow python ./script.py
- Use TensorFlow with the `nightly` tag to start Jupyter laptop server:NoteWhen you allow the host to access the files created in the container, permission problems may occur. Generally, we recommend you modify files on the host system.
docker run -it -p 8888:8888 tensorflow/tensorflow:nightly-jupyter
Use a browser to visit http://127.0.0.1:8888/?token=...
as instructed at the Jupyter website.
Run the following command to download and run the TensorFlow image supporting GPU:
docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu \
python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
It may take a while to set the image supporting GPU. To run the GPU-based script repeatedly, you can use docker exec
to use the container repeatedly.
Run the following command to use the latest TensorFlow GPU image to start the bash
shell session in the container:
docker run --gpus all -it tensorflow/tensorflow:latest-gpu bash
Was this page helpful?