May 24, 2024


My Anti-Drug Is Computer

Deploy your containerized AI applications with nvidia-docker

Deploy your containerized AI applications with nvidia-docker

Extra and much more products and solutions and services are taking advantage of the modeling and prediction abilities of AI. This post provides the nvidia-docker resource for integrating AI (Artificial Intelligence) software program bricks into a microservice architecture. The key benefit explored here is the use of the host system’s GPU (Graphical Processing Device) sources to accelerate many containerized AI apps.

To realize the usefulness of nvidia-docker, we will commence by describing what type of AI can benefit from GPU acceleration. Next we will current how to put into practice the nvidia-docker software. Last but not least, we will explain what equipment are accessible to use GPU acceleration in your applications and how to use them.

Why utilizing GPUs in AI applications?

In the industry of artificial intelligence, we have two key subfields that are utilised: machine understanding and deep studying. The latter is portion of a much larger loved ones of equipment studying methods primarily based on artificial neural networks.

In the context of deep mastering, exactly where functions are fundamentally matrix multiplications, GPUs are much more successful than CPUs (Central Processing Models). This is why the use of GPUs has grown in modern many years. Certainly, GPUs are regarded as as the heart of deep studying for the reason that of their massively parallel architecture.

However, GPUs simply cannot execute just any program. Without a doubt, they use a specific language (CUDA for NVIDIA) to just take edge of their architecture. So, how to use and converse with GPUs from your purposes?

The NVIDIA CUDA technological know-how

NVIDIA CUDA (Compute Unified Device Architecture) is a parallel computing architecture mixed with an API for programming GPUs. CUDA interprets software code into an instruction established that GPUs can execute.

A CUDA SDK and libraries this kind of as cuBLAS (Fundamental Linear Algebra Subroutines) and cuDNN (Deep Neural Network) have been formulated to talk simply and efficiently with a GPU. CUDA is obtainable in C, C++ and Fortran. There are wrappers for other languages which includes Java, Python and R. For instance, deep discovering libraries like TensorFlow and Keras are centered on these technologies.

Why utilizing nvidia-docker?

Nvidia-docker addresses the needs of builders who want to insert AI functionality to their programs, containerize them and deploy them on servers powered by NVIDIA GPUs.

The objective is to established up an architecture that enables the enhancement and deployment of deep finding out models in services out there by way of an API. Consequently, the utilization charge of GPU assets is optimized by generating them readily available to a number of application cases.

In addition, we gain from the rewards of containerized environments:

  • Isolation of scenarios of every single AI product.
  • Colocation of several models with their unique dependencies.
  • Colocation of the exact same model beneath many versions.
  • Steady deployment of styles.
  • Model performance monitoring.

Natively, working with a GPU in a container necessitates installing CUDA in the container and supplying privileges to obtain the device. With this in head, the nvidia-docker resource has been developed, enabling NVIDIA GPU gadgets to be uncovered in containers in an isolated and secure way.

At the time of crafting this report, the most recent version of nvidia-docker is v2. This edition differs considerably from v1 in the subsequent means:

  • Edition 1: Nvidia-docker is carried out as an overlay to Docker. That is, to produce the container you had to use nvidia-docker (Ex: nvidia-docker run ...) which performs the steps (amongst many others the creation of volumes) allowing for to see the GPU equipment in the container.
  • Edition 2: The deployment is simplified with the replacement of Docker volumes by the use of Docker runtimes. Indeed, to launch a container, it is now needed to use the NVIDIA runtime by way of Docker (Ex: docker run --runtime nvidia ...)

Notice that thanks to their diverse architecture, the two variations are not suitable. An software created in v1 must be rewritten for v2.

Placing up nvidia-docker

The needed factors to use nvidia-docker are:

  • A container runtime.
  • An obtainable GPU.
  • The NVIDIA Container Toolkit (primary component of nvidia-docker).



A container runtime is required to run the NVIDIA Container Toolkit. Docker is the advisable runtime, but Podman and containerd are also supported.

The official documentation provides the installation course of action of Docker.


Motorists are needed to use a GPU unit. In the scenario of NVIDIA GPUs, the drivers corresponding to a specified OS can be attained from the NVIDIA driver obtain webpage, by filling in the info on the GPU model.

The installation of the drivers is carried out by way of the executable. For Linux, use the following instructions by replacing the title of the downloaded file:

chmod +x NVIDIA-Linux-x86_64-470.94.operate

Reboot the host device at the end of the installation to choose into account the mounted motorists.

Setting up nvidia-docker

Nvidia-docker is out there on the GitHub task web site. To put in it, stick to the installation handbook based on your server and architecture specifics.

We now have an infrastructure that makes it possible for us to have isolated environments supplying accessibility to GPU sources. To use GPU acceleration in programs, a number of equipment have been developed by NVIDIA (non-exhaustive listing):

  • CUDA Toolkit: a established of equipment for building software/packages that can conduct computations utilizing equally CPU, RAM, and GPU. It can be utilised on x86, Arm and Energy platforms.
  • NVIDIA cuDNN: a library of primitives to speed up deep studying networks and enhance GPU effectiveness for key frameworks these as Tensorflow and Keras.
  • NVIDIA cuBLAS: a library of GPU accelerated linear algebra subroutines.

By working with these applications in software code, AI and linear algebra responsibilities are accelerated. With the GPUs now visible, the application is in a position to ship the info and functions to be processed on the GPU.

The CUDA Toolkit is the most affordable level choice. It gives the most handle (memory and recommendations) to construct custom made purposes. Libraries provide an abstraction of CUDA functionality. They permit you to concentrate on the software progress somewhat than the CUDA implementation.

As soon as all these aspects are implemented, the architecture using the nvidia-docker assistance is prepared to use.

Right here is a diagram to summarize all the things we have witnessed:



We have set up an architecture letting the use of GPU sources from our purposes in isolated environments. To summarize, the architecture is composed of the next bricks:

  • Running process: Linux, Home windows …
  • Docker: isolation of the environment working with Linux containers
  • NVIDIA driver: set up of the driver for the components in dilemma
  • NVIDIA container runtime: orchestration of the preceding a few
  • Programs on Docker container:
    • CUDA
    • cuDNN
    • cuBLAS
    • Tensorflow/Keras

NVIDIA proceeds to build tools and libraries all-around AI technologies, with the purpose of establishing itself as a leader. Other systems may perhaps complement nvidia-docker or may be additional suited than nvidia-docker depending on the use case.