How can I give virtual GPU resources to my end users seamlessly?

Presentation given at OpenInfra Day France 2024 about virtual GPU support in OpenStack Nova — covering VFIO-mdev, SR-IOV GPUs, vGPU live migration, and unified limits.

Slides

Download PDF

Summary

VFIO-mdev: How the kernel interface exposes GPU profiles (mediated device types) to Nova
SR-IOV GPUs (Caracal): Support for GPUs with virtual functions and the max_instances config option
vGPU Live Migration: Requirements (libvirt 8.6+, QEMU 8.1+, kernel 5.18+) and configuration limits
Unified Limits: New quota system using Keystone for VGPU resource quotas

Demo

During the presentation, I demonstrated GPU-accelerated prime number computation using numba with CUDA inside an OpenStack VM with a virtual GPU.

The demo script (get_primes.py) uses numba’s @vectorize decorator to run a prime checker on either the GPU (target='cuda') or CPU (target='parallel'):

import numba as nb
import numpy as np

@nb.vectorize(['int32(int32)'], target='cuda')
def check_prime_gpu(num):
    for i in range(2, (num // 2) + 1):
       if (num % i) == 0:
           return 0
    return num

@nb.vectorize(['int32(int32)'], target='parallel')
def check_prime_no_gpu(num):
    for i in range(2, (num // 2) + 1):
       if (num % i) == 0:
           return 0
    return num

Terminal recordings

The following recordings show the live demo on a server with an NVIDIA Quadro RTX 6000 GPU.

Sylvain Bauza

How can I give virtual GPU resources to my end users seamlessly?

Slides

Summary

Demo

Terminal recordings

Part 1: GPU setup and nvidia-smi

Part 2: Creating vGPU mediated devices

Part 3: Launching a VM with vGPU

Part 4: Running get_primes.py with GPU

Part 5: Performance comparison (GPU vs CPU)

Part 6: SR-IOV GPU demo

Leave a comment