How can I give virtual GPU resources to my end users seamlessly?
Presentation given at OpenInfra Day France 2024 about virtual GPU support in OpenStack Nova — covering VFIO-mdev, SR-IOV GPUs, vGPU live migration, and unified limits.
Slides
Summary
- VFIO-mdev: How the kernel interface exposes GPU profiles (mediated device types) to Nova
- SR-IOV GPUs (Caracal): Support for GPUs with virtual functions and the
max_instancesconfig option - vGPU Live Migration: Requirements (libvirt 8.6+, QEMU 8.1+, kernel 5.18+) and configuration limits
- Unified Limits: New quota system using Keystone for VGPU resource quotas
Demo
During the presentation, I demonstrated GPU-accelerated prime number computation using numba with CUDA inside an OpenStack VM with a virtual GPU.
The demo script (get_primes.py) uses numba’s @vectorize decorator to run a prime checker on either the GPU (target='cuda') or CPU (target='parallel'):
import numba as nb
import numpy as np
@nb.vectorize(['int32(int32)'], target='cuda')
def check_prime_gpu(num):
for i in range(2, (num // 2) + 1):
if (num % i) == 0:
return 0
return num
@nb.vectorize(['int32(int32)'], target='parallel')
def check_prime_no_gpu(num):
for i in range(2, (num // 2) + 1):
if (num % i) == 0:
return 0
return num
Terminal recordings
The following recordings show the live demo on a server with an NVIDIA Quadro RTX 6000 GPU.
Leave a comment