Set up GPU Passthrough

GPU Passthrough provides maximum performance by assigning entire physical GPUs exclusively to single VMs. This approach is ideal for high-performance computing workloads that require full GPU access and bare-metal performance.

Configure GPU Passthrough infrastructure

Set up your passthrough GPU infrastructure by configuring hosts and clusters with GPU capabilities.

Step 1: Configure GPU passthrough host configuration

To configure your host to support GPU workloads, perform the following steps.

Prerequisites: Verify SR-IOV support

Before configuring GPU passthrough, verify that your GPU hardware supports SR-IOV.

Bash
Copy

Replace c1:00.0 with your actual GPU PCI device ID. You should see a similar output.

Capabilities: [bcc v1] Single Root I/O Virtualization (SR-IOV)

If you see this output, your GPU supports SR-IOV and you can proceed with GPU configuration.

  1. Navigate to Infrastructure > Cluster Blueprint > Host Configurations on the PCD console.
  2. Select Add Host Configuration to create a new configuration.
  3. Configure your new PCD host by entering the network and management settings. Each setting controls a specific functionality that determines how the host operates within your PCD environment.
Field NameDescription
Name this configurationSpecify the host name to configure.
Network InterfaceEnter the Physical Interface Name.
Physical Network LabelThis is an optional entry. Use a descriptive label to identify and organize physical network interfaces. By assigning meaningful names like "Production-Network" or "Management-VLAN" you can filter your search for easier identification and troubleshooting.
ManagementEnable management functions.
VM ConsoleEnable VM Console access to allow administrators to connect directly to virtual machines running on this host for troubleshooting and management.
Image Library I/OEnable Image Library I/O to allow this host to read from and write to the centralized image repository for VM deployment and updates.
Virtual Network TunnelsEnable Virtual Network Tunnels to allow secure network connectivity between this host and other hosts in the PCD environment.
Host Liveness ChecksEnable Host Liveness Checks to automatically monitor a specific host health status and trigger alerts when the host is unresponsive.
  1. Name this configuration.
  2. Configure the basic host settings such as network interfaces.
  3. On the GPU Configuration, select Enable GPU.
  4. From the GPU Model dropdown, select your GPU model (NVIDIA L4).
  5. If your host supports SR-IOV, check the Enable SR-IOV checkbox.
  6. Select Save.

Optionally, you can edit an existing host configuration by performing the following steps.

  1. Navigate to Infrastructure > Cluster Blueprint > Host Configurations on the PCD console.
  2. On the Host Configurations list, select the configuration you want to modify.

If your host configuration is currently in use, you must first deauthorize the associated host before editing.

  1. Select Enable GPU Passthrough and then select your GPU model from a list of supported models.
  2. Select SR-IOV Device if hosts support SR-IOV.
  3. Select Save Blueprint.

Your host configuration now supports GPU workloads.

Step 2: Enable GPU in your cluster

Configure your cluster to support GPU enabled VMs by selecting the passthrough GPU mode.

  1. Navigate to Infrastructure > Clusters on the PCD console.
  2. Select Add Cluster to configure the cluster configuration.

Optionally, you can also configure VMHA or DRR settings.

  1. Select Enable GPU and then select the GPU mode: Passthrough for full GPU assignment to single VMs.
  2. Select Save.

Your new cluster now supports GPU passthrough workloads.

Optionally, edit an existing cluster by performing the following steps.

  1. From the Clusters list, select the cluster you want to modify.

The cluster must not currently be in use. If it is in use, you must first deauthorize the associated hosts before editing.

Additionally, all hosts under this cluster support only GPU Passthrough or the respective mode.

  1. Select Edit Cluster.
  2. Perform steps 4-5 listed in Step 2: Enable GPU in your cluster.

Your existing cluster now supports GPU passthrough workloads.

Step 3: Run the GPU configuration script

Before you begin using GPU features, run a configuration script on each GPU host. This script configures the underlying GPU drivers and enables passthrough mode.

Prerequisites: Onboard GPU hosts

Before running the GPU configuration script, you must first onboard your GPU hosts using pcdctl.

  1. Onboard your GPU host as a hosts using pcdctl.
  2. Verify the hosts onboarding completed successfully.
  3. Ensure you have administrator access to the onboarded hosts.

Execute GPU configuration script

The GPU configuration script is located at /opt/pf9/gpu/pf9-gpu-configure.sh on your onboarded host.

  1. Access your onboarded GPU host with administrator privileges.
  2. Navigate to the GPU script directory:
Bash
Copy
  1. Run the GPU configuration script:

Bash
Copy
  1. The script will prompt you to choose configuration options:
  2. Enter option 1 (PCI Passthrough) based on your cluster configuration and then you can either:
  • Enter specific PCI device IDs separated by spaces.
  • Press Enter without input to configure ALL listed NVIDIA GPUs.
  1. After the script completes running, it will prompt you to update grub and reboot. If you select N in the prompt, manually run the following commands to apply the changes:
Bash
Copy

You may be required to wait for the host to be online before proceeding.

  1. To verify if your GPU Passthrough configuration is successful, run the following command.
Bash
Copy
  1. Enter option 5 (Validate Passthrough) on the terminal and review the verification output to confirm passthrough set up is configured.
  2. On the console, view Infrastructure > GPU Hosts to verify if your GPU host appears on the list. You will see:
  • Compatibility mode (passthrough)
  • GPU model and device ID

The script has configured GPU drivers to your host.

Step 4: Authorize your GPU Hosts

Authorize your GPU configured host in your GPU enabled cluster.

  1. Navigate to Infrastructure > Cluster Hosts and then select an existing configured host to authorize.
  2. Select the Host configuration where you enabled GPU support.
  3. Select the cluster where you enabled GPU passthrough support.
  4. Select Authorize Hosts.

You may be required to wait for few minutes for the authorization process to complete. On Infrastructure > Cluster Hosts the authorized GPU will have Status | ok and Scheduling | Enabled

Your GPU hosts is now authorized and ready to host GPU passthrough virtual machines.

Learn more on how to Create GPU Enabled Flavors for Passthrough virtual machines.

Monitor passthrough GPU resources

Monitor GPU usage and availability for passthrough configurations.

  1. Navigate to Infrastructure > GPU Hosts to view:
  • GPU models and total count per host
  • Passthrough GPUs available vs. assigned
  • GPU utilization per host

Passthrough GPU migration behavior

Understanding live migration behavior for passthrough GPU VMs:

  • You cannot migrate passthrough GPU VMs as each VM has exclusive access to specific physical GPU hardware.
  • Currently, the migration option in the console for passthrough GPU VMs is not supported.
  • To move a passthrough GPU VM, you must shut it down and redeploy on the target host.

Passthrough GPU best practices

  • Driver installation: Ensure NVIDIA drivers are installed in the VM for optimal performance.
  • Resource planning: Each passthrough GPU serves only one VM, so plan capacity accordingly.
  • High availability: Consider GPU redundancy for critical workloads since passthrough VMs cannot be migrated.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated
GPU, PassthroughvGPUflavorsset up gpu infrastructure