Set up vGPU

Virtual GPU (vGPU) allows multiple VMs to share physical GPU resources efficiently. This approach maximizes resource utilization and reduces costs while still providing GPU acceleration for lighter workloads.

Configure vGPU infrastructure

Set up your vGPU infrastructure following the specific sequence required for virtual GPU functionality.

Prerequisites

Before beginning vGPU configuration, ensure your hosts has the required GPU drivers and licensing:

  • vGPU functionality requires proper NVIDIA drivers and valid licenses to be installed on the hosts before any configuration steps.

  • Ensure that the GPU cards intended for vGPU configuration are unbound and not linked to any other process or device. For more details see Troubleshooting GPU Support

  • Required components:

    • NVIDIA GPU drivers installed and functioning on the host
    • NVIDIA vGPU licenses properly configured
    • NVIDIA license server created and accessible
    • Valid license allocation for your vGPU usage
  • Ensure that SR-IOV is enabled in the BIOS before proceeding with vGPU configuration.

  • Verify GPU drivers are installed: nvidia-smi

Step 1: Onboard vGPU host

Before configuring vGPU, you must first onboard your GPU host using pcdctl.

  1. Onboard your vGPU host using pcdctl .
  2. Verify the hosts onboarding completed successfully.
  3. Ensure you have administrator access to the onboarded hosts.

Step 2: Run initial vGPU configuration

Execute the GPU configuration script to set up vGPU functionality on your host.

The GPU configuration script is located at /opt/pf9/gpu/pf9-gpu-configure.sh on your onboarded host.

  1. Access your onboarded vGPU host with administrator privileges.
  2. Navigate to the GPU script directory by using the following command.
Bash
Copy
  1. Run the GPU configuration script and enter option 2 (vgpu pre configure).
Bash
Copy

This configuration will prompt you to reboot at the end.

  1. The script will prompt you to update grub and reboot. If you select N in the prompt, manually run the following commands:
Bash
Copy

You may require to wait for the host to come back online before proceeding.

  1. To verify if your vGPU pre configuration is successful, run the following command.
Bash
Copy
  1. Enter option 6 (validate vgpu) on the terminal.

Step 3: Configure SR-IOV for vGPU

Configure SR-IOV settings required for vGPU functionality.

Currently, vGPU configuration in PCD is supported only through SR-IOV. Therefore, only GPU models with SR-IOV enabled are supported for vGPU use in PCD.

  1. Navigate to the GPU script directory using the following command.
Bash
Copy
  1. Run the GPU configuration script and enter option 3 (vGPU SR-IOV configure).
Bash
Copy
  1. The script will display output similar to the following.
Bash
Copy
  1. You can either:
  • Enter specific PCI device IDs separated by spaces.
  • Press Enter without input to configure ALL listed NVIDIA GPUs.

If you encounter an Cannot obtain unbindLockerror during this step, refer to the Troubleshooting GPU Support for resolution steps.

  1. Verify vGPU and SR-IOV configurations, by performing the following steps.
  • Navigate to the GPU script directory:
Bash
Copy
  • Run the GPU configuration script to verify vGPU setup.
Bash
Copy
  • Enter option 6 (Validate vGPU) on your terminal and review the verification output to confirm vGPU set up is configured.
  1. On the console, view Infrastructure > GPU Hosts to verify if your GPU host appears on the list. You will see:
    • Compatibility mode (vGPU)
    • GPU model and device ID

Step 4: Create host configuration and vGPU cluster

Create the necessary host configuration and cluster settings for vGPU operation.

  1. Navigate to Infrastructure > Cluster Blueprint > Host Configurations on the PCD console.
  2. Select Add Host Configuration to create a new configuration.
  3. Configure your new PCD host by entering the network and management settings. Each setting controls a specific functionality that determines how the host operates within your PCD environment.
Field NameDescription
Name this configurationSpecify the host name to configure.
Network InterfaceEnter the Physical Interface Name.
Physical Network LabelThis is an optional entry. Use a descriptive label to identify and organize physical network interfaces. By assigning meaningful names like "Production-Network" or "Management-VLAN" you can filter your search for easier identification and troubleshooting.
ManagementEnable management functions.
VM ConsoleEnable VM Console access to allow administrators to connect directly to virtual machines running on this host for troubleshooting and management.
Image Library I/OEnable Image Library I/O to allow this host to read from and write to the centralized image repository for VM deployment and updates.
Virtual Network TunnelsEnable Virtual Network Tunnels to allow secure network connectivity between this host and other hosts in the PCD environment.
Host Liveness ChecksEnable Host Liveness Checks to automatically monitor a specific host health status and trigger alerts when the host is unresponsive.
  1. Name this configuration.
  2. Configure the basic host settings with network section configured in the blueprint.

Step 5: Create vGPU cluster

  1. Navigate to Infrastructure > Clusters on the PCD console.
  2. Select Add Cluster to configure the cluster configuration.

Optionally, you can also configure VMHA or DRR settings.

  1. Select Enable GPU and then select the GPU mode: vGPU for sharing GPUs across multiple VMs.
  2. Select Save.

Your host configuration and cluster now support vGPU workloads.

Step 6: Authorize vGPU hosts

Authorize your vGPU configured host in your cluster.

  1. Navigate to Infrastructure > Cluster Hosts in the PCD console.
  2. Authorize the hosts by assigning:
  • Host configuration
  • Hypervisor role
  • vGPU cluster

You may be required to wait for few minutes the authorization process to complete.

Step 7: Configure vGPU host with vGPU profile

Configure your vGPU host with the appropriate vGPU profile.

  1. Navigate to Infrastructure > GPU Hosts in the PCD console.
  2. Select your vGPU host from the list.
  3. Configure the vGPU host with a vGPU profile.

Only single profile selection is allowed per vGPU host.

Step 8: Complete vGPU host configuration

Run the configuration script to complete vGPU host setup.

  1. Access your GPU host with administrator privileges.
  2. Navigate to the GPU script directory:
Bash
Copy
  1. Run the GPU configuration script.
Bash
Copy
  1. Enter option 4 (vGPU host configure) on the terminal to complete host configuration.
  2. Verify that the vGPU host is properly configured and appears in the Infrastructure > GPU Hosts. You will see:
  • Compatibility mode (vGPU)
  • GPU model and device ID
  • Available vGPU profiles

Your vGPU infrastructure is now ready for creating flavors and deploying VMs. For more details see Create GPU Enabled Flavors

Monitor vGPU resources

Monitor GPU usage and availability for vGPU configurations:

  1. Navigate to Infrastructure > GPU Hosts to view:
  • GPU models and total VRAM per GPU
  • Used and available VRAM per GPU
  • Active vGPU profiles and their utilization
  • vGPU slices available vs. assigned

vGPU migration behavior

Understanding live migration behavior for vGPU VMs:

  • vGPU VMs can be migrated if the destination host supports the same vGPU profile.
  • The system validates compatibility before allowing migration.
  • Migration fails if the destination host does not have the required vGPU profile available.

vGPU best practices

  • Profile selection: Choose vGPU profiles that match your workload requirements.
  • Resource monitoring: Monitor vGPU utilization to optimize resource allocation.
  • Driver compatibility: Ensure vGPU drivers are compatible with your guest operating system
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated
vGPUflavorsVMset up gpu infrastructure