Windows VM Performance Degradation on PCD
Problem
Windows 24H2 and Windows Server 2025 VMs running on PCD show noticeably slower performance compared to similar VMs running on other KVM-based platforms such as Proxmox.
Environment
- Compute Service
- Private Cloud Director Virtualization - v2025.4 and Higher
- Self-Hosted Private Cloud Director Virtualization - v2025.4 and Higher
Cause
The performance degradation was traced to inconsistent CPU model exposure and topology configuration across compute nodes.
When libvirt launches a Windows VM, it attempts to match the CPU model defined in its XML configuration. If the match is partial, the guest OS receives fewer CPU instruction flags, leading to reduced optimization and slower performance.
Additionally, default VM topology settings (e.g., sockets=4 cores=1 threads=1) can cause Windows to detect multiple single-core CPUs instead of a single multi-core processor. This results in:
- Higher inter-socket latency
- Reduced cache efficiency
- Suboptimal thread scheduling by the Windows kernel
Diagnostics
1. Check CPU configuration for the affected VM from the hosted Compute Node:
$ sudo virsh dumpxml <VM_UUID> | grep -A10 -i cpuLook for:
check='partial'→ Indicates incomplete CPU feature exposurefallback='allow'→ Allows degraded CPU model matching
2. Compare CPU settings across compute nodes:
$ sudo grep -i cpu /opt/pf9/etc/nova/conf.d/*.confConfirm consistent entries such as:
cpu_mode = customcpu_models = EPYC-Rome3. Verify host hardware consistency:
$ sudo lscpuEnsure that all compute nodes have the same CPU model and features.
Resolution
1. Define explicit CPU topology in the flavor:
$ openstack flavor set <FLAVOR_NAME> \ --property hw:cpu_sockets=1 \ --property hw:cpu_cores=4 \ --property hw:cpu_threads=1This configuration ensures Windows detects a single multi-core processor, improving scheduling and cache utilization.
2. Use strict CPU model matching in libvirt:
In the domain XML, define:
<cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>EPYC-Rome</model></cpu>This prevents libvirt from falling back to a partially matched CPU configuration.
3. Guest OS optimizations:
- Verify that the DiagTrack (Connected User Experiences and Telemetry) service is enabled. Disabling it may negatively impact performance.
- Review Virtualization-Based Security (VBS) settings — in some cases, disabling VBS may reduce overhead and improve responsiveness.
4. Optional graphics optimization (for GUI workloads): You may test the following flavor or image properties:
hw_video_model=virtioorhw_video_model=qxlhw_graphics_type=vncorhw_graphics_type=spice
Validation
- Recreate or resize a Windows VM using the updated flavor.
- Run the same workload or performance benchmark as before.
- Compare execution times — improved performance should be observed.
- Verify topology and CPU features in:
virsh dumpxml(on the host)- Windows Task Manager → Performance → CPU
Expected result: CPU-intensive tasks complete significantly faster and overall responsiveness improves.
Additional Information
- Using strict CPU model matching and explicit topology can yield 10–15% performance improvement in CPU-bound Windows workloads.
- Linux guests are generally unaffected, as they handle partial CPU model exposure more gracefully.