Troubleshooting MetalLB Add-on

Problem

MetalLB offers external load balancing for applications within PCD-K clusters. As an external add-on for PCD-K Kubernetes clusters, MetalLB's broken functionality can significantly impact service availability. When MetalLB experiences issues, here's a general guide on how to troubleshoot CoreDNS Add-on Issues.

Environment

  • Private Cloud Director - v2025.4 and Higher.
  • Kubernetes Cluster - 1.31.2 or Higher.

Procedure

  1. MetalLB has all MetalLB objects, a controller pod and speakers pods per worker node deployed in the metallb-system namespace. Verify the pod status in the namespace using the command:
Verify MetalLB pods
Copy
  1. Review why these pods are in "CrashLoopBackOff/OOMkilled/Pending/Error" state, see the events sections part of the command:
Check events
Copy
  1. Validate if the MetalLB service shows the correct IP address pools using command:
Get IPaddressPool details
Copy
  1. Verify that the metallb-webhook-service service has correct endpoint IP address of the metallb-controller pod.
  2. Get more information on the failure from pod logs using command:
Command
Copy
  1. Ensure that MetalLB VIP/MAC address pair mapping is added on all the worker nodes. As Mac address spoofing is not allowed by Port Security.
  2. If these steps prove insufficient to resolve the issue, kindly reach out to the Platform9 Support Team for additional assistance.

Most Common causes:

  • Incorrect IP range mentioned in the IPAddressPools.
  • L2Advertisement or BGPAdvertisement resources are not created as per IPAddressPools Or advertisements are misconfigured.
  • MetalLB VIP/MAC address pair mapping is missing on all/some of the worker nodes.
  • Check Calico pod logs for errors, since MetalLB relies on Calico's functionality.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard