Troubleshooting MetalLB Add-on

Problem

MetalLB offers external load balancing for applications within PCD-K clusters. As an external add-on for PCD-K Kubernetes clusters, MetalLB's broken functionality can significantly impact service availability. When MetalLB experiences issues, here's a general guide on how to troubleshoot CoreDNS Add-on Issues.

Environment

Private Cloud Director - v2025.4 and Higher.
Kubernetes Cluster - 1.31.2 or Higher.

Procedure

MetalLB has all MetalLB objects, a controller pod and speakers pods per worker node deployed in the metallb-system namespace. Verify the pod status in the namespace using the command:

Verify MetalLB pods
    
 
$ kubectl get pods -n metallb-system
Copy

Review why these pods are in "CrashLoopBackOff/OOMkilled/Pending/Error" state, see the events sections part of the command:

Check events
    
 
$ kubectl describe <pod-name> -n metallb-system
Copy

Validate if the MetalLB service shows the correct IP address pools using command:

Get IPaddressPool details
    
 
$ kubectl get IPAddressPool -n metallb-system
Copy

Verify that the metallb-webhook-service service has correct endpoint IP address of the metallb-controller pod.
Get more information on the failure from pod logs using command:

Command
    
 
$ kubectl logs <pod-name> -n metallb-system
Copy

Ensure that MetalLB VIP/MAC address pair mapping is added on all the worker nodes. As Mac address spoofing is not allowed by Port Security.
If these steps prove insufficient to resolve the issue, kindly reach out to the Platform9 Support Team for additional assistance.

Most Common causes:

Incorrect IP range mentioned in the IPAddressPools.
L2Advertisement or BGPAdvertisement resources are not created as per IPAddressPools Or advertisements are misconfigured.
MetalLB VIP/MAC address pair mapping is missing on all/some of the worker nodes.
Check Calico pod logs for errors, since MetalLB relies on Calico's functionality.

Last updated on

Was this page helpful?