Troubleshooting MetalLB Add-on

Troubleshoot MetalLB issues in PCD-K Kubernetes clusters with our guide. Learn to verify pod statuses, check IP address pools, validate endpoints, and identify common causes like misconfigured IP rang

Problem

MetalLB offers external load balancing for applications within PCD-K clusters. As an external add-on for PCD-K Kubernetes clusters, MetalLB's broken functionality can significantly impact service availability. When MetalLB experiences issues, here's a general guide on how to troubleshoot CoreDNS Add-on Issues.

Environment

  • Private Cloud Director - v2025.4 and Higher.

  • Kubernetes Cluster - 1.31.2 or Higher.

Procedure

  1. MetalLB has all MetalLB objects, a controller pod and speakers pods per worker node deployed in the metallb-system namespace. Verify the pod status in the namespace using the command:

$ kubectl get pods -n metallb-system
  1. Review why these pods are in "CrashLoopBackOff/OOMkilled/Pending/Error" state, see the events sections part of the command:

$ kubectl describe <pod-name> -n metallb-system
  1. Validate if the MetalLB service shows the correct IP address pools using command:

$ kubectl get IPAddressPool -n metallb-system
  1. Verify that the metallb-webhook-service service has correct endpoint IP address of the metallb-controller pod.

  2. Get more information on the failure from pod logs using command:

$ kubectl logs <pod-name> -n metallb-system
  1. Ensure that MetalLB VIP/MAC address pair mapping is added on all the worker nodes. As Mac address spoofing is not allowed by Port Security.

  2. If these steps prove insufficient to resolve the issue, kindly reach out to the Platform9 Support Teamarrow-up-right for additional assistance.

Most Common causes:

  • Incorrect IP range mentioned in the IPAddressPools.

  • L2Advertisement or BGPAdvertisement resources are not created as per IPAddressPools Or advertisements are misconfigured.

  • MetalLB VIP/MAC address pair mapping is missing on all/some of the worker nodes.

  • Check Calico pod logs for errors, since MetalLB relies on Calico's functionality.

Last updated