Pods Deployed Using Multus Plugin Fail to Have Container Added to Network After Upgrading to Whereab

Problem

It is observed that the stateful sets pods were failing to come up with the below error.

Warning  FailedCreatePodSandBox  4m31s (x1188 over 7h15m)  kubelet  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2aaXXXXXXXXXXXXXXXXX42": plugin type="multus" name="multus-cni-network" failed (add): [signaling-cvi/sip-frontend-0:mirror-dhcp-vlan-372-10.165.186.128-25]: error adding container to network "mirror-dhcp-vlan-372-10.165.186.128-25": error at storage engine: k8s get error: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline

Cluster is upgraded to K8s v1.24, new job pods for the ip-reconciler cronjob were unable to start.

$ kubectl get jobs -n kube-system | grep ip-reconciler
ip-reconciler-28333105                1/1           8s         16d
ip-reconciler-28333110                1/1           8s         16d
ip-reconciler-28333115                1/1           8s         16d
ip-reconciler-28357405                0/1           3m31s      3m31s

A manual run of the job produced an error with regards to file not being present in the container image, e.g.

7s         Warning   Failed                 pod/ip-reconciler-manual-bkxwd                Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/ip-reconciler": stat /ip-reconciler: no such file or directory: unknown

Image: docker.io/platform9/whereabouts:v0.6-pmk-6arrow-up-right

Command:
      /ip-reconciler
      -kubeconfig=/host/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig

Environment

  • Platform9 Managed Kubernetes - v5.6 and Higher

  • Managed Platform9 Edge Cloud - v5.6 and Higher.

Answer

This is a know issue, Platform9 has a jira filed AIR-1268 to track and provide a permanent solution for this issue.

Workaround

The workaround is to patch the networkplugins.plumber.k8s.pf9.io CRD for the K8s cluster associated with respective management plane to use the Whereabouts v0.4.10 image.

Additional Information

To track progress of the permanent solution for this issue, open a support ticket mentioning the jira ID AIR-1268.

Last updated