Flannel CNI Failing to Set-Up Pods With Etcd Unreachable

Problem

  • Pods are failing to be scheduled with a warning similar to the following.

Warning FailedCreatePodSandBox 7s (x129 over 4m49s) kubelet, k8s-worker-ba79baeb-6d85-4820-bc80-7e5002ad3ac4000001 (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "03903d5782a1c423d722519caeb5de9cc48b24a8529ca0ff1cafdddfcb22d997" network for pod "tiller-deploy-6f8d4f6c9c-mfblm": NetworkPlugin cni failed to set up pod "tiller-deploy-6f8d4f6c9c-mfblm_kube-system" network: open /run/flannel/subnet.env: no such file or directory
  • The Flannel container log reveals that the network configuration cannot be fetched as the etcd cluster endpoint is unavailable.

{"log":"I1009 20:05:26.472067      1 main.go:475] Determining IP address of default interface<br>","stream":"stderr","time":"2019-10-09T20:05:26.4723886Z"}{"log":"I1009 20:05:26.472438      1 main.go:488] Using interface with name eth0 and address 10.0.0.16<br>","stream":"stderr","time":"2019-10-09T20:05:26.472585698Z"}{"log":"I1009 20:05:26.472474      1 main.go:505] Defaulting external address to interface address (10.0.0.16)<br>","stream":"stderr","time":"2019-10-09T20:05:26.472632697Z"}{"log":"2019-10-09 20:05:26.473129 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated<br>","stream":"stderr","time":"2019-10-09T20:05:26.473313689Z"}{"log":"I1009 20:05:26.473184      1 main.go:235] Created subnet manager: Etcd Local Manager with Previous Subnet: None<br>","stream":"stderr","time":"2019-10-09T20:05:26.473344789Z"}{"log":"I1009 20:05:26.473190      1 main.go:238] Installing signal handlers<br>","stream":"stderr","time":"2019-10-09T20:05:26.473348489Z"}{"log":"E1009 20:05:27.473798      1 main.go:349] Couldn't fetch network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 13.64.37.173:4001: i/o timeout<br>","stream":"stderr","time":"2019-10-09T20:05:27.474056314Z"}{"log":"timed out<br>","stream":"stdout","time":"2019-10-09T20:05:28.474383118Z"}

Environment

  • Platform9 Managed Kubernetes - All Versions

  • Flannel

  • Etcd

Cause

TCP Port 4001 is not reachable on the API/etcd endpoint. Flannel will be unable to retrieve its network configuration from etcd.

pf9@k8s-master-ba79baeb-6d85-4820-bc80-7e5002ad3ac4000003:~$ telnet 13.64.37.173 4001Trying 13.64.37.173...

Resolution

1. Ensure that TCP/4001 is listening and reachable from the workers to the masters.

2. Check for any security group or firewall limitations which may be limiting the connection otherwise.

Additional Information

BareOS – Networking Prerequisites - Network Port Configurationsarrow-up-right

Last updated