Kubectl Exec Command is Failing Intermittently With i/o Timeout Error
Problem
- The kubectl exec command is failing intermittently with i/o timeout Error.
$ kubect exec -it <POD_NAME> -n <NAMESPACE> -- bash
Error from server: error dialing backend: dial tcp [WORKER_NODE_IP]:10250: i/o timeout
- Corresponding API Server logs on the Master Node.
E0611 [TIME_STAMP] status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"error dialing backend: dial tcp [WORKER_NODE_IP]:10250: i/o timeout"}: error dialing backend: dial tcp [WORKER_NODE_IP]:10250: i/o timeout
Environment
- Platform9 Managed Kubenetes - v5.9 and Higher
- Platform9 Self Managed Cloud Platform - v-5.9.2-3199093 and Higher.
Cause
- Not all nodes in the Kubernetes cluster have the
net.ipv4.ip_local_reserved_ports
setting configured to reserve the NodePort range. This lead to ephemeral port conflicts and service binding failures.
The net.ipv4.ip_local_reserved_ports
setting in Linux is a sysctl parameter that allows you to reserve specific local port numbers so they won’t be used for automatic port assignments during outbound connections.
Resolution
- Ensure consistent reservation of the NodePort range across
all cluster nodes (both workers and master nodes) using the
net.ipv4.ip_local_reserved_ports
sysctl parameter.
Implementation Steps:
- Determine the NodePort range used in your cluster. Default is
30000–32767
, unless customized in the Kubernetes API server configuration. - Apply the reservation on all nodes.
echo "net.ipv4.ip_local_reserved_ports = <NODEPORT_IP_RANGE>" >> /etc/sysctl.conf
sysctl -p
- Confirm the change is active.
sysctl net.ipv4.ip_local_reserved_ports
- Use automation (e.g., Ansible, scripts, or DaemonSets) to enforce this setting across all current and future nodes.
Additional Information
- To learn more about using sysctls in a Kubernetes Cluster, refer to the Official Kubernetes Documentation.
Was this page helpful?