PMK Troubleshooting Guide

Issues by Component

Cluster

  • Cluster A PMK cluster needs to be created by using onboarded (authorized) nodes. Refer the table below for issues related to a cluster: Troubleshooting Cluster Issues Cluster
Cluster
Component/TopicSymptoms/Error MessagesLink to KB Article
BareOS Cluster CreationCluster creation fails, UI may show the failing step.Troubleshooting Cluster Issues
Cluster Creation using Public Cloud Provider (e.g. AWS)Cluster creation fails, UI may show the failing step.Troubleshooting Cluster Issues
Etcd Configuration
  • Heartbeat/Election Timeout Interval
  • Database Size Exceeded
Troubleshooting Cluster Issues

Nodes

  • Nodes: Linux servers are configured by PMK before they can be used to create a cluster. The configuration process includes installing PMK-specific packages and verifying other prerequisites.
Nodes
Component/TopicSymptoms/Error MessagesLink to KB Article
VIP association on Master nodesVIP Not Routable from Other Masters (Misconfigured)Troubleshooting Node Issues
Node Preparation / Onboarding / Node Not ConvergedIncompatible Package Version(s)Troubleshooting Node Issues
Clock Skew
  • PF9 Host agent fails to generate certificates.
  • Error message in hostagent.log: “Unable to vouch URL …”
Troubleshooting Node Issues

Pods

  • Pods While deploying workloads to Kubernetes (PMK), you may encounter issues around starting pods for deployments. If the dashboard (UI) reports unhealthy workload, refer the table below:
Pods
Component/TopicSymptoms/Error MessagesLink to KB Article
Pods / DeploymentsError: ImagePullBackOffTroubleshooting Pod Issues
Node Preparation / Onboarding / Node Not ConvergedError: CrashLoopBackOffTroubleshooting Pod Issues

Networking

  • Network: Various issues seemingly related to Kubernetes workloads may be caused by underlying network issues. Refer the table below for known networking issues:
Network
Component/TopicSymptoms/Error MessagesLink to KB Article
DNSErrors due to domain name/host name resolution failureTroubleshooting Network Issues
Calico CNIPod Networking broken / Kernel IP Forwarding not enabled on hostTroubleshooting Network Issues

Applications

  • Applications: Applications can be deployed using the Apps Catalog tab of Apps Dashboard. Applications can only be deployed to clusters that have been registered with a repository. The table below outlines some common issues when deploying or managing apps.
Applications
Component/TopicSymptoms/Error MessagesLink to KB Article
MetalLBMetalLB is configured but doesn’t work.Troubleshooting Application Issues

CLI

  • ** CLI
  • *: Troubleshooting steps for known issues of Command Line Interface clients.
CLI
Component/TopicSymptoms/Error MessagesLink to KB Article
Kubectl
  • API Server Unreachable
  • Invalid Token in Kubeconfig
Troubleshooting CLI Issues
Etcdctl
  • Incorrect Endpoint(s)
  • Certificates Not Specified
Troubleshooting CLI Issues

AWS EC2 Clusters

  • Troubleshooting guidance for known issue on clusters using AWS Cloud Provider:
AWS
Component/TopicSymptoms/Error MessagesLink to KB Article
Instance registration on Elastic Load BalancerElastic Load Balancer (ELB) Shows No Active InstancesAWS Troubleshooting
Node Preparation / Onboarding / Node Not ConvergedNodePort Service Isn't Externally ReachableAWS Troubleshooting
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated by Anmol Sachan