# Calico-kube-controller Pod Restarts Frequently Due To OOM- Memory Exhaustion.

## Problem

The calico-kube-controller pod is getting restarted frequently due to OOM- memory exhaustion with 137 error code:

{% tabs %}
{% tab title="Calico-kube-controller pod describe output during time of issue" %}

```javascript
% kubectl -n kube-system describe pod calico-kube-controllers-6f4d4c87cf-pnxbx
Name:                 calico-kube-controllers-6f4d4c87cf-pnxbx
...
Status:               Running
...
Controlled By:  ReplicaSet/calico-kube-controllers-6f4d4c87cf
Containers:
  calico-kube-controllers:
    Image:          calico/kube-controllers:v3.23.5
    State:          Running
      Started:      Wed, 04 Oct 2023 17:14:57 +0530
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Wed, 04 Oct 2023 17:02:07 +0530
      Finished:     Wed, 04 Oct 2023 17:14:56 +0530
    Ready:          True
    Restart Count:  244
    Limits:
      cpu:     200m
      memory:  400Mi
    Requests:
      cpu:      1m
      memory:   25Mi
....
Events:
  Type     Reason     Age                      From     Message
  ----     ------     ----                     ----     -------
  Normal   Created    41m (x244 over 83d)      kubelet  Created container calico-kube-controllers
  Normal   Pulled     28m (x245 over 83d)      kubelet  Container image "calico/kube-controllers:v3.23.5" already present on machine
  Warning  Unhealthy  2m23s (x10692 over 83d)  kubelet  Readiness probe failed: command "/usr/bin/check-status -r" timed out
```

{% endtab %}
{% endtabs %}

## Environment

* Platform9 Managed Kubenetes - v5.6.8.
* Kubernetes version 1.23.8.

## Answer

This is a known issue, a jira- PMK-6180 has already been filed to track this issue and resolve it. The fix will be available in upcoming patch release.

## Workaround

Modify the readiness probe timeout to 10 seconds and increase the memory limit on the pod to 2Gi.

Before modification:

{% tabs %}
{% tab title="Before modifying calico-kube-controller deployment" %}

```javascript
% kubectl get deployment calico-kube-controllers -n kube-system -o yaml
...
livenessProbe:
          exec:
            command:
            - /usr/bin/check-status
            - -l
          failureThreshold: 6
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 10
        name: calico-kube-controllers
readinessProbe:
          exec:
            command:
            - /usr/bin/check-status
            - -r
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
```

{% endtab %}
{% endtabs %}

Modify the calico-kube-controllers deployment using below command:

{% tabs %}
{% tab title="Edit deployment calico-kube-controllers" %}

```javascript
% kubectl edit deployment calico-kube-controllers -n kube-system
```

{% endtab %}
{% endtabs %}

After modification using:

{% tabs %}
{% tab title="After modifying calico-kube-controller deployment" %}

```javascript
livenessProbe:
          exec:
            command:
            - /usr/bin/check-status
            - -l
          failureThreshold: 6
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 10
        name: calico-kube-controllers

        readinessProbe:
          exec:
            command:
            - /usr/bin/check-status
            - -r
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 10

        resources:
          limits:
            cpu: 200m
            memory: 2000Mi
          requests:
            cpu: 1m
            memory: 25Mi
```

{% endtab %}
{% endtabs %}

## Additional Information

This is known bug with JIRA ID: PMK-6180


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://platform9.com/kb/pmk/frequently-asked-questions/calico-kube-controller-restarts-frequently-due-to-oom--memory-ex.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
