Rappel des requests et limits

Lors de la déclaration d’un manifest de pod il est recommandé de déclarer des requests et limits pour le CPU et la RAM, surtout pour la RAM car le CPU est throttlé en cas de dépassement de la limite alors qu’en dépassement de la RAM, le pod est redémarré.
Voir l’article qui l’explique.

Mise en situation

Manifests de déclaration du namespace:

apiVersion: v1
kind: Namespace
metadata:
  name: limits-oom
  labels:
    name: limits-oom

Manifests de déclaration du pod:

apiVersion: v1
kind: Pod
metadata:
  name: oom-demo
  namespace: limits-oom
spec:
  containers:
  - name: oom-eater
    image: python:3.11-slim
    command: ["python3", "-c"]
    args:
    - |
      import time
      print("Démarrage...", flush=True)
      data = []
      i = 0
      while True:
          data.append("x" * 10 * 1024 * 1024)  # +10 Mo par seconde
          i += 1
          print(f"{i * 10} Mo alloués", flush=True)
          time.sleep(1)
    resources:
      requests:
        memory: "50Mi"
      limits:
        memory: "100Mi"

Le pod alloue +10 Mo/seconde, atteint 100Mi (limit RAM) en ~10s, puis OOMKill → restart.

Application des manifest:

kubectl apply -f ./

Voir le comportement en temps réelle

Visualiser en temps réel les changements d’états

kubectl get pod oom-demo -w -n limits-oom
NAME       READY   STATUS      RESTARTS      AGE
oom-demo   0/1     OOMKilled   1 (20s ago)   40s
oom-demo   0/1     CrashLoopBackOff   1 (14s ago)   44s
oom-demo   1/1     Running            2 (15s ago)   45s
oom-demo   0/1     OOMKilled          2 (24s ago)   54s
oom-demo   0/1     CrashLoopBackOff   2 (13s ago)   66s
oom-demo   1/1     Running            3 (26s ago)   79s
oom-demo   0/1     OOMKilled          3 (35s ago)   88s
oom-demo   0/1     CrashLoopBackOff   3 (15s ago)   103s
oom-demo   1/1     Running            4 (44s ago)   2m12s
oom-demo   0/1     OOMKilled          4 (53s ago)   2m21s
oom-demo   0/1     CrashLoopBackOff   4 (14s ago)   2m34s
oom-demo   1/1     Running            5 (83s ago)   3m43s
oom-demo   0/1     OOMKilled          5 (92s ago)   3m52s
oom-demo   0/1     CrashLoopBackOff   5 (15s ago)   4m6s
oom-demo   1/1     Running            6 (2m50s ago)   6m41s
oom-demo   0/1     OOMKilled          6 (2m59s ago)   6m50s
oom-demo   0/1     CrashLoopBackOff   6 (16s ago)     7m5s
oom-demo   1/1     Running            7 (5m11s ago)   12m
oom-demo   0/1     OOMKilled          7 (5m20s ago)   12m
oom-demo   0/1     CrashLoopBackOff   7 (13s ago)     12m

ou

kubectl describe pod -n limits-oom oom-demo
Name:             oom-demo
Namespace:        limits-oom
Priority:         0
Service Account:  default
Node:             minikube/192.168.49.2
Start Time:       Fri, 24 Apr 2026 22:47:50 +0200
Labels:           <none>
Annotations:      <none>
Status:           Running
IP:               10.244.0.3
IPs:
  IP:  10.244.0.3
Containers:
  oom-eater:
    Container ID:  docker://2757593635c2688e74ef0e2636f60553224a33b4b014f2a82e55df45bb7bfbec
    Image:         python:3.11-slim
    Image ID:      docker-pullable://python@sha256:6d85378d88a19cd4d76079817532d62232be95757cb45945a99fec8e8084b9c2
    Port:          <none>
    Host Port:     <none>
    Command:
      python3
      -c
    Args:
      import time
      print("Démarrage...", flush=True)
      data = []
      i = 0
      while True:
          data.append("x" * 10 * 1024 * 1024)  # +10 Mo par seconde
          i += 1
          print(f"{i * 10} Mo alloués", flush=True)
          time.sleep(1)
      
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Fri, 24 Apr 2026 23:10:20 +0200
      Finished:     Fri, 24 Apr 2026 23:10:29 +0200
    Ready:          False
    Restart Count:  9
    Limits:
      memory:  100Mi
    Requests:
      memory:     50Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mszdv (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  kube-api-access-mszdv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  24m                   default-scheduler  Successfully assigned limits-oom/oom-demo to minikube
  Normal   Pulling    24m                   kubelet            Pulling image "python:3.11-slim"
  Normal   Pulled     24m                   kubelet            Successfully pulled image "python:3.11-slim" in 9.681s (9.681s including waiting). Image size: 124470789 bytes.
  Normal   Created    22m (x5 over 24m)     kubelet            Created container oom-eater
  Normal   Started    22m (x5 over 24m)     kubelet            Started container oom-eater
  Normal   Pulled     22m (x4 over 23m)     kubelet            Container image "python:3.11-slim" already present on machine
  Warning  BackOff    4m14s (x85 over 23m)  kubelet            Back-off restarting failed container oom-eater in pod oom-demo_limits-oom(4c0825da-96a8-4a7a-9b7d-7a3574e36451)
client$ curl http://service-web.application.svc
Welcome to NGINX!
client$ curl http://service-web.application.svc.cluster.local
Welcome to NGINX!

On voit s’enchaîner les CrashLoopBackOff et OOMKilled.
Kubernetes attend de manière exponentiel pour le restart:

Restart 1 → attente 10s
Restart 2 → attente 20s
Restart 3 → attente 40s
Restart 4 → attente 80s
Restart 5 → attente 160s
...
Max      → attente 300s (5 min) — puis reste bloqué là, il restarte toutes les 300s

Note: Kubernetes reset le backoff une fois que le pod tourne pendant 10minutes sans erreurs.
Kubernetes Pod Lifecycle