How to Troubleshoot Kubernetes Insufficient Node Resources

3 min readDec 23, 2023

Running out of resources in your Kubernetes cluster is a familiar foe for any K8s warrior. The dreaded “insufficient node resources” message leaves you facing a chaotic battleground of stalled pods, frustrated users, and a performance dip so steep it could rival a ski slope. But fear not, brave adventurer! This guide will equip you with the tools and strategies to navigate this perilous terrain and emerge victorious.

Events:
  Type     Reason     Age   From               Message
  ----     ------     ----  ----               -------
  Normal   Scheduled  19m   default-scheduler  Successfully assigned argo/parallel-jobxx-vsd25-123213123` to 10.84.103.96
  Warning  OutOfcpu   19m   kubelet            Node didn't have enough resource: cpu, requested: 310, used: 3655, capacity: 3910

Step 1: Scouting the Battlefield

Symptoms: Pods stuck in “Pending” limbo, containers crashing due to OOM (Out-of-Memory) attacks, sluggish performance like a snail in molasses, and overall cluster instability.
Tools:
kubectl get pods: Identify pods stuck in “Pending” purgatory.
kubectl describe nodes: Survey resource availability on each node, like a scout inspecting the land.
kubectl top nodes: Monitor resource utilization in real-time, keeping an eye on the enemy’s movements.
Cluster monitoring dashboards (Prometheus, Grafana) offer a bird’s-eye view of the battlefield.

Step 2: Analyzing the Enemy’s Tactics

CPU, memory, and storage utilization: Are these resources approaching critical levels on any nodes? Spikes or sustained high levels are red flags.

# kubectl describe nodes
[...]
Capacity:
  cpu:                4
  ephemeral-storage:  61255492Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             2038904Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  56453061334
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             1936504Ki
  pods:               110
[...]
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                850m (20%)  0 (0%)
  memory             340Mi (7%)  340Mi (17%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)

Pod resource requests and limits: Are pods requesting more than available? Are requests realistic or inflated?

apiVersion: v1
kind: Pod
metadata:
  name: high-mem
spec:
  containers:
    - command:
        - sleep
        - "3600"
      image: busybox
      name: lets-break-pod-with-high-mem
      resources:
        requests:
          memory: "1000Gi"

Eviction thresholds: Are pods being evicted for exceeding resource limits? This could be a clue to resource overallocation.

Step 3: Deploying Countermeasures

Over-requested resources:

Adjust Pod resource requests and limits: Scale back demands to match actual workload needs. Think of it as rationing supplies for efficient use.
Optimize resource usage within containers: Identify resource-intensive processes and trim the fat. Like a skilled warrior, make every resource count.

Overloaded nodes:

Scale down deployments: Reduce the number of pods, lessening the pressure on nodes. Think of it as retreating to regroup and strategize.
Add more nodes to the cluster: Expand your territory and increase overall resource capacity. Think of it as building reinforcements.
Investigate external resource hogs: Are processes outside of Kubernetes pods consuming resources? Identify and eliminate them like hidden snipers.

Continue Reading on https://foxutech.com/how-to-troubleshoot-kubernetes-insufficient-node-resources/

If you like our posts, dont forget to subscribe and share with your friends.

You can subscribe us on https://www.youtube.com/@FoxuTech