How to Troubleshoot Kubernetes Insufficient Node Resources

FoxuTech
3 min readDec 23, 2023

--

Running out of resources in your Kubernetes cluster is a familiar foe for any K8s warrior. The dreaded “insufficient node resources” message leaves you facing a chaotic battleground of stalled pods, frustrated users, and a performance dip so steep it could rival a ski slope. But fear not, brave adventurer! This guide will equip you with the tools and strategies to navigate this perilous terrain and emerge victorious.

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 19m default-scheduler Successfully assigned argo/parallel-jobxx-vsd25-123213123` to 10.84.103.96
Warning OutOfcpu 19m kubelet Node didn't have enough resource: cpu, requested: 310, used: 3655, capacity: 3910

Step 1: Scouting the Battlefield

  • Symptoms: Pods stuck in “Pending” limbo, containers crashing due to OOM (Out-of-Memory) attacks, sluggish performance like a snail in molasses, and overall cluster instability.
  • Tools:
  • kubectl get pods: Identify pods stuck in “Pending” purgatory.
  • kubectl describe nodes: Survey resource availability on each node, like a scout inspecting the land.
  • kubectl top nodes: Monitor resource utilization in real-time, keeping an eye on the enemy’s movements.
  • Cluster monitoring dashboards (Prometheus, Grafana) offer a bird’s-eye view of the battlefield.

Step 2: Analyzing the Enemy’s Tactics

  • CPU, memory, and storage utilization: Are these resources approaching critical levels on any nodes? Spikes or sustained high levels are red flags.
# kubectl describe nodes
[...]
Capacity:
cpu: 4
ephemeral-storage: 61255492Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 2038904Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 56453061334
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 1936504Ki
pods: 110
[...]
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 850m (20%) 0 (0%)
memory 340Mi (7%) 340Mi (17%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
  • Pod resource requests and limits: Are pods requesting more than available? Are requests realistic or inflated?
apiVersion: v1
kind: Pod
metadata:
name: high-mem
spec:
containers:
- command:
- sleep
- "3600"
image: busybox
name: lets-break-pod-with-high-mem
resources:
requests:
memory: "1000Gi"
  • Eviction thresholds: Are pods being evicted for exceeding resource limits? This could be a clue to resource overallocation.

Step 3: Deploying Countermeasures

Over-requested resources:

  • Adjust Pod resource requests and limits: Scale back demands to match actual workload needs. Think of it as rationing supplies for efficient use.
  • Optimize resource usage within containers: Identify resource-intensive processes and trim the fat. Like a skilled warrior, make every resource count.

Overloaded nodes:

  • Scale down deployments: Reduce the number of pods, lessening the pressure on nodes. Think of it as retreating to regroup and strategize.
  • Add more nodes to the cluster: Expand your territory and increase overall resource capacity. Think of it as building reinforcements.
  • Investigate external resource hogs: Are processes outside of Kubernetes pods consuming resources? Identify and eliminate them like hidden snipers.

Continue Reading on https://foxutech.com/how-to-troubleshoot-kubernetes-insufficient-node-resources/

If you like our posts, dont forget to subscribe and share with your friends.

You can subscribe us on https://www.youtube.com/@FoxuTech

Follow us on Twitter & Instagram

--

--

FoxuTech

Discuss about #Linux, #DevOps, #Docker, #kubernetes, #HowTo’s, #cloud & IT technologies like #argocd #crossplane #azure https://foxutech.com/