Running out of resources in your Kubernetes cluster is a familiar foe for any K8s warrior. The dreaded “insufficient node resources” message leaves you facing a chaotic battleground of stalled pods, frustrated users, and a performance dip so steep it could rival a ski slope. But fear not, brave adventurer! This guide will equip you with the tools and strategies to navigate this perilous terrain and emerge victorious.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 19m default-scheduler Successfully assigned argo/parallel-jobxx-vsd25-123213123` to 10.84.103.96
Warning OutOfcpu 19m kubelet Node didn't have enough resource: cpu, requested: 310, used: 3655, capacity: 3910
Step 1: Scouting the Battlefield
- Symptoms: Pods stuck in “Pending” limbo, containers crashing due to OOM (Out-of-Memory) attacks, sluggish performance like a snail in molasses, and overall cluster instability.
- Tools:
kubectl get pods
: Identify pods stuck in “Pending” purgatory.kubectl describe nodes
: Survey resource availability on each node, like a scout inspecting the land.kubectl top nodes
: Monitor resource utilization in real-time, keeping an eye on the enemy’s movements.- Cluster monitoring dashboards (Prometheus, Grafana) offer a bird’s-eye view of the battlefield.
Step 2: Analyzing the Enemy’s Tactics
- CPU, memory, and storage utilization: Are these resources approaching critical levels on any nodes? Spikes or sustained high levels are red flags.
# kubectl describe nodes
[...]
Capacity:
cpu: 4
ephemeral-storage: 61255492Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 2038904Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 56453061334
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 1936504Ki
pods: 110
[...]
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 850m (20%) 0 (0%)
memory 340Mi (7%) 340Mi (17%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
- Pod resource requests and limits: Are pods requesting more than available? Are requests realistic or inflated?
apiVersion: v1
kind: Pod
metadata:
name: high-mem
spec:
containers:
- command:
- sleep
- "3600"
image: busybox
name: lets-break-pod-with-high-mem
resources:
requests:
memory: "1000Gi"
- Eviction thresholds: Are pods being evicted for exceeding resource limits? This could be a clue to resource overallocation.
Step 3: Deploying Countermeasures
Over-requested resources:
- Adjust Pod resource requests and limits: Scale back demands to match actual workload needs. Think of it as rationing supplies for efficient use.
- Optimize resource usage within containers: Identify resource-intensive processes and trim the fat. Like a skilled warrior, make every resource count.
Overloaded nodes:
- Scale down deployments: Reduce the number of pods, lessening the pressure on nodes. Think of it as retreating to regroup and strategize.
- Add more nodes to the cluster: Expand your territory and increase overall resource capacity. Think of it as building reinforcements.
- Investigate external resource hogs: Are processes outside of Kubernetes pods consuming resources? Identify and eliminate them like hidden snipers.
Continue Reading on https://foxutech.com/how-to-troubleshoot-kubernetes-insufficient-node-resources/
If you like our posts, dont forget to subscribe and share with your friends.
You can subscribe us on https://www.youtube.com/@FoxuTech