We have seen about imagepullbackoff error on last article, now let’s take a look on another familiar error on Kubernetes. If you are working on Kubernetes, this could be on the annoying error, you may experience multiple times. The error is nothing but Kubernetes
CrashLoopBackOff, it is one of the common errors in Kubernetes, indicating a pod constantly crashing in an endless loop and either unable to get start or fail.
In this post will see how we can identify the cause of the issue and why we are getting
CrashLoopBackOff error, and also, we will cover how you can solve this.
Why does CrashLoopBackOff occurs?
CrashLoopBackOff error can occur due to varies reasons, including:
- Insufficient resources — lack of resources prevents the container from loading
- Locked file/database/port — a resource already locked by another container
- No proper reference/Configuration — reference to scripts or binaries that are not present on the container or any misconfiguration on underlying system such as read-only filesystem
- Config loading/Setup error — a server cannot load the configuration file or initial setup like init-container failing
- Connection issues — DNS or kube-DNS is not able to connect to a external services
- Downstream service — One of the downstream services on which the application relies can’t be reached or the connection fails (database, backend, etc.)
- Liveness probes– Liveness probes could have misconfigured or probe fails due to any reason.
- Port already in use: Two or more containers are using the same port, which doesn’t work if they’re from the same Pod
How to Diagnosis CrashLoopBackOff
To troubleshoot any issues, the best way to identify the root cause is to start going through the list of potential causes and check one by one. Let’s say easy on first. Also, another basic requirement is having better understanding of the environment, like what is the configuration, what port it used, is there any mount point, what is the probe configured, etc.
Back Off Restarting Failed Container
For first point to troubleshoot to collect the issue details run
kubectl describe pod [name]. Let say you have configured and it is failing due to some reason like
Liveness probe failed and
Back-off restarting failed container.
If you get the
back-off restarting failed container message this means that you are dealing with a temporary resource overload, as a result of an activity spike. The solution is to adjust
timeoutSeconds to give the application a longer window of time to respond.
Check the logs
If the previous step not providing any details or cannot identify, the next step will be pulling more details explanation about what is happening, you can get this from failing pod.
For that run
kubectl get pods to identify the pod that was exhibiting the
CrashLoopBackOff error. You can run the following command to get the log of the pod:
kubectl logs PODNAME
Try to walkthrough the error, to identify why the pod is repeatedly crashing. This may have some more details from the application running inside the pod, with this you could see any configuration error or any readiness issue like that.
Check Deployment Logs
Run the following command to retrieve the kubectl deployment logs:
kubectl logs -f deploy/ -n
This may also provide clues about issues at the application level. For example, below you can see a log file that shows ./datacan’t be mounted, likely because it’s already in use and locked by a different container.
you may be experiencing CrashLoopBackOff errors due to insufficient memory resources. You can increase the memory limit by changing the “resources:limits” in the Container’s resource manifest.
Issue with image
If still there is a issue, another reason could be the docker image you are using may not working properly, you need to make sure when you run separately it is working fine. If that is working and failing with Kubernetes, you may need to go advance way to find what is happening, try following,
Step 1: Identify entrypoint and cmd
You will need to identify the entrypoint and cmd to gain access to the container for debugging. Do the following:
docker pull [image-id]to pull the image.
docker inspect [image-id]and locate the entrypoint and cmd for the container image.
Step 2: Change entrypoint
Because the container has crashed and cannot start, you’ll need to temporarily change the entrypoint in the container specification to
tail -f /dev/null.
Step 3: Check for the cause
With the entrypoint changed, you should be able to use the default command line kubectl to execute into the issue container. Once you login the container, check all the possible options and validate all good, if you see any issue fix it.
Step 4: Check for missing packages or dependencies
When you logged in, check if any packages or dependencies are missing, preventing the application from starting. If there are packages or dependencies missing, provide the missing files to the application and see if this resolves the error.
Continue Reading on Kubernetes CrashLoopBackOff — How to Troubleshoot — FoxuTech