How to take Azure Kubernetes Backup using Velero

5 min readJun 6, 2022

One another important topic to discuss, yes backup. Irrespective of any system or environment Backup is mandatory. Like that Kubernetes or any managed Kubernetes, we should take a backup, even if you’re using Infrastructure as Code and all deployments are automated, as it will add additional benefits from taking backups of AKS clusters. You can check some best practices from Microsoft.

Why should take Backup?

Most of the SRE or who all managing the Kubernetes, may experienced the risk beyond taking dump or time consumptions when they manage large database in any Kubernetes platform. Also, some important configuration managements may more crucial. To optimize this or reduce the risk, in AKS we do have couple of options like,

Take a backup from the Azure managed disk with Azure Backup
Create a scheduler backup to PVC with backup tool like Velero.

In this post, lets check how to use Velero to take Azure Kubernetes Service backup.

Mean Time To Recover (MTTR)

It is always expected to define the MTTR to bring back any given application or system. Now a days we are mostly automated all our operations like cluster creation and application deployment etc using some pipeline with CI/CD tools like Jenkins, CircleCI, etc. But does the bring the cluster quick? there we have lot to discuss or check. Also, we cannot be sure it will be state as before in rare situation. Also, it takes more time if there was a greater number of services, as mostly we have limited number of parallel executions.

Let’s assume each execution takes ~3–4min and you are expected to deploy 10+ microservices or applications, it may take approximately 30–45min. This number should be okay, 4–5 years back, but considering modern technology growth, this may high number.

With this during any disaster/failure, recreating new infrastructure and re-deploying all components can take time. Depending on the criticality of the incident and the importance of the app, it can feel like an eternity.

To fix this kind of challenges tool like Velero to backup all Kubernetes resources, a cluster can be quickly restored to a certain state, in less time. This tool helps to reduce the recovery time from ~45 minutes to ~15 minutes approximately.

Another advantage on using backup tools is help to backup the data in persistent volume, like said before, if you are running any stateful application on the cluster, it uses persistent volume to store the data.

Velero

Velero(formerly Heptio Ark) is an open source tool for safely backing up and restoring resources in a Kubernetes cluster, performing disaster recovery, and migrating resources and persistent volumes to another Kubernetes cluster.

Velero offers key data protection features, such as scheduled backups, retention schedules, and pre- or post-backup hooks for custom actions. Velero can help protect data stored in persistent volumes and makes your entire Kubernetes cluster more resilient.

Velero Use Cases

Here are some of the things Velero can do:

Back up your cluster and restore it in case of loss.
Recover from disaster.
Copy cluster resources to other clusters.
Replicate your production environment to create development and testing environments.
Take a snapshot of your application’s state before upgrading a cluster.

How it will work

Each Velero operation–on-demand backup, scheduled backup, restoration–is a custom resource that is defined with a Kubernetes custom resource definition, or CRD, and stored in etcd. Velero includes controllers that process the CRDs to back up and restore resources. You can back up or restore all objects in your cluster, or you can filter objects by type, namespace, or label.

Data protection is a chief concern for application owners who want to make sure that they can restore a cluster to a known good state, recover from a crashed cluster, or migrate to a new environment. Velero provides those capabilities.

Velero Components and Architecture

Velero contains two main components:

A server that runs on your cluster
A command-line utility that runs locally

Velero supports plug-ins to enable it to work with different storage systems and Kubernetes platforms. You can run Velero in clusters on a cloud provider or on premises.

Installation:

CLI:

You can download the CLI from official release page, here let see how to install 1.8.1 version, you can pick any version from release page and install based on your requirement.

# cd /tmp# wget https://github.com/vmware-tanzu/velero/releases/download/v1.8.1/velero-v1.8.1-linux-amd64.tar.gz# tar -xvf velero-v1.8.1-linux-amd64.tar.gz# cd velero-v1.8.1-linux-amd64 && mv velero /usr/local/bin/# velero help

The Server Side

For the server-side component, there’s two main methods of installation:

The Velero CLI
A Helm Chart

Installing Velero

Out of two options, we are going to pick helm and let’s install it with some modification, also we can see how it use it in GitOps in future.

Velero uses an Azure Plugin to interact with Azure. To authenticate, use Service Principals for now. Usually, I prefer using Azure Active Directory Pod Identities but there’s an open issue with Managed Identities. It’s a project that allows pods to authenticate against Azure using Managed Identities. In other words, by using pod identities (managed identities), you won’t need an API secret for Velero to authenticate against Azure. Remember though, managed identities only work on Azure.

Prerequisites:

Before we start, we need following tools should be installed.

Kubernetes Service up and running, if you don’t have one, please follow the steps with terraform to create it in azure. https://foxutech.com/how-to-create-azure-kubernetes-service-using-terraform/ or your preferable environments.
Kubectl installed in the VM or machine you are going to manage the AKS.
Have a kubeconfig file (default location is ~/.kube/config).
azure-cli
Helm

Dynamic Resource Group

Azure created the “foxutech-velero” resource group to hold dynamic resources created for my Kubernetes cluster. For example, agent pools, dynamic disks for persistent volumes.

Once it is done next step is to setup a storage account.

Setup Storage Account

Create blob container inside the storage account:

# az storage account create --name mystoragevelero --resource-group myResourceGroup --sku Standard_GRS --encryption-services blob --https-only true --kind BlobStorage --access-tier Cold# az storage container create -n velero --public-access off --account-name mystoragevelero

Get your subscription and tenant ID:

# az account list --query '[?isDefault].id' -o tsv# az account list --query '[?isDefault].tenantId' -o tsv

Create a service principal with contributor access:

# export SUBSCRIPTION_ID=XXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXX# export STORAGE_RESOURCE_GROUP=myResourceGroup# export MC_RESOURCE_GROUP=foxutech-velero# az ad sp create-for-rbac \
 --name "velero" \
  --role "Contributor" \
  --query 'password' \
  -o tsv \
  --scopes /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$STORAGE_RESOURCE_GROUP /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$MC_RESOURCE_GROUP

Save the password that you got while creating the service principal.

Get the app ID for the service principal:

# az ad sp list --display-name "velero" --query '[0].appId' -o tsv

Create a credentials file velero-credentials for Velero, make sure to update the values of subscription id, tenant id, a client id (SP app id), client secret (SP password), and resource group name.

# cat velero-credentials
AZURE_SUBSCRIPTION_ID=XXXX-XXXX-XXX-XXX-XXXX-XXXXXXXX
AZURE_TENANT_ID=XXXX-XXXX-XXX-XXX-XXXX-XXXXXXXX
AZURE_CLIENT_ID=SERVICE_PRINCIPAL_APPID
AZURE_CLIENT_SECRET=SERVICE_PRINCIPAL_PASSWORD
AZURE_RESOURCE_GROUP=foxutech-velero
AZURE_CLOUD_NAME=AzurePublicCloud

Continue Reading on https://foxutech.com/how-to-take-azure-kubernetes-backup-using-velero/

If you like you can buy me a Coffee https://www.buymeacoffee.com/foxutech

Learn Kubernetes on Udemy, now Deal extended: Courses Up To 85% Off

How to take Azure Kubernetes Backup using Velero

Written by FoxuTech

No responses yet