Kubernetes HighAvailability Deployment with Pod Anti-Affinity

4 min readApr 12, 2023

We do aware main role of Kubernetes is providing High availability for our microservice applications. There is a greater number of options and best practices we should follow to keep our application high available. Today will see one more option how to keep the application high available in public cloud which contains multiple zones.

With a Kubernetes Deployment we can achieve high availability natively with Pod Replicas. But if those replicas are scheduled on the same node and there is a problem with the node then the system can experience downtime. Same, if those replicas are scheduled in the same availability zone (AZ) and the zone fails then the system will experience downtime.

As the Kubernetes scheduler uses a bin-packing algorithm to fit as many pods as possible into a cluster. The scheduler prefers a more evenly distributed general node load to app replicas precisely spread across nodes. Therefore, by default, multi-replica is not guaranteed multi-AZ. Considering that, to avoid the downtime and pod get allotted to single node/AZ, for production services, we can consider using pod anti-affinity to ensure replicas are distributed between AZs.

In this article, lets understand more about affinity works and what are the options it provides and how to configure with deployments.

How Does Affinity Work?

Affinities are used to express Pod scheduling constraints that can match characteristics of candidate Nodes and the Pods that are already running on those Nodes. A Pod that has an “affinity” to a given Node is more likely to be scheduled to it; conversely, an “anti-affinity” makes it less probable it will be scheduled. The overall balance of these weights is used to determine the final placement of each Pod.

Affinity assessments can produce either hard or soft outcomes. A “hard” result means the Node must have the characteristics defined by the affinity expression. “Soft” affinities act as a preference, indicating to the scheduler that it should use a Node with the characteristics if one is available. A Node that does not meet the condition will still be selected if necessary.

Types of Affinity Condition

There are currently two different kinds of affinity that you can define:

Node Affinity — Used to constrain the Nodes that can receive a Pod by matching labels of those Nodes. Node Affinity can only be used to set positive affinities that attract Pods to the Node.
Inter-Pod Affinity — Used to constrain the Nodes that can receive a Pod by matching labels of the existing Pods already running on each of those Nodes. Inter-Pod Affinity can be either an attracting affinity or a repelling anti-affinity.

Setting Node Affinities

Node Affinity has two distinct sub-types:

requiredDuringSchedulingIgnoredDuringExecution – This is the “hard” affinity matcher that requires the Node meet the constraints you define.
preferredDuringSchedulingIgnoredDuringExecution – This is the “soft” matcher to express a preference that’s ignored when it can’t be fulfilled.

Here, The IgnoredDuringExecution part of these verbose names makes it explicit that affinity is only considered while scheduling Pods. Once a Pod has made it onto a Node, affinity is not re-evaluated. Changes to the Node will not cause a Pod eviction due to changed affinity values.

Example

In the simplest possible example, a Pod that includes a Node Affinity condition of label=value will only be scheduled to Nodes with a label=value label. A Pod with the same condition but defined as an Inter-Pod Affinity will be scheduled to a Node that already hosts a Pod with a label=value label.

Zones

This configuration makes a best effort to schedule replicas of a workload in different zones from each other.

spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: <label-key>
              operator: In
              values:
              - <label-value>
          topologyKey: topology.kubernetes.io/zone

Where the label key-value pair is unique to the Deployment Pods.

Nodes

This configuration makes a best effort to schedule replicas of a workload in different nodes from each other.

spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: <label-key>
              operator: In
              values:
              - <label-value>
          topologyKey: kubernetes.io/hostname

You could also use both configurations. This would be useful in scenarios where you have more pod replicas and nodes than zones.

Continue reading in Kubernetes HighAvailability Deployment with Pod Anti-Affinity (foxutech.com)