Friday 30 June 2023

Zero Downtime Blue-Green Deployment for AKS with Terraform - Simulation Using a Resource Group with Pipeline Steps

In this post we use a resource group to demo a blue green deployment scenario for AKS with Terraform IaC, to understand the steps required for successful pipline implementation. Same pattern can be applied to deploy other Azure resources well. Instead of two resource groups used here to run demo faster, actual implementation can be two AKS clusters (blue and green) withing a single resource group or in two resource groups. Full pipeline with Terrafrom IaC for AKS blue greeen with application deployments will be dicussed in a future post(s). For now let's loo at the example blue green, using resource group to represent AKS cluster, which is used to validate blue green deployment algorithm for AKS.

Full algorithm test information and terraform files available in below links.

Let's take one scanario in this blog post as below.

  • Blue AKS cluster is already deployed and live with k8s version 1.25.6.
  • We need next release to upgrade k8s version to 1.26.3 and then deploy applications.

So, our deployment is at below state as of now the blue cluster is live with k8s 1.25.6.

In the next release, we need to make sure we deploy a new AKS cluster green with k8s 1.26.3, then deploy applications to green cluster while keeping the blue cluster with k8s 1.25.6 still live to achieve zero down time. Once new blue cluster is ready fully with applications we need to bring it live routing all traffic to it. Then we need to destroy the blue cluster.

Let's go step by step look at end result. First our IaC code should deploy the new green AKS cluster without making any changes to the clurrent live blue cluster as shown below. Parameters used in the command below is decribed later in the post.

Command: terraform apply -var deployment_phase=deploy -var blue_deploy=true -var green_deploy=true -var green_golive=false -var current_kubernetes_version=1.25.6 -auto-approve

You can see blue cluster is still live and in k8s version 1.25.6 and green cluster is deployed with k8s 1.26.3.

We deploy our applications now to the green cluster with k8s 1.26.3. Still our bluse cluster with k8s 1.25.6 remains live.

The next step would be to switch the green cluster as live routing traffic to it.

Command: terraform apply -var deployment_phase=switch -var blue_deploy=true -var green_deploy=true -var green_golive=true -var current_kubernetes_version=1.25.6 -auto-approve

You can see green cluster with k8s 1.26.3 is live now. The blue cluster is ready to be destroyed.

Next, we run the destroy phase destroying the previously used blue cluster.

Command:terraform apply -var deployment_phase=destroy -var blue_deploy=false -var green_deploy=true -var green_golive=true -var current_kubernetes_version=1.26.3 -auto-approve

You can see blue cluster is removed and the green clsuter remains live.

To achive above mentioned scenario and all scenarios described in ReadMe file.

A pipeline tool such as Azure DevOps pipelines or GitHub actions should be used with below defined phases of deployment.

  • deploy: Blue (if current live is green/ fresh deployment) or green (if current live is blue) cluster deployment.
  • appdeploy: deploy apps to newly deployed cluster which is not yet live.
  • switch: Bring newly deployed cluster live.
  • destroy Destroy the previous cluster.

Pipeline should define below variables.

  • blue_deploy: Should set intial value to `true` and should not change value afterwards. 
  • green_deploy: Should set intial value to `false` for fresh deployment or `true` if blue cluster is existing. 
  • green_golive: Should set intial value to `true` for fresh deployment or `false` if blue cluster is existing.
  • current_k8s: Should set as empty value for fresh deployment or set as current blue cluster k8s version if blue cluster is existing.
  • app_deploy_culster: Should set as empty value for fresh deployment or set as current blue cluster name if blue cluster is existing.

Pipeline should update the value as specified in below phases for all above variables. After intial setup no manual changes should be done to above pipeline variables.

Infra `deploy` phase

  • Here the Blue or Green AKS cluster gets deployed based on current live cluster. Existing cluster is not changed. If fresh deployment a blue cluster created and set as deployed but not live.
  • Update pipeline variable `green_golive` to NOT(`green_golive`).
  • Update `app_deploy_culster` from TF output variable `app_deploy_cluster`.

Application deployment phase `appdeploy`

  • Deploy the apps to `app_deploy_culster`.

Infra `switch` phase

  • Terraform will set the `app_deploy_cluster` as live cluster, by routing traffic to it.
  • Update pipeline variable `blue_deploy` to `false` if `green_golive` is `true`. 
  • Update pipeline variable `green_deploy` to `false` if `green_golive` is `false`. 
  • Update pipeline variable `current_k8s` to `K8Sversion` of the live cluster, if `current_k8s` is not equal to `K8Sversion`. 

Infra `destroy` phase

  • Terraform will destroy the non live cluster.
  • Update pipeline variable `blue_deploy` to `true`, if the current value is `false`.
  • Update pipeline variable `green_deploy` to `true`, if the current value is `false`.

The ReadMe file describes all the steps of algorithm, to perform AKS blue green deployment, including k8s version upgrades, fullly tested using terraform files available in GitHub here.

In future post(s) let's explore fully functional implementation of Aure DevOps pipelines, of blue green deployment for AKS with applications deployed to AKS, using the algorithm described in this post.

No comments:

Popular Posts