Chaminda's DevOps Journey with MSFT: Setting Up RabbitMQ Cluster in AKS Using RabbitMQ Cluster Operator

We have discused "Setting Up RabbitMQ Cluster Operator and Topology Operator via Azure Pipelines in AKS" in the previous post. The deployed cluster oprator in AKS can be used deploy a RabbitMQ cluster. In this post let's explore deploying a production ready RabbitMQ Cluster in AKS (without TLS/SSL - we will explore that in a future post), to be used with apps deployed in same AKS cluster.

Once successfully deployed the RabbitMQ clsuter the rabbitmq namcspace should have resources shown in the below image. We are deploying a three node RabbitMQ cluster, each pod scheduled in a different Azure availability zone node. Cluster access setup with service/rabbitmq-cluster ClusterIP, as we only need access within AKS cluster for apps. How to use it for local development, we can discuss in a future post.

The following contet should be created as rabbitmq-cluster.yaml in the same pipelines\aks_manifests\rabbitmq path, as we have used in "Setting Up RabbitMQ Cluster Operator and Topology Operator via Azure Pipelines in AKS".

apiVersion: v1
kind: Namespace
metadata:
  name: rabbitmq

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  labels:
    app.kubernetes.io/instance: rabbitmq
    app.kubernetes.io/name: rabbitmq
    app.kubernetes.io/part-of: rabbitmq
  name: rabbitmq-storage
parameters:
  skuName: StandardSSD_LRS
provisioner: disk.csi.azure.com
reclaimPolicy: Delete # Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: rabbitmq-cluster-pdb
  namespace: rabbitmq
spec:
  minAvailable: 50%
  selector:
    matchLabels:
      app.kubernetes.io/name: rabbitmq-cluster

---
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
  name: rabbitmq-cluster
  namespace: rabbitmq
  annotations:
    rabbitmq.com/topology-allowed-namespaces: "pocrmq" # use "*"" to allow all or "default,ns1,my-namespace" to allow multiple 
spec:
  image: rabbitmq:4.1.0-management-alpine
  replicas: 3
  resources:
    requests:
      cpu: 500m # 4
      memory: 2Gi # 10Gi
    limits:
      memory: 2Gi # 10Gi
  rabbitmq: # use disk_free_limit.absolute = 10GB
    additionalConfig: |
      cluster_partition_handling = pause_minority
      disk_free_limit.absolute = 1GB
      collect_statistics_interval = 10000
      default_user = ${rabbitmq_user}$
      default_pass = ${rabbitmq_user_passowrd}$
      default_user_tags.administrator = true
  persistence:
    storageClassName: rabbitmq-storage
    storage: "16Gi" # "512Gi"
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/os
            operator: In
            values:
            - linux
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
              - rabbitmq-cluster
        topologyKey: kubernetes.io/hostname
  override:
    statefulSet:
      spec:
        template:
          spec:
            containers: []
            topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: "topology.kubernetes.io/zone"
              whenUnsatisfiable: DoNotSchedule
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: rabbitmq-cluster

---
apiVersion: v1
kind: Secret
metadata:
  name: rabbitmq-service-credentials
  namespace: pocrmq
  annotations:
    rabbitmq.com/topology-allowed-namespaces: "pocrmq"
type: Opaque
stringData:
  username: ${rabbitmq_user}$
  password: ${rabbitmq_user_passowrd}$
  uri: http://rabbitmq-cluster.rabbitmq.svc.cluster.local:15672
  amqpuri: amqp://rabbitmq-cluster.rabbitmq.svc.cluster.local:5672

Here we setup a namespace rabbitmq. Then we define a storage class with standard LRS so a disk will be created and mounted for each RabbitMQ cluster pod. The pod disruption budget ensure only one pod can be unavailable at a given point of time by setting 50% in 3 pod cluster. We should set replicas in 1,3,5,7 aas odd number of pods as recommended by RabbitMQ documentation.

In RabbitmqCluster specification above the annotation below ensures the pocrmq namespace is allowed to define the queues etc in the RabbitMQ cluster.

rabbitmq.com/topology-allowed-namespaces: "pocrmq"

We have to seup a new secret in the required namespace with correct credentials as we have overwritten default username and password. This is required later for deploying topologies (exchanges, queues, policies etc. with topology operator), which we will discuss in a next post.

The image: rabbitmq:4.1.0-management-alpine is specifically used to control the used RabbitMQ version. In additionalConfig section we can define overwrites to the RabbitMQ config settings defined here. We specifically set user name and password for RabbitMQ default user here, which will later be used in apps.

The user name is defined in app config in Azure via terraform and password is generated as shown below via terraform.

resource "random_password" "rmq_password" {
  length           = 32
  special          = true
  override_special = "-_"
  min_numeric      = 1
  min_upper        = 1
  min_lower        = 1
  min_special      = 1
}

Then the password is stored in the Azure keyvault and allowed the apps to use the password via Azure app config. Note that all terraform code is not added here since it is not the topic of this post.

# Secrets 
resource "azurerm_key_vault_secret" "secret" {
  for_each = {
    DemoSecret                  = "Notarealsecret"
    RabbitMQPassword            = random_password.rmq_password.result
  }
  name         = each.key
  value        = each.value
  key_vault_id = azurerm_key_vault.instancekeyvault.id

  depends_on = [
    azurerm_key_vault.instancekeyvault
  ]
}

Azure app config uses the secret in keyvault to get RabbitMQ user password.

resource "azurerm_app_configuration_key" "config_vault" {
  for_each = {
    "DemoSecret"                           = azurerm_key_vault_secret.secret["DemoSecret"].versionless_id
    "RabbitMQPassword"                     = azurerm_key_vault_secret.secret["RabbitMQPassword"].versionless_id
  }
  configuration_store_id = azurerm_app_configuration.appconf.id
  key                    = each.key
  type                   = "vault" # keyvault reference
  label                  = azurerm_resource_group.instancerg.name
  vault_key_reference    = each.value
  depends_on = [
    azurerm_role_assignment.appconf_dataowner
  ]
}

user name is defnined in terraform local varaible and used in app config as shown below.

resource "azurerm_app_configuration_key" "config_kv" {
  for_each = {
    "RabbitMQUser"    = local.rabbitmq_user
  }
  configuration_store_id = azurerm_app_configuration.appconf.id
  key                    = each.key
  type                   = "kv" # key value
  label                  = azurerm_resource_group.instancerg.name
  value                  = each.value
  depends_on = [
    azurerm_role_assignment.appconf_dataowner
  ]
}

Terraform outputs the user name and password as shown below.

output "rabbitmq_user" {
  value = local.rabbitmq_user
}

output "rabbitmq_user_passowrd" {
  value     = random_password.rmq_password.result
  sensitive = true
}

To setup the RabbitMQ cluster we can use the same script used in "Setting Up RabbitMQ Cluster Operator and Topology Operator via Azure Pipelines in AKS". We need to add the below additional section to the script install-rabbitmq.ps1 in the "Setting Up RabbitMQ Cluster Operator and Topology Operator via Azure Pipelines in AKS".

$clusterManifest = -join($manifestPath,'rabbitmq-cluster.yaml');

Write-Host (-join('Deploying RabbitMQ cluster with: ',$clusterManifest, ' ...'));
kubectl apply -f $clusterManifest;
Invoke-AKS-App-Health-Check -aksNamespace 'rabbitmq' -apps @('rabbitmq-cluster') -appHealthCheckMaxAttempts 90 -appHealthCheckIntervalSeconds 10 -appReadyInitialWaitSeconds 30; # Max 15 minutes wait
Write-Host ('Successfully deployed RabbitMQ cluster.');
Write-Host ('=========================================================');

Then we have to use adtional step of reading terraform output and set them as Azure DevOps variables to get the deployment done via Azure DevOps pipeline. Note that only most relevant pipeline steps are added here.

- task: TerraformCLI@0
    name: terraformOutput
    displayName: 'Run terraform output'
    inputs:
      command: output
      workingDirectory: "$(System.DefaultWorkingDirectory)/iac/Deployment/Terraform"
    
  - task: PowerShell@2
    name: setup_blue_green_sys_vars
    displayName: 'Set blue-green sys IaC variables and update ADO variable group'
    inputs:
      targetType: 'inline'
      script: |
        $env:AZURE_DEVOPS_EXT_PAT = '$(System.AccessToken)'
        
        $rabbitmq_user = '$(TF_OUT_RABBITMQ_USER)';
        $rabbitmq_user_passowrd = '$(TF_OUT_RABBITMQ_USER_PASSOWRD)';

        az extension add --name azure-devops --allow-preview false --yes

        Write-Host "##vso[task.setvariable variable=rabbitmq_user;]$rabbitmq_user"
        Write-Host "##vso[task.setvariable variable=rabbitmq_user_passowrd;]$rabbitmq_user_passowrd"

        
  
  - task: KubectlInstaller@0
    displayName: 'Install Kubectl latest'
  
  - task: Kubernetes@1
    displayName: 'Deploy k8s system prerequisites'
    inputs:
      connectionType: 'Azure Resource Manager'
      azureSubscriptionEndpoint: '${{ parameters.serviceconnection }}'
      azureResourceGroup: 'ch-mq-$(envname)-rg'
      kubernetesCluster: 'ch-mq-$(envname)-aks-$(sys_app_deploy_instance_suffix)'
      useClusterAdmin: true
      command: apply
      arguments: '-f k8s_system_prerequisites.yaml'
      workingDirectory: '$(System.DefaultWorkingDirectory)/pipelines/aks_manifests'
  
  - task: qetza.replacetokens.replacetokens-task.replacetokens@5
    displayName: 'Replace rabbitmq-cluster.yaml'
    inputs:
      rootDirectory: $(System.DefaultWorkingDirectory)/pipelines/aks_manifests/rabbitmq
      actionOnMissing: fail
      tokenPattern: custom
      tokenPrefix: '${'
      tokenSuffix: '}$'
      targetFiles: |
        rabbitmq-cluster.yaml

  - task: AzureCLI@2
    displayName: 'Deploy RabbitMQ'
    inputs:
      azureSubscription: '${{ parameters.serviceconnection }}'
      scriptType: ps
      scriptLocation: inlineScript
      inlineScript: |
        $rgName = 'ch-mq-$(envname)-rg';
        $aksName = 'ch-mq-$(envname)-aks-$(sys_app_deploy_instance_suffix)';
        
        Write-Host $aksName
        
        az aks get-credentials -n $aksName -g $rgName --admin --overwrite-existing
        
        $(System.DefaultWorkingDirectory)/pipelines/scripts/install_rabbitmq.ps1 `
          -manifestPath '$(System.DefaultWorkingDirectory)/pipelines/aks_manifests/rabbitmq/';
        
        kubectl config delete-context (-join($aksName,'-admin'))

Then the pipline can deploy the RabbitMQ cluster. In a next post let's explore how to test the RabbitMQ clsuter is working as expected in the AKS.

Chaminda's DevOps Journey with MSFT

Friday, 9 May 2025

Setting Up RabbitMQ Cluster in AKS Using RabbitMQ Cluster Operator

No comments:

Popular Posts