Chaminda's DevOps Journey with MSFT: Scale Pods in AKS with Kubernetes Event Driven Autoscaling (KEDA) ScaledJob Based on Azure Service Bus Queue as a Trigger

In previous posts we discussed "Setting Up Kubernetes Event Drivern Autoscaling (KEDA) in AKS with Workload Identity" and how to "Set Up (KEDA) Authentication Trigger for Azure Storage Queue/Service Bus in AKS". With that now we can proceed to setup kubernetes scaled job in AKS to run a pod when the Azure service bus queue received a message. Using scaled job we are going to start a job (pod) once a messsage is received in the queue and then receive the massage in the pod container app, process and complete the message and complete the job execution with a pod complete. So, there will be a different pod and a container (kubernetes job) processing each message recived in the Azure service bus queue.

So the purpose is to create a kubernetes job which can be scaled using KEDA based on the messages recived in the queue as shown below.

The example .NET 8 application code is available here in GitHub, which is performing a video transcoding as its process using ffmpeg. The dcker image can be built using the docker file in the project here and the image can be pushed to Azure container registry.

We can use below shown yaml (deploy with kubectl apply) to deploy the .NET app as a scaled job with KEDA scaling the job when we send messages to the Azure service bus queue. You can find comments describing purpose of each setting in below yaml, and some important settings are described latter in this post below for further clarification.

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: dotnet-video-processor-scaled-job
  namespace: media
  # annotations:
  #   autoscaling.keda.sh/paused: true          # Optional. Use to pause autoscaling of Jobs
spec:
  jobTargetRef:
    parallelism: 1                            # [max number of desired pods](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism)
    completions: 1                            # [desired number of successfully finished pods](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism)
    activeDeadlineSeconds: 15000 # Should be higher than maxProcessWaitTime (14400) in video processor  #  Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer
    backoffLimit: 6                           # Specifies the number of retries before marking this job failed. Defaults to 6
    template:
      # ==========================================================
      # describes the [job template](https://kubernetes.io/docs/concepts/workloads/controllers/job)
      metadata:
        labels:
          app: dotnet-video-processor
          service: dotnet-video-processor
          azure.workload.identity/use: "true"
      spec:
        serviceAccountName: ch-video-wi-sa
        nodeSelector:
          "kubernetes.io/os": linux
        priorityClassName: media-highest-priority-linux
        #------------------------------------------------------
        # setting pod DNS policies to enable faster DNS resolution
        # https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
        dnsConfig:
          options:
            # use FQDN everywhere 
            # any cluster local access from pods need full CNAME to resolve 
            # short names will not resolve to internal cluster domains
            - name: ndots
              value: "2"
            # dns resolver timeout and attempts
            - name: timeout
              value: "15"
            - name: attempts
              value: "3"
            # use TCP to resolve DNS instad of using UDP (UDP is lossy and pods need to wait for timeout for lost packets)
            - name: use-vc
            # open new socket for retrying
            - name: single-request-reopen
        #------------------------------------------------------
        volumes:
          # # `name` here must match the name
          # # specified in the volume mount
          # - name: demo-configmap-video-processor-volume
          #   configMap:
          #     # `name` here must match the name
          #     # specified in the ConfigMap's YAML
          #     name: demo-configmap
          - name: media-data-volume
            persistentVolumeClaim:
              claimName: media-azurefile-video-processor # PersistentVolumeClaim name in aks_manifests\prerequisites\k8s.yaml
        terminationGracePeriodSeconds: 90 # This must be set to a value that is greater than the preStop hook wait time.
        containers:
          - name: dotnet-video-processor
            lifecycle:
              preStop:
                exec:
                  command: ["sleep","60"]
            image: chdemosharedacr.azurecr.io/media/chdotnetmediaservice:1.5
            imagePullPolicy: Always
            volumeMounts:
              # - mountPath: /etc/config
              #   name: demo-configmap-video-processor-volume
              - mountPath: /media/data
                name: media-data-volume
            env:
              - name: MEDIA_PATH
                value: /media/data
            resources:
                  limits:
                    memory: 4Gi # the memory limit equals to the request!
                    # no cpu limit! this is excluded on purpose
                  requests:
                    memory: 4Gi
                    cpu: "2"
        # ==========================================================
  pollingInterval: 5                         # Optional. Default: 30 seconds
  successfulJobsHistoryLimit: 20               # Optional. Default: 100. How many completed jobs should be kept.
  failedJobsHistoryLimit: 20                   # Optional. Default: 100. How many failed jobs should be kept.
  # envSourceContainerName: {container-name}    # Optional. Default: .spec.JobTargetRef.template.spec.containers[0]
  minReplicaCount: 0                          # Optional. Default: 0
  maxReplicaCount: 10                        # Optional. Default: 100
  # rolloutStrategy: gradual        # Deprecated: Use rollout.strategy instead (see below).
  rollout:
    strategy: gradual                         # We should not delete existing jobs. So gradual. # Optional. Default: default. Which Rollout Strategy KEDA will use.
    propagationPolicy: foreground             # Optional. Default: background. Kubernetes propagation policy for cleaning up existing jobs during rollout.
  scalingStrategy:
    strategy: "default"   # We use default as locked messages should not appear in queue # Optional. Default: default. Which Scaling Strategy to use. 
    # customScalingQueueLengthDeduction: 1      # Optional. A parameter to optimize custom ScalingStrategy.
    # customScalingRunningJobPercentage: "0.5"  # Optional. A parameter to optimize custom ScalingStrategy.
    # pendingPodConditions:                     # Optional. A parameter to calculate pending job count per the specified pod conditions
    #   - "Ready"
    #   - "PodScheduled"
    #   - "Pending"
    #   - "ContainerCreating"
    # multipleScalersCalculation : "max" # Optional. Default: max. Specifies how to calculate the target metrics when multiple scalers are defined.
  triggers:
    - type: azure-servicebus
      metadata:
        queueName: dotnetvideoqueue
        namespace: ch-video-dev-euw-001-sbus-blue
        messageCount: "1"
      authenticationRef:
        name: video-processor-queue-auth
    #--------------
    # Below is how the storage queue is used as trigger
    #--------------
    # - type: azure-queue
    #   metadata:
    #     queueName: dotnetvideoqueue
    #     queueLength: '1'
    #     accountName: chvideodeveuw001queuest # storage-account-name
    #     # activationQueueLength: '50'
    #   authenticationRef:
    #     name: video-processor-queue-auth
    #--------------

Once we deploy the scale job we can find it is ready to create jobs if any messages received in the queue.

Once we send messages to the queue for example, the code here in GitHub can be used to send 5 messages to the Azure service bus queue (the message content hardcoded for the demo purpose). KEDA will trigger the jobs as shown below.

Jobs should be executed to the completion as messages recived in the queue.

Complete infrastructure deployment terraform, with AKS, workload identity and KEDA, and required deployment manifests to deploy apps and scaled job are available in the GitHub repo here.

Now let's have a detailed look at the scaled job deployment YAML above.

The kid of deployment is a ScaledJob with API version set as keda.sh/v1alpha1. Notice that we have commented the annotation which can be used to disable the autoscaling of the job with KEDA.

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: dotnet-video-processor-scaled-job
  namespace: media
  # annotations:
  #   autoscaling.keda.sh/paused: true 

Next important settings are job target reference. Here we define the requested parallelism as 1. However, this will not prevent the job from scaling based on the max scale setting we define later. We are defning that we need job completions to be successful if the pod (the pod one queued for message) is completed. Back off limit allows restart of the contianer in case of startup failures to be max count to 6 here, before deciding the pod has failed (job failed). The deadline below is allowing the pod to keep runing from start time the given number of seconds, befre kubernetes decides job is timed out and not cmpleted succefully. For long running jobs should be set to a larger value. Here it is set to more than 4 hous to accomdate of proessing of larger video files.

jobTargetRef:
    parallelism: 1       
    completions: 1   
    activeDeadlineSeconds: 15000 
    backoffLimit: 6 

Polling interval below says for each how many seconds the queue is checked for message availablility. Job hostory limits allow to keep the history of success and failed job pods, for inspection, if required. Since, we are using a single container defalt is fine for the source container. Min replica count set to 0 here so that no pods will run there is no messages. If there are 100 messages in the queue, max replicas count controls how many jobs running as maximum job count at agiven time. So, the 10 jobs at a time will be allowed to run and the next mssages will sart new pods as and when other jobs completed if there are large number of messages are still in the queue.

pollingInterval: 5                        
successfulJobsHistoryLimit: 20            
failedJobsHistoryLimit: 20                  
# envSourceContainerName: {container-name}  
minReplicaCount: 0                          
maxReplicaCount: 10                       

Rollout stategy is useful to define how the current running jobs would behave when a new job deployment spec is applied. The default k8s behaviour of terminating existing pods when new ones with new spec appear, should not happen here. So we are setting it to gradual strategy and the KEDA will let the existing running jobs to complete. We are choosing propagation policy as foreground, instead of default background to ensure cleanup of all objects when pods are deleted.

rollout:
  strategy: gradual 
  propagationPolicy: foreground  

As scaling strategy we are using the default scaling strategy of KEDA. If you need to cange the behavoir you can change strategy to custom and play around with the other commented settings.

           
scalingStrategy:
  strategy: "default"   
  # customScalingQueueLengthDeduction: 1     
  # customScalingRunningJobPercentage: "0.5" 
  # pendingPodConditions:                     
  #   - "Ready"
  #   - "PodScheduled"
  #   - "Pending"
  #   - "ContainerCreating"
  # multipleScalersCalculation : "max" # Optional. Default: max. 

For the trigger we use the service bus queue as trigger. The trigger authentication we have created with Azure worload identity is used in this trigger.

triggers:
  - type: azure-servicebus
    metadata:
      queueName: dotnetvideoqueue
      namespace: ch-video-dev-euw-001-sbus-blue
      messageCount: "1"
    authenticationRef:
      name: video-processor-queue-auth

For further information look at KEDA documentation for scaled jobs here.

Chaminda's DevOps Journey with MSFT

Saturday, 20 January 2024

Scale Pods in AKS with Kubernetes Event Driven Autoscaling (KEDA) ScaledJob Based on Azure Service Bus Queue as a Trigger

No comments:

Popular Posts