Azure Workload Identity allows your containers in AKS touse amanaged identity to access Azure resources securely without having to depend on connection strings, passwords, access keys or secrets. In other works you can just use DefaultAzureCredential in your containers running in AKS, which will be using workload identity assigned to the container, to get access to the required Azure resource. The roale based access permissions will be in effect and the user assigned managed identity (we can use AD app registration as well bu user assigned managed identity is recommended) used to setup the workload identity in AKS should be given the necessary roles in the target Azure resource. This is far better than having to store secrets or connection stigs to utilized by the dotnet applications. In this post let's understand how to setup workload identity in AKS deployed containers and explore how it simplifies the dotnet application code allowing the application to access Azure resources securely with a managed identity.
The first step of setting up the workload identity in AKS is to enable the OIDC (open id conectivity) issuer and workload identity. In terraform resource azurerm_kubernetes_cluster.
oidc_issuer_enabled = true
workload_identity_enabled = true
Example AKS cluster tf code is below
resource "azurerm_kubernetes_cluster" "aks_cluster" { lifecycle { ignore_changes = [default_node_pool[0].node_count] } name = "${var.prefix}-${var.project}-${var.environment_name}-aks-${var.deployment_name}" kubernetes_version = local.kubernetes_version sku_tier = "Standard" location = var.location resource_group_name = var.rg_name dns_prefix = "${var.prefix}-${var.project}-${var.environment_name}-aks-${var.deployment_name}-dns" node_resource_group = "${var.prefix}-${var.project}-${var.environment_name}-aks-${var.deployment_name}-rg" image_cleaner_enabled = false # As this is a preview feature keep it disabled for now. Once feture is GA, it should be enabled. image_cleaner_interval_hours = 48 network_profile { network_plugin = "azure" load_balancer_sku = "standard" } storage_profile { file_driver_enabled = true } default_node_pool { name = "chlinux" orchestrator_version = local.kubernetes_version node_count = 1 enable_auto_scaling = true min_count = 1 max_count = 7 vm_size = "Standard_DS4_v2" os_sku = "Ubuntu" vnet_subnet_id = var.subnet_id max_pods = 30 type = "VirtualMachineScaleSets" scale_down_mode = "Delete" zones = ["1", "2", "3"] } oidc_issuer_enabled = true # Allow creating open id connect issue url to be used in federated identity credential workload_identity_enabled = true # Enable workload identity in AKS identity { type = "SystemAssigned" } ingress_application_gateway { gateway_id = azurerm_application_gateway.aks.id } key_vault_secrets_provider { secret_rotation_enabled = false } azure_active_directory_role_based_access_control { azure_rbac_enabled = false managed = true tenant_id = var.tenant_id # add sub owners as cluster admin admin_group_object_ids = [ var.sub_owners_objectid] # azure AD group object ID } oms_agent { log_analytics_workspace_id = var.log_analytics_workspace_id } depends_on = [ azurerm_application_gateway.aks ] tags = merge(tomap({ Service = "aks_cluster" }), var.tags) }
As the next step we hve to setup a user assigned managed identity, which will be used as workload identity in AKS.
Then we need to setup a federated identity credential for user assigned identity for the AKS service account which will be used to assigned the identity for each pod. We need to use OIDC issuer url of the AKS cluster in the federated identity credential setup. To understand the concepts in detail read the docs here.
# Federated identity credential for AKS user assigned id - to be used with workload identity service account resource "azurerm_federated_identity_credential" "aks" { name = "${var.prefix}-${var.project}-${var.environment_name}-aks-fic-${var.deployment_name}" resource_group_name = var.rg_name audience = ["api://AzureADTokenExchange"] issuer = azurerm_kubernetes_cluster.aks_cluster.oidc_issuer_url parent_id = var.user_assigned_identity subject = "system:serviceaccount:widemo:wi-demo-sa" # system:serviceaccount:aksapplicationnamespace:workloadidentityserviceaccountname depends_on = [ azurerm_kubernetes_cluster.aks_cluster ] lifecycle { ignore_changes = [] } }
We can use terraform output to obtain the client id of the user assigned managed identity we created. This is required for later use with the service account creation.
Once the terrafomr code is deployed and the AKS cluster is created we can use kubectl to create the service account as shown below.
apiVersion: v1 kind: ServiceAccount metadata: annotations: azure.workload.identity/client-id: userassignedidentitycientid #${USER_ASSIGNED_CLIENT_ID}$ # user Assigned identity client ID (aks_uai_client_id output from Terraform) azure.workload.identity/tenant-id: tenantid #${AZURE_TENANT_ID}$ # Azure tenant id # azure.workload.identity/service-account-token-expiration: "3600" # Default is 3600. Supported range is 3600-86400. Configure to avoid down time in token refresh. Setting in Pod spec takes precedence. name: wi-demo-sa namespace: widemo
We can deploy our application pods enabling use of workload identity as shown below in pod template.
template: metadata: labels: app: wi-api service: wi-api azure.workload.identity/use: "true" # Required to make the contianers in the pod to use the workload identity # annotations: # azure.workload.identity/service-account-token-expiration: "3600" # Configure to avoid down time in token refresh. Takes precedence over servie acount setting. Default 3600, acceptable range: seconds 3600 - 86400. # azure.workload.identity/skip-containers: "container1:container2" # Containers o skip using workload identity. By default all containers in pod will use workload identity when pod is labeled with azure.workload.identity/use: true # azure.workload.identity/inject-proxy-sidecar: "true" # Default true. The proxy sidecar is used to intercept token requests to IMDS (Azure Instance Metadata Service) and acquire an AAD token on behalf of the user with federated identity credential. # azure.workload.identity/proxy-sidecar-port: "8000" # Port of the proxy sidecar. Default 8000 spec: serviceAccountName: wi-demo-sa # Service account (see aks_manifests\prerequisites\k8s.yaml) will provide identity to the pod https://azure.github.io/azure-workload-identity/docs/concepts.html
A full example deployment is below.
apiVersion: apps/v1 kind: Deployment metadata: name: wi-api namespace: widemo labels: app: wi-api spec: strategy: type: RollingUpdate rollingUpdate: maxSurge: 50% maxUnavailable: 25% minReadySeconds: 0 selector: matchLabels: service: wi-api template: metadata: labels: app: wi-api service: wi-api azure.workload.identity/use: "true" # Required to make the contianers in the pod to use the workload identity # annotations: # azure.workload.identity/service-account-token-expiration: "3600" # Configure to avoid down time in token refresh. Takes precedence over servie acount setting. Default 3600, acceptable range: seconds 3600 - 86400. # azure.workload.identity/skip-containers: "container1:container2" # Containers o skip using workload identity. By default all containers in pod will use workload identity when pod is labeled with azure.workload.identity/use: true # azure.workload.identity/inject-proxy-sidecar: "true" # Default true. The proxy sidecar is used to intercept token requests to IMDS (Azure Instance Metadata Service) and acquire an AAD token on behalf of the user with federated identity credential. # azure.workload.identity/proxy-sidecar-port: "8000" # Port of the proxy sidecar. Default 8000 spec: serviceAccountName: wi-demo-sa # Service account (see aks_manifests\prerequisites\k8s.yaml) will provide identity to the pod https://azure.github.io/azure-workload-identity/docs/concepts.html nodeSelector: "kubernetes.io/os": linux priorityClassName: widemo-highest-priority-linux #------------------------------------------------------ # setting pod DNS policies to enable faster DNS resolution # https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy dnsConfig: options: # use FQDN everywhere # any cluster local access from pods need full CNAME to resolve # short names will not resolve to internal cluster domains - name: ndots value: "2" # dns resolver timeout and attempts - name: timeout value: "15" - name: attempts value: "3" # use TCP to resolve DNS instad of using UDP (UDP is lossy and pods need to wait for timeout for lost packets) - name: use-vc # open new socket for retrying - name: single-request-reopen #------------------------------------------------------ volumes: # `name` here must match the name # specified in the volume mount - name: widemo-configmap-wi-api-volume configMap: # `name` here must match the name # specified in the ConfigMap's YAML. See aks_manifests\prerequisites\k8s.yaml name: widemo-configmap terminationGracePeriodSeconds: 90 # This must be set to a value that is greater than the preStop hook wait time. containers: - name: wi-api lifecycle: preStop: exec: command: ["sleep","60"] image: chdemosharedacr.azurecr.io/widemo/wi-api:1.1 imagePullPolicy: Always # probe to determine the stratup success startupProbe: httpGet: path: /api/health port: container-port initialDelaySeconds: 30 # give 30 seconds to get container started before checking health failureThreshold: 30 # max 300 (30*10) seconds wait for start up to succeed periodSeconds: 10 # interval of probe (300 (30*10) start up to succeed) successThreshold: 1 # how many consecutive success probes to consider as success timeoutSeconds: 10 # probe timeout terminationGracePeriodSeconds: 30 # restarts container (default restart policy is always) # readiness probe fail will not restart container but cut off traffic to container with one failure # as specified below and keep readiness probes running to see if container works again readinessProbe: # probe to determine if the container is ready for traffic (used by AGIC) httpGet: path: /api/health port: container-port failureThreshold: 1 # one readiness fail should stop traffic to container periodSeconds: 20 # interval of probe # successThreshold not supported by AGIC timeoutSeconds: 10 # probe timeout # probe to determine the container is healthy and if not healthy container will restart livenessProbe: httpGet: path: /api/health port: container-port failureThreshold: 3 # tolerates three consecutive faiures before restart trigger periodSeconds: 40 # interval of probe successThreshold: 1 # how many consecutive success probes to consider as success after a failure probe timeoutSeconds: 10 # probe timeout terminationGracePeriodSeconds: 60 # restarts container (default restart policy is always) volumeMounts: - mountPath: /etc/config name: widemo-configmap-wi-api-volume ports: - name: container-port containerPort: 80 protocol: TCP env: - name: ASPNETCORE_URLS value: http://+:80 - name: ASPNETCORE_ENVIRONMENT value: Production - name: CH_WIDEMO_CONFIG value: /etc/config/config_dev-euw-001.json resources: limits: memory: 1Gi # the memory limit equals to the request! # no cpu limit! this is excluded on purpose requests: memory: 1Gi cpu: "500m" --- apiVersion: v1 kind: Service metadata: name: wi-api-clusterip namespace: widemo labels: app: wi-api service: wi-api spec: type: ClusterIP ports: - port: 8091 targetPort: 80 protocol: TCP selector: service: wi-api --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: wi-api namespace: widemo annotations: # -------------- # AGIC appgw.ingress.kubernetes.io/connection-draining: "true" appgw.ingress.kubernetes.io/connection-draining-timeout: "120" appgw.ingress.kubernetes.io/use-private-ip: "true" appgw.ingress.kubernetes.io/request-timeout: "30" # -------------- spec: ingressClassName: azure-application-gateway rules: - host: wi-api.aksblue.ch-wi-dev-euw-001.net http: paths: - path: /* pathType: Prefix backend: service: name: wi-api-clusterip port: number: 8091
When our application pods are running the containers are injected with below shown environemnt variables. This allows our application to authnticate via the user assigned managed identity as explained in here..
So with this setup we can use code such as below to load app configuration with keyvault acces to our apps running in AKS using workload identity. app configuration endpoint would be just only the endpoint without any secret or connection information, for example https://ch-wi-dev-euw-001-appconfig-ac.azconfig.io is enough to enable access to app config as we are using default credntials now.
using Azure.Identity; using Azure.Security.KeyVault.Secrets; using Microsoft.Extensions.Configuration; using Microsoft.Extensions.Configuration.AzureAppConfiguration; namespace common.lib.Configs { public class ConfigLoader { public static void LoadConfiguration(IConfigurationBuilder configBuilder) { configBuilder.AddJsonFile(Environment.GetEnvironmentVariable("CH_WIDEMO_CONFIG")); var config = configBuilder.Build(); string? appConfigEndpont = config.GetSection("AppConfigEndpoint").Value; string? appConfigLabel = config.GetSection("AppConfigLabel").Value; string? sharedAppConfiglabel = config.GetSection("SharedAppConfiglabel").Value; string? keyVaultName = config.GetSection("KeyVaultName").Value; string? aadTenantId = config.GetSection("AadTenantId").Value; //Load configuration from Azure App Configuration configBuilder.AddAzureAppConfiguration(options => { DefaultAzureCredential azureCredentials = new(); options.Connect( new Uri(appConfigEndpont), azureCredentials); options .Select(KeyFilter.Any, sharedAppConfiglabel) .Select(KeyFilter.Any, appConfigLabel); SecretClient secretClient = new( new Uri($"https://{keyVaultName}.vault.azure.net/"), azureCredentials); options.ConfigureKeyVault(kv => kv.Register(secretClient)); }); configBuilder.Build(); } } }
This is possible because in terraform we can grant the necessary permision to the user assigned managed identity, as shown below.
resource "azurerm_key_vault" "instancekeyvault" { name = "${var.PREFIX}-${var.PROJECT}-${var.ENVNAME}-kv" location = azurerm_resource_group.instancerg.location resource_group_name = azurerm_resource_group.instancerg.name tenant_id = data.azurerm_client_config.current.tenant_id sku_name = "standard" enabled_for_deployment = false enabled_for_disk_encryption = false purge_protection_enabled = false # allow purge for drop and create in demos. else this should be set to true network_acls { bypass = "AzureServices" default_action = "Deny" ip_rules = ["xxx.xxx.xxx.xxx/32", "${chomp(data.http.mytfip.response_body)}/32"] virtual_network_subnet_ids = [ "${azurerm_subnet.aks.id}" ] } # Sub Owners access_policy { tenant_id = var.TENANTID object_id = data.azuread_group.sub_owners.object_id key_permissions = ["Get", "Purge", "Recover"] secret_permissions = ["Get", "List", "Set", "Delete", "Purge", "Recover"] certificate_permissions = ["Create", "Get", "Import", "List", "Update", "Delete", "Purge", "Recover"] } # Infra Deployment Service Principal access_policy { tenant_id = data.azurerm_client_config.current.tenant_id object_id = data.azurerm_client_config.current.object_id key_permissions = ["Get", "Purge", "Recover"] secret_permissions = ["Get", "List", "Set", "Delete", "Purge", "Recover"] certificate_permissions = ["Create", "Get", "Import", "List", "Update", "Delete", "Purge", "Recover"] } # Containers in AKS via user assigned identity access_policy { tenant_id = var.TENANTID object_id = azurerm_user_assigned_identity.aks.principal_id # principal_id is the object id of the user assigned identity secret_permissions = ["Get", "List", ] } tags = merge(tomap({ Service = "key_vault", }), local.tags) }
We can access storage blobs via default credntials as well as shown below.
Or storage queue as shown below.
Above are only few examples. With workload identity enabled, your containers deployed to AKS can access any Azure resource with DefaultAzureCredential securely using a managed identity. This far better secure approach than having to store connection strings, secrets etc. for your application usage purpose and having to pass those secret information around in your application components.
No comments:
Post a Comment