Azure Workload Identity allows your containers in AKS touse amanaged identity to access Azure resources securely without having to depend on connection strings, passwords, access keys or secrets. In other works you can just use DefaultAzureCredential in your containers running in AKS, which will be using workload identity assigned to the container, to get access to the required Azure resource. The roale based access permissions will be in effect and the user assigned managed identity (we can use AD app registration as well bu user assigned managed identity is recommended) used to setup the workload identity in AKS should be given the necessary roles in the target Azure resource. This is far better than having to store secrets or connection stigs to utilized by the dotnet applications. In this post let's understand how to setup workload identity in AKS deployed containers and explore how it simplifies the dotnet application code allowing the application to access Azure resources securely with a managed identity.
The first step of setting up the workload identity in AKS is to enable the OIDC (open id conectivity) issuer and workload identity. In terraform resource azurerm_kubernetes_cluster.
oidc_issuer_enabled = true
workload_identity_enabled = trueExample AKS cluster tf code is below
resource "azurerm_kubernetes_cluster" "aks_cluster" {
lifecycle {
ignore_changes = [default_node_pool[0].node_count]
}
name = "${var.prefix}-${var.project}-${var.environment_name}-aks-${var.deployment_name}"
kubernetes_version = local.kubernetes_version
sku_tier = "Standard"
location = var.location
resource_group_name = var.rg_name
dns_prefix = "${var.prefix}-${var.project}-${var.environment_name}-aks-${var.deployment_name}-dns"
node_resource_group = "${var.prefix}-${var.project}-${var.environment_name}-aks-${var.deployment_name}-rg"
image_cleaner_enabled = false # As this is a preview feature keep it disabled for now. Once feture is GA, it should be enabled.
image_cleaner_interval_hours = 48
network_profile {
network_plugin = "azure"
load_balancer_sku = "standard"
}
storage_profile {
file_driver_enabled = true
}
default_node_pool {
name = "chlinux"
orchestrator_version = local.kubernetes_version
node_count = 1
enable_auto_scaling = true
min_count = 1
max_count = 7
vm_size = "Standard_DS4_v2"
os_sku = "Ubuntu"
vnet_subnet_id = var.subnet_id
max_pods = 30
type = "VirtualMachineScaleSets"
scale_down_mode = "Delete"
zones = ["1", "2", "3"]
}
oidc_issuer_enabled = true # Allow creating open id connect issue url to be used in federated identity credential
workload_identity_enabled = true # Enable workload identity in AKS
identity {
type = "SystemAssigned"
}
ingress_application_gateway {
gateway_id = azurerm_application_gateway.aks.id
}
key_vault_secrets_provider {
secret_rotation_enabled = false
}
azure_active_directory_role_based_access_control {
azure_rbac_enabled = false
managed = true
tenant_id = var.tenant_id
# add sub owners as cluster admin
admin_group_object_ids = [
var.sub_owners_objectid] # azure AD group object ID
}
oms_agent {
log_analytics_workspace_id = var.log_analytics_workspace_id
}
depends_on = [
azurerm_application_gateway.aks
]
tags = merge(tomap({
Service = "aks_cluster"
}), var.tags)
}
As the next step we hve to setup a user assigned managed identity, which will be used as workload identity in AKS.
Then we need to setup a federated identity credential for user assigned identity for the AKS service account which will be used to assigned the identity for each pod. We need to use OIDC issuer url of the AKS cluster in the federated identity credential setup. To understand the concepts in detail read the docs here.
# Federated identity credential for AKS user assigned id - to be used with workload identity service account resource "azurerm_federated_identity_credential" "aks" { name = "${var.prefix}-${var.project}-${var.environment_name}-aks-fic-${var.deployment_name}" resource_group_name = var.rg_name audience = ["api://AzureADTokenExchange"] issuer = azurerm_kubernetes_cluster.aks_cluster.oidc_issuer_url parent_id = var.user_assigned_identity subject = "system:serviceaccount:widemo:wi-demo-sa" # system:serviceaccount:aksapplicationnamespace:workloadidentityserviceaccountname depends_on = [ azurerm_kubernetes_cluster.aks_cluster ] lifecycle { ignore_changes = [] } }
We can use terraform output to obtain the client id of the user assigned managed identity we created. This is required for later use with the service account creation.
Once the terrafomr code is deployed and the AKS cluster is created we can use kubectl to create the service account as shown below.
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
azure.workload.identity/client-id: userassignedidentitycientid #${USER_ASSIGNED_CLIENT_ID}$ # user Assigned identity client ID (aks_uai_client_id output from Terraform)
azure.workload.identity/tenant-id: tenantid #${AZURE_TENANT_ID}$ # Azure tenant id
# azure.workload.identity/service-account-token-expiration: "3600" # Default is 3600. Supported range is 3600-86400. Configure to avoid down time in token refresh. Setting in Pod spec takes precedence.
name: wi-demo-sa
namespace: widemo
We can deploy our application pods enabling use of workload identity as shown below in pod template.
template:
metadata:
labels:
app: wi-api
service: wi-api
azure.workload.identity/use: "true" # Required to make the contianers in the pod to use the workload identity
# annotations:
# azure.workload.identity/service-account-token-expiration: "3600" # Configure to avoid down time in token refresh. Takes precedence over servie acount setting. Default 3600, acceptable range: seconds 3600 - 86400.
# azure.workload.identity/skip-containers: "container1:container2" # Containers o skip using workload identity. By default all containers in pod will use workload identity when pod is labeled with azure.workload.identity/use: true
# azure.workload.identity/inject-proxy-sidecar: "true" # Default true. The proxy sidecar is used to intercept token requests to IMDS (Azure Instance Metadata Service) and acquire an AAD token on behalf of the user with federated identity credential.
# azure.workload.identity/proxy-sidecar-port: "8000" # Port of the proxy sidecar. Default 8000
spec:
serviceAccountName: wi-demo-sa # Service account (see aks_manifests\prerequisites\k8s.yaml) will provide identity to the pod https://azure.github.io/azure-workload-identity/docs/concepts.html
A full example deployment is below.
apiVersion: apps/v1
kind: Deployment
metadata:
name: wi-api
namespace: widemo
labels:
app: wi-api
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 50%
maxUnavailable: 25%
minReadySeconds: 0
selector:
matchLabels:
service: wi-api
template:
metadata:
labels:
app: wi-api
service: wi-api
azure.workload.identity/use: "true" # Required to make the contianers in the pod to use the workload identity
# annotations:
# azure.workload.identity/service-account-token-expiration: "3600" # Configure to avoid down time in token refresh. Takes precedence over servie acount setting. Default 3600, acceptable range: seconds 3600 - 86400.
# azure.workload.identity/skip-containers: "container1:container2" # Containers o skip using workload identity. By default all containers in pod will use workload identity when pod is labeled with azure.workload.identity/use: true
# azure.workload.identity/inject-proxy-sidecar: "true" # Default true. The proxy sidecar is used to intercept token requests to IMDS (Azure Instance Metadata Service) and acquire an AAD token on behalf of the user with federated identity credential.
# azure.workload.identity/proxy-sidecar-port: "8000" # Port of the proxy sidecar. Default 8000
spec:
serviceAccountName: wi-demo-sa # Service account (see aks_manifests\prerequisites\k8s.yaml) will provide identity to the pod https://azure.github.io/azure-workload-identity/docs/concepts.html
nodeSelector:
"kubernetes.io/os": linux
priorityClassName: widemo-highest-priority-linux
#------------------------------------------------------
# setting pod DNS policies to enable faster DNS resolution
# https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
dnsConfig:
options:
# use FQDN everywhere
# any cluster local access from pods need full CNAME to resolve
# short names will not resolve to internal cluster domains
- name: ndots
value: "2"
# dns resolver timeout and attempts
- name: timeout
value: "15"
- name: attempts
value: "3"
# use TCP to resolve DNS instad of using UDP (UDP is lossy and pods need to wait for timeout for lost packets)
- name: use-vc
# open new socket for retrying
- name: single-request-reopen
#------------------------------------------------------
volumes:
# `name` here must match the name
# specified in the volume mount
- name: widemo-configmap-wi-api-volume
configMap:
# `name` here must match the name
# specified in the ConfigMap's YAML. See aks_manifests\prerequisites\k8s.yaml
name: widemo-configmap
terminationGracePeriodSeconds: 90 # This must be set to a value that is greater than the preStop hook wait time.
containers:
- name: wi-api
lifecycle:
preStop:
exec:
command: ["sleep","60"]
image: chdemosharedacr.azurecr.io/widemo/wi-api:1.1
imagePullPolicy: Always
# probe to determine the stratup success
startupProbe:
httpGet:
path: /api/health
port: container-port
initialDelaySeconds: 30 # give 30 seconds to get container started before checking health
failureThreshold: 30 # max 300 (30*10) seconds wait for start up to succeed
periodSeconds: 10 # interval of probe (300 (30*10) start up to succeed)
successThreshold: 1 # how many consecutive success probes to consider as success
timeoutSeconds: 10 # probe timeout
terminationGracePeriodSeconds: 30 # restarts container (default restart policy is always)
# readiness probe fail will not restart container but cut off traffic to container with one failure
# as specified below and keep readiness probes running to see if container works again
readinessProbe: # probe to determine if the container is ready for traffic (used by AGIC)
httpGet:
path: /api/health
port: container-port
failureThreshold: 1 # one readiness fail should stop traffic to container
periodSeconds: 20 # interval of probe
# successThreshold not supported by AGIC
timeoutSeconds: 10 # probe timeout
# probe to determine the container is healthy and if not healthy container will restart
livenessProbe:
httpGet:
path: /api/health
port: container-port
failureThreshold: 3 # tolerates three consecutive faiures before restart trigger
periodSeconds: 40 # interval of probe
successThreshold: 1 # how many consecutive success probes to consider as success after a failure probe
timeoutSeconds: 10 # probe timeout
terminationGracePeriodSeconds: 60 # restarts container (default restart policy is always)
volumeMounts:
- mountPath: /etc/config
name: widemo-configmap-wi-api-volume
ports:
- name: container-port
containerPort: 80
protocol: TCP
env:
- name: ASPNETCORE_URLS
value: http://+:80
- name: ASPNETCORE_ENVIRONMENT
value: Production
- name: CH_WIDEMO_CONFIG
value: /etc/config/config_dev-euw-001.json
resources:
limits:
memory: 1Gi # the memory limit equals to the request!
# no cpu limit! this is excluded on purpose
requests:
memory: 1Gi
cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
name: wi-api-clusterip
namespace: widemo
labels:
app: wi-api
service: wi-api
spec:
type: ClusterIP
ports:
- port: 8091
targetPort: 80
protocol: TCP
selector:
service: wi-api
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: wi-api
namespace: widemo
annotations:
# --------------
# AGIC
appgw.ingress.kubernetes.io/connection-draining: "true"
appgw.ingress.kubernetes.io/connection-draining-timeout: "120"
appgw.ingress.kubernetes.io/use-private-ip: "true"
appgw.ingress.kubernetes.io/request-timeout: "30"
# --------------
spec:
ingressClassName: azure-application-gateway
rules:
- host: wi-api.aksblue.ch-wi-dev-euw-001.net
http:
paths:
- path: /*
pathType: Prefix
backend:
service:
name: wi-api-clusterip
port:
number: 8091
When our application pods are running the containers are injected with below shown environemnt variables. This allows our application to authnticate via the user assigned managed identity as explained in here..
So with this setup we can use code such as below to load app configuration with keyvault acces to our apps running in AKS using workload identity. app configuration endpoint would be just only the endpoint without any secret or connection information, for example https://ch-wi-dev-euw-001-appconfig-ac.azconfig.io is enough to enable access to app config as we are using default credntials now.
using Azure.Identity; using Azure.Security.KeyVault.Secrets; using Microsoft.Extensions.Configuration; using Microsoft.Extensions.Configuration.AzureAppConfiguration; namespace common.lib.Configs { public class ConfigLoader { public static void LoadConfiguration(IConfigurationBuilder configBuilder) { configBuilder.AddJsonFile(Environment.GetEnvironmentVariable("CH_WIDEMO_CONFIG")); var config = configBuilder.Build(); string? appConfigEndpont = config.GetSection("AppConfigEndpoint").Value; string? appConfigLabel = config.GetSection("AppConfigLabel").Value; string? sharedAppConfiglabel = config.GetSection("SharedAppConfiglabel").Value; string? keyVaultName = config.GetSection("KeyVaultName").Value; string? aadTenantId = config.GetSection("AadTenantId").Value; //Load configuration from Azure App Configuration configBuilder.AddAzureAppConfiguration(options => { DefaultAzureCredential azureCredentials = new(); options.Connect( new Uri(appConfigEndpont), azureCredentials); options .Select(KeyFilter.Any, sharedAppConfiglabel) .Select(KeyFilter.Any, appConfigLabel); SecretClient secretClient = new( new Uri($"https://{keyVaultName}.vault.azure.net/"), azureCredentials); options.ConfigureKeyVault(kv => kv.Register(secretClient)); }); configBuilder.Build(); } } }
This is possible because in terraform we can grant the necessary permision to the user assigned managed identity, as shown below.
resource "azurerm_key_vault" "instancekeyvault" {
name = "${var.PREFIX}-${var.PROJECT}-${var.ENVNAME}-kv"
location = azurerm_resource_group.instancerg.location
resource_group_name = azurerm_resource_group.instancerg.name
tenant_id = data.azurerm_client_config.current.tenant_id
sku_name = "standard"
enabled_for_deployment = false
enabled_for_disk_encryption = false
purge_protection_enabled = false # allow purge for drop and create in demos. else this should be set to true
network_acls {
bypass = "AzureServices"
default_action = "Deny"
ip_rules = ["xxx.xxx.xxx.xxx/32", "${chomp(data.http.mytfip.response_body)}/32"]
virtual_network_subnet_ids = [
"${azurerm_subnet.aks.id}"
]
}
# Sub Owners
access_policy {
tenant_id = var.TENANTID
object_id = data.azuread_group.sub_owners.object_id
key_permissions = ["Get", "Purge", "Recover"]
secret_permissions = ["Get", "List", "Set", "Delete", "Purge", "Recover"]
certificate_permissions = ["Create", "Get", "Import", "List", "Update", "Delete", "Purge", "Recover"]
}
# Infra Deployment Service Principal
access_policy {
tenant_id = data.azurerm_client_config.current.tenant_id
object_id = data.azurerm_client_config.current.object_id
key_permissions = ["Get", "Purge", "Recover"]
secret_permissions = ["Get", "List", "Set", "Delete", "Purge", "Recover"]
certificate_permissions = ["Create", "Get", "Import", "List", "Update", "Delete", "Purge", "Recover"]
}
# Containers in AKS via user assigned identity
access_policy {
tenant_id = var.TENANTID
object_id = azurerm_user_assigned_identity.aks.principal_id # principal_id is the object id of the user assigned identity
secret_permissions = ["Get", "List", ]
}
tags = merge(tomap({
Service = "key_vault",
}), local.tags)
}
We can access storage blobs via default credntials as well as shown below.
Or storage queue as shown below.
Above are only few examples. With workload identity enabled, your containers deployed to AKS can access any Azure resource with DefaultAzureCredential securely using a managed identity. This far better secure approach than having to store connection strings, secrets etc. for your application usage purpose and having to pass those secret information around in your application components.
No comments:
Post a Comment