Creating scalable github actions runners in AKS:

  1. Install Cert-Manager:

helm repo add jetstack https://charts.jetstack.io

helm repo update

kubectl apply --validate=false -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.crds.yaml

helm install cert-manager jetstack/cert-manager \ --namespace cert-manager \ --create-namespace \ --version v1.14.0 # Use the latest stable version

kubectl get pods --namespace cert-manager

  1. Helm install Actions Runner Controller:

helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller

helm install --namespace actions-runner-system --create-namespace --set=authSecret.create=true --set=authSecret.github_token= --wait actions-runner-controller actions-runner-controller/actions-runner-controller

kubectl get pods -n actions-runner-system

  1. Create Runner-deployment.yaml

apiVersion: actions.summerwind.dev/v1alpha1 kind: RunnerDeployment metadata: name: example-runnerdeploy namespace: actions-runner-system spec: template: spec: ephemeral: true repository: group: "" labels: - self-hosted - env: [] nodeSelector: #optional, if you want to connect to a certain nodepool agentPool:


apiVersion: actions.summerwind.dev/v1alpha1 kind: HorizontalRunnerAutoscaler metadata: name: example-runner-autoscaler spec: #scaleDownDelaySecondsAfterScaleOut: 300 scaleTargetRef: kind: RunnerDeployment name: example-runnerdeploy #same name as runner deployment minReplicas: 1 maxReplicas: 8 metrics: - type: PercentageRunnersBusy scaleUpThreshold: '0.75' scaleDownThreshold: '0.25' scaleUpFactor: '2' scaleDownFactor: '0.5' - type: TotalNumberOfQueuedAndInProgressWorkflowRuns repositoryNames: -

  1. Apply the deployment:

kubectl apply -f runner-deployment.yaml -n actions-runner-system

  1. Check if min number of replicas mentioned are coming up:

kubectl get runnerdeployments -n actions-runner-system

kubectl get runners -n actions-runner-system


APPLIES EXCLUSIVELY IF THERE IS A PREVIOUS INSTALLATION OF ARC IN THE AKS CLUSTER

If there have been a previous deployment, we can delete all the hanging webhooks and services using these commands: - kubectl delete namespace actions-runner-system

If the deletion process is not proceeding and is in “Terminating” state, we need to remove the finalizers associated to the namespace: kubectl get namespace actions-runner-system -o json | jq '.spec.finalizers = []' | kubectl replace --raw "/api/v1/namespaces/actions-runner-system/finalize" -f –

Check for validatingwebhookconfigurations and mutatingwebhookconfigurations associated with actions controller or cert-manager and delete those before proceeding for new installation:

kubectl get validatingwebhookconfigurations | grep actions kubectl get mutatingwebhookconfigurations | grep actions kubectl get crd | grep actions

and delete those as below:

kubectl delete validatingwebhookconfigurations actions-runner-controller-validating-webhook-configuration kubectl delete mutatingwebhookconfigurations actions-runner-controller-mutating-webhook-configuration kubectl delete crd horizontalrunnerautoscalers.actions.summerwind.dev kubectl delete crd runnerdeployments.actions.summerwind.dev kubectl delete crd runnerreplicasets.actions.summerwind.dev kubectl delete crd runners.actions.summerwind.dev kubectl delete crd runnersets.actions.summerwind.dev

Note: In my instance, the CRD runners.actions.summerwind.dev wasn’t getting deleted. Had to remove the finalizers for that particular CRD using this command: kubectl patch crd/runners.actions.summerwind.dev -p '{"metadata":{"finalizers":[]}}' --type=merge

END

Creating multiple runners with different configurations based on their hardware requirements on the same AKS cluster:

  1. Create RunnerDeployments:

Create copies of your RunnerDeployment, or in your runner-deployment.yaml file, create multiple RunnerDeployment resources, each targeting a different node pool using the nodeSelector field:


apiVersion: actions.summerwind.dev/v1alpha1 kind: RunnerDeployment metadata: name: cpu-optimized-runners namespace: actions-runner-system spec: template: spec: ephemeral: true repository: labels: - cpu-optimized nodeSelector: node-pool: cpu-optimized


apiVersion: actions.summerwind.dev/v1alpha1 kind: RunnerDeployment metadata: name: gpu-enabled-runners namespace: actions-runner-system spec: template: spec: ephemeral: true repository: labels: - gpu-enabled nodeSelector: node-pool: gpu-enabled


apiVersion: actions.summerwind.dev/v1alpha1 kind: RunnerDeployment metadata: name: high-memory-runners namespace: actions-runner-system spec: template: spec: ephemeral: true repository: labels: - high-memory nodeSelector: node-pool: high-memory

  1. Create HorizontalRunnerAutoscalers
  2. For each RunnerDeployment, you can create a corresponding HorizontalRunnerAutoscaler to automatically scale the runners based on the workflow demands:

apiVersion: actions.summerwind.dev/v1alpha1 kind: HorizontalRunnerAutoscaler metadata: name: cpu-optimized-autoscaler spec: scaleTargetRef: name: cpu-optimized-runners minReplicas: 1 maxReplicas: 5 metrics: - type: PercentageRunnersBusy scaleUpThreshold: '0.75' scaleDownThreshold: '0.25' scaleUpFactor: '2' scaleDownFactor: '0.5' - type: TotalNumberOfQueuedAndInProgressWorkflowRuns repositoryNames: -


# Similar HorizontalRunnerAutoscaler configurations for gpu-enabled-runners and high-memory-runners

  1. Use Labels in GitHub Actions Workflows
  2. In your GitHub Actions workflows, you can specify the appropriate labels to target the specific runner types:

jobs: build: runs-on: [self-hosted, cpu-optimized] test: runs-on: [self-hosted, gpu-enabled] deploy: runs-on: [self-hosted, high-memory]