Autoscaling Based on ALB Metrics Using KEDA

TL;DR: KEDA is used to fetch metrics from AWS Cloud watch. Based on ALB metrics the application can be scaled accordingly. This will help run software efficiently and smoothly.

a year ago   •   5 min read

By Kamal Acharya, Abhinav Dubey,

Table of contents

Autoscaling is one of the key benefits of the Kubernetes. It helps reduce the utilization of resources, thus reducing the cost of cloud infrastructure. When demand drops, the autoscaling mechanism automatically removes the resources to avoid overspending.  The scaling of nodes or pods increases or decreases as the demand for the service response.

KEDA is a Kubernetes-based Event Driven Autoscaler. It can scale the Kubernetes workloads based on events needed to be processed. It is a lightweight, single-purpose component that can be added to the Kubernetes cluster and has support for multiple scalers that can be used. For this article, we would be autoscaling our applications using ALB metrics which are being fetched from AWS Cloud Watch and will dive into the practical hands-on.

What is an Application Load Balancer?

An application load balancer distributes incoming traffic among multiple applications, which we call servers or instances. An application load Balancer (ALB) is typically used to route HTTP and HTTPS requests to specific targets, such as Amazon ec2 instances, containers, and IP addresses.

What are the ALB Metrics

Application Load balancer publishes data points to the cloud watch, enabling it to retrieve statistics about these data points, known as metrics. These performance and usage metrics are known as ALB metrics.

Some common ALB metrics include:

  1. Request Count: This metric tells us about the total number of requests received by ALB.
  2. HTTP Code Count: This metric tracks the HTTP response codes returned by the ALB, such as 2xx, 3xx, 4xx, and 5xx.
  3. Target Response Time: This metric measures the time taken by the target instances to respond to requests forwarded by the ALB.
  4. Active Connection Count: This metric tracks the number of active connections between the ALB and the target instances.
  5. Target Connection Error Count: This metric counts the number of errors that occur when the ALB tries to establish connections with target instances.
  6. Target Response Error Count: This metric counts the number of errors that occur when the target instances fail to respond to requests forwarded by the ALB.

Now, let's dive into the practical world and setup KEDA for autoscaling based on ALB metrics.

Autoscale based on ALB Metrics using KEDA

For setting up the autoscaling, we will be using KEDA and autoscale our cluster with ALB metrics. To execute all the tasks, we will use Devtron, which has native integration of KEDA (event-driven autoscaler).

Step-1: Install ALB Controller using Helm Chart
With Devtron's Helm dashboard, you can install any helm chart and manage it directly from Devtron's intuitive user interface. For alb controller, CRDs can be installed from Helm charts. To deploy the controller through the chart, navigate to chart store and search for aws-load-balancer.

ALB chart
ALB chart

Configure the YAML file and choose the cluster where you want to deploy it. To check the status of your chart, you can search your app in the helm app or navigate to the resource browser to check the controller pod. To know more about Resource browser, feel free to read this article.

Pod of ALB Controller
Pod of ALB Controller

Step-2: Install KEDA controller from Chart Store.
Just like alb-controller chart, you can install KEDA controller as from the chart store and deploy it through Devtron.

KEDA chart
KEDA chart

Step-4: Configure the Application
Now let's configure the application where KEDA needs to be used for autoscaling. To learn application deployment with Devtron, feel free to check out how to deploy applications with Devtron.

For this application, you have to enable the ingress from deployment-template, and configure it according to your requirements. Here's sample configuration and annotations that we have used.

  annotations: <CustomName> "80" HTTP internet-facing subnet-id , subnet-id ip <GroupName>
  className: alb.    #Required
  enabled: true
    - host: <YourHostName>
      pathType: Prefix
        - /
Ingress configuration
Ingress configuration

Step-5: Add your AWS credentials
To add any confidential data, you can create secrets directly from Devtron and pass the key-value pair as shown in the below image.

Note: Make sure you have given permission to the node group to create a load balancer and autoscaling it.

Step-6: Create KEDA Scaled Object
Now, time to configure KEDA. With Devtron's deployment-template, KEDA is natively integrated so that you don't need to worry about managin maniests for that. Just enable the kedaAutoscaling object and pass the necessary configurations. You can also specify the trigger for which you want to autoscale this application and in our case, it's aws-cloudwatch as you see in the below image.

KEDA configuration
KEDA configuration
Note: Make sure you use the secret name correctly in triggerAuthentication of KEDA object!!

You can view your KEDA objects in Custom resources once the aplication is succesfully deployed.

Step-7: Test your HPA by increasing your request to your application
You will be able to view the number of replicas increasing as per request in your HPA. Run this command to automatically increase requests on your load balancer

While true
Curl <hostname>
Autoscaling resource
Autoscaling resource 


To run software applications efficiently and smoothly, they must automatically scale up and down according to traffic. So, the cloud provider provides a load balancer to expose the application, and ALB is one of them. So to autoscale according to ALB metrics, KEDA is used, which helps to fetch metrics from AWS cloud watch and auto-scales the application accordingly.

Handling Kubernetes resources through the command line requires lots of experience, and debugging and troubleshooting require effort. Devtron provides Kubernetes Dashboard, which drives things without commands. It provides full-stack observability of resources for easy debugging.

Spread the word

Keep reading