🚀 Start Deploying into Kubernetes, today. Install Now

Introduction to Kubernetes Event-Driven Autoscaling (KEDA)

TL;DR: Explore KEDA (Kubernetes Event-Driven Autoscaling), an effective autoscaling solution for Kubernetes workloads, learn KEDA architecture and its benefits.

Introduction to Kubernetes Event-Driven Autoscaling (KEDA)

Manual scaling is slowly becoming a thing of the past. Currently, autoscaling is the norm, and organizations that deploy into Kubernetes clusters get built-in autoscaling features like HPA (Horizontal Pod Autoscaling) and VPA (Vertical Pod Autoscaling). But these solutions have limitations. For example, it's difficult for HPA to scale back the number of pods to zero or (de)scale pods based on metrics other than memory or CPU usage. KEDA (Kubernetes Event-Driven Autoscaling) was introduced to address some of these challenges in autoscaling K8s workloads.

In this blog, we will delve into KEDA and discuss the following points:
What is KEDA
KEDA architecture and components
KEDA installation and demo
Integrate KEDA in CI/CD pipelines

What is KEDA

KEDA is a lightweight, open-source Kubernetes event-driven autoscaler used by DevOps, SRE, and Ops teams to horizontally scale pods based on external events or triggers. KEDA helps to extend the capability of native Kubernetes autoscaling solutions, which rely on standard resource metrics such as CPU or memory. You can deploy KEDA into a Kubernetes cluster and manage the scaling of pods using custom resource definitions (CRDs).

Tabular comparison between different Kubernetes autoscaling features: VPA, KEDA, and HPA
Tabular comparison between different Kubernetes autoscaling features: VPA, KEDA, and HPA

Built on top of Kubernetes HPA, KEDA scales pods based on information from event sources such as AWS SQS, Kafka, RabbitMQ, etc. These event sources are monitored using scalers, which activate or deactivate deployments based on the rules set for them. KEDA scalers can also feed custom metrics for a specific event source, helping DevOps teams observe metrics relevant to them.

What problems does KEDA solve

KEDA helps SREs and DevOps teams with a few significant issues they have:

Freeing up resources and reducing cloud cost:

KEDA scales down the number of pods to zero in case there are no events to process. This is harder to do using the standard HPA, and it helps ensure effective resource utilization and cost optimization, ultimately bringing down the cloud bills.

Interoperability with DevOps toolchain:

As of now, KEDA supports 59 built-in scalers and 4 external scalers. External scalers include KEDA HTTP, KEDA Scaler for Oracle DB, etc. Using external events as triggers aids efficient autoscaling, especially for message-driven microservices like payment gateways or order systems. Since KEDA can be extended by developing integrations with any data source, it can easily fit into any DevOps toolchain.

KEDA interoperability
KEDA interoperability

KEDA architecture and components

As mentioned in the beginning, KEDA and HPA work in tandem to achieve autoscaling. Because of that, KEDA needs only a few components to get started.

KEDA components

Refer to Fig. A and let us explore some of the components of KEDA.

Fig. A - KEDA architecture (source: keda.sh)
Fig. A - KEDA architecture (source: keda.sh)

Event sources:
These are the external event/trigger sources by which KEDA changes the number of pods. Prometheus, RabbitMQ, and Apache Pulsar are some examples of event sources.

Scalers:
Event sources are monitored using scalers, which fetch metrics and trigger the scaling of Deployments or Jobs based on the events.

Metrics adapter:
Metrics adapter takes metrics from scalers and translates or adapts them into a form that HPA/controller component can understand.

Controller:
The controller/operator acts upon the metrics provided by the adapter and brings about the desired deployment state specified in the ScaledObject (refer below).

KEDA CRDs

KEDA offers four custom resources to carry out the auto-scaling functions- ScaledObject, ScaledJob, TriggerAuthentication, and ClusterTriggerAuthentication.

ScaledObject and ScaledJob:
ScaledObject represents the mapping between event sources and objects, and specifies the scaling rules for a Deployment, StatefulSet, Jobs or any Custom Resource in a K8s cluster. Similarly, ScaledJob is used to specify scaling rules for Kubernetes Jobs.

Below is an example of a ScaledObject which configures KEDA autoscaling based on Prometheus metrics. Here, the deployment object ‘keda-test-demo3’ is scaled based on the trigger threshold (50) from Prometheus metrics. KEDA will scale the number of replicas between a minimum of 1 and a maximum of 10, and scale down to 0 replicas if the metric value drops below the threshold.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaledobject
  namespace: demo3
spec:
  scaleTargetRef:
    apiVersion: argoproj.io/v1alpha1
    kind: Rollout
    name: keda-test-demo3
  triggers:
    - type: prometheus
      metadata:
      serverAddress:  http://<prometheus-host>:9090
      metricName: http_request_total
      query: envoy_cluster_upstream_rq{appId="300", cluster_name="300-0", container="envoy", namespace="demo3", response_code="200" }
      threshold: "50"
  idleReplicaCount: 0                       
  minReplicaCount: 1
  maxReplicaCount: 10

TriggerAuthentication and ClusterTriggerAuthentication:
They manage authentication or secrets to monitor event sources.

Now let us see how all these KEDA components work together and scale K8s workloads.

How do KEDA components work?

It is easy to deploy KEDA on any Kubernetes cluster, as it doesn’t need overwriting or duplication of existing functionalities. Once deployed and the components are ready, the event-based scaling starts with the external event source (refer to Fig. A). The scaler will continuously monitor for events based on the source set in ScaledObject and pass the metrics to the metrics adapter in case of any trigger events. The metrics adapter then adapts the metrics and provides them to the controller component, which then scales up or down the deployment based on the scaling rules set in ScaledObject.

Note that KEDA activates or deactivates a deployment by scaling the number of replicas to zero or one. It then triggers HPA to scale the number of workloads from one to n based on the cluster resources.

KEDA deployment and demo

KEDA can be deployed in a Kubernetes cluster through Helm charts, operator hub, or YAML declarations. The following method uses Helm to deploy KEDA.

#Adding the Helm repo
helm repo add kedacore https://kedacore.github.io/charts

#Update the Helm repo
helm repo update

#Install Keda helm chart
kubectl create namespace keda
helm install keda kedacore/keda --namespace keda

To check if KEDA Operator and Metrics API server are up or not after the deployment, you can use the following command:

kubectl get pod -n keda

Now, watch the below video to see a hands-on demo of autoscaling using KEDA. The demo uses a small application called TechTalks and uses RabbitMQ as the message broker.

Integrate KEDA in CI/CD pipelines

KEDA makes autoscaling of K8s workloads very easy and efficient. The vendor-agnostic approach of KEDA ensures flexibility in terms of event sources, and it can help DevOps and SRE teams to optimize the cost and resource utilization of their Kubernetes cluster by scaling up or down based on event sources and metrics of their choice.

Integrating KEDA in CI/CD pipelines enables DevOps teams to quickly respond to trigger events in their application’s resource requirements, further streamlining the continuous delivery process. And KEDA supports events generated by different workloads such as StatefulSet, Job, Custom Resource, and Job. All these help in reduced downtime and improved efficiency and user experience of the applications.

Check out the following blog to see KEDA autoscaling based on Kafka Lag: Autoscaling Using KEDA Based On Kafka Lag.

MANAGE KUBERNETES UNDER THE HOOD
Reduce Developer Toil by 90%
check Build and containerize without YAML files
check One-click GitOps
check Multicluster visibility into Apps throughout the value chain
check GUI for HELM and K8s resource management
check Perform Blue Green and Canary from templates