Configuring Kubernetes HPA on a K8s Cluster

4 years ago   •   2 min read

By Anushka Arora

Horizontal Pod Scaler

Horizontal Pod Scaler automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization.

This blog will explain how you configure HPA (Horizontal Pod Scaler) on a Kubernetes Cluster.

Prerequisites to Configure K8s HPA

  1. Ensure that you have a running Kubernetes Cluster and kubectl, version 1.2 or later.
  2. Deploy Metrics-Server Monitoring in the cluster to provide metrics via resource metrics API, as HPA uses this API to collect metrics. To know about deploying of Metrics-Server, Click on this GitHub Repository: Deploy-Metrics-Server
  3. If you want to make use of custom metrics, your cluster must be able to communicate with the API server providing the custom metrics API.

Below are the steps of how you deploy an application and Configure HPA on Kubernetes Cluster:

Deploy an Application using Docker

  • Here, we are using a custom Docker image based on the php-apache image.
  • Create a Docker file with the following content:
FROM php:5-apache
ADD index.php /var/www/html/index.php
RUN chmod a+rx index.php

Below is the index.php page, which performs calculations to generate intensive CPU load.

<?php   $x = 0.0001;   

Then start a deployment running the image and expose it as a service using the following YAML Configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
	name: php-apache
spec:
	selector:
    	matchLabels:
        	run: php-apache
    replicas: 1
    template:
    	metadata:
        	labels:
            	run: php-apache
        spec:
        	containers:
            - name: php-apache
            image: k8s.gcr.io/hpa-example
            ports:
            - containerPort: 80
            resources:
            	limits:
                	cpu: 500m
                requests:
                cpu: 200m
                
---

apiVersion: v1
kind: Service
metadata:
	name: php-apache
	labels:
		run: php-apache
spec:
	ports:
		-port: 80
	selector:
run: php-apache

Then, run the following command:

kubectl apply -f https://k8s.io/examples/application/php-apache.yaml

Create Horizontal Pod Autoscaler

Now that the server is running, create a autoscaler using kubectl autoscale.

When you will create a Horizontal Pod Autoscaler it will maintain between 1 to 10 replicas of Pods controlled by php-apache deployment that you created in the above step. HPA will continue to increase or decrease the number of replicas to maintain an average CPU Utilization across all pods of 50%. You can create Horizontal Pod Autoscaler, using the following kubectl autoscale command:

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

  • You can check the current status of Autoscaler using the command :
kubectl get hpa

How HPA reacts to Increased Load?

Ensure that you run all the following commands in the different Terminal;

  • First, start the Docker Container
kubectl run --generator=run-pod/v1 -it --rm load-generator --image=busybox

  • Then, send an infinite loop of queries to the php-apache service
while true;
 do wget -q -O- http://php-apache.default.svc.cluster.local; done

  • You can check the higher CPU load by executing:
kubectl get hpa

Terminate the Load

You can now stop the user load. Switch to the terminal, where you had created the Docker Container with busybox image and press  + C.

You can verify, if you have terminated the increased load using the command:

kubectl get hpa

The CPU utilization will be dropped to 0% and HPA will autoscale the number of replicas back down to 1. The autoscaling of replicas may take a few minutes.

Result State:

NAME         REFERENCE                      TARGET    MINPODS   MAXPODS     REPLICAS   AGE
php-apache   Deployment/php-apache/scale   0% / 50%     1         10        1          11

Spread the word