Horizontal Pod Scaler
Horizontal Pod Scaler automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization.
This blog will explain how you configure HPA (Horizontal Pod Scaler) on a Kubernetes Cluster.
Prerequisites to Configure K8s HPA
- Ensure that you have a running Kubernetes Cluster and kubectl, version 1.2 or later.
- Deploy Metrics-Server Monitoring in the cluster to provide metrics via resource metrics API, as HPA uses this API to collect metrics. To know about deploying of Metrics-Server, Click on this GitHub Repository: Deploy-Metrics-Server
- If you want to make use of custom metrics, your cluster must be able to communicate with the API server providing the custom metrics API.
Below are the steps of how you deploy an application and Configure HPA on Kubernetes Cluster:
Deploy an Application using Docker
- Here, we are using a custom Docker image based on the php-apache image.
- Create a Docker file with the following content:
FROM php:5-apache ADD index.php /var/www/html/index.php RUN chmod a+rx index.php
Below is the index.php page, which performs calculations to generate intensive CPU load.
<?php $x = 0.0001;
Then start a deployment running the image and expose it as a service using the following YAML Configuration:
apiVersion: apps/v1 kind: Deployment metadata: name: php-apache spec: selector: matchLabels: run: php-apache replicas: 1 template: metadata: labels: run: php-apache spec: containers: - name: php-apache image: k8s.gcr.io/hpa-example ports: - containerPort: 80 resources: limits: cpu: 500m requests: cpu: 200m --- apiVersion: v1 kind: Service metadata: name: php-apache labels: run: php-apache spec: ports: -port: 80 selector: run: php-apache
Then, run the following command:
kubectl apply -f https://k8s.io/examples/application/php-apache.yaml
Create Horizontal Pod Autoscaler
Now that the server is running, create a autoscaler using kubectl autoscale.
When you will create a Horizontal Pod Autoscaler it will maintain between 1 to 10 replicas of Pods controlled by php-apache deployment that you created in the above step. HPA will continue to increase or decrease the number of replicas to maintain an average CPU Utilization across all pods of 50%. You can create Horizontal Pod Autoscaler, using the following kubectl autoscale command:
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
- You can check the current status of Autoscaler using the command :
kubectl get hpa
How HPA reacts to Increased Load?
Ensure that you run all the following commands in the different Terminal;
- First, start the Docker Container
kubectl run --generator=run-pod/v1 -it --rm load-generator --image=busybox
- Then, send an infinite loop of queries to the php-apache service
while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done
- You can check the higher CPU load by executing:
kubectl get hpa
Terminate the Load
You can now stop the user load. Switch to the terminal, where you had created the Docker Container with busybox image and press + C.
You can verify, if you have terminated the increased load using the command:
kubectl get hpa
The CPU utilization will be dropped to 0% and HPA will autoscale the number of replicas back down to 1. The autoscaling of replicas may take a few minutes.
NAME REFERENCE TARGET MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache/scale 0% / 50% 1 10 1 11