Kubernetes is an open-source orchestration system for automating deployment, scaling, and management of containerized applications.
For more details on Kubernetes, checkout Guide to Kubernetes .
Kubernetes provides basic resources known as Pods. A pod is the smallest deployable unit in Kubernetes which is actually a wrapper around containers. It can have one or more containers, with shared storage/network, and a specification for how to run the containers.
Pods are considered to be ephemeral entities. If a node, to which the pod is scheduled dies, then that pod will be scheduled for deletion after a timeout period.
A pod represents a single unit in Kubernetes. For horizontal scaling (like running multiple instances), an application would need multiple pods. The pods are replicated to achieve Horizontal Scaling. The replicated pods are created and managed through Controllers.
Some key Controller resources used for pod replication
Deployment is the easiest and most used resource for deploying an application. It manages the deployment of ReplicaSet. You can define deployments to update, create new replicasets, or to remove existing deployments.The main role of a deployment is to provide declarative updates to both pod and replicaset.
Deployments help you to achieve the following:
- To rollout ReplicaSet - It will create your pods in the background. You can check the status of the rollout to check if it is succeeded or not.
- Declare the new state of the pods - You can update the PodTemplateSpec of the deployment manifest. A new replicaset is created and the deployment moves the pods from old replicaset to the new one, at the controlled rate. Each new replicaset will now have the updated revision of the deployment.
- Rollback to earlier deployment revision - If due to some circumstance, the current state doesn’t turn out to be stable, then the deployment can be rolled back to earlier deployment revision.
What Deployment doesn’t provide?
- It doesn’t provide an identifier to pods.
- It doesn’t provide storage for pods, hence it is used for only stateless applications (the ones that don’t care which network is being used, and don’t need any permanent storage. For e.g., Web Servers such as Apache, Nginx, Tomcat).
- It is a Kubernetes resource, to manage stateful applications.
It manages the deployment and scaling of a set of pods, and provides a guarantee of ordering and uniqueness of the pods.
- Unlike deployments, statefulset maintains an identity for each of the pods.
Each pod has a persistent identifier, that it maintains across any scheduling.
- For example, if you create a statefulset with a name “flag”, it will create a pod with name flag-0, and for multiple replicas of a statefulset, the pod names would increment like flag-0,flag-1,flag-2, etc.
- Every node is given its own Persistent Volume. If you delete or scale down the pods, volumes associated with them, will not be deleted, therefore data persists.
Difference in attaching volumes for storage in a Deployment and StatefulSet
- Deployments: It is used for “stateless applications”. The volume (PVC) is shared across the pods. Since there is no data in the volume that is shared, it leads to data exposure concerns.
- StatefulSet: It is used for “stateful applications”. The PVC will have information stored in it, and this leads to the sharing of information across all the pods.
Using PVC across Deployments and StatefulSets
PVC can be used across deployments and statefulsets with the help of Access Modes. There are 3 Access Modes, namely
- ReadWriteOnce: Mount the volume as read-write by a single node.
- ReadOnlyMany: Mount the volume as read-only to many nodes.
- ReadWriteMany: Mount the volume as read-write by many nodes.
When PVC is specified for deployments, it is shared across all the replica pods. In such a case, PVC must have ReadWriteMany or ReadOnlyMany access mode (ReadWriteMany is rare, it is provided by only a few storage providers).
If you create a PVC with ReadWriteOnce access mode, and then you try to create a deployment that runs a stateful application. It will work fine, till you don’t scale your deployment. If you scale your deployment, you’ll get an error that the volume is already in use, when a new pod starts.
So, it is better to use a read-only volume to avoid errors, in such cases.
When a PVC is specified for statefulset, you must ensure that the PVC has ReadWriteOnce access mode. With statefulset, you define a VolumeClaimTemplate, so a new PVC is created for each replica automatically.
Another benefit is that, you will have one file that defines your application as well as persistent volume. It will further bolster the scalability of your application.