What is Kubernetes?
The need for a container orchestration tool
Kubernetes is an open-source container orchestration tool developed by Google. These containers can be docker containers or other technologies. It basically means Kubernetes helps you manage applications that are made up of hundreds or thousands of containers in different deployment environments.
The need for a container orchestration tool
Over the recent years, the trend has changed from monolith architecture to microservice architecture due to the increase in size of applications.
Microservice architecture solves the issues of:
- Developing and deploying large applications by breaking them down into smaller independent modules.
- Single point of failure in applications.
- Difficulty in adopting new technologies.
The rise of microservices caused and increased the usage of container technologies. Containers offer the perfect host for small independent units of an application. Applications now comprise of small hundreds or even thousands of containers. Thus it becomes hard to manage these containers across multiple environments and sometimes even impossible. This scenario gave rise to the demand for a proper way of managing hundreds of containers.
Features offered by container orchestration tool
Container orchestration tools like Kubernetes helps to tackle those issues by guaranteeing the following features
- High availability to ensure there is zero downtime and always accessible by the users.
- Scalability for high performance and faster loading.
- Disaster recovery to restore the data to its latest state in case of infrastructure failure and run the application from the latest state after the recovery.
- Load Balancing for proper distribution of traffic.
Additionally, the tool also offers other features like scheduling, self-healing, rollouts and rollbacks.
Along with Kubernetes there are other orchestration tools too like Docker Swarm, OpenShift.
I will give an overview of the most important components of the Kubernetes. Now, Kubernetes have tons of components but most of the time you will be working with only just handful of them.
Node is a server which could be a physical or virtual machine. There are two types of nodes - Master Node and Worker Node. We run pods inside the Node.
Pod is the smallest unit of Kubernetes. It is an abstraction over container. Pod creates a running environment or a layer on top of the container. Pod is usually meant to run only one container inside of it. There could be more than one container that are helper containers or side services to run inside of that pod. Each pod gets its own IP Address. Pods can communicate with each other using that internal IP Address. In Kubernetes, you only work with pods and not containers. Pods die frequently. When pod dies, new pod is created and so its IP address changes. Here is why service comes into play.
Service is a permanent IP Address that can be attached to each Pod. The lifecycle of service and pod are not connected. So if a pod dies, its service remains alive and the new pod which is created is attached to the same service. There are two types of services - External and Internal. We can access the pod from outside sources like a browser through the external services. Pods attached to internal services are not accessible through outside sources.
The URL or endpoint of external service looks something like this -
http://172.31.35.103:8080 which is not good looking for production purpose. We would want our URL to have a secured protocol and a proper domain name. So for this there is another component called Ingress. The external request first goes into Ingress and it forwards the request to the service. Ingress also provides load balancing and SSL termination.
ConfigMap stores external configuration of your application in key-value pair (example database URL). You can connect the ConfigMap to the pod who needs this data.
We cannot put credentials like username, password in plain text inside ConfigMap. Secrets is just like ConfigMap but it stores the data in base 64 encoded format instead of plain text.
We would want our database data to persist for a long term. If the database pod or container is restarted, the data would be gone. Volumes are used to persist data. Volumes attach a physical storage to the pod.
This physical storage could be located on the same machine where the pod is running or on a remote machine located outside the Kubernetes cluster. Another important thing to note is Kubernetes doesn't manage data persistence itself. It means us as the Kubernetes user is responsible for storing and backing up the data.
Deployment is the blueprint for creating, replicating and scaling pods. Instead of relying on only 1 application pod, database pod, etc. we replicate everything on multiple servers. We will not create the replica pods by ourselves but instead we will be defining its blueprint.
In practice, you will not be creating pods, but you will be creating deployments where we can specify how many replicas we want and even scale up and scale down the replicas. Deployment is another abstraction on top of pod. Please note that replica pods are connected to the same service. So if one replica pod dies, the service routes the request to the another replica pod.
We can't replicate database pod using deployment because it has state. We would want a common storage where all database replica pod could read and write data. There should also be a mechanism to avoid data inconsistencies. StatefulSet offers such mechanism and pod replication. So stateful apps like databases should be created using StatefulSet.
So this was all about what Kubernetes is and a brief overview of its components. In the next article I will try to explain its Architecture.