Introduction to Kubernetes

In this post will discuss basic concepts of Kubernetes and key concepts relevant to Kubernetes.
Before reading this post please read about containers, cluster technologies.

What is Kubernetes ? 

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications  
  • Simply what does Kubernetes do is deploying/managing containerized applications across clustered environment.
  • Kubernetes are inspired from google's Borg system. Before Kubernetes google's Borg was the cluster manager to run containerized workloads in production, including gmail, gdrive etc.   read more about Borg                                                      
  • Open source project by google.
  • Written in go language. 

Kubernetes Features

 -  Automatic binpacking

Kubernetes automatically schedules the containers based on resource usage and constraints, without sacrificing the availability.

 - Self-healing

Kubernetes automatically replaces and reschedules the containers from failed nodes. It also kills and restarts the containers which do not respond to health checks, based on existing rules/policy.

 - Horizontal scaling

Scale your application up and down with a simple command, with a UI, or automatically based on CPU usage.

 - Automated rollouts and rollbacks

Kubernetes can roll out and roll back new versions/configurations of an application, without introducing any downtime.

- Secrets and configuration management

Deploy and update secrets and application configuration without rebuilding your image and without exposing secrets in your stack configuration.

 - Storage orchestration

Kubernetes can automatically mount local, external, and storage solutions to the containers in a seamless manner, based on Software Defined Storage (SDS). Or from public cloud providers like AWS or GCP.

 - Batch execution

In addition to services, Kubernetes can manage your batch and CI workloads, replacing containers that fail, if desired.
Kubernetes deployed in virtual machines, or public/private/hybrid/multi-cloud setups.

Kubernetes Architecture.

Master Node 

The Master Node is responsible for managing the Kubernetes cluster, and it is the entry point for all administrative tasks. We can communicate to the Master Node via the CLI, the GUI (Dashboard), or via APIs.
For fault tolerance purposes, there can be more than one Master Node in the cluster. If we have more than one Master Node, they would be in a HA (High Availability) mode, and only one of them will be the leader, performing all the operations. The rest of the Master Nodes would be followers.

Master Node Components

 - Api Server

All the administrative tasks are performed via the API Server within the Master Node. A user/operator sends REST commands to the API Server, which then validates and processes the requests. After executing the requests, the resulting state of the cluster is stored in the distributed key-value store.

 - Scheduler

Scheduler schedules the work to different Worker Nodes. The Scheduler has the resource usage information for each Worker Node. Before scheduling the work, the Scheduler also takes into account the quality of the service requirements, data locality, affinity, anti-affinity, etc. The Scheduler schedules the work inwterms of Pods and Services.

- Controller Manager

The Controller Manager manages different non-terminating control loops, which regulate the state of the Kubernetes cluster. A controller is a control loop that watches the shared state of the cluster through the apiserver and makes changes attempting to move the current state towards the desired state. 
eg: replication controller, endpoints controller, namespace controller, service account controller.

 - etcd

etcd is a distributed key-value store which is used to store the cluster state. All the cluster data is stored here.

Worker Node

A Worker Node is a machine (VM, physical server,) which runs the applications using Pods and is controlled by the Master Node. Pods are scheduled on the Worker Nodes, which have the necessary tools to run and connect them. A Pod is the scheduling unit in Kubernetes. It is a logical collection of one or more containers which are always scheduled together.

worker node components

 - container runtime

To run containers, we need a Container Runtime on the Worker Node. By default, Kubernetes is configured to run containers with Docker. It can also run containers using the rkt Container Runtime.

 - kubelet

The kubelet is an agent which runs on each Worker Node and communicates with the Master Node.
The kubelet works in terms of a PodSpec. A PodSpec is a YAML or JSON object that describes a pod. The kubelet takes these PodSpecs that are provided through various mechanisms (primarily through the apiserver) and ensures that the containers described in those PodSpecs are running and healthy.

 - kube-proxy

kube-proxy is the network proxy which runs on each Worker Node and listens to the API Server for each Service endpoint creation/deletion. For each Service endpoint, kube-proxy sets up the routes so that it can reach to it. 

Useful Terminologies

 - pods

Pod is group of one or more containers (eg: docker containers) with shared storage/network and with specification for how to run the containers. 
 A pod’s contents are always co-located and co-scheduled, and run in a shared context,  this means that they can share volumes and IP space, and can be deployed and scaled as a single application.

 - services 

A Kubernetes Service is an abstraction which defines a logical set of Pods and a policy by which to access them. 

 - Replication controllers

A ReplicationController ensures that a specified number of pod replicas are running at any one time.
It make sure that if one container goes down , the replication controller startup another container.
They make sure pods are always up and available. 

 - Labels

Labels are key/value pairs that attach to objects such as pods, replication controllers and services.  

References and Tutorials