DISCOVERY

May 20th, 2019

Orchestrating Containers with Kubernetes Part II: Single Node Cluster

Kubernetes

Docker

Container

Container Orchestrator

AWS

In my previous Kubernetes article, I went over the concepts of container orchestration and the architecture of a Kubernetes cluster. In this article, I'm building a single node Kubernetes cluster that runs a Node.js application.

The Node.js application is the same one I used in my article on Containerization. The Kubernetes cluster environment is very similar to my Docker playground. The single node Kubernetes cluster is created with a CloudFormation template wrapped in Terraform. It installs Docker, kubeadm, kubectl, kubelet, and kubernetes-cni. You can check out the infrastructure code on GitHub.

Let's quickly go over the installed components. Docker is a container runtime used by the Kubernetes cluster. My single node cluster only runs Docker containers, although it can be configured to use different runtimes. kubeadm bootstraps and initializes the Kubernetes cluster via an easy to use CLI. kubectl is a CLI that interacts with the Kubernetes API. The Kubernetes API runs on the clusters master node after bootstrapping is completed. My single node cluster only contains a master node (there are no worker nodes).

Every worker node runs a kubelet instance. The kubelet watches the Kubernetes API and waits for Pods to be scheduled to its node. Pods are the simplest object in Kubernetes - they run one or more containers. Once a Pod is scheduled, the kubelet tells the container runtime (Docker) to start a container in the Pod. Since my cluster has no worker nodes, the kubelet runs on the master node. All my Pods are scheduled to run on the master node.

Finally, the kubernetes-cni is used for cluster networking.

Once the CloudFormation template is built, an EC2 instance exists on which we can execute kubeadm commands. The following script bootstraps the Kubernetes cluster, creating a single master node.

# Initialize a new Kubernetes cluster sudo kubeadm init # Commands to run as provided by the kubeadm init command mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config

With the cluster initialized, we can try running kubectl commands. One simple test is to check the status of all the nodes in the cluster. Only one master node should be returned.

kubectl get nodes
NAME STATUS ROLES AGE VERSION ip-10-0-1-154 NotReady master 23s v1.14.2

Notice that the node listed has a status of NotReady. The node isn't ready because the cluster isn't set up with networking capabilities yet1. The following command sets up the networking.

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

After waiting a few seconds, the master node will be in a Ready state.

kubectl get nodes
NAME STATUS ROLES AGE VERSION ip-10-0-1-154 Ready master 2m13s v1.14.2

At this point the Kubernetes cluster is bootstrapped and ready for use! Now let's discuss the different Kubernetes objects needed to deploy the basic Node.js application.

Objects in a Kubernetes cluster are represented as YAML documents. For my basic Node.js application I created two objects on Kubernetes - a Pod and a Service.

In my previous Kubernetes article I displayed the YAML document for the Node.js application Pod. The document is shown again below:

# Version of the Kubernetes API. Pods exist in the stable v1 version apiVersion: v1 # What kind of object to deploy (in this case a Pod) kind: Pod # Metadata describing the Pod object metadata: name: nodejs-app labels: app: nodejs-app version: v1.0.0 environment: sandbox spec: containers: # Unique name for the container in the pod - name: nodejs-app # Docker image to use for the container from DockerHub image: ajarombek/nodejs-docker-app:latest # Configure a network port for a container ports: # Port to expose on the container IP - containerPort: 3000 # Protocol defaults to TCP protocol: TCP

Pods are the smallest computable resource in Kubernetes3. They are an execution environment for one or more containers4. In fact, a Pod is a special kind of container that hosts one or more smaller containers5. The smaller containers inside a Pod are the ones which run applications.

More often than not, a Pod holds a single container. In the YAML configuration above, the nodejs-app Pod holds a single container based on the ajarombek/nodejs-docker-app:latest Docker image. This image is hosted on DockerHub.

The Pod definition exposes port 3000 on the container as an entry point to the application. However, the Pod doesn't provide a way to access the container from outside the Kubernetes cluster. Networking configuration for Pods is performed by another object called a Service.

Services provide networking capabilities and a stable IP address for Pods. The following service object provides an entry point to the Pod from the internet.

apiVersion: v1 kind: Service metadata: name: nodejs-app-nodeport-service labels: version: v1.0.0 environment: sandbox spec: type: NodePort ports: - name: tcp # Port on nodes in the cluster which proxy to the service nodePort: 30001 # Port exposed on the service for internal cluster use. Requests to this port will be forwarded # to the Pods matching the labels selected by this service. port: 3000 # Port on the Pod that requests are sent to. Applications must listen on this port. targetPort: 3000 protocol: TCP selector: matchLabels: # This service applies to Pods with an 'app' label and a corresponding 'nodejs-app' value. app: nodejs-app

There are multiple different types of services in Kubernetes. The type of the service defined above is NodePort. A NodePort service exposes a port on each node in the Kubernetes cluster6. When a request is made to a nodes IP address at the exposed port, the request is forwarded to any corresponding Pods.

Since my node (an EC2 instance) has a public IP address, the NodePort service allows me to access the application at port 30001.

At this point I have both my Pod and Service objects expressed in YAML documents. I have a Docker image on DockerHub and a single node Kubernetes cluster. To simplify things, I combined both my YAML documents into a single file called all-in-one.yml.

# all-in-one.yml apiVersion: v1 kind: Service metadata: name: nodejs-app-service labels: version: v1.0.0 environment: sandbox spec: type: NodePort ports: - port: 3000 nodePort: 30001 protocol: TCP selector: app: nodejs-app --- apiVersion: v1 kind: Pod metadata: name: nodejs-app labels: app: nodejs-app version: v1.0.0 environment: sandbox spec: containers: - name: nodejs-app image: ajarombek/nodejs-docker-app:latest ports: - containerPort: 3000 protocol: TCP

Let's try deploying these objects onto the Kubernetes cluster. The following Bash commands clone the repository containing all-in-one.yml and attempt to create the Kubernetes objects.

git clone https://github.com/AJarombek/devops-prototypes.git cd devops-prototypes/docker/nodejs-docker-app/k8s/ # Create a Pod & Service based on the all-in-one.yml Kubernetes config file kubectl create -f all-in-one.yml
service/nodejs-app-service created pod/nodejs-app created

The Pod and Service objects were successfully sent to the Kubernetes API. Now let's confirm that the Pod and Service are running successfully.

kubectl get service kubectl get pod
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 123m nodejs-app-service NodePort 10.110.40.12 <none> 3000:30001/TCP 22s NAME READY STATUS RESTARTS AGE nodejs-app 0/1 Pending 0 32s

The first command proves that the nodejs-app-service service was created and assigned an internal IP address and port mapping. nodejs-app-service is accessible within the cluster at the private IP address 10.110.40.12 on port 3000. It's also accessible from the nodes public IP address at port 30001.

The second command is a bit more concerning. The status of the Pod is Pending, meaning it hasn't been scheduled to run on a node. To debug this issue, we can execute the kubectl describe pod command. While it returns a lot of info, the relevant piece is the following:

Events: Type Reason From Message ---- ------ ---- ------- Warning FailedScheduling default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.

This error message says the Scheduler failed because it didn't find any nodes for the Pod to run on. This issue occurred because my Kubernetes cluster only contains a single master node. By default, user Pods are only scheduled to worker nodes. However, my cluster doesn't have any worker nodes. In a production environment its good to keep all the application Pods on worker nodes, however in my demo application we want the Pod to run on the master node. This default behavior can be changed with a configuration object called a taint.

In Kubernetes, taints repel Pods from being scheduled to certain nodes7. By default, the master node has a taint called NoSchedule. Unless you specify in the YAML configuration that a Pod tolerates the NoSchedule taint, Pods are not be scheduled to the master node. Since I did not specify this toleration, the Pod failed to be scheduled.

A workaround for this issue is to remove the NoSchedule taint from the master node. The following command achieves this (note that the name of your node will be different):

kubectl taint node ip-10-0-1-154 node-role.kubernetes.io/master:NoSchedule-

Now if you check the status of the Pod again with kubectl get pod, you will see that its status changed to Ready! The Pod was successfully scheduled to the master node. The application setup is complete and the application is accessible on the master nodes public IP address.

While a single node cluster is nice for playing around and learning Kubernetes, it has a number of limitations and should never be used in a production environment. Most importantly, a single node is a single point of failure. If the master node goes down, so does the entire cluster. In a production environment you want a cluster with multiple worker nodes and a highly available master node. With this setup, the cluster will not go down if nodes fail. Most cloud providers offer highly available clusters. For example, AWS provides EKS (Elastic Container Service for Kubernetes) and Google Cloud offers GKE (Google Kubernetes Engine). I'm currently creating a Kubernetes prototype which uses EKS. Look forward to articles on that soon!

Along with cluster limitations, the way I designed the application has a number of drawbacks. One of the key reasons to use Kubernetes is to leverage its Pod autoscaling, self-healing, and deployments. I didn't take advantage of these capabilities in my Kubernetes configuration. These features are easy to implement with the ReplicaSet, Deployment, and HorizontalPodAutoscaler Kubernetes objects. I will cover these objects in the future.

Another drawback is the NodePort service object. As you likely noticed, the application wasn't accessible through the default HTTP or HTTPS ports (80 and 443, respectively). This is a limitation of the NodePort service, which can only use IP addresses ranging from 30000-32767. In a production application, either a LoadBalancer service or an Ingress object will be used for networking. I will also cover these objects in the future.

In this article I created a barebones application in a single node Kubernetes cluster. Although the application is far from production ready with the current configuration, its a good step forward in learning Kubernetes! All the code from this article is available on GitHub.

[1] Nigel Poulton, The Kubernetes Book (Nigel Poulton, 2018), 59-60

[2] "Integrating Kubernetes via the Addon: Installation", https://bit.ly/2EiJZjL

[3] "Pods", https://kubernetes.io/docs/concepts/workloads/pods/pod/

[4] Poulton., 64

[5] Poulton., 66

[6] Marko Lukša, Kubernetes in Action (Shelter Island, NY: Manning, 2018), 135

[7] "Taints and Tolerations", https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/