Cloud Native

Real-World Kubernetes Deployments Part 1 - Cloud Native CI/CD

Understanding Kubernetes manifest directives that allow containers to self-heal throughout the pod lifecycle and control the deployment process.
Ryan Wendel Featured Team Member
Ryan Wendel | Nov 11 2021
30 min read

I've been working on a few Kubernetes engagements recently, in particular application deployment processes for Amazon Elastic Kubernetes Service (EKS) clusters. The question came up, how could we build a simple, cloud native deployment mechanism to deploy applications on EKS clusters. I have built a pretty involved proof of concept of one approach to solve this challenge and want to share it with others who may be interested.

The entire proof-of-concept solution involves the following tasks:

  1. Understanding Kubernetes manifest directives that allow containers to self-heal throughout the pod lifecycle and control the deployment process.
  2. Experimenting with a Docker container to research the self-healing capabilities of Kubernetes to help manage issues encountered during the pod lifecycle.
  3. Creating an AWS cloud-native CI/CD pipeline to facilitate deployments to an EKS cluster.

As working through these three tasks would make for an extremely long read, I’ll only be focusing on the first in this blog post. I’ll go into the remaining two (maybe broken up into three) tasks in a series of follow-up blog posts over the coming month or so.

The links to all three posts in this series are.

Real-World Kubernetes Deployments Part 1 - Cloud Native CI/CD

Real-World Kubernetes Deployments Part 2 - Cloud Native CI/CD

Real-World Kubernetes Deployments Part 3 - Cloud Native CI/CD

The Kubernetes eco-system is a remarkably powerful tool-set and, as one could easily expect, provides a robust mechanism for rolling out container updates and other supporting objects required by a given application deployment.

With this said, a good place to start this journey would be to understand how Kubernetes probes workload containers throughout the pod lifecycle. In the context of a deployment, this entails understanding the "livenessProbe", "readinessProbe", and "startupProbe" manifest directives. Granted, the former isn't necessary for deployment purposes but it's good to understand so we'll go over it anyways.

Some great references for these probes come straight from the Kubernetes docs.

I'll quickly summarize the highlights to help you quickly work through this post.

  • livenessProbe - This is a container health check and indicates whether the container is running. If the liveness probe fails, the container is terminated and then subjected to its restart policy.
  • readinessProbe - This is a "boot" status container check and indicates whether the container is ready to respond to requests. If the readiness probe fails, requests will not be routed to the configured Pod.
  • startupProbe - Indicates whether the application within the container is started. If a startup probe is provided, all other probes are disabled until it succeeds. If the startup probe fails, the container is terminated and then subjected to its restart policy.

The startup probe type was created to deal with legacy applications that might require an additional startup time on their first initialization. You will, more than likely, opt for a readiness over a startup probe and, as such, we'll not focus on the startup probe in this post.

Probes feature handlers that allow you to customize how you want to determine the status of your container. The following three handlers are afforded by Kubernetes:

  • ExecAction - Executes a specified command inside the container. The diagnostic is considered successful if the command exits with a status code of 0.
  • TCPSocketAction - Performs a TCP check against a pod's IP address on a specified port. The diagnostic is considered successful if the port is open.
  • HTTPGetAction - Performs an HTTP GET request against a pod's IP address on a specified port and path. The diagnostic is considered successful if the response has a status code greater than or equal to 200 and less than 400.

We'll only be working with the HTTP handler for this blog post as my proof-of-concept dealt with containers running web applications.

Let's take a look at examples of a liveness and readiness probe as they might be used in the wild to test a container's HTTP health check endpoint.

livenessProbe:
  httpGet:
    path: /healthz
    port: 80
  initialDelaySeconds: 20
  successThreshold: 1
  failureThreshold: 1
  
readinessProbe:
  httpGet:
    path: /healthz
    port: 80
  initialDelaySeconds: 5
  successThreshold: 1
  failureThreshold: 1 

Right away we see some interesting, extra directives. Of particular interest are the initial delay and success/failure thresholds. These are important to set in order to help prevent false positives and to codify what success and failure levels we'll accept from our probes (respectively).

  • initialDelaySeconds - Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to 0 seconds. Minimum value is 0.
  • successThreshold - Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup Probes. Minimum value is 1.
  • failureThreshold - When a probe fails, Kubernetes will try the specified number of times before giving up. Giving up in case of liveness probe means restarting the container. In case of a readiness probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.

Looking back at our example probes, we see that when our pod/container starts we:

  • Are delaying our readiness probe by 5 seconds upon container startup
  • Are delaying our liveness probe by 20 seconds upon container startup
  • Require our readiness probe to succeed once before allowing requests to be sent to the pod
  • Are setting our liveness probe such that if a single failure is encountered the underlying container will be subjected to its restart policy.
  • Are checking the /healthz web endpoint on TCP port 80 for status information

Now that we have a basic understanding of how to help Kubernetes understand the health of our container workloads, we'll want to look at how we go about deploying new versions of containers; which has us first looking at the "Deployment" Kubernetes object.

Straight from the docs:

“A Deployment provides declarative updates for Pods and ReplicaSets. You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate.”

Keeping our focus concentrated on the more important aspects of a deployment we'll focus on the following manifest directives.

  • progressDeadlineSeconds
  • strategy

We'll start with the "strategy" directive as this controls how we'll instruct Kubernetes to provision and terminate the containers that comprise our deployments.

The following two types of strategies are at our disposal:

  • Recreate - All existing Pods are terminated before new ones are created.
  • RollingUpdate - Pods are updated in a rolling fashion such that only a portion of a deployment's pods/containers are updated at a time. Controls exist to limit the speed and resources that are consumed and made available during an update.

Seeing that this post is discussing “real-world Kubernetes deployments”, we're only going to focus on the latter. Application uptime is paramount in today's online business environment and terminating an entire deployment's pods/containers prior to replacing them is not something that would likely be viewed as acceptable.

The "RollingUpdate" update type utilizes two of its own directives to control how a deployment's update process is enacted.

  • maxSurge - Is an optional field that specifies the maximum number of new pods that can be provisioned during the update process over the deployment's desired number of pods. The value can be an absolute number or a percentage (5 or 10%). This value cannot be 0 if MaxUnavailable is 0. The absolute number is calculated from the percentage by rounding up. The default value is 25%.
  • maxUnavailable - Is an optional field that specifies the maximum number of Pods that can be unavailable during the update process. The value can be an absolute number or a percentage (5 or 10%). The absolute number is calculated from percentage by rounding down. This value cannot be 0 if maxSurge is 0. The default value is 25%.

In short, maxSurge is how many new pods we allow Kubernetes to create over the original number specified in the deployment manifest while maxUnavailable is the amount of pods we are willing to forgo when serving requests from the deployment during the update process.

And lastly, we need to touch on the "progressDeadlineSeconds" directive. Again, taking info provided to us in the Kubernetes docs:

  • progressDeadlineSeconds - Is an optional field that specifies the number of seconds you want to wait for a Deployment to progress before the system reports back that the Deployment has failed progressing.

It is important to note that this directive is given in terms of deployment "progress", not the overall length of time a deployment spans from start to finish. More specifically, progress is defined as any time the deployment creates or deletes a pod/container. When progress is made, the timeout clock is reset to zero.

In short, this directive allows you to tell Kubernetes to give up and abandon a rollout with a progress deadline different from the default 10 minutes.

Putting all of these directives together, we end up with a deployment manifest that looks like the following.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-foo
  namespace: default
  labels:
    app: example-foo
    deployment: foo
spec:
  progressDeadlineSeconds: 60
  replicas: 5
  selector:
    matchLabels:
      app: example-foo
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 20%
      maxSurge: 1
  template:
    metadata:
      labels:
        app: example-foo
        deployment: foo
    spec:
      containers:
      - name: example-foo
        image: public.ecr.aws/nginx/nginx:latest
        command: [ "/bin/sh", "-c" ]
        args:
        - sleep 15;
          echo "1.0.0" > /usr/share/nginx/html/index.html;
          echo healthy > /usr/share/nginx/html/healthz;
          nginx -g "daemon off;";
        livenessProbe:
          httpGet:
            path: /healthz
            port: 80
          initialDelaySeconds: 20
          successThreshold: 1
          failureThreshold: 1
        readinessProbe:
          httpGet:
            path: /healthz
            port: 80
          initialDelaySeconds: 5
          successThreshold: 1
          failureThreshold: 1

This manifest provides us with a deployment that will utilize every aforementioned Kubernetes pod lifecycle directive. Some of the more important aspects of this deployment include:

  • The creation of five pods in total.
  • Each pod will wait 15 seconds before starting an Nginx webserver.
  • Each Nginx webserver will return 200 status codes for probes made to /healthz.
  • Each Nginx webserver returns a version number in the response provided for index.html.
  • Only a single extra pod be created at a time during a rollout.
  • Only a single pod will be missing from the fleet servicing requests during a rollout.
  • A period of 60 seconds passing before “progress” is made before failing the entire rollout.

In short, this deployment will simulate the creation of an application container fleet where each container requires a few seconds of boot time before being able to accept requests. It will enable us to examine how Kubernetes conducts and controls rollouts given the values we specify for the pod lifecycle directives.

In order to begin using this deployment to examine rollout behavior, we’ll want to provision a NodePort service so we can access the web services being provided by each container. The following service manifest should do the trick.

apiVersion: v1
kind: Service
metadata:
  name: foo-nodeport-svc
  labels:
    deployment: foo
spec:
  ports:
    - name: http
      port: 80
      protocol: TCP
      targetPort: 80
      nodePort: 30080
  selector:
    deployment: foo
  type: NodePort

We’ll write both of these manifests to separate files and then apply both to a cluster using kubectl. Once applied, we’ll examine the information associated with the deployment, its pods, and the service. You should see something like the following:

$ kubectl apply -f deployments.yaml
deployment.apps/example-foo created

$ kubectl apply -f services.yaml
service/foo-nodeport-svc created

$ kubectl get deployment.apps/example-foo
NAME          READY   UP-TO-DATE   AVAILABLE   AGE
example-foo   3/5     5            3           24s

$ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
example-foo-6574bff886-tdd8h   1/1     Running   0          31s
example-foo-6574bff886-25gpw   0/1     Running   0          31s
example-foo-6574bff886-4m44j   1/1     Running   0          31s
example-foo-6574bff886-fb7s4   0/1     Running   0          31s
example-foo-6574bff886-txxqp   1/1     Running   0          31s

$ kubectl get service/foo-nodeport-svc
NAME               TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
foo-nodeport-svc   NodePort   10.105.128.167   <none>        80:30080/TCP   39s

You will quickly notice that not all pods are in the “ready” state. This is the result of having added the sleep statement to the webserver pod. After about 45 seconds you should see all 5 pods servicing requests.

The kubernetes cluster I utilize for testing purposes has two worker nodes with the following IP addresses. You will see these hosts throughout the remainder of this post.

  • 192.168.0.231
  • 192.168.0.232

Making requests to the exposed services will show that the webservers in our pods have all successfully started and are servicing requests made to /index.html and /healthz.

$ curl -S http://192.168.0.231:30080/healthz
healthy

$ curl -S http://192.168.0.232:30080/healthz
healthy

$ curl -S http://192.168.0.231:30080/index.html
1.0.0

$ curl -S http://192.168.0.232:30080/index.html
1.0.0

Now that we’ve performed our initial deployment, we’ll alter the deployment in such a way as to force Kubernetes to replace all of the pods in the deployment. We’ll accomplish this by incrementing the version number displayed in the response provided by index.html. As in, we’ll change the line in the deployment manifest that writes the version number to /usr/share/nginx/html/index.html:

       command: [ "/bin/sh", "-c" ]
        args:
        - sleep 15;
          echo "1.0.0" > /usr/share/nginx/html/index.html;
          echo healthy > /usr/share/nginx/html/healthz;
          nginx -g "daemon off;";

To look like:

      command: [ "/bin/sh", "-c" ]
        args:
        - sleep 15;
          echo "1.0.1" > /usr/share/nginx/html/index.html;
          echo healthy > /usr/share/nginx/html/healthz;
          nginx -g "daemon off;";

Applying the updated deployment file and examining the output of the following commands over time will illustrate how Kubernetes is managing the rollout of our deployment.

  • kubectl get deployment.apps/example-foo
  • kubectl get pods
  • curl http://192.168.0.231:30080
  • curl http://192.168.0.231:30080

Right after applying the updated manifest we see the following output.

$ kubectl get deployment.apps/example-foo

NAME          READY   UP-TO-DATE   AVAILABLE   AGE
example-foo   5/5     1            5           111s

$ kubectl get pods
NAME                           READY   STATUS              RESTARTS   AGE
example-foo-59bc65584c-v27q2   0/1     ContainerCreating   0          1s
example-foo-6574bff886-25gpw   1/1     Running             0          111s
example-foo-6574bff886-4m44j   1/1     Running             0          111s
example-foo-6574bff886-fb7s4   1/1     Running             0          111s
example-foo-6574bff886-tdd8h   1/1     Running             0          111s
example-foo-6574bff886-txxqp   1/1     Running             0          111s

$ curl http://192.168.0.231:30080
1.0.0

$ curl http://192.168.0.232:30080
1.0.0

As expected, only a single container was initially provisioned as our maxSurge value was set to “1”. Additionally, the version being displayed in the response for index.html has not changed yet.

Waiting a little more than twenty seconds we then see the following.

$ kubectl get deployment.apps/example-foo

NAME          READY   UP-TO-DATE   AVAILABLE   AGE
example-foo   5/5     2            5           2m14s

$ kubectl get pods
NAME                           READY   STATUS              RESTARTS   AGE
example-foo-59bc65584c-84xxc   0/1     ContainerCreating   0          0s
example-foo-59bc65584c-v27q2   1/1     Running             0          24s
example-foo-6574bff886-25gpw   1/1     Running             0          2m14s
example-foo-6574bff886-4m44j   1/1     Running             0          2m14s
example-foo-6574bff886-fb7s4   1/1     Running             0          2m14s
example-foo-6574bff886-tdd8h   1/1     Terminating         0          2m14s
example-foo-6574bff886-txxqp   1/1     Running             0          2m14s

$ curl http://192.168.0.231:30080
1.0.0

$ curl http://192.168.0.232:30080
1.0.0

At this point we see a second new pod being provisioned and a single old pod being terminated. Just what we’d expect by setting maxUnavailable to “20%” with five pods in a deployment. Again, we have yet to see the new version pop up in a response.

A little over a minute in and we see that a total of four new pods have been provisioned, the new version number is being reflected in some responses, and Kubernetes is continuing to terminate old pods.

$ kubectl get deployment.apps/example-foo

NAME          READY   UP-TO-DATE   AVAILABLE   AGE
example-foo   5/5     4            5           3m1s

$ kubectl get pods
NAME                           READY   STATUS              RESTARTS   AGE
example-foo-59bc65584c-84xxc   1/1     Running             0          48s
example-foo-59bc65584c-9fdp2   1/1     Running             0          23s
example-foo-59bc65584c-v27q2   1/1     Running             0          72s
example-foo-59bc65584c-v59mm   0/1     ContainerCreating   0          2s
example-foo-6574bff886-25gpw   1/1     Running             0          3m2s
example-foo-6574bff886-4m44j   1/1     Terminating         0          3m2s
example-foo-6574bff886-fb7s4   1/1     Terminating         0          3m2s
example-foo-6574bff886-txxqp   1/1     Running             0          3m2s

$ curl http://192.168.0.231:30080
1.0.1

$ curl http://192.168.0.232:30080
1.0.0

Not once during the deployment have we seen a request fail. Granted, we’ve seen differences in the versions (something characteristic of a rolling update) but, so far, we have not experienced any downtime during this deployment.

After a total of about 2.5 minutes we see the deployment completing successfully.

$ kubectl get deployment.apps/example-foo

NAME          READY   UP-TO-DATE   AVAILABLE   AGE
example-foo   5/5     5            5           4m15s

$ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
example-foo-59bc65584c-84xxc   1/1     Running   0          2m2s
example-foo-59bc65584c-9fdp2   1/1     Running   0          97s
example-foo-59bc65584c-gt6q9   1/1     Running   0          53s
example-foo-59bc65584c-v27q2   1/1     Running   0          2m26s
example-foo-59bc65584c-v59mm   1/1     Running   0          76s

$ curl http://192.168.0.231:30080
1.0.1

$ curl http://192.168.0.232:30080
1.0.1

The final result being that all of the updated containers/pods have been provisioned while every older pod has been terminated and all requests to index.html are returning the new version number. At this point we can confidently state that our deployment ran successfully encountering zero downtime!

Some final thoughts to consider are that setting pod lifecycle directives to optimal values won’t always be easy and will, undoubtedly, involve some tweaking, experimentation, and metrics-driven debugging when substantial code-base changes are made and new deployments are rolled out. Rare will be the occasion where you’ll get to “set it and forget it”.

Additionally, responsibility for setting these values needs to be hashed out between teams involved in the deployment process. A collaboration between developers and Kubernetes admins will, most likely, produce an optimal result. Removing any diffusions of responsibility will go a long way in ensuring a smooth user experience during deployments.

Thanks for hanging out with me for a bit. Stay tuned for the next installment of this series where I’ll work through a similar deployment process with a container I put together that can produce some of the hiccups you may encounter during a deployment.

Author
Ryan Wendel Featured Team Member
Ryan Wendel