Supergiant Blog

Product releases, new features, announcements, and tutorials.

Debugging Kubernetes Applications: How To Guide

Posted by Kirill Goltsman on September 2, 2018

Do your Kubernetes pods or deployments sometimes crash or begin to behave in a way not expected? 

Without knowledge of how to inspect and debug them, Kubernetes developers and administrators will struggle to identify the reasons for the application failure. Fortunately, Kubernetes ships with powerful built-in debugging tools that allow inspecting cluster-level, node-level, and application-level issues. 

In this article, we focus on several application-level issues you might face when you create your pods and deployments. We'll show several examples of using kubectl CLI to debug pending, waiting, or terminated pods in your Kubernetes cluster. By the end of this tutorial, you'll be able to identify the causes of pod failures at a fraction of time, making debugging Kubernetes applications much easier. Let's get started!

Tutorial

To complete examples in this tutorial, you need the following prerequisites:

  • A running Kubernetes cluster. See Supergiant GitHub wiki for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube.
  • A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here.

One of the most common reasons for a pod being unable to start is that Kubernetes can't find a node on which to schedule the pod. The scheduling failure might be because of the excessive resource request by pod containers. If for some reason you lost track of how many resources are available in your cluster, the pod's failure to start might confuse and puzzle you. Kubernetes built-in pod inspection and debugging functionality come to the rescue, though. Let's see how. Below is the Deployment spec that creates 5 replicas of Apache HTTP Server (5 Pods) requesting 0.3 CPU and 500 Mi each (Check this article to learn more about Kubernetes resource model):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpd-deployment
  labels:
    app: httpd
spec:
  replicas: 5
  selector:
    matchLabels:
      app: httpd
  template:
    metadata:
      labels:
        app: httpd
    spec:
      containers:
      - name: httpd
        image: httpd:latest
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: "0.3"
            memory: "500Mi"

Let's save this spec in the httpd-deployment.yaml and create the deployment with the following command:

kubectl create -f httpd-deployment.yaml
deployment.extensions "httpd-deployment" created

Now, if we check the replicas created we'll see the following output:

kubectl get pods
NAME                               READY     STATUS      RESTARTS   AGE
httpd-deployment-b644c8654-54fpq   0/1       Pending     0          38s
httpd-deployment-b644c8654-82brr   1/1       Running     0          38s
httpd-deployment-b644c8654-h9cj2   1/1       Running     0          38s
httpd-deployment-b644c8654-jsl85   0/1       Pending     0          38s
httpd-deployment-b644c8654-wkqqx   1/1       Running     0          38s

As you see, only 3 replicas of 5 are Ready and Running and 2 others are in the Pending state. If you are a Kubernetes newbie, you'll probably wonder what all these statuses mean. Some of them are quite easy to understand (e.g., Running ) while others are not.

Just to remind the readers, a pod's life cycle includes a number of phases defined in the PodStatus object. Possible values for phase include the following:

  • Pending: Pods with a pending status have been already accepted by the system, but one or several container images have not been yet downloaded or installed.
  • Running: The pod has been scheduled to a specific node, and all its containers are already running.
  • Succeeded: All containers in the pod were successfully terminated and will not be restarted.
  • Failed: At least one container in the pod was terminated with a failure. This means that one of the containers in the pod either exited with a non-zero status or was terminated by the system.
  • Unknown: The state of the pod cannot be obtained for some reason, typically due to a communication error.

Two replicas of our deployment are Pending, which means that the pods have not yet been scheduled by the system. The next logical question why that is the case? Let's use the main inspection tool at our disposal --kubectl describe. Run this command with one of the pods that have a Pending status:

kubectl describe pod  httpd-deployment-b644c8654-54fpq
Name:           httpd-deployment-b644c8654-54fpq
Namespace:      default
Node:           <none>
Labels:         app=httpd
                pod-template-hash=620074210
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  ReplicaSet/httpd-deployment-b644c8654
Containers:
  httpd:
    Image:      httpd:latest
    Port:       80/TCP
    Host Port:  0/TCP
    Requests:
      cpu:        300m
      memory:     500Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-9wdtd (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  default-token-9wdtd:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-9wdtd
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  4m (x37 over 15m)  default-scheduler  0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory.

Let's discuss some fields of this description that are useful for debugging:

Namespace -- A Kubernetes namespace in which the pod was created. You might sometimes forget the namespace in which the deployment and pod were created and then be surprised to find no pods when running kubectl get pods. In this case, check all available namespaces by running kubectl get namespaces and access pods in the needed namespace by running kubectl get pods --namespace <your-namespace>.

Status -- A pod's lifecycle phase defined in the PodStatus object (see the discussion above).

Conditions: PodScheduled -- A Boolean value that tells if the pod was scheduled. The value of this field indicates that our pod was not scheduled.

QoS Class -- Resource guarantees for the pod defined by the quality of service (QoS) Class. In accordance with QoS, pods can be Guaranteed, Burstable and Best-Effort (see the image below).


Events -- pod events emitted by the system. Events are very informative about the potential reasons for the pod's issues. In this example, you can find the event with a FailedScheduling Reason and the informative message indicating that the pod was not scheduled due to insufficient CPU and insufficient memory. Events such as these are stored in etcd to provide high-level information on what is going on in the cluster. To list all events, we can use the following command:

kubectl get events
LAST SEEN   FIRST SEEN   COUNT     NAME                                              KIND      SUBOBJECT                   TYPE      REASON                    SOURCE                 MESSAGE
3h          3h           1         apache-server-558f6f49f6-8bjnc.1541c8c4e84a9d6c   Pod                                   Normal    SuccessfulMountVolume     kubelet, minikube      MountVolume.SetUp succeeded for volume "default-token-9wdtd" 
3h          3h           1         apache-server-558f6f49f6-8bjnc.1541c8c4f1c64e87   Pod                                   Normal    SandboxChanged            kubelet, minikube      Pod sandbox changed, it will be killed and re-created.
3h          3h           1         apache-server-558f6f49f6-8bjnc.1541c8c500f534e9   Pod       spec.containers{httpd}      Normal    Pulled                    kubelet, minikube      Container image "httpd:2-alpine" already present on machine
3h          3h           1         apache-server-558f6f49f6-8bjnc.1541c8c503df3cb5   Pod       spec.containers{httpd}      Normal    Created                   kubelet, minikube      Created container
3h          3h           1         apache-server-558f6f49f6-8bjnc.1541c8c50a061e37   Pod       spec.containers{httpd}      Normal    Started                   kubelet, minikube      Started container
3h          3h           1         apache-server-558f6f49f6-p7mkl.1541c8c4711915e3   Pod                                   Normal    SuccessfulMountVolume     kubelet, minikube      MountVolume.SetUp succeeded for volume "default-token-9wdtd" 
3h          3h           1         apache-server-558f6f49f6-p7mkl.1541c8c475d37603   Pod                                   Normal    SandboxChanged            kubelet, minikube      Pod sandbox changed, it will be killed and re-created.

Please, remember that all events are namespaced so you should indicate the namespace you are searching by typing: kubectl get events --namespace=my-namespace

As you see, kubectl describe pod <pod-name> function is very powerful in identifying pod issues. It allowed us to find out that the pod was not created due to insufficient memory and CPU. Another way to retrieve extra information about a pod is passing the -o yaml format flag to kubectl get pod:

kubectl get  pod  httpd-deployment-b644c8654-54fpq -o yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: 2018-07-13T10:33:58Z
  generateName: httpd-deployment-b644c8654-
  labels:
    app: httpd
    pod-template-hash: "620074210"
  name: httpd-deployment-b644c8654-54fpq
  namespace: default
  ownerReferences:
  - apiVersion: extensions/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: httpd-deployment-b644c8654
    uid: 40329209-8688-11e8-bf09-0800270c281a
  resourceVersion: "297148"
  selfLink: /api/v1/namespaces/default/pods/httpd-deployment-b644c8654-54fpq
  uid: 40383a20-8688-11e8-bf09-0800270c281a
spec:
  containers:
  - image: httpd:latest
    imagePullPolicy: Always
    name: httpd
    ports:
    - containerPort: 80
      protocol: TCP
    resources:
      requests:
        cpu: 300m
        memory: 500Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-9wdtd
      readOnly: true
  dnsPolicy: ClusterFirst
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: default-token-9wdtd
    secret:
      defaultMode: 420
      secretName: default-token-9wdtd
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: 2018-07-13T10:33:58Z
    message: '0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory.'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending
  qosClass: Burstable

This command will output all information that Kubernetes has about this pod. It will contain the description of all spec options and fields you specified including any annotations, restart policy, statuses, phases, and more. The abundance of pod-related data makes this command one of the best tools for debugging pods in Kubernetes.

That's it! To fix the scheduling issue, you'll need to request the appropriate amount of CPU and memory. While doing so, please, keep in mind that Kubernetes starts with some default daemons and services like kube-proxy. Therefore, you can't request 1.0 of CPU for your apps.

Scheduling is only one amongst the common issues for your pods stuck in the pending stage. Let's create another deployment to illustrate other potential scenarios:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: apache-server
  labels:
    app: httpd
spec:
  replicas: 3
  selector:
    matchLabels:
      app: httpd
  strategy:
    type: RollingUpdate
    rollingUpdate: 
      maxSurge: 40%
      maxUnavailable: 40%
  template:
    metadata:
      labels:
        app: httpd
    spec:
      containers:
      - name: httpd
        image: httpd:23-alpine
        ports:
        - containerPort: 80

All we need to know about this deployment, is that it creates 3 replicas of the Apache HTTP server and specifies the custom RollingUpdate strategy.

Let's save this spec in the httpd-deployment-2.yaml and create the deployment running the following command:

kubectl create -f httpd-deployment-2.yaml
deployment.apps "apache-server" created

Let's check whether all replicas were successfully created:

kubectl get pods
NAME                            READY     STATUS             RESTARTS   AGE
apache-server-dc9bf8469-bblb4   0/1       ImagePullBackOff   0          53s
apache-server-dc9bf8469-x2wwq   0/1       ErrImagePull       0          53s
apache-server-dc9bf8469-xhmm7   0/1       ImagePullBackOff   0          53s

Oops! As you see, all three dods are not Ready and have ImagePullBackoff and ErrImagePull statuses. These statuses indicate that something wrong has happened while pulling the httpd image from the Docker hub repository. Let's describe one of the pods in the deployment to find out more information:

kubectl describe pod apache-server-dc9bf8469-bblb4
Name:           apache-server-dc9bf8469-bblb4
Namespace:      default
Node:           minikube/10.0.2.15
Start Time:     Fri, 13 Jul 2018 15:25:02 +0300
Labels:         app=httpd
                pod-template-hash=875694025
Annotations:    <none>
Status:         Pending
IP:             172.17.0.6
Controlled By:  ReplicaSet/apache-server-dc9bf8469
Containers:
  httpd:
    Container ID:   
    Image:          httpd:23-alpine
    Image ID:       
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-9wdtd (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  default-token-9wdtd:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-9wdtd
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age              From               Message
  ----     ------                 ----             ----               -------
  Normal   Scheduled              3m               default-scheduler  Successfully assigned apache-server-dc9bf8469-bblb4 to minikube
  Normal   SuccessfulMountVolume  3m               kubelet, minikube  MountVolume.SetUp succeeded for volume "default-token-9wdtd"
  Normal   Pulling                2m (x4 over 3m)  kubelet, minikube  pulling image "httpd:23-alpine"
  Warning  Failed                 2m (x4 over 3m)  kubelet, minikube  Failed to pull image "httpd:23-alpine": rpc error: code = Unknown desc = Error response from daemon: manifest for httpd:23-alpine not found
  Warning  Failed                 2m (x4 over 3m)  kubelet, minikube  Error: ErrImagePull
  Normal   BackOff                1m (x6 over 3m)  kubelet, minikube  Back-off pulling image "httpd:23-alpine"
  Warning  Failed                 1m (x6 over 3m)  kubelet, minikube  Error: ImagePullBackOff

If you scroll down this description to the bottom, you'll see details on why the pod failed: "Failed to pull image "httpd:23-alpine": rpc error: code = Unknown desc = Error response from daemon: manifest for httpd:23-alpine not found". This means that we specified a container image that does not exist. Let's update our deployment with the right httpd container image version to fix the issue:

kubectl set image deployment/apache-server  httpd=httpd:2-alpine
deployment.apps "apache-server" image updated

Then, let's check the deployment's pods again:

kubectl get pods
NAME                             READY     STATUS             RESTARTS   AGE
apache-server-558f6f49f6-8bjnc   1/1       Running            0          36s
apache-server-558f6f49f6-p7mkl   1/1       Running            0          36s
apache-server-558f6f49f6-q8gj5   1/1       Running            0          36s

Awesome! The Deployment controller has managed to pull the new image and all Pod replicas are now Running.

Finding the Reasons your Pod Crashed

Sometimes, your pod might crash due to some syntax errors in commands and arguments for the container. In this case, kubectl describe pod <PodName> will provide you only with the error name but not the explanation of its cause. Let's create a new pod to illustrate this scenario:

apiVersion: v1
kind: Pod
metadata:
  name: pod-crash
  labels:
    app: demo
spec:
  containers:
  - name: busybox
    image: busybox
    command: ['sh']
    args: ['-c', 'MIN=5 SEC=45; echo "$(( MIN*60 + SEC + ; ))"']

This pod uses BusyBox sh command to calculate the arithmetic value of two variables.

Let's save the spec in the pod-crash.yaml and create the pod running the following command:

kubectl create -f pod-crash.yaml
pod "pod-crash" created

Now, if you check the pod, you'll see the following output:

kubectl get pods
NAME                             READY     STATUS             RESTARTS   AGE
pod-crash                        0/1       CrashLoopBackOff   1          11s

The CrashLoopBackOff status means that you have a pod starting, crashing, starting again, and then crashing again. Kubernetes attempts to restart this pod because the default restartPolicy: Always is enabled. If we had set the policy to Never, the pod would not be restarted.

The status above, however, does not indicate the precise reason for the pod's crash. Let's try to find more details:

kubectl describe pod pod-crash
Containers:
  busybox:
    Container ID:  docker://f9a67ec6e37281ff16b114e9e5a1f1c0adcd027bd1b63678ac8d09920a25c0ed
    Image:         busybox
    Image ID:      docker-pullable://busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
    Args:
      -c
      MIN=5 SEC=45; echo "$(( MIN*60 + SEC + ; ))"
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Mon, 16 Jul 2018 10:32:35 +0300
      Finished:     Mon, 16 Jul 2018 10:32:35 +0300
    Ready:          False
    Restart Count:  5
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-9wdtd (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  default-token-9wdtd:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-9wdtd
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age               From               Message
  ----     ------                 ----              ----               -------
  Normal   Scheduled              5m                default-scheduler  Successfully assigned pod-crash to minikube
  Normal   SuccessfulMountVolume  5m                kubelet, minikube  MountVolume.SetUp succeeded for volume "default-token-9wdtd"
  Normal   Pulled                 4m (x4 over 5m)   kubelet, minikube  Successfully pulled image "busybox"
  Normal   Created                4m (x4 over 5m)   kubelet, minikube  Created container
  Normal   Started                4m (x4 over 5m)   kubelet, minikube  Started container
  Normal   Pulling                3m (x5 over 5m)   kubelet, minikube  pulling image "busybox"
  Warning  BackOff                2s (x23 over 4m)  kubelet, minikube  Back-off restarting failed container

The description above indicates that the pod is not Ready and that it was terminated because of the Back-Off Error. However, the description does not provide any further explanation of why the error occurred. Where should we search then?

We are most likely to find the reason for the pod's crash in the BusyBox container logs. You can check them by running kubectl logs ${POD_NAME} ${CONTAINER_NAME}. Note that ${CONTAINER_NAME} can be omitted for pods that only contain a single container (as in our case)

kubectl logs pod-crash
sh: arithmetic syntax error

Awesome! There must be something wrong with our command or arguments syntax. Indeed, we made a typo inserting ; into the expression echo "$(( MIN*60 + SEC + ; ))"'. Just fix that typo and you are good to go!

Pod Fails Due to the 'Unknown Field' Error

In the earlier versions of Kubernetes, a pod could be created even if the error was made in the spec's field name or value. In this case, the error would be silently ignored if the pod was created with the --validate flag set to false. In the newer Kubernetes versions (we are using Kubernetes 1.10.0), the --validate option is always set to true by default so the error for the unknown field is always printed. Therefore, the debugging becomes much easier. Let's create a pod with a wrong field value to illustrate this:

apiVersion: v1
kind: Pod
metadata:
  name: pod-field-error
  labels:
    app: demo
spec:
  containers:
  - name: busybox
    image: busybox
    comand: ['sh']
    args: ['-c', 'MIN=5 SEC=45; echo "$(( MIN*60 + SEC))"']

Let's save this spec in the pod-field-error.yaml and create the pod with the following command:

kubectl create -f pod-field-error.yaml
error: error validating "pod-field-error.yaml": error validating data: ValidationError(Pod.spec.containers[0]): unknown field "comand" in io.k8s.api.core.v1.Container; if you choose to ignore these errors, turn validation off with --validate=false

As you see, the pod start was blocked because of the unknown 'comand' field (we made a typo in this field intentionally). If you are using older versions of Kubernetes and the pod is created with the error silently ignored, delete the pod and run it with kubectl create --validate -f pod-field-error.yaml . This command will help you find the reason for the error:

kubectl create --validate -f pod-field-error.yaml
I0805 10:43:25.129850   46757 schema.go:126] unknown field: comand
I0805 10:43:25.129973   46757 schema.go:129] this may be a false alarm, see https://github.com/kubernetes/kubernetes/issues/6842
pods/pod-field-error

Cleaning Up

This tutorial is over, lso t's clean up after ourselves.

Delete Deployments:

kubectl delete deployment httpd-deployment
deployment.extensions "httpd-deployment" deleted
kubectl delete deployment apache-server 
deployment.extensions "apache-server" deleted

Delete Pods:

kubectl delete pod pod-crash
pod "pod-crash" deleted
kubectl delete pod pod-field-error
pod "pod-field-error" deleted

You may also want to delete files with spec definitions if you don't need them anymore.

Conclusion

As this tutorial demonstrated, Kubernetes ships with great debugging tools that help identify the reasons for pod failure or unexpected behavior at a fraction of time. The rule of thumb for Kubernetes debugging is first to find out the pod status and pod events and then check event messages by using kubectl describe or kubectl get events. If your pod crashes and the detailed error message is not available, you can check the containers' logs to find container-level errors and exceptions. These simple tools will dramatically increase your debugging speed and efficiency freeing up time for more productive work. 

Stay tuned to upcoming blogs to learn more about node-level and cluster-level debugging in Kubernetes.

Keep reading

Using Kubernetes Cron Jobs to Run Automated Tasks

Posted by Kirill Goltsman on August 29, 2018

In a previous tutorial, you learned how to use Kubernetes jobs to perform some tasks sequentially or in parallel. However, Kubernetes goes even further with task automation by enabling Jobs to create cron jobs that perform finite, time-related tasks that run repeatedly at any time you specify. Cron jobs can be used to automate a wide variety of common computing tasks such as creating database backups and snapshots, sending emails or upgrading Kubernetes applications. Before you can learn how to run cron jobs, make sure to consult our earlier tutorial about Kubernetes Jobs. If you are ready, let's delve into the basics of cron jobs where we'll show you how they work and how to create and manage them. Let's get started!

Definition of Cron Jobs

Cron (which originated from the Greek word for time χρόνος) initially was a utility time-based job scheduler in Unix-like operating system. At the OS level, cron files are used to schedule jobs (commands or shell scripts) to run periodically at fixed times, dates, or intervals. They are useful for automating system maintenance, administration, or scheduled interaction with the remote services (software and repository updates, emails, etc.). First used in the Unix-like operating systems, cron jobs implementations have become ubiquitous today. Cron Job API became a standard feature in Kubernetes in 1.8 and is widely supported by the Kubernetes ecosystem for automated backups, synchronization with remote services, system, and application maintenance (upgrades, updates, cleaning the cache) and more. Read on because we will show you a basic example of a cron job used to perform a mathematic operation.

Tutorial

To complete examples in this tutorial, you need the following prerequisites:

  • A running Kubernetes cluster at version >= 1.8 (for cron job). For previous versions of Kubernetes (< 1.8) you need to explicitly turn on batch/v2alpha1 API by passing --runtime-config=batch/v2alpha1=true to the API server (see how to do this in this tutorial), and then restart both the API server and the controller manager component. See Supergiant GitHub wiki for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube.
  • A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here.

Let's assume we have a simple Kubernetes jobs to calculate a π to 3000 places using perl and print out the result to stdout.

apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(3000)"]
      restartPolicy: Never
  backoffLimit: 4

We can easily turn this simple job into a cron job. In essence, a cron job is a type of the API resource that creates a standard Kubernetes job executed at a specified date or interval. The following template can be used to turn our π job into a full-fledged cron job:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: pi-cron
spec:
  schedule: "*/1 * * * *"
  startingDeadlineSeconds: 20
  successfulJobsHistoryLimit: 5
  jobTemplate:
    spec:
      completions: 2
      template:
        metadata:
          name: pi
        spec:
          containers:
          - name: pi
            image: perl
            command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(3000)"]
          restartPolicy: Never

Let's look closely at the key fields of this spec:

.spec.schedule -- a scheduled time for the cron job to be created and executed. The field takes a cron format string, such as 0 * * * * or @hourly. The cron format string uses the format of the standard crontab (cron table) file -- a configuration file that specifies shell commands to run periodically on a given schedule. See the format in the example below:

# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of week (0 - 6) (Sunday to Saturday;
# │ │ │ │ │                                       7 is also Sunday on some systems)
# │ │ │ │ │
# │ │ │ │ │
# * * * * *  command to execute

Each asterisk from the left to the right corresponds to a minute, an hour, a day of month, a month, a day of week on which to perform the cron job and the command to execute for it.

In this example, we combined a slash (/) with a 1-minute range to specify a step/interval at which to perform the job. For example, */5 written in the minutes field would cause the cron job to calculate π every 5 minutes. Correspondingly, if we wanted to perform the cron job hourly, we could write 0 */1 * * * to accomplish that. 

Format Note: The question mark (?) in the schedule field has the same meaning as an asterisk *. That is, it stands for any of available value for a given field.

.spec.jobTemplate -- a cron job's template. It has exactly the same schema as a job but is nested into a cron job and does not require an apiVersion or kind.

.spec.startingDeadlineSeconds -- a deadline in seconds for starting the cron job if it misses its schedule for some reason (e.g., node unavailability). A cron job that does not meet its deadline is regarded as failed. Cron jobs do not have any deadlines by default.

.spec.concurrencyPolicy --  specifies how to treat concurrent executions of a Job created by the cron job. The following concurrency policies are allowed:

  1. Allow (default): the cron job supports concurrently running jobs.
  2. Forbid: the cron job does not allow concurrent job runs. If the current job has not finished yet, a new job run will be skipped.
  3. Replace: if the previous job has not finished yet and the time for a new job run has come, the previous job will be replaced by a new one.

In this example, we are using the default allow policy. Computing π to 3000 places and printing out will take more than a minute. Therefore, we expect our cron job to run a new job even if the previous one has not yet completed.

.spec.suspend -- if the field is set to true, all subsequent job executions are suspended. This setting does not apply to executions which already began. The default value is false.

.spec.successfulJobsHistoryLimit -- the field specifies how many successfully completed jobs should be kept in job history. The default value is 3.

.spec.failedJobsHistoryLimit -- the field specifies how many failed jobs should be kept in job history. The default value is 1. Setting this limit to 0 means that no jobs will be kept after completion.

That's it! Now you have a basic understanding of available cron job settings and options. 

Let's continue with the tutorial. Open two terminal windows. In the first one, you are going to watch the jobs created by the cron job:

kubectl get jobs --watch

Let's save the spec above in the cron-job.yaml and create a cron job running the following command in the second terminal:

kubectl create -f cron-job.yaml
cronjob.batch "pi-cron" created

In a minute, you should see that two π jobs (as per the Completions value) were successfully created in the first terminal window:

kubectl get jobs --watch
NAME              DESIRED   SUCCESSFUL   AGE
pi-cron-1531219740   2         0         0s
pi-cron-1531219740   2         0         0s

You can also check that the cron job was successfully created by running:

kubectl get cronjobs
NAME      SCHEDULE      SUSPEND   ACTIVE    LAST SCHEDULE   AGE
pi-cron   */1 * * * *   False     1         57s             1m

Computing π to 3000 places is computationally intensive and takes more time than our cron job schedule (1 minute). Since we used the default concurrency policy ("allow"), you'll see that the cron job will start new jobs even though the previous ones have not yet completed:

kubectl get jobs --watch
NAME              DESIRED   SUCCESSFUL   AGE
pi-cron-1531219740   2         0         0s
pi-cron-1531219740   2         0         0s
pi-cron-1531219800   2         0         0s
pi-cron-1531219800   2         0         0s
pi-cron-1531219860   2         0         0s
pi-cron-1531219860   2         0         0s
pi-cron-1531219740   2         1         2m
pi-cron-1531219800   2         1         1m
pi-cron-1531219860   2         1         57s
pi-cron-1531219920   2         0         0s
pi-cron-1531219920   2         0         0s
pi-cron-1531219740   2         2         3m
pi-cron-1531219800   2         2         2m
pi-cron-1531219860   2         2         1m
pi-cron-1531219920   2         1         20s
pi-cron-1531219920   2         2         35s
pi-cron-1531219740   2         2         3m
pi-cron-1531219740   2         2         3m
pi-cron-1531219740   2         2         3m
pi-cron-1531219980   2         0         0s

As you see, some old jobs are still in process and new ones are created without waiting for them to finish. That's how Allow concurrency policy works!

Now, let's check if these jobs are computing the π correctly. To do this, simply find one pod created by the job:

kubectl get pods
NAME                       READY     STATUS             RESTARTS   AGE
pi-cron-1531220100-sbqrx   0/1       Completed          0          3m
pi-cron-1531220100-t8l2v   0/1       Completed          0          3m
pi-cron-1531220160-bqcqf   0/1       Completed          0          2m
pi-cron-1531220160-mqg7t   0/1       Completed          0          2m
pi-cron-1531220220-dzmfp   0/1       Completed          0          1m
pi-cron-1531220220-zrh85   0/1       Completed          0          1m
pi-cron-1531220280-k2ttw   0/1       Completed          0          23s

Next, select one pod from the list and check its logs:

kubectl logs pi-cron-1531220220-dzmfp 

You'll see a pi number calculated to the 3000 place after the comma (that's pretty impressive):

3.14159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316527120190914564856692346034861045432664821339360726024914127372458700660631558817488152092096282925409171536436789259036001133053054882046652138414695194151160943305727036575959195309218611738193261179310511854807446237996274956735188575272489122793818301194912983367336244065664308602139494639522473719070217986094370277053921717629317675238467481846766940513200056812714526356082778577134275778960917363717872146844090122495343014654958537105079227968925892354201995611212902196086403441815981362977477130996051870721134999999837297804995105973173281609631859502445945534690830264252230825334468503526193118817101000313783875288658753320838142061717766914730359825349042875546873115956286388235378759375195778185778053217122680661300192787661119590921642019893809525720106548586327886593615338182796823030195203530185296899577362259941389124972177528347913151557485724245415069595082953311686172785588907509838175463746493931925506040092770167113900984882401285836160356370766010471018194295559619894676783744944825537977472684710404753464620804668425906949129331367702898915210475216205696602405803815019351125338243003558764024749647326391419927260426992279678235478163600934172164121992458631503028618297455570674983850549458858692699569092721079750930295532116534498720275596023648066549911988183479775356636980742654252786255181841757467289097777279380008164706001614524919217321721477235014144197356854816136115735255213347574184946843852332390739414333454776241686251898356948556209921922218427255025425688767179049460165346680498862723279178608578438382796797668145410095388378636095068006422512520511739298489608412848862694560424196528502221066118630674427862203919494504712371378696095636437191728746776465757396241389086583264599581339047802759009946576407895126946839835259570982582262052248940772671947826848260147699090264013639443745530506820349625245174939965143142980919065925093722169646151570985838741059788595977297549893016175392846813826868386894277415599185592524595395943104997252468084598727364469584865383673622262609912460805124388439045124413654976278079771569143599770012961608944169486855584840635342207222582848864815845602850601684273945226746767889525213852254995466672782398645659611635488623057745649803559363456817432411251507606947945109659609402522887971089314566913686722874894056010150330861792868092087476091782493858900971490967598526136554978189312978482168299894872265880485756401427047755513237964145152374623436454285844479526586782105114135473573952311342716610213596953623144295248493718711014576540359027993440374200731057853906219838744780847848968332144571386875194350643021845319104848100537061468067491927819119793995206141966342875444064374512371819217999839101591956181467514269123974894090718649423196

Awesome! Our cron job works as expected. You can imagine how this functionality might be useful for making regular backups of your database, application upgrades and any other task. As it comes to automation, cron jobs are gold!

Cleaning Up

If you don’t need a cron Job anymore, delete it with kubectl delete cronjob:

$ kubectl delete cronjob pi-cron
cronjob "pi-cron" deleted

Deleting the cron job will remove all the jobs and pods it created and stop it from spawning additional jobs.

Conclusion

Hopefully, you now have a better understanding of how cron jobs can help you automate tasks in your Kubernetes application. We used a simple example that can kickstart your thought process. However, when working with the real world Kubernetes cron jobs, please, be aware of the following limitation.

A cron job creates a job object approximately once per execution time of its schedule. There are certain scenarios where two jobs are created or no Job is created at all. Therefore, to avoid side effects jobs should be idempotent, which means they should not change the data consumed by other scheduled jobs. If .spec.startingDeadlineSeconds is set to a large value or left unset (the default) and if .spec. concurrencyPolicy is set to Allow, the jobs will always run at least once. If you want to start the job notwithstanding the delay, set a longer .spec.startingDeadlineSeconds if starting your job is better than not starting it at all. If you keep these limitations and best practices in mind, your cron jobs will never let your application down.

Keep reading

Making Sense of Kubernetes Jobs

Posted by Kirill Goltsman on August 22, 2018

A Kubernetes Job is a special controller that can create one or more pods and manage them in the process of doing some finite work. Jobs ensure the pod's successful completion and allow rescheduling pods if they fail or terminate due to node hardware failure or node reboot. Kubernetes comes with a native support for parallel jobs that allow distributing workloads between multiple worker pods or performing the same task multiple times until reaching the completions count. The ability to reschedule failed pods and built-in parallelism make Kubernetes Jobs a great solution for parallel and batch processing and managing work queues in your applications.

In this article, we're going to discuss the architecture of and use cases for Kubernetes Jobs and walk you through simple examples demonstrating how to create and run your custom Jobs. Let's get started!

Why Do Kubernetes Jobs Matter?

Let's assume you have a task of calculating all prime numbers between 1 and 110 using the bash script and Kubernetes. The algorithm for calculating prime numbers is not that difficult and we could easily create a Pod with a bash command implementing it. However, using a bare pod for this kind of operation might run us into several problems.

First, the node on which your pod is running may suddenly shutdown due to hardware failure or connection issues. Consequently, the pod running on this node will also cease to exist.

Secondly, if we were to calculate all prime numbers ranging from 1 to 10000, for example, doing this in a single bash instance would be very slow. The alternative would be to split this range into several batches and assign those to multiple pods. To take a real-world example, we could create a work queue in some key-value store like Redis and make our worker pod process items in this queue until it gets empty. Using bare pods to accomplish that would be no big deal if we just needed 3-4 pods but would be a harder task if our work queue is large enough (e.g we have thousands of emails, files, and messages to process). Even if this can be done by manually created pods, the first problem is still not solved.

So what is the solution? Enter Kubernetes Jobs! They elegantly solve the above-mentioned problems. On the one hand, Jobs allow rescheduling pods to another node if the one they were running on fails. On the other hand, Kubernetes Jobs support pod parallelism with multiple pods performing connected tasks in parallel. In what follows, we will walk you through a simple tutorial that will teach you how to leverage these features of Kubernetes jobs.

Tutorial

To complete examples in this tutorial, you'll need the following prerequisites:

  • a running Kubernetes cluster. See Supergiant GitHub wiki for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube.
  • A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here.

In the example below, we create a Job to calculate prime numbers between 0 and 110. Let's define the Job spec first:

apiVersion: batch/v1
kind: Job
metadata:
  name: primes
spec:
  template:
    metadata:
      name: primes
    spec:
      containers:
      - name: primes
        image: ubuntu
        command: ["bash"]
        args: ["-c",  "current=0; max=110; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [  `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done"]
      restartPolicy: Never
  backoffLimit: 4

As you see, the Job uses the batch/v1 apiVersion, which is the first major difference from bare pods and Deployments. However, Jobs use the same PodTemplateSpec as Deployments and other controllers. In our case, we defined a pod running the ubuntu container from the public Docker Hub repository. Inside the container, we use the bash command provided by the image with a script that calculates prime numbers.

Also, we are using spec.template.spec.restartPolicy parameter set to Never to prevent a pod from restarting once the operation is completed. Finally, the field .spec.backoffLimit specifies the number of retries before the Job is considered to be failed. This might be useful in case when you want to fail a Job after some number of retries due to a logical error in the configuration etc. The default value for the .spec.backoffLimit is 6.

Let's save this spec in the job-prime.yaml and create the job running the following command:

kubectl create -f job-prime.yaml
job.batch "primes" created

Next, let's check the status of the running job:

kubectl describe jobs/primes
Name:           primes
Namespace:      default
Selector:       controller-uid=2415e4c1-802d-11e8-a389-0800270c281a
Labels:         controller-uid=2415e4c1-802d-11e8-a389-0800270c281a
                job-name=primes
Annotations:    <none>
Parallelism:    1
Completions:    1
Start Time:     Thu, 05 Jul 2018 11:26:40 +0300
Pods Statuses:  0 Running / 1 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=2415e4c1-802d-11e8-a389-0800270c281a
           job-name=primes
  Containers:
   primes:
    Image:      ubuntu
    Port:       <none>
    Host Port:  <none>
    Command:
      bash
    Args:
      -c
      current=0; max=110; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [  `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  46s   job-controller  Created pod: primes-bwdt7

Pay attention to several important fields in this description. In particular, the key Parallelism has a value of 1 (default) indicating that only one pod was started to do this job. In its turn, the key Completions tells that the job made one successful completion of the task (i.e prime numbers calculation). Since the pod successfully completed this task, the job was completed as well. Let's verify this by running:

kubectl get jobs
NAME      DESIRED   SUCCESSFUL   AGE
primes    1         1            5m

You can also easily check the prime numbers calculated by the bash script. In the bottom of the job description, find the name of the pod created by the Job (in our case, it is a pod named primes-bwdt7 . It is formatted as [POD_NAME][HASH_VALUE} ). Let's check the logs of this pod:

kubectl logs primes-bwdt7
1
2
3
5
7
11
13
17
19
23
29
31
37
41
....

That's it! The pod created by our Job has successfully calculated all prime numbers between 0 and 110. Above example represents a non-parallel Job. In this type of Jobs, just one pod is started unless it fails. Also, a Job is completed as soon as the pod completes successfully.

However, the Job controller also supports parallel Jobs which can create several pods working on the same task. There are two types of parallel jobs in Kubernetes: jobs with a fixed completions count and parallel jobs with a work queue. Let's discuss both of them.

Jobs with a Fixed Completions Count

Jobs with a fixed completions count create one or more pods sequentially and each pod has to complete the work before the next one is started. This type of Job needs to specify a non-zero positive value for .spec.completions which refers to a number of pods doing a task. A Job is considered completed when there is one successful pod for each value in the range 1 to .spec.completions (in other words, each pod started should complete the task). The jobs of this type may or may not specify the .spec.parallelism value. If the field value is not specified, a Job will create 1 pod. Let's test how this type of jobs works using the same spec as above with some slight modifications:

apiVersion: batch/v1
kind: Job
metadata:
  name: primes-parallel
  labels:
     app: primes
spec:
  completions: 3
  template:
    metadata:
      name: primes
      labels:
        app: primes
    spec:
      containers:
      - name: primes
        image: ubuntu
        command: ["bash"]
        args: ["-c",  "current=0; max=110; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [  `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done"]
      restartPolicy: Never

The only major change we made is adding the .spec.completions field set to 3, which asks Kubernetes to start 3 pods to perform the same task. Also, we set the app:primes label for our pods to access them in kubectl later.

Now, let's open two terminal windows.

In the first terminal, we are going to watch the pods created:

kubectl get pods -l app=primes -w

Save this spec in job-prime-2.yaml and create the job running the following command in the second terminal:

kubectl create -f job-prime-2.yaml
job.batch "primes-parallel" created

Next, let's watch what's happening in the first terminal window:

kubectl get pods -l app=primes -w
NAME                    READY     STATUS    RESTARTS   AGE
primes-parallel-7gsl8   0/1       Pending   0          0s
primes-parallel-7gsl8   0/1       Pending   0         0s
primes-parallel-7gsl8   0/1       ContainerCreating   0         0s
primes-parallel-7gsl8   1/1       Running   0         3s
primes-parallel-7gsl8   0/1       Completed   0         14s
primes-parallel-nsd7k   0/1       Pending   0         0s
primes-parallel-nsd7k   0/1       Pending   0         0s
primes-parallel-nsd7k   0/1       ContainerCreating   0         0s
primes-parallel-nsd7k   1/1       Running   0         4s
primes-parallel-nsd7k   0/1       Completed   0         14s
primes-parallel-ldr7x   0/1       Pending   0         0s
primes-parallel-ldr7x   0/1       Pending   0         0s
primes-parallel-ldr7x   0/1       ContainerCreating   0         0s
primes-parallel-ldr7x   1/1       Running   0         4s
primes-parallel-ldr7x   0/1       Completed   0         14s

As you see, the kubectl started three pods sequentially waiting for the current pod to complete the operation before starting the next pod.

For more details, let's check the status of the Job again:

kubectl describe jobs/primes-parallel
Name:           primes-parallel
Namespace:      default
Selector:       controller-uid=2ec4494f-8035-11e8-a389-0800270c281a
Labels:         controller-uid=2ec4494f-8035-11e8-a389-0800270c281a
                job-name=primes-parallel
Annotations:    <none>
Parallelism:    1
Completions:    3
Start Time:     Thu, 05 Jul 2018 12:24:14 +0300
Pods Statuses:  0 Running / 3 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=2ec4494f-8035-11e8-a389-0800270c281a
           job-name=primes-parallel
  Containers:
   primes:
    Image:      ubuntu
    Port:       <none>
    Host Port:  <none>
    Command:
      bash
    Args:
      -c
      current=0; max=70; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [  `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  1m    job-controller  Created pod: primes-parallel-b7d4s
  Normal  SuccessfulCreate  1m    job-controller  Created pod: primes-parallel-rwww4
  Normal  SuccessfulCreate  54s   job-controller  Created pod: primes-parallel-kfpqc

As you see, all three pods successfully completed the task and exited. We can verify that the Job was completed as well by running:

kubectl get jobs/primes-parallel
NAME              DESIRED   SUCCESSFUL   AGE
primes-parallel   3         3            6m

If you look into the logs of each pod, you'll see that each of them completed the prime numbers calculation successfully (to check logs, take the pod name from the Job description):

kubectl logs primes-parallel-b7d4s
1
2
3
5
7
11
13
17
19
23
29
31
...
kubectl logs primes-parallel-rwww4
1
2
3
5
7
11
13
17
19
23
29
31
...

That's it! Parallel jobs with fixed completions count are very useful when you want to perform the same task multiple times. However, what about a scenario when we want one task to be completed in parallel by several pods? Enter parallel jobs with a work queue!

Parallel Jobs with a Work Queue

Parallel jobs with a work queue can create several pods which coordinate with themselves or with some external service which part of the job to work on. If your application has a work queue implementation for some remote data storage, for example, this type of Job can create several parallel worker pods that will independently access the work queue and process it. Parallel jobs with a work queue come with the following features and requirements:

  • for this type of Jobs, you should leave .spec.completions unset.
  • each worker pod created by the Job is capable to assess whether or not all its peers are done and, thus, the entire Job is done (e.g each pod can check if the work queue is empty and exit if so).
  • when any pod terminates with success, no new pods are created.
  • once at least one pod has exited with success and all pods are terminated, then the job completes with success as well.
  • once any pod has exited with success, other pods should not be doing any work and should also start exiting.

Let's add parallelism to the previous Job spec to see how this type of Jobs work:

apiVersion: batch/v1
kind: Job
metadata:
  name: primes-parallel-2
  labels:
    app: primes
spec:
  parallelism: 3
  template:
    metadata:
      name: primes
      labels:
        app: primes
    spec:
      containers:
      - name: primes
        image: ubuntu
        command: ["bash"]
        args: ["-c",  "current=0; max=110; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [  `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done"]
      restartPolicy: Never

The only difference from the previous spec is that we omitted .spec.completions field, added the .spec.parallelism field and set its value to 3.

Now, let's open two terminal windows as in the previous example. In the first terminal, watch the pods:

kubectl get pods -l app=primes -w

Let's save the spec in the job-prime-3.yaml and create the job in the second terminal:

kubectl create -f job-prime-3.yaml
job.batch "primes-parallel-2" created

Next, let's see what's happening in the first terminal window:

kubectl get pods -l app=primes -w
NAME                      READY     STATUS    RESTARTS   AGE
primes-parallel-2-b2whq   0/1       Pending   0          0s
primes-parallel-2-b2whq   0/1       Pending   0         0s
primes-parallel-2-vhvqm   0/1       Pending   0         0s
primes-parallel-2-cdfdx   0/1       Pending   0         0s
primes-parallel-2-vhvqm   0/1       Pending   0         0s
primes-parallel-2-cdfdx   0/1       Pending   0         0s
primes-parallel-2-b2whq   0/1       ContainerCreating   0         0s
primes-parallel-2-vhvqm   0/1       ContainerCreating   0         0s
primes-parallel-2-cdfdx   0/1       ContainerCreating   0         0s
primes-parallel-2-b2whq   1/1       Running   0         4s
primes-parallel-2-cdfdx   1/1       Running   0         7s
primes-parallel-2-vhvqm   1/1       Running   0         10s
primes-parallel-2-b2whq   0/1       Completed   0         17s
primes-parallel-2-cdfdx   0/1       Completed   0         21s
primes-parallel-2-vhvqm   0/1       Completed   0         23s

As you see, the kubectl created three pods simultaneously. Each pod was calculating the prime numbers in parallel and once each of them completed the task, the Job was successfully completed as well.

Let's see more details in the Job description:

kubectl describe jobs/primes-parallel
Name:           primes-parallel
Namespace:      default
Selector:       controller-uid=d8bfbf9c-8038-11e8-a389-0800270c281a
Labels:         controller-uid=d8bfbf9c-8038-11e8-a389-0800270c281a
                job-name=primes-parallel
Annotations:    <none>
Parallelism:    3
Completions:    <unset>
Start Time:     Thu, 05 Jul 2018 12:50:27 +0300
Pods Statuses:  0 Running / 3 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=d8bfbf9c-8038-11e8-a389-0800270c281a
           job-name=primes-parallel
  Containers:
   primes:
    Image:      ubuntu
    Port:       <none>
    Host Port:  <none>
    Command:
      bash
    Args:
      -c
      current=0; max=110; echo 1; echo 2; for((i=3;i<=max;)); do for((j=i-1;j>=2;)); do if [  `expr $i % $j` -ne 0 ] ; then current=1; else current=0; break; fi; j=`expr $j - 1`; done; if [ $current -eq 1 ] ; then echo $i; fi; i=`expr $i + 1`; done
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  1m    job-controller  Created pod: primes-parallel-8ggnq
  Normal  SuccessfulCreate  1m    job-controller  Created pod: primes-parallel-gbwgm
  Normal  SuccessfulCreate  1m    job-controller  Created pod: primes-parallel-66w65

As you see, all three pods succeeded in performing the task. You can also check the pods' logs to see the calculation results:

kubectl logs primes-parallel-8ggnq
1
2
3
5
7
11
...
kubectl logs primes-parallel-gbwgm
1
2
3
5
7
11
...
kubectl logs primes-parallel-66w65
1
2
3
5
7
11
...

From the logs above, it may look like our Job with a work queue acted the same way as the job with a fixed completions count. Indeed, all three pods created by the Job calculated prime numbers in the range of 1-110. However, the difference is that in this example all three pods did their work in parallel. If we had created a work queue for our worker pods and some script to process items in that queue, we could make the pods access different batches of numbers (or messages, emails etc.) in parallel until no items left in the queue. In this example, we don't have a work queue and a script to process it. That's why all three pods created by the Job did the same task to completion. However, this example is enough to illustrate the main feature of this type of Jobs - parallelism.


In the real-world scenario, we could imagine a Redis list with some work items (e.g messages, emails) in it and three parallel worker pods created by the Job (see the Image above). Each pod could have a script to requests a new message from the list, process it, and check if there are more work items left. If no more work items exist in the list, the pod accessing it would exit with success telling the controller that the work was successfully done. This notification would cause other pods to exit as well and the entire job to complete. Given this functionality, parallel jobs with a work queue are extremely powerful in processing large volumes of data with multiple workers doing their tasks in parallel.

Cleaning Up

As our tutorial is over, let's clean up all resources:

Delete the Jobs

Deleting a Job will cause all associated pods to be deleted as well.

kubectl delete job primes-parallel-2
job.batch "primes-parallel-2" deleted
kubectl delete job primes-parallel
job.batch "primes-parallel" deleted
kubectl delete job primes
job.batch "primes" deleted

Also, delete all files with the Job specs if you don't need them anymore.

Conclusion

As you have learned, Kubernetes jobs are extremely powerful in parallel computation and batch processing of diverse workloads. However, one should remember that the Job object does not support closely-communicating parallel processes commonly found in scientific computing. The job's basic use case is parallel or sequential processing of independent but related work items such as messages, emails, numbers, files etc. Whenever you need a batch processing functionality in your Kubernetes apps, Jobs will help you implement it but you'll need to design your work queue and a script to process it. In the next tutorial, we'll walk you through several job design patterns that will help you address a number of real-world scenarios for batch processing.

 

Keep reading

Supergiant Is Now a Certified Kubernetes Provider

Posted by Kirill Goltsman on August 14, 2018

We are proud to announce that Supergiant passed the Kubernetes conformance tests on August 13, 2018, and is now a Certified Kubernetes Provider.

Keep reading

Working with Kubernetes Secrets

Posted by Kirill Goltsman on August 6, 2018

As a Kubernetes user or administrator, you may sometimes need to include sensitive information, such as username, passwords, or ssh keys in your pods. However, putting these types of data in your pod specs verbatim might compromise the security of your application. You need to avoid situations when the sensitive data accidentally ends up in the hands of bad actors. 

Fortunately, Kubernetes ships with a developed Secrets API designed specifically to solve this problem for you. Kubernetes Secrets are essentially API objects that encode sensitive data and expose it to your pods in a controlled way, enabling encapsulating secrets by specific containers or sharing them. 

In this tutorial, we introduce you to this powerful API and show you several options for creating and using secrets in your Kubernetes applications. Let's get started!

Why Do Secrets Matter?

Kubernetes Secrets come with numerous benefits compared to exposing sensitive data in your pods verbatim:

  • Since Secret objects can be created independently of the pods that use them, there is less risk of your Secret being viewed if somebody gets access to your pod spec.
  • Secrets are not written to disk (they are stored in a tmpfs), and they are sent only to nodes that need them. Also, Secrets are deleted when the pod that is dependent on them is deleted.
  • On most native Kubernetes distributions, communication between users and the apiserver is protected by SSL/TLS. Therefore, Secrets transmitted over these channels are properly protected.
  • Any given pod does not have access to the Secrets used by another pod, which facilitates encapsulation of sensitive data across different pods.
  • Each container in a pod has to request a Secret volume in its volumeMounts for it to be visible inside the container. This feature can be used to construct security partitions at the pod level.

You now understand the benefits of Secrets, so let's move on to the examples of using them in your Kubernetes applications.

Tutorial

To complete examples in this tutorial, you'll need:

  • a running Kubernetes cluster. See Supergiant GitHub wiki for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube.
  • akubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here.

Different Options for Creating Secrets

Secrets can be created in one of the following ways:

  • from local files using kubectl tool 
  • from literal values using kubectl tool
  • using a manifest file of kind:secret

Creating Secrets from Files using Kubectl

You can create local files with sensitive data like passwords and ssh keys and then convert them into a Secret stored and managed by the Kubernetes API. Kubernetes will create key-value pairs using the filenames and their contents and encode the sensitive data in base64 format.

To illustrate how this works, let's create a secret that stores username and password needed to access some application (e.g., database). First, let's create the files with the username and password on your local machine:

$ echo -n 'admin' > ./username.txt
$ echo -n 'jiki893kdjnsd9s' > ./password.txt

Next, we can use kubectl create secret command to package these files into a Secret and create a Secret API object on the API server:

$ kubectl create secret generic db-auth --from-file=./username.txt --from-file=./password.txt
secret "db-auth" created

Let's see if the Secret was successfully created by running the following command:

kubectl get secrets 

You should get the following response:

NAME                  TYPE                                  DATA      AGE
db-auth               Opaque                                2         1m

For more detailed information about the secret, we can use kubectl describe:

kubectl describe secrets/db-auth
Name:         db-auth
Namespace:    default
Labels:       <none>
Annotations:  <none>
Type:  Opaque
Data
====
password.txt:  15 bytes
username.txt:  5 bytes

The type: Opaque of the Secret means that from Kubernetes' perspective the contents of the Secret is unstructured (i.e., the Secret can contain arbitrary key-value pairs). Opaque Secrets contrast with Secrets storing ServiceAccount credentials that have constrained contents.

Also, as you see, kubectl get secrets and kubectl describe secrets/db-auth do not output information stored in the Secret to the console. However, you can still retrieve your Secret data in the base64 encoding by running the following command:

kubectl get secret db-auth -o yaml

This should return something like this:

apiVersion: v1
data:
  password.txt: amlraTg5M2tkam5zZDlz
  username.txt: YWRtaW4=
kind: Secret
metadata:
  creationTimestamp: 2018-07-02T11:26:16Z
  name: db-auth
  namespace: default
  resourceVersion: "225482"
  selfLink: /api/v1/namespaces/default/secrets/db-auth
  uid: bc188842-7dea-11e8-af62-0800270c281a
type: Opaque

Rather than returning the username and password in their plaintext form, the above command returns them in base64 encoding. You can easily decode the Secret if needed using native console tools:

echo "amlraTg5M2tkam5zZDlz" | base64 --decode
jiki893kdjnsd9s

Creating Secrets from Literal Values using Kubectl

The second option is to create Secrets from literal values. In this case, we manually provide kubectl with our credentials like this:

kubectl create secret generic literal-secret --from-literal=username=supergiant --from-literal=password=Niisdfiis
secret "literal-secret" created

When creating Secrets using literals, you need to make sure that special characters such as $, \*, and ! are escaped using the \ character. If your actual password is N!d$jjj!, for example, you should escape ! and $ characters like this:

kubectl create secret generic literal-secret-2 --from-literal=username=supergiant --from-literal=password=N\!d\$jjj\!
secret "literal-secret-2" created

Now, let's check and see if those characters were escaped correctly:

kubectl get secret literal-secret-2 -o yaml
apiVersion: v1
data:
  password: TiFkJGpqaiE=
  username: c3VwZXJnaWFudA==
kind: Secret
metadata:
  creationTimestamp: 2018-07-03T20:22:06Z
  name: literal-secret-2
  namespace: default
  resourceVersion: "238847"
  selfLink: /api/v1/namespaces/default/secrets/literal-secret-2
  uid: c122b2ab-7efe-11e8-aee4-0800270c281a

Get the base64 value of the password and decode it by running the following command:

echo "TiFkJGpqaiE=" |base64 --decode
N!d$jjj!

Awesome! The decoded value matches our original plaintext password.

Note: you do not need to escape characters if you create secrets using --from-file.

Creating a Secret Spec

The third option is to create Secrets using a Secret spec. In this case, we need to convert the username and password data into base64 encoding manually. Let's do it like this:

$ echo -n 'admin' | base64
YWRtaW4=
$ echo -n 'jiki893kdjnsd9s' | base64
amlraTg5M2tkam5zZDlz

Now, we can safely use the base64-encoded credentials in the Secret spec:

apiVersion: v1
kind: Secret
metadata:
  name: test-secret
type: Opaque
data:
  username: YWRtaW4=
  password: amlraTg5M2tkam5zZDlz

Save this spec in the test-secret.yaml file, and create the Secret by running the following command:

kubectl create -f test-secret.yaml
secret "test-secret" created

Format Note: Using newlines in your Secret data is not permitted. Also, beware that when using the base64 utility on Darwin/OS X users should avoid using the -b option in order to split long lines. In their turn, Linux users should add the option -w 0 to base64 commands or the pipeline base64 | tr -d '\n' if -w option is not available.

Security Note: In this example, you include the base64-encoded Secret in the JSON or YAML manifest. This implies that sharing this file or sending it to a source repository will compromise the Secret. That is because base64 encoding is not an encryption method and is regarded the same as plain text.

Using Secrets in Pods

So far, we have learned how to create Kubernetes secrets using kubectl and Secret spec. It's time to show how existing Secrets can be consumed by containers in your pods. The most common option is mounting Secrets as data volumes at some location within your container. Let's create a pod to see how this works:

apiVersion: v1
kind: Pod
metadata:
  name: pod-with-secret
spec:
  containers:
  - name: db
    image: mongo
    volumeMounts:
    - name: myvolume
      mountPath: "/etc/secrets"
      readOnly: true
  volumes:
  - name: myvolume
    secret:
      secretName: test-secret

Let's describe key secret-related fields of this spec:

  • .spec.volumes[] -- a volume to store the Secret
  • .spec.volumes[].secret.secretName -- a Secret object to store in the volume
  • .spec.containers[].volumeMounts[] -- a volume to be mounted into container and the mountPath where we want our Secret to appear. Once the pod is created, each key in the Secret data map (i.e 'username' and 'password') will become the filename under mountPath.
  • .spec.containers[].volumeMounts[].readOnly = true -- sets the volume access rights to readOnly. This prevents from changing the Secret data stored in the volume.

Also, when referencing Secrets in your pods, pay attention to certain restrictions:

  • a Secret needs to exist before any pods depending on it are created. The pod won't be started if the Secret does not exist. If the Secret cannot be retrieved for some reason, kubelet will periodically retry, and once the Secret is fetched, it will create and mount a volume with it.
  • Secrets can be referenced only by pods in the same namespace as the Secret object.
  • individual Secrets are limited to 1MB in size.

Our Secret 'test-secret' meets all of these requirements, so let's save the spec in secret-volume.yaml and create the pod:

kubectl create -f secret-volume.yaml
pod "pod-with-secret" created

Next, let's check if the volume mounted at the /etc/secrets path of our MongoDb container actually contains the Secret.

First, get a shell to the container running the following command:

kubectl exec -it pod-with-secret -- /bin/bash

When inside the container, list the /etc/secrets folder:

ls /etc/secrets
password  username

That's it! Our Secret object with the username and password was projected to two files in /etc/secrets folder. All Secret values are base-64 encoded inside these files. To verify that, within the container, run:

cat /etc/secrets/username
admin
cat /etc/secrets/password 
jiki893kdjnsd9s

In this example, the filenames for our Secret contents were constructed from the Secret keys (i.e.,  'username' and 'password'). However, Kubernetes allows changing the default filenames and paths to which to project the Secret's contents using volumes.secret.items parameter. For example, in order to mount your username at /etc/secrets/admin-group/admin-username and password at /etc/secrets/admin-group/admin-password, you need to make the following changes to the volume spec.

volumes:
  - name: myvolume
    secret:
      secretName: test-secret
      items:
      - key: username
        path: admin-group/admin-username
      - key: password
        path: admin-group/admin-password

The updated spec will project username and password values of the Secret to the paths specified in the items list. If our Secret had another key and we had not specified it, the default path /etc/secrets/<key> would be used. Also, take note that if a wrong Secret key is specified, the volume will not be created.

Update Note: as soon as our Secret is consumed by a volume, Kubernetes will be running periodic checks on the Secret. If the Secret is updated, the Kubernetes will ensure that the projected keys are updated as well. The update may take some time though depending on the kubelet sync period. However, if your container uses a Secret as a subPath volume mount, the Secret will not be updated.

Using Secrets as Environmental Variables

As you might remember from the previous article about Working with Kubernetes Containers, environmental variables offer a convenient way for exposing some arbitrary data to your application without overcrowding the execution context. Environmental variables can be used for storing Secret keys as well. We can store Secrets in environmental variables using the spec.containers.env field with a set of name-values in it:

apiVersion: v1
kind: Pod
metadata:
  name: secret-env
spec:
  containers:
  - name: db
    image: mongo
    env:
      - name: SECRET_USERNAME
        valueFrom:
          secretKeyRef:
            name: test-secret
            key: username
      - name: SECRET_PASSWORD
        valueFrom:
          secretKeyRef:
            name: test-secret
            key: password
  restartPolicy: Never

The spec above creates two environmental variables: SECRET_USERNAME and SECRET_PASSWORD, which take their values from the username and password keys of the test-secret.

Let's save the spec in secret-env.yaml and create the pod as usual:

kubectl create -f secret-env.yaml 
pod "secret-env" created

Now, you can access components of the Secret as simple environmental variables inside the container.

First, get a shell to the running container:

kubectl exec -it secret-env -- /bin/bash

Once inside the container, run:

echo $SECRET_USERNAME
admin
echo $SECRET_PASSWORD
jiki893kdjnsd9s

That's it! The components of your Secret are available as environmental variables inside the MongoDB container.

Note: In case if you are using envFrom instead of env to create environmental variables in the container, the environmental names will be created from the Secret's keys. If a Secret key has invalid environment variable name, it will be skipped, but the pod will be allowed to start. Kubernetes uses the same conventions as POSIX for checking the validity of environmental variables but that might change. According to POSIX:

Environment variable names used by the utilities in the Shell and Utilities volume of IEEE Std 1003.1-2001 consist solely of uppercase letters, digits, and the '_' (underscore) from the characters defined in Portable Character Set and do not begin with a digit. Other characters may be permitted by an implementation; applications shall tolerate the presence of such names.

If the environmental variable name does not pass the check, the  InvalidVariableNames event will be fired and the message with the list of invalid keys that were skipped will be generated.

Conclusion

As you have already learned, Kubernetes Secrets are extremely powerful in protecting sensitive data in your pods from ending up in wrong hands. We've learned several options for configuring Secrets such as using kubectl --from-file and defining a secret's spec. You also saw how to use Secrets in pods both as data volumes and environmental variables. When working with Kubernetes Secrets, however, users should be aware of the following risks and best practices for using them:

  • Since Secret data is stored in the API server in the etcd as plaintext, Kubernetes administrators should limit access to etcd to admin users and delete disks used by etcd when no longer in use.
  • Secrets used as data volumes should be protected at the application level from accidental logging or transmitting them to an untrusted party.
  • You should remember that users who can create a pod with a Secret may also see the value of that Secret. Additional security measures are needed to avoid exposing your Secrets to wrong users.
  • If the cluster runs multiple replicas of etcd, the Secrets will be shared between them. By default, the peer-to-peer communication with SSL/TLS is not secured by etcd. However, this can be configured.

It's crucial to pay attention to the risks outlined above when designing your applications with Kubernetes Secrets. However, once you get using Secrets right, the security of your Kubernetes applications will be dramatically enhanced.

Keep reading

Managing Memory and CPU Resources for Kubernetes Namespaces

Posted by Kirill Goltsman on July 14, 2018

From the previous tutorials, you already know that Kubernetes allows specifying CPU and RAM requests and limits for containers running in a pod, a feature that is very useful for the management of resource consumption by individual pods.

However, if you are a Kubernetes cluster administrator, you might also want to control global consumption of resources in your cluster and/or configure default resource requirements for all containers.

Fortunately, Kubernetes supports cluster resource management at the namespace level. As you might already know, Kubernetes namespaces provide scopes for names and resource quota, which allow efficiently dividing cluster resources between multiple users, projects, and teams. In Kubernetes, you can define default resource request and limits, resource constraints (minimum and maximum resource requests and limits), and resource quota for all containers running in a given namespace. These features enable efficient resource utilization by applications in your cluster and help divide resources productively between different teams. For example, using resource constraints for the namespaces allows you to control how resources are used by your production and development workloads, allowing them to consume their fair share of the limited cluster resources. This can be achieved by creating separate namespaces for production and development workloads and assigning different resource constraints to them.

In this tutorial, we show your three strategies for the efficient management of your cluster resources: setting default resource requests and limits for containers, defining minimum and maximum resource constraints, and setting resource quotas for all containers in the namespace. These strategies will help you address a wide variety of use cases leveraging the full power of Kubernetes namespaces and resource management.

Tutorial

To complete examples in this tutorial, you'll need the following prerequisites:

  • A running Kubernetes cluster. See Supergiant GitHub wiki for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube.
  • A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here.

Example #1: Defining Default Resource Requests and Limits for Containers in a Namespace

In this example, we're going to define default requests and limits for containers in your namespace. These default values will be automatically applied to containers that do not specify their custom resource requests and limits. In this way, default resource requests and limits can impose binding resource usage policy for containers in your namespace.

As you already know, default resource requests and limits are defined at the namespace level, so we need to create a new namespace:

kubectl create namespace default-resources-config
namespace "default-resources-config" created

Default values for resource requests and limits for a namespace must be defined in a LimitRange object. We chose to use the following spec:

apiVersion: v1
kind: LimitRange
metadata:
  name: default-requests-and-limits
spec:
  limits:
  - default:
      memory: 512Mi
      cpu: 0.8
    defaultRequest:
      memory: 256Mi
      cpu: 0.4
    type: Container

The spec.limits.default field of this spec sets default resource limits and the spec.limits.defaultRequest field sets the default requests for the containers running in our namespace.

Save this spec in the limit-range-1.yaml file and create the LimitRange running the following command:

kubectl create -f limit-range-1.yaml --namespace=default-resources-config
limitrange "default-requests-and-limits" created

Now, if we create a pod in the default-resources-config namespace and omit memory or CPU requests and limits for its container, it will be assigned the default values defined for the LimitRange above. Let's create a pod to see how this works:

apiVersion: v1
kind: Pod
metadata:
  name: default-resources-demo
spec:
  containers:
  - name: default-resources-cont
    image: httpd:2.4

Let's save thispPod spec in the default-resources-demo-pod.yaml and create a pod in our namespace:

kubectl create -f default-resources-demo-pod.yaml --namespace default-resources-config
pod "default-resources-demo" created

As you see, the Apache HTTP server container in the pod has no resource requests and limits. However, since we have specified default namespace resources, they will be assigned to the container automatically.

kubectl get pod default-resources-demo --output=yaml --namespace=default-resources-config

As you see in the output below, the default resource requests and limits were automatically applied to our container:

containers:
  - image: httpd:2.4
    imagePullPolicy: IfNotPresent
    name: default-resources-cont
    resources:
      limits:
        cpu: 800m
        memory: 512Mi
      requests:
        cpu: 400m
        memory: 256Mi

It's as simple as that!

However, what happens if we specify only requests or limits but not both? Let's create a new pod with only resource limits specified to check this:

apiVersion: v1
kind: Pod
metadata:
  name: default-resources-demo-2
spec:
  containers:
  - name: default-resources-cont
    image: httpd:2.4
    resources:
      limits:
        memory: "1Gi"
        cpu: 1

Let's save this spec in default-resources-demo-pod-2.yaml and create the pod in our namespace:

kubectl create -f default-resources-demo-pod-2.yaml --namespace default-resources-config
pod "default-resources-demo-2" created

Now, check the container resources assigned:

kubectl get pod default-resources-demo-2 --output=yaml --namespace default-resources-config

The response should be:

containers:
  - image: httpd:2.4
    imagePullPolicy: IfNotPresent
    name: default-resources-cont
    resources:
      limits:
        cpu: "1"
        memory: 1Gi
      requests:
        cpu: "1"
        memory: 1Gi

As you see, Kubernetes automatically set resource requests to match the limits specified by the container. Please pay attention to the fact that these values are applied even though the container did not initially specify resource requests.

Next, let's see what happens if memory and CPU requests are specified and resource limits are omitted. Create a spec for the third pod:

apiVersion: v1
kind: Pod
metadata:
  name: default-resources-demo-3
spec:
  containers:
  - name: default-resources-cont
    image: httpd:2.4
    resources:
      requests:
        memory: "0.4Gi"
        cpu: 0.6

Let's save the spec in the default-resources-demo-pod-3.yaml and create the pod in our namespace:

kubectl create -f default-resources-demo-pod-3.yaml --namespace default-resources-config
pod "default-resources-demo-3" created

After the pod has been created, check the container resources assigned:

kubectl get pod default-resources-demo-3 --output=yaml --namespace default-resources-config

You should get the following output in your terminal:

containers:
  - image: httpd:2.4
    imagePullPolicy: IfNotPresent
    name: default-resources-cont
    resources:
      limits:
        cpu: 800m
        memory: 512Mi
      requests:
        cpu: 600m
        memory: 429496729600m

As you see, the container was assigned the default limits and resource requests specified above.

Note: if the container's memory and CPU requests are greater than the default resource limits, the pod won't be created.

Cleaning Up

Let's clean up after this example is completed.

Delete the namespace:

kubectl delete namespace default-resources-config
namespace "default-resources-config" deleted

Example #2 : Setting Min and Max Resource Constraints for the Namespace

In this example, we're going to create resource constraints for the namespace. These constraints are essentially minimum and maximum resource amounts which containers can use in their resource requests and limits. Let's see how it works!

As in the previous example, create a namespace first:

kubectl create namespace resource-constraints-demo
namespace "resource-constraints-demo" created

Next, you are going to create a LimitRange for this namespace:

apiVersion: v1
kind: LimitRange
metadata:
  name: resource-constraints-lr
spec:
  limits:
  - max:
      memory: 1Gi
      cpu: 0.8
    min:
      memory: 500Mi
      cpu: 0.3
    type: Container

Save this LimitRange in the limit-range-2.yaml and create it:

kubectl create -f limit-range-2.yaml --namespace resource-constraints-demo
limitrange "resource-constraints-lr" created

After the LimitRange was created, let's see if our minimum and maximum resource constraints were applied to the namespace:

kubectl get limitrange resource-constraints-lr --namespace resource-constraints-demo --output=yaml 

The response should be:

spec:
  limits:
  - default:
      cpu: 800m
      memory: 1Gi
    defaultRequest:
      cpu: 800m
      memory: 1Gi
    max:
      cpu: 800m
      memory: 1Gi
    min:
      cpu: 300m
      memory: 500Mi
    type: Container

As you see, the default resource requests and limits for your namespace were automatically set equal to the max resource constraint specified in the LimitRange. Now, when we create containers in the resource-constraints-demo namespace, the following rules automatically apply:

  • If the container does not specify its resource request and limit, the default resource request and limit are applied.
  • All containers in the namespace need to have resource requests greater than or equal to 300m for CPU and 500 Mi for memory.
  • All containers in the namespace need to have resource limits less than or equal to 800m for CPU and 1Gi for memory.

Let's create a pod to illustrate how namespace resource constraints are applied to containers:

apiVersion: v1
kind: Pod
metadata:
  name: resource-constraints-pod
spec:
  containers:
  - name: resource-constraints-ctr
    image: httpd:2.4
    resources:
      limits:
        memory: "900Mi"
        cpu: 0.7
      requests:
        memory: "600Mi"
        cpu: 0.4

This spec requests 600Mi of RAM and 0.4 CPU and sets a limit of 900Mi RAM and 0.7 CPU for the httpd container within this pod. These resource requirements meet the minimum and maximum constraints for the namespace.

Let's save this spec in the resource-constraints-pod.yaml and create the pod in our namespace:

kubectl create -f resource-constraints-pod.yaml --namespace resource-constraints-demo
pod "resource-constraints-pod" created

Next, check the resources assigned to the container in the pod:

kubectl get pod resource-constraints-pod --namespace resource-constraints-demo --output=yaml

You should get the following output:

containers:
  - image: httpd:2.4
    imagePullPolicy: IfNotPresent
    name: resource-constraints-ctr
    resources:
      limits:
        cpu: 700m
        memory: 900Mi
      requests:
        cpu: 400m
        memory: 600Mi

That's it! The pod was successfully created because the container's request and limit are within the minimum and maximum constraints for the namespace.

Now, let's see what happens if we specify requests and limits beyond the minimum and maximum values defined for the namespace. Let's create a new pod with new requests and limits:

apiVersion: v1
kind: Pod
metadata:
  name: resource-constraints-pod-2
spec:
  containers:
  - name: resource-constraints-ctr-2
    image: httpd:2.4
    resources:
      limits:
        memory: "1200Mi"
        cpu: 1.2
      requests:
        memory: "200Mi"
        cpu: 0.2

Save this spec in the resource-constraints-pod-2.yaml and create the pod in our namespace:

kubectl create -f resource-constraints-pod-2.yaml --namespace resource-constraints-demo
pod "resource-constraints-pod-2" created

Since resource requests are below the minimum LimitRange value and resource limits are above the maximum values for this namespace, the pod won't be created as expected:

Error from server (Forbidden): error when creating "resource-constraints-pod-2.yaml": pods "resource-constraints-pod-2" is forbidden: [minimum memory usage per Container is 500Mi, but request is 200Mi., minimum cpu usage per Container is 300m, but request is 200m., maximum cpu usage per Container is 800m, but limit is 1200m., maximum memory usage per Container is 1Gi, but limit is 1200Mi.]

Cleaning Up

This example is over, so let's delete the namespace with all associated pods and other resources:

kubectl delete namespace resource-constraints-demo
namespace "resource-constraints-demo" deleted

Example # 3: Setting Memory and CPU Quotas for a Namespace

In the previous example, we set resource constraints for individual containers running within a namespace. However, it is also possible to restrict the resource request and limits total for all containers running in a namespace. This can be easily achieved with a ResourceQuota resource object defined for the namespace.

To illustrate how resource quotas work, let's first create a new namespace so that resources created in this exercise are isolated from the rest of your cluster:

kubectl create namespace resource-quota-demo
namespace "resource-quota-demo" created

Next, let's create a ResourceQuota object with resources quotas for our namespace:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: resource-quota
spec:
  hard:
    requests.cpu: "1.4"
    requests.memory: 2Gi
    limits.cpu: "2"
    limits.memory: 3Gi

This ResourceQuota sets the following requirements for the namespace:

  • ResouceQuota imposes the requirement for each container to define its memory and CPU requests and limits.
  • The memory request total for all containers must not exceed 2Gi.
  • The CPU request total for all containers in the namespace should not exceed 1.4 CPU.
  • The memory limit total for all containers in the namespace should not exceed 3Gi.
  • The CPU limit total for all containers in the namespace should not exceed 2 CPU.

Save this spec in the resource-quota.yaml and create the ResourceQuota running the following command:

kubectl create -f resource-quota.yaml --namespace resource-quota-demo
resourcequota "resource-quota" created

The ResouceQuota object was created in our namespace and is ready to control total requests and limits by all containers in that namespace. Let's see the ResourceQuota description:

kubectl get resourcequota --namespace resource-quota-demo --output=yaml

The response should be:

hard:
      limits.cpu: "2"
      limits.memory: 3Gi
      requests.cpu: 1400m
      requests.memory: 2Gi
  status:
    hard:
      limits.cpu: "2"
      limits.memory: 3Gi
      requests.cpu: 1400m
      requests.memory: 2Gi
    used:
      limits.cpu: "0"
      limits.memory: "0"
      requests.cpu: "0"
      requests.memory: "0"
kind: List

This output shows that no memory and CPU have been yet consumed in the namespace. Let's create two pods to change this situation.

The first pod will request 1.3Gi of RAM and 0.8 CPU and have a resource limit of 1.2 CPU and 2Gi of RAM.

apiVersion: v1
kind: Pod
metadata:
  name: resource-quota-pod-1
spec:
  containers:
  - name: resource-quota-ctr-1
    image: httpd:2.4
    resources:
      limits:
        memory: "2Gi"
        cpu: 1.2
      requests:
        memory: "1.3Gi"
        cpu: 0.8

Save this spec in the resource-quota-pod-1.yaml and create the pod in our namespace:

kubectl create -f resource-quota-pod-1.yaml --namespace resource-quota-demo
pod "resource-quota-pod-1" created

The pod was successfully created because the container's requests and limits are within the resource quota set for the namespace. Let's verify this by checking the current amount of used resources in the ResourceQuota object:

kubectl get resourcequota --namespace resource-quota-demo --output=yaml

The response should be:

status:
    hard:
      limits.cpu: "2"
      limits.memory: 3Gi
      requests.cpu: 1400m
      requests.memory: 2Gi
    used:
      limits.cpu: 1200m
      limits.memory: 2Gi
      requests.cpu: 800m
      requests.memory: 1395864371200m

As you see, the first pod has consumed some of the resources available in the ResourceQuota. Let's create another pod to increase the consumption of available resources even further:

apiVersion: v1
kind: Pod
metadata:
  name: resource-quota-pod-2
spec:
  containers:
  - name: resource-quota-ctr-2
    image: httpd:2.4
    resources:
      limits:
        memory: "1.3Gi"
        cpu: 0.9
      requests:
        memory: "1Gi"
        cpu: 0.8

Save this spec in the resource-quota-pod-2.yaml and create the pod:

kubectl create -f resource-quota-pod-2.yaml --namespace resource-quota-demo

Running this command will cause the following error:

Error from server (Forbidden): error when creating "resource-quota-pod-2.yaml": pods "resource-quota-pod-2" is forbidden: exceeded quota: resource-quota, requested: limits.cpu=900m,limits.memory=1395864371200m,requests.cpu=800m,requests.memory=1Gi, used: limits.cpu=1200m,limits.memory=2Gi,requests.cpu=800m,requests.memory=1395864371200m, limited: limits.cpu=2,limits.memory=3Gi,requests

As you see, Kubernetes does not allow us to create this pod because the container's CPU and RAM requests and limits exceed the ResourceQuota requirements for this namespace.

Cleaning Up

This example is completed, so let's clean up:

Delete the namespace:

kubectl delete namespace resource-quota-demo
namespace "resource-quota-demo" deleted

Conclusion

That's it! We have discussed how to set default resource requests and limits, and how to create resource constraints and resource quotas for containers in Kubernetes namespaces.

As you've seen, by setting default requests and limits for containers in your namespace, you can impose namespace-wide resource policies automatically applicable to all containers without manually specified resource requests and limits.

In addition, you learned how to use resource constraints to impose limitations on the quantity of resources consumed by containers in your namespace. This feature facilitates the efficient management of resources by different application classes and teams and ensures constant availability of free resources in your cluster. The same effect (but at a larger scale) can be achieved by resource quotas which allow defining resource constraints for the total consumption of resources by all containers in the namespace.

 

Keep reading