ALL THINGS KUBERNETES

Using Kubernetes Cron Jobs to Run Automated Tasks

In a previous tutorial, you learned how to use Kubernetes jobs to perform some tasks sequentially or in parallel. However, Kubernetes goes even further with task automation by enabling Jobs to create cron jobs that perform finite, time-related tasks that run repeatedly at any time you specify. Cron jobs can be used to automate a wide variety of common computing tasks such as creating database backups and snapshots, sending emails or upgrading Kubernetes applications. Before you can learn how to run cron jobs, make sure to consult our earlier tutorial about Kubernetes Jobs. If you are ready, let’s delve into the basics of cron jobs where we’ll show you how they work and how to create and manage them. Let’s get started!

Definition of Cron Jobs

Cron (which originated from the Greek word for time χρόνος) initially was a utility time-based job scheduler in Unix-like operating system. At the OS level, cron files are used to schedule jobs (commands or shell scripts) to run periodically at fixed times, dates, or intervals. They are useful for automating system maintenance, administration, or scheduled interaction with the remote services (software and repository updates, emails, etc.). First used in the Unix-like operating systems, cron jobs implementations have become ubiquitous today. Cron Job API became a standard feature in Kubernetes in 1.8 and is widely supported by the Kubernetes ecosystem for automated backups, synchronization with remote services, system, and application maintenance (upgrades, updates, cleaning the cache) and more. Read on because we will show you a basic example of a cron job used to perform a mathematic operation.

Tutorial

To complete examples in this tutorial, you need the following prerequisites:

  • A running Kubernetes cluster at version >= 1.8 (for cron job). For previous versions of Kubernetes (< 1.8) you need to explicitly turn on batch/v2alpha1  API by passing --runtime-config=batch/v2alpha1=true  to the API server (see how to do this in this tutorial), and then restart both the API server and the controller manager component. See Supergiant GitHub wiki for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube.
  • A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here.

Let’s assume we have a simple Kubernetes jobs to calculate a π to 3000 places using perl  and print out the result to stdout .

We can easily turn this simple job into a cron job. In essence, a cron job is a type of the API resource that creates a standard Kubernetes job executed at a specified date or interval. The following template can be used to turn our π job into a full-fledged cron job:

Let’s look closely at the key fields of this spec:

.spec.schedule  — a scheduled time for the cron job to be created and executed. The field takes a cron format string, such as 0 * * * *  or @hourly . The cron format string uses the format of the standard crontab (cron table) file — a configuration file that specifies shell commands to run periodically on a given schedule. See the format in the example below:

Each asterisk from the left to the right corresponds to a minute, an hour, a day of month, a month, a day of week on which to perform the cron job and the command to execute for it.

In this example, we combined a slash (/) with a 1-minute range to specify a step/interval at which to perform the job. For example, */5  written in the minutes field would cause the cron job to calculate π every 5 minutes. Correspondingly, if we wanted to perform the cron job hourly, we could write 0 */1 * * *  to accomplish that.

Format Note: The question mark ( ? ) in the schedule field has the same meaning as an asterisk *.  That is, it stands for any of available value for a given field.

.spec.jobTemplate  — a cron job’s template. It has exactly the same schema as a job but is nested into a cron job and does not require an apiVersion  or kind .

.spec.startingDeadlineSeconds  — a deadline in seconds for starting the cron job if it misses its schedule for some reason (e.g., node unavailability). A cron job that does not meet its deadline is regarded as failed. Cron jobs do not have any deadlines by default.

.spec.concurrencyPolicy  —  specifies how to treat concurrent executions of a Job created by the cron job. The following concurrency policies are allowed:

  1. Allow  (default): the cron job supports concurrently running jobs.
  2. Forbid : the cron job does not allow concurrent job runs. If the current job has not finished yet, a new job run will be skipped.
  3. Replace : if the previous job has not finished yet and the time for a new job run has come, the previous job will be replaced by a new one.

In this example, we are using the default allow policy. Computing π to 3000 places and printing out will take more than a minute. Therefore, we expect our cron job to run a new job even if the previous one has not yet completed.

.spec.suspend  — if the field is set to true , all subsequent job executions are suspended. This setting does not apply to executions which already began. The default value is false .

.spec.successfulJobsHistoryLimit  — the field specifies how many successfully completed jobs should be kept in job history. The default value is 3.

.spec.failedJobsHistoryLimit  — the field specifies how many failed jobs should be kept in job history. The default value is 1 . Setting this limit to   means that no jobs will be kept after completion.

That’s it! Now you have a basic understanding of available cron job settings and options.

Let’s continue with the tutorial. Open two terminal windows. In the first one, you are going to watch the jobs created by the cron job:

Let’s save the spec above in the cron-job.yaml and create a cron job running the following command in the second terminal:

In a minute, you should see that two π jobs (as per the Completions value) were successfully created in the first terminal window:

You can also check that the cron job was successfully created by running:

Computing π to 3000 places is computationally intensive and takes more time than our cron job schedule (1 minute). Since we used the default concurrency policy (“allow”), you’ll see that the cron job will start new jobs even though the previous ones have not yet completed:

As you see, some old jobs are still in process and new ones are created without waiting for them to finish. That’s how Allow  concurrency policy works!

Now, let’s check if these jobs are computing the π correctly. To do this, simply find one pod created by the job:

Next, select one pod from the list and check its logs:

You’ll see a pi number calculated to the 3000 place after the comma (that’s pretty impressive):

Awesome! Our cron job works as expected. You can imagine how this functionality might be useful for making regular backups of your database, application upgrades and any other task. As it comes to automation, cron jobs are gold!

Cleaning Up

If you don’t need a cron Job anymore, delete it with kubectl delete cronjob :

Deleting the cron job will remove all the jobs and pods it created and stop it from spawning additional jobs.

Conclusion

Hopefully, you now have a better understanding of how cron jobs can help you automate tasks in your Kubernetes application. We used a simple example that can kickstart your thought process. However, when working with the real world Kubernetes cron jobs, please, be aware of the following limitation.

A cron job creates a job object approximately once per execution time of its schedule. There are certain scenarios where two jobs are created or no Job is created at all. Therefore, to avoid side effects jobs should be idempotent, which means they should not change the data consumed by other scheduled jobs. If .spec.startingDeadlineSeconds  is set to a large value or left unset (the default) and if .spec. concurrencyPolicy  is set to Allow , the jobs will always run at least once. If you want to start the job notwithstanding the delay, set a longer .spec.startingDeadlineSeconds  if starting your job is better than not starting it at all. If you keep these limitations and best practices in mind, your cron jobs will never let your application down.