ALL THINGS KUBERNETES

Assigning Computing Resources to Containers and Pods in Kubernetes

The Kubernetes resource model is designed to optimize resource utilization by containers and to ensure efficient scheduling of Pods and high availability of applications. The core of the Kubernetes approach is resource requests and limits defined for containers during a Pod’s creation. In this tutorial, we shall describe the inner workings of the Kubernetes resource model and walk you through assigning compute resources (CPU and RAM) to containers using Kubernetes native tools and API. We shall also discuss how resources can be assigned using the Supergiant platform that provides a Kubernetes-as-a-Service solution. We hope this tutorial will give you a deeper knowledge of how to assign resources to containers using both Kubernetes native tools (kubectl) and Supergiant.

How Does the Kubernetes Resource Allocation Model Work?

In Kubernetes, you can assign resources to containers by specifying CPU and memory (RAM) requests and limits for each container in a Pod. These values can be then used by the scheduler to choose the right node to place your Pods on.

Kubernetes documentation defines a resource request as the amount of computing resource (CPU or memory) Kubernetes will guarantee to the container. Correspondingly, a resource limit is defined as the maximum amount of resources Kubernetes will allow the container to consume.

It’s important to note that Kubernetes decides whether the Pod can be scheduled on a given node by computing the sum of all requests and limits by containers in that Pod. That is, the Pod may not be scheduled on the Node if at least one of its containers fails to match its resource request with the allocatable maximum on a Node. It is a task of both the scheduler and kubelet  to make sure that the sum of all requests by all containers is within the node’s capacity for two types of resources (both CPU and memory).

Calculation of Resource Requests and Limits

Requests specified by the container should be equal or greater than   and not be greater than the Node Allocatable capacity. This rule may be summed up by the following formula: 0 <= request <= Node Allocatable . In its turn, a limit should be equal or greater than the request and can have no upper bounds: request <= limit <= Infinity .

One should remember, though, that scheduling Pods depends on requests , not limits . In other words, a Pod can be scheduled on the Node if its resource requests are within the Node’s allocatable or available capacity even if its limits exceed this capacity. Also, note that even though the actual memory or CPU utilization on nodes may be low, the scheduler will still refuse a Pod to be placed on a node if its resource requests are greater than the Node’s capacity. This feature protects the system against the resource shortage in case of traffic spikes.

Kubernetes Resource Types

Kubernetes abstracts computing resources from the underlying processor architectures, exposing them on-demand in raw values or base units. For CPU resource these base units are units of cores and for memory — units of bytes. A memory resource can be specified as a plain integer or as a fixed-point integer using such suffixes as E, P, T, G, M, K. In its turn, one CPU is equivalent to:

  • 1 AWS vCPU
  • 1 GCP Core
  • 1 Azure vCore
  • 1 Hyperthread on an Intel processor that has Hyperthreading

Kubernetes allows specifying CPU values in fractional quantities (e.g., 0.5 CPU). For example, a container with the resource request of 0.5 CPU is guaranteed half of the CPU. 0.5 CPU  is also equivalent to 500m , which stands for “five hundred millicpu” or “five hundred minicores”. It is noteworthy that, in Kubernetes, the CPU resource is always requested in absolute quantities, meaning that 0.1  is the same amount of CPU on a single core, dual core, or any other machine.

Pod Classes Depending on Resource Requests and Limits

Depending on a min/max ratio of resource requests and limits, we can end up with three distinct classes of Pods: guaranteed Pods, burstable Pods, and best-effort Pods.

 

Pod Classes

Guaranteed Pods

A Pod is regarded as guaranteed If limits  and optionally requests  (not equal to 0) are set for all resources across all containers and they are equal.

Example:

In this example, both the ‘first’ and the ‘second’ containers in the Pod have equal values for requests and limits correspondingly. This makes the Pod a guaranteed one.

Burstable Pods

A Pod is treated as burstable If requests  and optionally limits  are set (not equal to  ) for one or more resources in one or more containers, and they are not equal. When limits  are not specified, Pods can use as many resources as the Node can allocate.

Example:

In this example, containers ‘first’ and ‘second’ have different limits and resources set for different resources. This makes a Pod burstable.

Best-Effort Pods

A Pod is treated as a best-effort if neither requests nor limits are set for all the resources, across all containers.

Example:

As we see, in this example, neither “first” nor “second” container have their resources specified.

Kubernetes grants different resource rights and priorities to the above-described Pod classes. Best-effort Pods are of the lowest priority, and they are the first candidates for eviction if the system runs out of memory. In their turn, Guaranteed Pods have the highest priority and are guaranteed not to be killed or throttled unless limits are exceeded and if there are no lower priority containers to be removed. Finally, Burstable Pods enjoy minimal resource guarantees but are allowed to use more computing resources if available. If no Best-Effort Pods are available, Burstable Pods will be the first to be killed if the cluster is under capacity.

The following rules apply to all Pods regardless of their class:

  • containers can exceed their memory requests if the memory is available on the Node. However, they are not allowed to use more memory than specified in the memory limit. Such a container becomes a candidate for termination. If a terminated container is restartable, the kubelet  will take care of it.
  • containers can or cannot exceed their CPU limit for an extended period of time. Whether or not containers run past their limit in CPU is up to the container runtime chosen (e.g., Docker, rkt). Thus, some containers can use more than their limit for a short time while others may not be able to do so. Whether or not containers have “burstable” CPU is up to how much free CPU is currently available on the node. If other containers are competing for those resources, the container will be brought back down to its request. If no one else wants those resources, the container can use as much CPU as specified in its limit.

Why Use Resource Requests and Limits at All?

We now have a basic understanding of how resource requests and limits affect the destiny of Pods. However, why should we use them at all?

Securing efficient consumption of computing resources and ensuring that high-priority Pods are always running are two key motivations for using Kubernetes resource requests and limits. More specifically:

  • Pods with low CPU and memory requests have a good chance of being scheduled.
  • If you set resource limits higher than resource requests, you can create Pods that can burst whenever CPU and memory resources are available. At the same time, having resource limits guarantees that resources used during a burst are limited to some amount.

Users can, however, run Pods without specifying resource requests and limits. In this case, the following rules apply:

  • The container can use all resources on the Node if no resource limits are specified.
  • If the container runs in a namespace with a default memory or CPU limit assigned to the container, then the cluster admin can use a LimitRange  to specify a default value for the memory limit for all containers.Note: If you want to find out how Supergiant further extends Kubernetes resource model with its cost-effective auto scaling algorithm, you will want to read this article.

Tutorial

In what follows, we are going to show how you can easily assign resource requests and limits to containers in a Pod using Kubernetes native tools.

To complete this tutorial, you’ll need the following prerequisites:

  • A running Kubernetes cluster. See Supergiant GitHub wiki for the details on deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster locally using Minikube.
  • A kubectl CLI installed and configured to communicate with the cluster. See how to install kubectl here.
  • A Heapster service running in your cluster. Note: Supergiant clusters are deployed with Heapster service by default. You can verify whether Heapster is running with kubectl get services --namespace=kube-system , which should produce a similar output if you are running a cluster deployed with Supergiant:

We are now set and can proceed to assigning resources to our Kubernetes containers.

Step 1: Create a new namespace

Creating a new namespace is a good practice that helps isolate computing resources and Pods used in a project from the rest of the cluster. We can create a new namespace for this tutorial using the following command:

If it works out, the console will produce the following output:

Step 2: Specify CPU requests and limits

For this tutorial we are creating a single-container Pod,  which is the most common type of Pods in Kubernetes. A container image used is NGINX web server accessed from the Docker Hub container repository. In order to specify a CPU and memory request for a given container, use spec.containers.resources.requests  field in the Pod manifest. For resource limits, we use spec.containers.resources.limits  field. We decided to set our resource request for the NGINX container at 0.5 CPU (or 500 millicpus) and a CPU limit at 1 CPU. Correspondingly, we have a memory request of 500 MiB and a memory limit of 700 MiB for this container.

Here’s the configuration file for the Pod:

Step 3: Create a Pod

To create a Pod, first save the configuration above in a file (e.g., test-pod.yaml ), and run the following command. (Note: use your path to the file.)

Step 4: Check whether the pod’s container is running in our namespace

To accomplish this, use get pod  command with the Pod’s name and -- namespace  argument set to “assigning-resources-tut” like this:

You should see the following output:

It indicates that the test-pod  is running with no restarts and has the age of 8 seconds.

You can also check a detailed information about the Pod using the following command outputting the Pod’s data in the YAML format:

Among other things, this output shows that the container is running with resource limits and requests we specified, so everything works as expected.

Step 5: Checking the actual resource usage

It’s very convenient to be able to track the resources your Pod is actually consuming. We can use the Heapster service to check this. Heapster enables Cluster Monitoring and Performance Analysis for Kubernetes and is installed by default on clusters deployed with Supergiant.

To use Heapster, we should first start a proxy:

The proxy will start in the current terminal window, so you should open another terminal to use Heapster. Now, to get a CPU usage rate run, in a new terminal window, run:

As you see, we are curling the Kubernetes API cpu/usage_rate  endpoint that refers to our namespace and the “test-pod” running in it. You should get output similar to this:

It shows a series of timestamps and corresponding CPU usage values for each timestamp, which helps us track the CPU usage dynamics over time.

Similarly, to get a memory usage rate, we can run a command with a memory/usage  endpoint:

The output below indicates that the Pod is using 13246454 of RAM.

Step 6: Deleting the pod

Now, as our tutorial is over, let’s clean up the cluster by deleting the Pod.

That’s it! You’ve learned how to assign resources to containers in a Pod. It’s as simple as that!

Managing Resource Requests and Limits with Supergiant

Supergiant is a very flexible system that combines an easy-to-use UI for cluster and resources deployment with the access to the low-level Kubernetes APIs and tools. For example, once you’ve deployed a Kube with Supergiant, you can use kubectl on the master the same way as described in this tutorial.

However, what if you don’t have time to learn kubectl  and low-level Kubernetes API but still want to set resources limits and requests for containers in your Pod?

Supergiant solves this problem for you by providing access to ~160 Helm charts in the /stable  Kubernetes Helm repository. This repository stores curated and well-tested Helm charts, which dramatically simplifies deployment of apps in Kubernetes (e.g see a partial view of Helm charts in the image below).

Supergiant Helm charts

 

Unfortunately, not all of these charts expose the ability to set resource requests and limits. However, if they do, you can use Supergiant’s Dashboard to configure them. One of the charts that offer such functionality is Sumokube, a hosted logging platform by Sumologic.

To find this app, click “Apps” in the main navigation menu and then “Deploy New App“. Enter “Sumokube” in the search field and select the chart to open its configuration.

Sumokube requests and limits

 

As you see, the chart contains default values for resource requests and limits which can be easily changed to match your needs. After all edits are made, just click “Submit” and the app will be deployed on your cluster within a minute or so.

As we’ve mentioned, not all helm charts expose resource requests and limits. However, you can always add your custom Helm repositories to Supergiant or use native Kubernetes tools like kubectl  discussed above.

Conclusion

We hope that this article shed some light on how the Kubernetes resource model actually works. As we’ve learned, the concepts of resource requests and limits are quite simple yet powerful. Using them reasonably can help you control how your Pods consume computing resources in the cluster. Kubernetes offers you a flexibility to define high-priority and low-priority Pods distributing limited resources in the most efficient way, ensuring that the most critical applications and containers are always running notwithstanding unexpected traffic spikes and various node-level failures.

Subscribe to our newsletter