Container technology and container orchestration are revolutionizing deployment and management of applications in a multi-node distributed environments at scale. Since Google open sourced Kubernetes in 2014, a number of reputable tech companies have decided to move their container workloads to the platform, thereby contributing to its growing popularity and recognition in the community.
In 2018 we see a broad consensus that containers powered by containers orchestration frameworks make application CI/CD, deployment, and management at scale much more efficient and productive.
However, real benefits of containerization are often hidden beneath a layer of complex terminology that is known only to a few experts in the field.
In this article, we are going to educate business leaders and IT managers about the actual cost-saving potential of container technology and lift the veil on the complexity of container orchestration. The article is organized as follows. In the first part, we explain container architecture and the advantages of containers over virtual machines. In the second part, we focus on the economic benefits of container orchestration and present case studies of two IT companies that have successfully adopted container technologies and appreciated its benefits. Let’s get started!
Linux containers is the technology that allows packaging and isolating applications with their entire runtime environment (e.g., binaries, files, dependencies). This makes it easy to move the containerized applications between various environments while retaining full functionality. Sounds familiar? Don’t virtual machines (VMs) offer the same functionality? The answer is “yes” and “no.”
To make a long story short, container runtimes were developed as an alternative to immutable virtual machine (VM) images. VM images are heavier (i.e., consume more resources) than containers because they require a full OS to operate. Because of that, VMs are slower to start up, and just a few of them can occupy an entire server. In contrast, containers do not require a full OS packaged into them to work. Since they use OS-level virtualization as opposed to hardware-level virtualization in VMs, multiple containers can share a single host OS.
Containers typically include a small snapshot of the host filesystem and dependencies they need. This is not enough though. Containers can request additional resources and services from the host OS when needed. Thanks to this self-contained design and flexibility, containers allow disentangling applications from the underlying infrastructure and isolating them from the host environment, thereby making them portable and environment-agnostic.
But how does this translate into cost saving? That’s not so difficult to understand. Let’s first look at the image below to get an idea of how VMs and containers differ.
As you see, a server hosting 3 applications in 3 virtual machines in the standard VM approach would require three copies of the guest OS running on the server. To clarify things, a guest OS is a virtual OS managed by the hypervisor that translates guest OS instructions into host OS’ system calls. In most cases, a guest OS is different from the host OS on which it runs. Virtualization of the OS typically requires a lot of resources. This means that VMs running our three apps will be very heavy and will consume a lot of memory and disk and CPU resources.
How is the container approach different? In a container world, all 3 applications could be managed by the container engine (e.g., Docker) and share one single version of the host OS.
Now you get an idea of the basic advantage here: with containers, more applications can run on the same hardware because we avoid duplication of heavy OS images. So instead of requiring multiple hosts to deploy your apps, you can use a single host. Sounds like an immense cost saving? It really is!
Note: There is widespread confusion that Linux containers are, in essence, mini versions of VMs. Indeed, we might erroneously assume that when we look inside a Linux container. There we will find the familiar filesystem structure, devices, and software used in any Linux distribution. However, the contents of the container’s filesystem and its runtime environment are not a full OS but a small representation of the target OS needed for the container to work. The kernel and underlying resources are still provided by the host OS, whereas the system devices and software are provided by the image. A host Linux OS is, therefore, able to run a container even though it appears to be an entirely different Linux distribution.
It is important to understand that both containers and VMs have their unique place in the IT world. The arrival of containers does not mean that VMs became obsolete and that we don’t need them anymore. There are a number of scenarios when you should consider using VMs. For example:
Users should also remember that containers are not flawless. The most widely cited problem with containers is security. Contrary to popular misconception, containers are not fully self-contained. This means that containers can have access to the OS kernel, all devices, SELinux, Cgroups, and all system files. If the container has superuser privileges, the host security might be compromised. This means that you can’t run random container applications on your system as root. Recently, however, Kubernetes has done a very good job of providing lots of security tools (SELinux, AppArmor, highly configurable policies and networking options, etc.) to users, but that they just aren’t set up by default and it takes time and training to do it properly.
All things considered, containers have certain limitations, but all of them are ultimately solvable. In what follows, we continue with our analysis of container cost-saving benefits that immediately appear when the caveats discussed in this section are addressed.
You can leverage various container design patterns to regulate how many resources your container requires and consumes. There are three basic patterns to choose from: “scratch” containers, container OS, and full OS containers.
Popular container runtimes (like Docker) allow creating very lightweight and fast boot-up containers known as “scratch” containers. They are based on a “scratch environment” that includes minimal resources and dependencies. By default, scratch containers have no access to SSH or higher-level OS functions, which means they are somewhat limited. However, if you need just a super-small Linux kernel and minimal functionality to perform some tasks without a full access to the host OS, a scratch container is a way to go. They will dramatically reduce your infrastructure costs.
If you want more exposure to the host OS, you can opt for a container OS. These container images provide a package manager to install dependencies. A good example of the container OS is Alpine OS container available in the official Docker Hub repository. These containers are also very small: usually no more than 5-8 megabytes in size (e.g., Docker Alpine OS image is only 5Mb in size). However, since you can install dependencies, the size of a container OS can increase quite fast if it’s not kept in check.
Finally, major container systems like Docker allow creating containers with a full OS. For instance, you can create a container based on Ubuntu or any other Linux distribution. Full OS containers will be significantly larger than container OS and “scratch” containers, so they are less desirable if you want to save on infrastructure resources.
Leveraging these three container design patterns, you can create a perfect mix of containers in your deployment, minimizing unnecessary infrastructure costs.
At this point, you know how container architecture secures more efficient utilization of resources than VMs. However, containerization cost-saving benefits do not end there. Let’s list the most important of them:
Containers are great, but they are just “units of deployments” that should be efficiently managed if you run them at scale. When you run multiple containers distributed across multiple hosts, manual updates and scaling are the error-prone and non-trivial tasks. Without automation, companies using containers at scale run into the risk of longer downtimes, slower update cycles, and a growing gap between development, test, and production environments. Being aware of these risks, medium and large companies alike are incorporating container orchestration in their application management. Kubernetes (or K8s) is widely regarded as one of the best container management platforms in the market.
Kubernetes is an open source platform for the deployment and management of containerized applications at scale. It automates deployment, scaling, scheduling, update, and networking of containerized applications. The platform simplifies grouping multiple hosts with containerized applications on them into a homogenous cluster managed by the orchestration engine. Since 2014, when Google open sourced Kubernetes, a number of companies and developers have contributed to the project, building dozens of integrations with popular cloud providers, storage systems, and networking infrastructures, etc. Kubernetes is supported by the growing ecosystem and community and is currently the most popular container orchestration tool around.
Kubernetes is a mature platform that ships with all features for running containers in the public, private, hybrid clouds, multi-cloud, and on-premises ranging from networking, support for stateful apps and various storage systems, DNS, service discovery, and microservices, etc. If you follow best practices, you can expect Kubernetes to become a significant cost-savings component of your business. We compiled a list of cost benefits that can be achieved with Kubernetes:
A number of companies appreciate the immense cost benefits of using Kubernetes for their containerized workloads. Let’s briefly discuss two case studies of companies that benefited from moving their workloads to containers and Kubernetes: Qbox and Pinterest.
Qbox Inc. provides a hosted Elasticsearch service that simplifies deployment and management of Elasticsearch clusters with major cloud providers (e.g., AWS). Initially, the company was single-tenant with each ES node being its own dedicated machine on AWS (which itself is a VM). This approach was based on hand-picking certain instance types optimized for Elasticsearch and leaving it up to users to configure single-tenant, multi-node clusters running on isolated VMs in any region. Qbox added a markup on the per-compute-hour price for the DevOps support and monitoring. However, Qbox AWS bills quickly get out of hand when the company grew to thousands of clusters. In addition to that, support began spending most of their time on replacing dead nodes and answering support tickets. To make things worse, the company faced the problem of more resources allocated to clusters compared to the usage. Qbox had thousands of servers with a collective CPU utilization under 5%. VMs turned out to be extremely unproductive when deployed at scale.
Facing the problem of inefficient resource usage, squeezing profit margins, and fierce competition from cloud-hosted Elasticsearch providers (Google and AWS), Qbox decided to adopt the container-first approach based on Kubernetes, Docker, and Supergiant, the tool developed by the company to manage its Kubernetes deployments. The transition to a containerized architecture was worth the effort. Performance improvement came almost immediately. With Kubernetes and Supergiant, Qbox could “pack” more applications on a single host, which translated to more efficient use of its infrastructure and reduction of cloud costs.
To provide granular control over resource sharing while avoiding the problem of “noisy neighbors,” Qbox also took advantage of Kubernetes requests and limits. By setting container-specific requests and limits, Qbox achieved more fine-grained control over resource utilization in its clusters and successfully moved to more practical, performant, and cost-effective multi-tenancy. Thus, Kubernetes solved both the utilization and the noisy neighbor problem. Although Qbox is multi-tenant, everyone gets what they pay for without any interference from other users (and they even get more, because have only CPU (not RAM) limits, so they can use more than they paid for if no one else is on the server).
Qbox transition to Kubernetes gave birth to the Supergiant Kubernetes-as-a-Service platform that was originally used by the company to simplify deployment of containers on Kubernetes. As a major component of Supergiant, Qbox developed a cost-reduction packing algorithm that can efficiently package containers on nodes avoiding under-utilization of resources and efficiently spin up new nodes or remove old ones depending on the load. Using Supergiant resulted in an immediate 25% drop in Qbox infrastructure footprint. Overall, the company saved 50% (about $600k per year).
Pinterest is a web application that operates a system designed to discover and share information on the web, mostly using images, GIFs, and videos. Pinterest had 200 million monthly active users in September 2017.
The challenge Pinterest faced in 2015 was managing over 1000 microservices, multiple layers of infrastructure, and diverse setup tools. Back in 2015, the company deployment process looked as follows. It had one base Amazon Machine Image (AMI) with an OS, common shared packages, and installed tools. For some services, mostly large and complex ones, the company also had a service-specific AMI built on the base AMI and having all service dependency packages in it. In addition to that, the company used two deployment tools: Puppet for provisioning of cron jobs, infra components, and Teletraan for the deployment of production service code and some ML models.
Using this architecture at scale resulted in several hard challenges:
In response to these challenges, in early 2016, Pinterest decided to move its microservices to Docker containers and chose Kubernetes as the orchestration system. The immediate impact of this decision was:
As the evidence demonstrates, containers and Kubernetes have an immense cost-savings potential appreciated by a number of major companies. In particular, containers can significantly decrease infrastructure costs because they are more lightweight than VMs and can share a single OS. Other benefits of containers include faster CI/CD pipelines, better coordination between development and engineering teams, and low maintenance costs. If you add Kubernetes to the equation, you can save even more with autoscaling, efficient cluster-level resource management, rolling updates, and efficient application scheduling. Thanks to containerized applications and container orchestration, you can expect two-digit cost savings and more productive utilization of your development, operations, and support teams’ time.
Read some articles that introduce to key concepts of Kubernetes :
Find more about Supergiant: