ALL THINGS KUBERNETES

Introduction to the Supergiant 2.0.0 Analyze Tool

K8s clusters are complex and dynamic environments where nodes and application exist in a constant state of flux. Supergiant Capacity and Supergiant Analyze are two components of the Supergiant 2.0.0 toolkit that help K8s administrators manage this complexity.

While Supergiant Capacity enables intelligent auto-scaling of nodes to reduce infrastructure costs, this is just one part of the puzzle.

Administrators who manage multi-user clusters with multiple namespaces and applications installed need a fine-grained and real-time view of the utilized resources, application metrics and performance, and — even more important — actionable insights that help fix issues as they arise. This is exactly what Supergiant Analyze does.

In this blog, we’ll discuss the architecture of the Supergiant Analyze and show how to use the Resource Requests and Limits plugin designed to improve resource optimization in your Kubernetes clusters.

What Is Supergiant Analyze?

Supergiant Analyze may be compared to Artificial Intelligence or a smart advisor that helps Kubernetes administrators identify problems in their cluster in real-time and fix them based on Analyze recommendations.

The tool collects metrics, checks configuration, and suggests actions for users to improve health/performance/optimization of cluster(s) and applications(s). These tasks are performed by the “virtual” team of “workers” — Analyze plugins that are responsible for different types of issues and scenarios. Analyze Control Plane periodically invokes each plugin and stores the results of k8s cluster/hosted apps checks and analysis by each plugin the etcd  key value store.

Supergiant Analyze Architecture

How Does Analyze Work?

Analyze may be defined as a Service that interacts with each plugin using a well-defined API based on gRPC protobufs. In general, the Analyze Service:

  • Provides all needed configuration to plugins. Such information may include cloud credentials and access tokens for communicating with the cluster or other cluster-specific details.
  • Schedules periodic checks of pods, configurations or any custom metrics provided by a given plugin. This task is implemented in the Scheduler (see the image above) that invokes individual plugins API periodically, which in turn make checks of the cluster and/or its components. The check interval is defined by the individual plugin’s configuration.
  • Updates plugin statuses and aggregating plugin notifications for users.

In its turn, based on the analysis of metrics, each plugin informs the user about the state of a given metrics/object/component. Depending on the individual plugin’s algorithm, each state can be:

  • Green:  No user action is required.
  • Yellow:  Some actions are required.
  • Red:  Urgent actions are required.

Each plugin determines if the check falls within any of these categories based on its specific requirements.

After the check is done, the plugin suggests a set of actions for the user to improve the state of the cluster or application(s). Users can either execute the action or dismiss it. If the action is approved, a plugin will interact with the Kubernetes API or using other means to modify the cluster state. This can involve removing a node, rescheduling applications to another node, or changing Pod configuration.

Each plugin is an autonomous application that can be integrated with different external environments. For example, in the near future, Analyze will allow plugin integrations with:

  1. Cloud provider API to analyze and modify virtualized cloud infrastructure.
  2. Kubernetes clusters to get information about cluster state and resources/applications and modify the cluster state.
  3. Horizontal integration with  fraternal SG tools such as Supergiant Capacity.
  4. Any service accessible from inside a Kubernetes cluster and is secure for integration.

Let’s get the feel of how Analyze and its plugins work by looking into its UI. Let’s get started!

Tutorial: Using Supergiant Analyze plugins

To complete examples in this tutorial, you’ll need:

  • A running Supergiant 2.0.0 Control tool. See our SG control installation guide to find out more.
  • An AWS or DigitalOcean cloud account with access to security credentials.
  • Linkage of your AWS or DigitalOcean account to Supergiant following the steps in the Linking a Cloud Account tutorial.
  • A Kubernetes cluster deployed with Supergiant Control. Read our tutorial to learn how to do this.
  • A Supergiant Analyze deployed using the Supergiant Control. See the installation guide here.

At the moment, the Supergiant Analyze UI  has two main pages: Home and Plugins.

On the Home page, you can see the list of plugin checks stamped by the date when there were run (see the image below).

Supergiant Analyze: Plugin Checks

On the Plugins page, you can see the list of installed plugins. By default, Supergiant Analyze ships with two plugins developed by our team: “Underutilized nodes sunsetting” plugin and “Resources (CPU/RAM) requests and limits” (or simply “Requests/Limits plugin”). We’ll discuss one of them in a moment.

Supergiant Analyze: Installed Plugins

 

Let’s go back to the Home Page and discuss a plugin notification. In the image below, you can see a single notification by the Requests/Limits plugin.

Supergiant Analyze: single plugin notification

Each notification has the following elements:

  • Name of the plugin that ran a check
  • Plugin’s check status (e.g., green, yellow, or red) with a date of the last check. In the image above, the plugin’s check status is red, which indicates some problems.
  • Plugin check’s details. You can expand the check’s details by clicking on the “Details” tab.
  • Plugin actions. Users can either dismiss notifications or approve the recommended actions.

Let’s discuss these features using a real example of the Requests and Limits plugin.

Resource Requests and Limits Plugin: Overview

As you remember from our previous tutorial titled “Assigning Computing Resources to Containers and Pod in Kubernetes,” Kubernetes resource requests and limits are important tools for optimizing resource utilization in your Kubernetes cluster. In short, requests ensure that containers get a minimum amount of resource they need and are scheduled on nodes that have these resources, and they handle guarantees that containers can use the amount of resources (RAM and CPU) up to a certain limit. This allows your applications to burst when the traffic grows, for instance.

Requests/Limits plugin checks to see if resource requests and limits are properly configured in containers deployed in the cluster managed by Supergiant. Based on these findings, the plugin suggests actions to take. The plugin can dramatically improve resource utilization in your cluster by aligning resource requests/limits configuration with the Kubernetes resource model’s best practices.

That’s how the Requests/Limits plugin works. For each node in the cluster and for each Pod running on these nodes, the plugin checks container requests and limits and compiles a detailed table describing the status of requests/limits configuration for each container (see the image below). You can see this table by expanding the “Details” section inside the plugin notification.

Supergiant Analyze: Request/Limits Details

As you see in the table above, the plugin analyzed containers in two Pods residing on the node in our cluster: alertmanager-prometheus-operator-alertmanager-0  and prometheus-operator-grafana-7654f69d89-mhhkg . For each container in the Pod, the plugin displays the container name, container image, and requests/limits configuration for both RAM and CPU.

For example, let’s take a look at the requests and limits check for alertmanager  container running the first Pod. As you see, the container has a properly configured RAM request, but the CPU request is not set. The plugin treats this case as a major error indicated by the red highlighting of the text: “is not set.” In contrast, if limits are not set, the plugin assigns the yellow status to the container. That’s because setting requests is considered to be more important than setting limits since the absence of such a request can prevent a Pod from being scheduled at all.

As you see in the image above, the general status of the plugin check is red. How does the Requests/Limits plugin decide on what general status to assign to the full check? In general, the following rules apply.

  • If at least one container in the entire cluster has its requests undefined, the plugin sets the red status for the check.
  • If all requests are properly configured for each container in the cluster and if at least one container has its limits undefined, the overall status of the check is yellow.
  • If all containers have their requests and limits set both for CPU and RAM, the status check gets the green status.

Plugin Action

Users can choose to Dismiss or Approve the actions suggested by the plugin. If you want to dismiss the notification, select the “Dismiss Notification” tab and click “Run.” The Analyze will delete this plugin notification from the notifications list.

In contrast, if you want to approve the actions suggested by the plugin, select the “Set missing requests/limits” tab and click “Run.” The plugin will set missing requests/limits automatically or apply custom requests/limits for each container suggested by the user.

Supergiant Analyze: Plugin Notification Approve

Conclusion

In this article, we introduced you to the Supergiant Analyze — a tool in Supergiant 2.0.0 toolkit that enables smart cluster checks and recommendations to improve the efficiency of resource utilization in your cluster and its components. At the moment, the Analyze ships with two built-in plugins, but the Supergiant team is working on making Analyze pluggable and extendable. In the near future, users will be able to create their own plugins following a set of Supergiant plugin interface standards. Supergiant users will be able to develop plugins that cover the following use cases:

  • Application-level plugins for interacting with specific applications. Example: Nginx server plugin for managing logs, monitoring open connections, ports, etc.
  • Cluster-level plugins
  • Resource utilization plugins

What’s Next?

Learn more about Supergiant toolkit using the following resources:

Getting started with the Supergiant Toolkit

Supergiant Control on GitHub

Supergiant Capacity on GitHub

Supergiant Analyze on GitHub

Supergiant 2.0.0 Documentation