Skip to main content
Version: 2.15.0

Inference Agent

Installing the Smart Scaler Agent installs the Event Agent and the Inference Agent. This topic describes the Inference Agent that collects real-time data.

Overview

The Inference Agent is a key component of the Smart Scaler platform. Its main job is to collect real-time data from your systems. This data is then sent back to the Smart Scaler platform for what we call inference.

The Inference Agent is a key component of the Smart Scaler platform. Its main job is to collect real-time data from your systems. This data is then sent back to the Smart Scaler platform for what we call inference.

Inference is a process where the platform uses real-time data to make predictive decisions. In this case, it's used to calculate the exact number of pods (units of your application) that need to be deployed to meet your Service Level Agreement (SLA). The SLA is a commitment between you and your users about the level of service they can expect.

The exact number of pods is the Horizontal Pod Autoscaler (HPA) value for each service. The HPA is a feature in Kubernetes that automatically adjusts the number of pods in a deployment based on observed CPU utilization.

With Smart Scaler, you can see in real time what the platform recommends in terms of pod deployment compared to what the HPA is doing. This gives you a clear picture of how the platform is optimizing your resources.

If you want to use Smart Scaler to manage your pod deployment, you will need to configure HPA to scale based on Smart Scaler metrics (instead of the CPU/Memory metrics it may otherwise be reacting to).

If for any reason you want to switch back to using reactive HPA (without SmartScaler), you can easily do so by reverting your HPA configuration to its previous settings.