Version: 2.17.0

Configure High Availability for the Inference Agent

The Smart Scaler Agent version 2.9.40 and later provides High Availability (HA) with priority-based custom metrics coordination. This ensures that the HPA always queries the pod delivering the highest-quality data.

Key Benefits

Removes any single point of failure in the system
Ensures the HPA consistently receives the highest-quality available metrics
Maintains service continuity with graceful degradation during partial outages
Automatically fails over to the healthiest data source based on data quality

Data Priority in Custom Metrics Coordination

The Smart Scaler Agent version 2.9.40 and later supports the following data priority in custom metrics coordination.

Priority Level	Description
SaaS-connected (highest priority)	Real-time recommendations from SaaS
Dynamic-fallback	Calculated by the HPA algorithm using real-time metrics
Static-fallback (lowest priority)	Uses fixed fallback values

Configure HA

To use HA, configure multiple replicas in your ss-agent-values.yaml file. For production environments, we recommend configuring at least two replicas.

By default, the deployed agent contains single replica.

Enable HA for the Smart Scaler Agent by adding the following configuration to the ss-agent-values.yaml file:

inferenceAgent:
  replicas: 2  # Enable HA with 2 replicas

Key Benefits​

Data Priority in Custom Metrics Coordination​

Configure HA​

Key Benefits

Data Priority in Custom Metrics Coordination

Configure HA