Skip to main content
Version: 2.18.0

Configure Scale to Zero

Scale to Zero is a scaling capability that allows a CPU-based workload (pod, service, job, or microservice) to scale down to zero running instances at a scheduled time. The workload resumes consuming compute only after the scheduled time.

Configure Scale to Zero in the Smart Scaler Agent

Scale to Zero is enabled through the agent configuration, and scale-to-zero schedules are created from the SaaS management console.

After this feature is enabled, the agent scales scheduled deployments down to zero replicas at the scheduled time and scales them back up afterward.

Enable Scale to Zero

  1. You must enable HPA auto-apply in the agent's YAML file. It is required for the custom metrics APIService used by HPAs. Enable HPA auto-apply in the ss-agent-values.yaml file as shown below:

    eventAutoscaler:
    autoscalerProperties:
    hpaAutoApply:
    enabled: true
  2. Enable Scale to Zero and allow namespaces as shown in the following snippet:

    eventAutoscaler:
    autoscalerProperties:
    hpaAutoApply:
    scaleToZero:
    enabled: true
    allowNamespaces: ["*"] # or list specific namespaces
    - frontend
    - cartservice
    - checkoutservice
    excludeNamespaces:
    - kube-system
    - kube-public
    - kube-node-lease
    excludeDeployments:
    - redis-cart
    - frontend
    # Never scale/restore these (controller needs inference-agent for metrics)
    - inference-agent
    - agent-controller-manager

Schedule Scale to Zero on the SaaS Management Console

You must create scale-to-zero schedules on the SaaS management console. For more information, see Manage Scale to Zero Schedules.