Configure Scale to Zero
Scale to Zero is a scaling capability that allows a CPU-based workload (pod, service, job, or microservice) to scale down to zero running instances at a scheduled time. The workload resumes consuming compute only after the scheduled time.
Configure Scale to Zero in the Smart Scaler Agent
Scale to Zero is enabled through the agent configuration, and scale-to-zero schedules are created from the SaaS management console.
After this feature is enabled, the agent scales scheduled deployments down to zero replicas at the scheduled time and scales them back up afterward.
Enable Scale to Zero
-
You must enable
HPA auto-applyin the agent's YAML file. It is required for the custom metrics APIService used by HPAs. EnableHPA auto-applyin thess-agent-values.yamlfile as shown below:eventAutoscaler:
autoscalerProperties:
hpaAutoApply:
enabled: true -
Enable Scale to Zero and allow namespaces as shown in the following snippet:
eventAutoscaler:
autoscalerProperties:
hpaAutoApply:
scaleToZero:
enabled: true
allowNamespaces: ["*"] # or list specific namespaces
- frontend
- cartservice
- checkoutservice
excludeNamespaces:
- kube-system
- kube-public
- kube-node-lease
excludeDeployments:
- redis-cart
- frontend
# Never scale/restore these (controller needs inference-agent for metrics)
- inference-agent
- agent-controller-manager
Schedule Scale to Zero on the SaaS Management Console
You must create scale-to-zero schedules on the SaaS management console. For more information, see Manage Scale to Zero Schedules.