Skip to main content
Version: 1.1.0

Best Practices

This topic outlines recommended best practices for achieving secure and efficient production-grade operations.

Graceful Node Consolidation with Smart Karpenter on OCI

To ensure reliable and predictable node consolidations when using Smart Karpenter on Oracle Cloud Infrastructure (OCI), review the following recommended practices:

  1. Configure an Appropriate consolidateAfter Value

    OCI nodes typically require around 3 minutes to provision, and this buffer period allows new nodes to initialize and applications to stabilize before Smart Karpenter initiates consolidation or voluntary disruptions.

    If this parameter is not configured, Smart Karpenter may consolidate nodes too aggressively. To prevent the aggressive nodes consolidation:

    1. Confirm whether consolidateAfter is explicitly defined in the NodePool configuration.
    2. Set consolidateAfter to approximately 10 minutes.
  2. Use Pod Disruption Budgets (PDBs) and Anti-Affinity Rules

    Implement PDBs and anti-affinity rules for workloads, especially mission-critical or stateful applications. These safeguards help maintain availability during rescheduling events initiated by Smart Karpenter or Kubernetes. These safeguards are considered standard best practices even outside of Smart Karpenter environments.

  3. Protect Highly Sensitive Workloads from Voluntary Disruption

    For workloads that must never be disrupted during consolidation, add the following annotation to the pod specification:

    karpenter.sh/do-not-disrupt: "true"

    This prevents Smart Karpenter from evicting the pod during voluntary disruption workflows, including consolidation operations.

note

If node consolidation issues continue, please share the Smart Karpenter logs, NodePool definition, and relevant application manifests with Avesha Support. This information helps diagnose the issue accurately and offer more targeted guidance.