Skip to main content
Version: 1.15.0

Install Using Helm Charts

This topic describes how to manually install EGS using Helm charts.

note
  • The EGS Controller is also referred to as the KubeSlice Controller in some diagrams and in the YAML files.
  • The EGS Admin Portal is also referred to as the KubeSlice Manager (UI) in some diagrams and in the YAML files.

Prerequisites

Before you begin the installation, ensure that you have completed the following prerequisites:

  • Have access to the Kubernetes cluster where you will install EGS and have the necessary permissions to create namespaces, deploy applications, and manage resources.
  • Installed prerequisites for the EGS controller. For more information, see Install EGS Controller Prerequisites.
  • Installed prerequisites for the worker cluster. For more information, see Install EGS Worker Prerequisites.
  • Applied a valid EGS license received from Avesha. For more information, see EGS Registration.
  • Have the required command-line tools installed, including kubectl and Helm. For more information, see Install Command Line Tools.

Add and Update the Helm Repository

Use the following command to add and update the Helm repository:

helm repo add kubeslice-egs-helm-ent-prod https://kubeslice.aveshalabs.io/repository/kubeslice-egs-helm-ent-prod/
helm repo update

Create a Namespace

To create a kubeslice-controller namespace on the controller cluster, use the following command:

kubectl create namespace kubeslice-controller

Apply the License Secret File

To apply the license secret file, use the following command:

kubectl apply -f <license-secret-file> -n kubeslice-controller

After the license is applied, it is stored securely in EGS. You can manage the license through CLI. The license secret is stored in the kubeslice-controller namespace and can be viewed using:

kubectl get secret <license-secret-name.yaml> -n kubeslice-controller -o yaml

Install the EGS Controller

Get the PostgresSQL secret details, Kubernetes controller endpoint, and Prometheus endpoint URL. You will need these details to configure the values-controller.yaml file.

  1. Get the PostgreSQL secret details:

    info

    The PostgreSQL secret is created in the kubeslice-controller namespace as part of the pre-requisites installation. Use these secrets and not the ones in the kt-postgresql namespace. The secret name is kubetally-db-credentials. The values must be base64 decoded.

    Example

    k get secret kubetally-db-credentials -n kubeslice-controller -o json | jq -r '.data | to_entries[] | "\(.key)=\(.value|@base64d)"'
  2. Get the Kubernetes controller endpoint:

    kubectl cluster-info

    Example Output

    Kubernetes control plane is running at https://pu.mk8scluster-e00w111mv8rn8em35z.mk8s.eu-north1.nebius.cloud:443
    CoreDNS is running at https://pu.mk8scluster-e00w111mv8rn8em35z.mk8s.eu-north1.nebius.cloud:443/api/v1/namespaces/kube-system/services/coredns:udp-53/proxy
  3. Get the Prometheus endpoint URL:

    kubectl get svc -n egs-monitoring

    Example Output

    NAME                        TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                      AGE
    prometheus-operated ClusterIP None <none> 9090/TCP 10m
    prometheus-kube-prometheus-prometheus LoadBalancer 10.43.240.123 129.1XX.116.71 443:32000/TCP 10m

    For example, the PrometheusUrl value is "http://prometheus-kube-prometheus-prometheus.egs-monitoring.svc.cluster.local:9090".

  4. Create a values-controller.yaml file with the following properties:

    global:
    imageRegistry: harbor.saas1.smart-scaler.io/avesha/aveshasystems
    # user can configure labels or annotations that EGS Controller namespaces should have
    namespaceConfig:
    labels: {}
    annotations: {}
    kubeTally:
    enabled: false # need postgress avaialble for true. Set to true if you want to enable KubeTally.
    postgresSecretName: POSTGRES_SECRET_NAME
    existingSecret: false # Set to true if secret is pre-created externally
    postgresAddr: POSTGRES_ADDR
    postgresPort: POSTGRES_PORT
    postgresUser: POSTGRES_USER
    postgresPassword: POSTGRES_PASSWORD
    postgresDB: POSTGRES_DB
    postgresSslmode: POSTGRES_SSLMODE
    prometheusUrl: PROMETHEUS_ENDPOINT # Prometheus endpoint URL
    kubeslice:
    controller:
    endpoint: CONTROLLER_ENDPOINT # Kubernetes control plane endpoint from kubectl cluster-info

    The following is an example of the values-controller.yaml file:

    global:
    imageRegistry: harbor.saas1.smart-scaler.io/avesha/aveshasystems
    # user can configure labels or annotations that EGS Controller namespaces should have
    namespaceConfig:
    labels: {}
    annotations: {}
    kubeTally:
    enabled: true # need postgress avaialble for true. Set to true if you want to enable KubeTally.
    postgresSecretName: kubetally-db-credentials
    existingSecret: false # Set to true if secret is pre-created externally
    postgresAddr: "kt-postgresql.kt-postgresql.svc.cluster.local"
    postgresPort: "5432"
    postgresUser: "postgres"
    postgresPassword: "postgres"
    postgresDB: "postgres"
    postgresSslmode: "disable"
    prometheusUrl: "http://prometheus-kube-prometheus-prometheus.egs-monitoring.svc.cluster.local:9090"
    kubeslice:
    controller:
    endpoint: https://pu.mk8scluster-e00w111mv8rn8em35z.mk8s.eu-north1.nebius.cloud:443
    note

    In a multi-cluster deployment, the controller cluster must be able to reach the Prometheus endpoints running on the worker clusters.

    warning

    If the Prometheus endpoints are not configured, you may experience issues with the dashboards (for example, missing or incomplete metric displays).

  5. Use the values-controller.yaml file in the following command to install the EGS Controller:

    helm install egs-controller kubeslice-egs-helm-ent-prod/kubeslice-controller-egs -f <values-controller.yaml> -n kubeslice-controller
  6. Verify the installation by checking the status of the EGS Controller pods:

    kubectl get pods -n kubeslice-controller

    Example Output

    NAME                                                READY   STATUS      RESTARTS   AGE
    egs-gpr-manager-74f64ff8c7-h8dr2 1/1 Running 0 2m25s
    egs-inventory-controller-manager-86b7c49fdf-975bb 1/1 Running 0 2m25s
    egs-queue-manager-cfcfb9d85-s8qlb 1/1 Running 0 2m25s
    kubeslice-controller-manager-678985bdd6-75w5t 2/2 Running 0 2m25s
    kubetally-pricing-service-b69c65c7-7gq7x 1/1 Running 0 2m25s
    kubetally-pricing-updater-job-rdpnf 1/1 Running 0 2m25s
    kubetally-report-7c6fdbcb88-6bvdz 1/1 Running 0 2m25s

Install the EGS Admin Portal

  1. Create a file called values-ui.yaml with the following properties:

    global:
    imageRegistry: harbor.saas1.smart-scaler.io/avesha/aveshasystems # Docker registry for the UI images
    kubeslice:
    prometheus:
    url: PROMETHEUS_ENDPOINT
    uiproxy:
    service:
    type: ClusterIP # Service type for the UI proxy
    ## if type selected to NodePort then set nodePort value if required
    # nodePort:
    # port: 443
    # targetPort: 8443
    labels:
    app: kubeslice-ui-proxy
    annotations: {}

    ## Ingress support is only available from EGS version 1.12.1.
    ingress:
    ## If true, ui‑proxy Ingress will be created
    enabled: false
    ## Port on the Service to route to
    servicePort: 443
    ## Ingress class name (e.g. "nginx"), if you’re using a custom ingress controller
    className: ""
    hosts:
    - host: ui.kubeslice.com # replace with your FQDN
    paths:
    - path: / # base path
    pathType: Prefix # Prefix | Exact
    ## TLS configuration (you must create these Secrets ahead of time)
    tls: []
    # - hosts:
    # - ui.kubeslice.com
    # secretName: uitlssecret
    annotations: []
    ## Extra labels to add onto the Ingress object
    extraLabels: {}

    egsCoreApis:
    enabled: true # Enable EGS core APIs for the UI
    service:
    type: ClusterIP # Service type for the EGS core APIs
    #### Helm Flags and Verification Settings ####
  2. Use the values-ui.yaml in the following command to install the EGS Admin Portal:

    helm install egs-ui kubeslice-egs-helm-ent-prod/kubeslice-ui-egs -f <values-ui.yaml> -n kubeslice-controller

Create a Project

  1. Create a file called project.yaml with the following properties:

    apiVersion: controller.kubeslice.io/v1alpha1
    kind: Project
    metadata:
    name: avesha
    namespace: kubeslice-controller
    spec:
    serviceAccount:
    readWrite:
    - admin
  2. Apply the project.yaml file by using it in the following command in the controller cluster:

    kubectl apply -f project.yaml -n kubeslice-controller

Log in to the Admin Portal

To access the EGS Admin Portal, you need to retrieve the Admin Portal URL and the admin access token.

  1. Get the Admin Portal URL, use the following command:

    # Check EGS UI pod status
    kubectl get pods -n kubeslice-controller | grep kubeslice-ui-proxy

    Ensure that the kubeslice-ui-proxy pod is in the Running state before proceeding.

    Depending on the service type configured in the values-ui.yaml file, use one of the following methods to get the Admin Portal URL:

    1. For LoadBalancer service type:

      kubectl get svc kubeslice-ui-proxy -n kubeslice-controller -o jsonpath='{.status.loadBalancer.ingress[0].hostname}' 2>/dev/null || \
      kubectl get svc kubeslice-ui-proxy -n kubeslice-controller -o jsonpath='{.status.loadBalancer.ingress[0].ip}' 2>/dev/null

      Example

      kubectl get svc kubeslice-ui-proxy -n kubeslice-controller

      Example Output

      NAME                 TYPE           CLUSTER-IP       EXTERNAL-IP       PORT(S)         AGE
      kubeslice-ui-proxy LoadBalancer 10.128.144.231 139.144.167.243 443:32185/TCP 9m23s

      Example

      kubectl get svc kubeslice-ui-proxy -n kubeslice-controller -o jsonpath='{.status.loadBalancer.ingress[0].hostname}' 2>/dev/null || \
      kubectl get svc kubeslice-ui-proxy -n kubeslice-controller -o jsonpath='{.status.loadBalancer.ingress[0].ip}' 2>/dev/null

      Example Output

      139-144-167-243.ip.linodeusercontent.com
    2. For NodePort service type:

      NODE_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="ExternalIP")].address}' 2>/dev/null | grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' | head -n1)
      if [ -z "$NODE_IP" ]; then
      NODE_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}' 2>/dev/null | grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' | head -n1)
      fi
      NODE_PORT=$(kubectl get svc kubeslice- ui-proxy -n kubeslice-controller -o jsonpath='{.spec.ports[0].nodePort}' 2>/dev/null)
      echo "https://$NODE_IP:$NODE_PORT"

      Example Output

      NODE_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="ExternalIP")].address}' 2>/dev/null | grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' | head -n1)
      echo $NODE_IP
      139.177.207.126

      if [ -z "$NODE_IP" ]; then
      NODE_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}' 2>/dev/null | grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' | head -n1)
      fi

      echo $NODE_IP
      139.177.207.126

      NODE_PORT=$(kubectl get svc kubeslice-ui-proxy -n kubeslice-controller -o jsonpath='{.spec.ports[0].nodePort}' 2>/dev/null)
      echo "https://$NODE_IP:$NODE_PORT"
      https://139.177.207.126:32185
    3. For ClusterIP service type (port-forward required):

      kubectl port-forward -n kubeslice-controller svc/kubeslice-ui-proxy 8080:443
      echo "https://localhost:8080"

      The output will be the Admin Portal URL. For example, https://<EXTERNAL-IP>:<NODE-PORT> or https://<LOAD-BALANCER-IP>.

      Example Output

      kubectl port-forward -n kubeslice-controller svc/kubeslice-ui-proxy 8080:443
      echo "https://localhost:8080"
      Forwarding from 127.0.0.1:8080 -> 8443
      Forwarding from [::1]:8080 -> 8443

      installation

  2. Get the admin access token, use the following command:

    kubectl get secret kubeslice-rbac-rw-admin -o jsonpath="{.data.token}" -n kubeslice-avesha

Register a Worker Cluster

After a successful login, you can start using the EGS Admin Portal to manage your Kubernetes clusters and applications. It's essential to provide EGS at least one worker cluster for running jobs on GPU.

There are two options:

  • If your controller cluster also has GPU nodes, you can register the controller as a worker cluster by following the instructions in the Manual Cluster Registration topic.

  • If you have separate clusters with GPU nodes in addition to the controller cluster, you can register a worker cluster, create workspaces, and manage applications within the EGS environment. For more information on how to register a worker cluster, see Register a Worker Cluster.

Create a Workspace

After registering a worker cluster, you can create a workspace. A workspace is a logical boundary for a user or a team workspace. The workspace can be viewed as the workspace where users can deploy their AI workloads. For more information on how to create a workspace, see Create a Workspace.