Version: 1.15.0

Install Using the Script

This topic describes the steps to install EGS on the Kubernetes cluster using the script provided in the egs-installation repository.

note

The EGS Controller is also referred to as the KubeSlice Controller in some diagrams and in the YAML files.
The EGS Admin Portal is also referred to as the KubeSlice Manager (UI) in some diagrams and in the YAML files.

EGS Installation Script Overview

The EGS installation script automates the process of deploying EGS components on a Kubernetes cluster. The egs-installation repository includes the script to install EGS. The egs-installer script orchestrates the installation process by handling prerequisites, configuration, and installation steps based on the parameters defined in the egs-installer-config.yaml file.

The installation process involves cloning the repository, checking prerequisites, modifying the configuration file, and running the installation script.

Prerequisites

Before you begin the installation, ensure that you have completed the following prerequisites:

Have access to the Kubernetes cluster where you will install EGS and have the necessary permissions to create namespaces, deploy applications, and manage resources.
Installed prerequisites for the EGS controller. For more information, see Install EGS Controller Prerequisites.
Installed prerequisites for the worker cluster. For more information, see Install EGS Worker Prerequisites.
Applied a valid EGS license received from Avesha. For more information, see EGS Registration.
Have the required command-line tools installed, including kubectl and Helm. For more information, see Install Command Line Tools.

Clone the Repository

Clone the EGS installation repository using the following command:

git clone https://github.com/kubeslice-ent/egs-installation.git

note

Ensure the YAML configuration file is properly formatted and includes all required fields.
The installation script will terminate with an error if any critical step fails, unless explicitly configured to skip on failure.
All file paths specified in the YAML must be relative to the base_path, unless absolute paths are provided.

Create Namespaces

If your cluster enforces namespace creation policies, pre-create the namespaces required for installation before running the script. This step is an Optional step and only necessary if your cluster has such policies in place.

Navigate to the cloned egs-installation repository and locate the create-namespaces.sh script and the namespace-input.yaml file. Use the namespace-input.yaml file to specify the namespaces to be created. You must ensure that all required annotations and labels for policy enforcement are correctly configured in the namespace-input.yaml file.

Use the following command to create namespaces:

create-namespaces.sh --input-yaml <NAMESPACE_INPUT_YAML> --kubeconfig <ADMIN KUBECONFIG> --kubecontext-list <KUBECTX>

Example Command:

./create-namespaces.sh --input-yaml namespace-input.yaml --kubeconfig ~/.kube/config --kubecontext-list context1,context2

For more information, see the Namespace Creation readme file in the egs-installation repository.

Run the EGS Preflight Check Script

Use the egs-preflight-check.sh script to verify the prerequisites for installing EGS.

To run the preflight check script, use the following command:

./egs-preflight-check.sh --kubeconfig <ADMIN KUBECONFIG> --kubecontext-list <KUBECTX>

Example command:

./egs-preflight-check.sh --kubeconfig ~/.kube/config --kubecontext-list context1,context2

The script performs the following checks:

Validates the presence of required binaries (for example, kubectl, helm, jq, yq, curl)
Verifies access to the Kubernetes clusters specified in the kubecontext-list
Validates namespaces, permissions, PVCs, and services, helping to identify and resolve potential issues before installation

Run the EGS Prerequisites Installer Script

Use the egs-install-prerequisites.sh script to configure additional applications required for EGS, such as GPU Operator, Prometheus, and PostgreSQL.

To run the prerequisites installer script:

Navigate to the cloned egs-installation repository and locate the input configuration file named egs-installer-config.yaml.

Edit the egs-installer-config.yaml file with the global kubeconfig and kubecontext parameters:

global_kubeconfig: ""  # Relative path to global kubeconfig file from base_path default is script directory (MANDATORY)
global_kubecontext: ""  # Global kubecontext (MANDATORY)
use_global_context: true  # If true, use the global kubecontext for all operations by default

Enable additional applications installation by setting the following parameters in the egs-installer-config.yaml file:

# Enable or disable specific stages of the installation
enable_install_controller: true               # Enable the installation of the Kubeslice controller
enable_install_ui: true                       # Enable the installation of the Kubeslice UI
enable_install_worker: true                   # Enable the installation of Kubeslice workers

# Enable or disable the installation of additional applications (prometheus, gpu-operator, postgresql)
enable_install_additional_apps: true          # Set to true to enable additional apps installation

# Enable custom applications
# Set this to true if you want to allow custom applications to be deployed.
# This is specifically useful for enabling NVIDIA driver installation on your nodes.
enable_custom_apps: false

# Command execution settings
# Set this to true to allow the execution of commands for configuring NVIDIA MIG.
# This includes modifications to the NVIDIA ClusterPolicy and applying node labels
# based on the MIG strategy defined in the YAML (e.g., single or mixed strategy).
run_commands: false

# Node labeling automation for KubeSlice networking
# Set this to true to automatically label nodes with 'kubeslice.io/node-type=gateway'
# Priority: 1) Nodes with external IPs, 2) Any available nodes (up to 2 nodes)
# This is required when kubesliceNetworking is enabled in worker clusters
add_node_label: true

note

Important configuration in the egs-installer-config.yaml file:

Set enable_custom_apps to true if you need NVIDIA driver installation on your nodes.
Set run_commands to true if you need NVIDIA MIG configuration and node labeling.
Set add_node_label to true to enable automatic node labeling for KubeSlice networking.

After configuring the YAML file, run the egs-install-prerequisites.sh script to set up GPU Operator, Prometheus, and PostgreSQL:
```
./egs-install-prerequisites.sh --input-yaml egs-installer-config.yaml
```
This step installs the required infrastructure components before the main EGS installation.

Single Cluster Installation

For single-cluster deployments, you can skip the worker cluster registration step. The controller and worker components are installed in the same cluster.

To install EGS in a single-cluster setup, follow these steps:

Navigate to the cloned egs-installation repository and locate the input configuration file named egs-installer-config.yaml.

Edit the egs-installer-config.yaml with basic configuration parameters:

# Kubernetes Configuration (Mandatory)
global_kubeconfig: ""  # Relative path to global kubeconfig file from base_path default is script directory (MANDATORY)
global_kubecontext: ""  # Global kubecontext (MANDATORY)
use_global_context: true  # If true, use the global kubecontext for all operations by default

# Installation Flags (Mandatory)
enable_install_controller: true               # Enable the installation of the Kubeslice controller
enable_install_ui: true                       # Enable the installation of the Kubeslice UI
enable_install_worker: true                   # Enable the installation of Kubeslice workers
enable_install_additional_apps: true          # Set to true to enable additional apps installation
enable_custom_apps: true                      # Set to true if you want to allow custom applications to be deployed
run_commands: false                           # Set to true to allow the execution of commands for configuring NVIDIA MIG

Run the EGS installation script to deploy EGS components in the single cluster:
```
./egs-installer.sh --input-yaml egs-installer-config.yaml
```

Multi-Cluster Installation

To install EGS in a multi-cluster setup, follow these steps:

Navigate to the cloned egs-installation repository and locate the input configuration file named egs-installer-config.yaml.

Edit the egs-installer-config.yaml file with the global kubeconfig and kubecontext parameters:

global_kubeconfig: ""  # Relative path to global kubeconfig file from base_path default is script directory (MANDATORY)
global_kubecontext: ""  # Global kubecontext (MANDATORY)
use_global_context: true  # If true, use the global kubecontext for all operations by default

(AirGap installation only) If you are performing an AirGap installation, update the image_pull_secrets section in the config file with appropriate registry credentials or secret references. You can skip this step if you are not performing AirGap installation.
```
# From the email received after registration with Avesha 
IMAGE_REPOSITORY="https://index.docker.io/v1/"
USERNAME="xxx"
PASSWORD="xxx"
```
(Optional) These settings are required only if you are not using local Helm charts and instead pulling them from a remote Helm repository.
1. Set use_local_charts to false
```
use_local_charts: false
```
2. Set the global Helm repository URL
```
global_helm_repo_url: "https://smartscaler.nexus.aveshalabs.io/repository/kubeslice-egs-helm-ent-prod"
```

EGS Controller Configuration

Update the EGS Controller (KubeSlice Controller) configuration, in the egs-installer-config.yaml file:

#### Kubeslice Controller Installation Settings ####
kubeslice_controller_egs:
  skip_installation: false                     # Do not skip the installation of the controller
  use_global_kubeconfig: true                  # Use global kubeconfig for the controller installation
  specific_use_local_charts: true              # Override to use local charts for the controller
  kubeconfig: ""                               # Path to the kubeconfig file specific to the controller, if empty, uses the global kubeconfig
  kubecontext: ""                              # Kubecontext specific to the controller; if empty, uses the global context
  namespace: "kubeslice-controller"            # Kubernetes namespace where the controller will be installed
  release: "egs-controller"                    # Helm release name for the controller
  chart: "kubeslice-controller-egs"            # Helm chart name for the controller
#### Inline Helm Values for the Controller Chart ####
  inline_values:
    global:
      imageRegistry: harbor.saas1.smart-scaler.io/avesha/aveshasystems    # Docker registry for the images
      namespaceConfig:   # user can configure labels or annotations that EGS Controller namespaces should have
        labels: {}
        annotations: {}
      kubeTally:
        enabled: false                          # Enable KubeTally in the controller
  #### Postgresql Connection Configuration for Kubetally  ####
        postgresSecretName: kubetally-db-credentials   # Secret name in kubeslice-controller namespace for PostgreSQL credentials created by install, all the below values must be specified 
                                                    # then a secret will be created with specified name. 
                                                    # alternatively you can make all below values empty and provide a pre-created secret name with below connection details format
        postgresAddr: "kt-postgresql.kt-postgresql.svc.cluster.local" # Change this Address to your postgresql endpoint
        postgresPort: 5432                     # Change this Port for the PostgreSQL service to your values 
        postgresUser: "postgres"               # Change this PostgreSQL username to your values
        postgresPassword: "postgres"           # Change this PostgreSQL password to your value
        postgresDB: "postgres"                 # Change this PostgreSQL database name to your value
        postgresSslmode: disable               # Change this SSL mode for PostgreSQL connection to your value
        prometheusUrl: http://prometheus-kube-prometheus-prometheus.egs-monitoring.svc.cluster.local:9090  # Prometheus URL for monitoring
    kubeslice:
      controller:
        endpoint: ""                           # Endpoint of the controller API server; auto-fetched if left empty
  #### Helm Flags and Verification Settings ####
  helm_flags: "--wait --timeout 5m --debug"            # Additional Helm flags for the installation
  verify_install: false                        # Verify the installation of the controller
  verify_install_timeout: 30                   # Timeout for the controller installation verification (in seconds)
  skip_on_verify_fail: true                    # If verification fails, do not skip the step
  #### Troubleshooting Settings ####
  enable_troubleshoot: false                   # Enable troubleshooting mode for additional logs and checks

KubeTally Configuration

(Optional) Configure PostgreSQL to use the KubeTally (Cost Management) feature. The PostgreSQL connection details required by the controller are stored in a Kubernetes Secret in the kubeslice-controller namespace.

You can configure the secret in one of the following ways:
- To use your own Kubernetes Secret, enter only the secret name in the configuration file and leave other fields blank. Confirm the secret exists in the kubeslice-controller namespace and uses the required key-value format.
```
postgresSecretName: kubetally-db-credentials   # Existing secret in kubeslice-controller namespace

postgresAddr: ""  
postgresPort: ""  
postgresUser: ""  
postgresPassword: ""  
postgresDB: ""  
postgresSslmode: ""  
```
- To automatically create a secret, provide all connection details and the secret name. The installer will then create a Kubernetes Secret in the kubeslice-controller namespace.
```
postgresSecretName: kubetally-db-credentials   # Secret to be created in kubeslice-controller namespace

postgresAddr: "kt-postgresql.kt-postgresql.svc.cluster.local"  # PostgreSQL service endpoint
postgresPort: 5432                     # PostgreSQL service port (default 5432)
postgresUser: "postgres"               # PostgreSQL username
postgresPassword: "postgres"           # PostgreSQL password
postgresDB: "postgres"                 # PostgreSQL database name
postgresSslmode: disable               # SSL mode for PostgreSQL connection (for example, disable or require)
```
  info
  You can add the kubeslice.io/managed-by-egs=false label to GPU nodes. This label excludes or filters the associated GPU nodes from the EGS inventory.

EGS Admin Portal Configuration

The EGS Admin Portal (KubeSlice UI) provides a web-based interface for managing and monitoring the EGS environment. To configure the EGS Admin Portal installation, update the following settings in the egs-installer-config.yaml file:

note

The DCGM_METRIC_JOB_VALUE must match the Prometheus scrape job name configured in your Prometheus configuration. Without a proper Prometheus scrape configuration, GPU metrics will not be collected, and UI visualization will not work. Ensure your Prometheus configuration includes the corresponding scrape job.

# Kubeslice UI Installation Settings
kubeslice_ui_egs:
skip_installation: false                     # Do not skip the installation of the UI
use_global_kubeconfig: true                  # Use global kubeconfig for the UI installation
kubeconfig: ""                               # Path to the kubeconfig file specific to the UI, if empty, uses the global kubeconfig
kubecontext: ""                              # Kubecontext specific to the UI; if empty, uses the global context
namespace: "kubeslice-controller"            # Kubernetes namespace where the UI will be installed
release: "egs-ui"                            # Helm release name for the UI
chart: "kubeslice-ui-egs"                    # Helm chart name for the UI

# Inline Helm Values for the UI Chart
inline_values:
  global:
    imageRegistry: harbor.saas1.smart-scaler.io/avesha/aveshasystems   # Docker registry for the UI images
  kubeslice:
    prometheus:
      url: http://prometheus-kube-prometheus-prometheus.egs-monitoring.svc.cluster.local:9090  # Prometheus URL for monitoring
    uiproxy:
      service:
        type: ClusterIP                  # Service type for the UI proxy
        ## if type selected to NodePort then set nodePort value if required
        # nodePort:
        # port: 443
        # targetPort: 8443
      labels:
        app: kubeslice-ui-proxy
      annotations: {}

      ingress:
        ## If true, ui‑proxy Ingress will be created
        enabled: false
        ## Port on the Service to route to
        servicePort: 443
        ## Ingress class name (e.g. "nginx"), if you're using a custom ingress controller
        className: ""
        hosts:
          - host: ui.kubeslice.com     # replace with your FQDN
            paths:
              - path: /             # base path
                pathType: Prefix    # Prefix | Exact
        ## TLS configuration (you must create these Secrets ahead of time)
        tls: []
          # - hosts:
          #     - ui.kubeslice.com
          #   secretName: uitlssecret
        annotations: []
        ## Extra labels to add onto the Ingress object
        extraLabels: {}
      apigw:
        env:
          - name: DCGM_METRIC_JOB_VALUE
            value: nvidia-dcgm-exporter  # This value must match the Prometheus scrape job name for GPU metrics collection
   
    egsCoreApis:
      enabled: true                         # Enable EGS core APIs for the UI
      service:
        type: ClusterIP                  # Service type for the EGS core APIs

# Helm Flags and Verification Settings
helm_flags: "--wait --timeout 5m --debug"            # Additional Helm flags for the UI installation
verify_install: false                        # Verify the installation of the UI
verify_install_timeout: 50                   # Timeout for the UI installation verification (in seconds)
skip_on_verify_fail: true                    # If UI verification fails, do not skip the step

# Chart Source Settings
specific_use_local_charts: true              # Override to use local charts for the UI

Worker Cluster: Monitoring Endpoint Configuration

In multi-cluster deployments, you must configure the global_auto_fetch_endpoint parameter in the egs-installer-config.yaml file. This configuration is essential for proper monitoring and dashboard URL management across multiple clusters.

note

In single-cluster deployments, this step is not required, as the controller and worker are in the same cluster.
In a multi-cluster deployment, the controller cluster must be able to reach the Prometheus endpoint running on the worker clusters.

warning

If the Prometheus endpoints are not configured, you may experience issues with the dashboards (for example, missing or incomplete metric displays).

To configure monitoring endpoints for multi-cluster setups, follow these steps to update the inline values in your egs-installer-config.yaml file:

Set the global_auto_fetch_endpoint parameter to true.

Use the following commands to get the Grafana and Prometheus LoadBalancer External IPs or NodePorts from the worker clusters:

kubectl get svc prometheus-grafana -n monitoring
kubectl get svc prometheus-kube-prometheus-prometheus -n monitorin

Example Output

NAME                        TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                      AGE
prometheus-grafana          LoadBalancer   10.96.0.1      <grafana-lb>      80:31380/TCP                 5d
prometheus-kube-prometheus  LoadBalancer   10.96.0.2      <prometheus-lb>   9090:31381/TCP               5d

Update the Prometheus and Grafana LoadBalancer IPs or NodePorts in the inline_values section of your egs-installer-config.yaml file:

inline_values:  # Inline Helm values for the worker chart
 global:
   imageRegistry: harbor.saas1.smart-scaler.io/avesha/aveshasystems # Docker registry for worker images
operator:
  env:
    - name: DCGM_EXPORTER_JOB_NAME
    value: gpu-metrics  # This value must match the Prometheus scrape job name for GPU metrics collection
egs:
  prometheusEndpoint: "http://prometheus-kube-prometheus-prometheus.egs-monitoring.svc.cluster.local:9090"  # Prometheus endpoint
  grafanaDashboardBaseUrl: "http://<grafana-lb>/d/Oxed_c6Wz" # Grafana dashboard base URL
egsAgent:
  secretName: egs-agent-access
  agentSecret:
    endpoint: ""
    key: ""
metrics:
  insecure: true                        # Allow insecure connections for metrics
kserve:
  enabled: true                         # Enable KServe for the worker
  kserve:                               # KServe chart options
    controller:
      gateway:
      domain: kubeslice.com
      ingressGateway:
        className: "nginx"            # Ingress class name for the KServe gateway

Register a Worker Cluster

note

The EGS installation script installs a EGS worker on the controller cluster by default for quick installation and testing purposes.

The installation script allows you to register multiple worker clusters at the same time. To register an additional worker cluster, follow these steps to update the egs-installer-config.yaml file:

Add worker cluster configuration under:
- kubeslice_worker_egs array
- cluster_registration array
Repeat the configuration for each worker cluster you want to register.

To update the configuration file:

Add a new worker configuration to the kubeslice_worker_egs array in your configuration file. The following is an example configuration for a new worker:

kubeslice_worker_egs:
  - name: "worker-1"                           # Existing worker
    # ... existing configuration ...

  - name: "worker-2"                           # New worker
    use_global_kubeconfig: true                # Use global kubeconfig for this worker
    kubeconfig: ""                             # Path to the kubeconfig file specific to the worker, if empty, uses the global kubeconfig
    kubecontext: ""                            # Kubecontext specific to the worker; if empty, uses the global context
    skip_installation: false                   # Do not skip the installation of the worker
    specific_use_local_charts: true            # Override to use local charts for this worker
    namespace: "kubeslice-system"              # Kubernetes namespace for this worker
    release: "egs-worker-2"                    # Helm release name for the worker (must be unique)
    chart: "kubeslice-worker-egs"              # Helm chart name for the worker
    inline_values:                             # Inline Helm values for the worker chart
      global:
        imageRegistry: harbor.saas1.smart-scaler.io/avesha/aveshasystems # Docker registry for worker images
      egs:
        prometheusEndpoint: "http://prometheus-kube-prometheus-prometheus.egs-monitoring.svc.cluster.local:9090"  # Prometheus endpoint
        grafanaDashboardBaseUrl: "http://<grafana-lb>/d/Oxed_c6Wz" # Grafana dashboard base URL
      egsAgent:
        secretName: egs-agent-access
        agentSecret:
          endpoint: ""
          key: ""
      metrics:
        insecure: true                        # Allow insecure connections for metrics
      kserve:
        enabled: true                         # Enable KServe for the worker
        kserve:                               # KServe chart options
          controller:
            gateway:
              domain: kubeslice.com
              ingressGateway:
                className: "nginx"            # Ingress class name for the KServe gateway
    helm_flags: "--wait --timeout 5m --debug" # Additional Helm flags for the worker installation
    verify_install: true                      # Verify the installation of the worker
    verify_install_timeout: 60                # Timeout for the worker installation verification (in seconds)
    skip_on_verify_fail: false                # Do not skip if worker verification fails
    enable_troubleshoot: false                # Enable troubleshooting mode for additional logs and checks

Add a worker cluster registration configuration in the cluster_registration array in your configuration file. The following is an example configuration for a new worker cluster:

cluster_registration:
  - cluster_name: "worker-1"                    # Existing cluster
    project_name: "avesha"                      # Name of the project to associate with the cluster
    telemetry:
      enabled: true                             # Enable telemetry for this cluster
      endpoint: "http://prometheus-kube-prometheus-prometheus.egs-monitoring.svc.cluster.local:9090" # Telemetry endpoint
      telemetryProvider: "prometheus"           # Telemetry provider (Prometheus in this case)
    geoLocation:
      cloudProvider: ""                         # Cloud provider for this cluster (e.g., GCP)
      cloudRegion: ""                           # Cloud region for this cluster (e.g., us-central1)

 - cluster_name: "worker-2"                    # New cluster
   project_name: "avesha"                      # Name of the project to associate with the cluster
   telemetry:
     enabled: true                             # Enable telemetry for this cluster
     endpoint: "http://prometheus-kube-prometheus-prometheus.egs-monitoring.svc.cluster.local:9090" # Telemetry endpoint
     telemetryProvider: "prometheus"           # Telemetry provider (Prometheus in this case)
   geoLocation:
     cloudProvider: ""                         # Cloud provider for this cluster (e.g., GCP)
     cloudRegion: ""                           # Cloud region for this cluster (e.g., us-central1)

Run the EGS Installation Script

note

The installation script creates a default project workspace and registers a worker cluster.

Use the following command to install EGS:

./egs-installer.sh --input-yaml egs-installer-config.yaml

Access the Admin Portal

After the successful installation, the script displays the LoadBalancer external IP address and the admin access token to log in to the Admin Portal.

install

Make a note of the LoadBalancer external IP address and the admin access token required to log in to the Admin Portal. The KubeSlice UI Proxy LoadBalancer URL value is your Admin Portal URL and The token for project avesha (username: admin) is your login token.

Use the URL and the admin access token, from the previous step to log in to the Admin Portal.

installation

Retrieve Admin Credentials Using kubectl

If you missed the LoadBalancer external IP address or the admin access token displayed after installation, you can retrieve them using kubectl commands.

Perform the following steps to retrieve the admin access token and the Admin Portal URL:

Use the following command to retrieve the admin access token:

kubectl get secret kubeslice-rbac-rw-admin -o jsonpath="{.data.token}" -n kubeslice-avesha | base64 --decode

Example Output:

eyJhbGciOiJSUzI1NiIsImtpZCI6IjE2YjY0YzYxY2E3Y2Y0Y2E4YjY0YzYxY2E3Y2Y0Y2E4YjYiLCJ0eXAiOiJKV1QifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2UtYWNjb3VudCIsImt1YmVybmV0ZXM6c2VydmljZS1hY2NvdW50Om5hbWUiOiJrdWJlc2xpY2UtcmJhYy1ydy1hZG1pbiIsImt1YmVybmV0ZXM6c2VydmljZS1hY2NvdW50OnVpZCI6Ijg3ZjhiZjBiLTU3ZTAtMTFlYS1iNmJlLTRmNzlhZTIyMWI4NyIsImt1YmVybmV0ZXM6c2VydmljZS1hY2NvdW50OnNlcnZpY2UtYWNjb3VudC51aWQiOiI4N2Y4YmYwYi01N2UwLTExZWEtYjZiZS00Zjc5YWUyMjFiODciLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZXNsaWNlLXJiYWMtcnctYWRtaW4ifQ.MEYCIQDfXoX8v7b8k7c3
4mJpXHh3Zk5lYzVtY2Z0eXlLQAIhAJi0r5c1v6vUu8mJxYv1j6Kz3p7G9y4nU5r8yX9fX6c

Use the following command to access the Load Balancer IP:

Example

kubectl get svc -n kubeslice-controller | grep kubeslice-ui-proxy

Example Output

NAME                                                      TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)         AGE
kubeslice-ui-proxy                                        LoadBalancer   10.96.2.238   172.18.255.201 443:31751/TCP   24h

Note down the LoadBalancer external IP of the kubeslice-ui-proxy pod. In the above example, 172.18.255.201 is the external IP. The EGS Portal URL will be https://<ui-proxy-ip>.

Upload Custom Pricing for Cloud Resources

To upload custom pricing for cloud resources, you can use the custom-pricing-upload.sh script provided in the EGS installation repository. This script allows you to upload custom pricing data for various cloud resources, which can be used for cost estimation and budgeting. Ensure you have installed curl to upload the CSV file.

To upload custom pricing data:

Navigate to the cloned egs-installation repository and change the file permission using the following command:
```
chmod +x custom-pricing-upload.sh
```

Use the customer-pricing-data.yaml file to specify the custom pricing data. The file should contain the following structure:

kubernetes:
  kubeconfig: ""         #absolute path of kubeconfig
  kubecontext: ""        #kubecontext name
  namespace: "kubeslice-controller"
  service: "kubetally-pricing-service"

#we can add as many cloud providers and instance types as needed
cloud_providers:
  - name: "gcp"
    instances:
      - region: "us-east1"
        component: "Compute Instance"
        instance_type: "a2-highgpu-2g"
        vcpu: 1
        price: 20
        gpu: 1
      - region: "us-east1"
        component: "Compute Instance"
        instance_type: "e2-standard-8"
        vcpu: 1
        price: 5
        gpu: 0

Run the script to upload the custom pricing data:
```
./custom-pricing-upload.sh 
```

This script automates the process of loading custom cloud pricing data into the pricing API running inside a Kubernetes cluster.

Script Workflow:

Reads the cluster connection details (kubeconfig, context) from the YAML input file.
Identifies the target service and its exposed port (for example, kubetally-pricing-service:80).
Selects a random available local port on the host machine.
Establishes a port-forwarding tunnel from the selected local port to the Kubernetes service. Runs in the background to keep the tunnel active during upload.
Converts the pricing data from YAML format into CSV format for API ingestion.
Uploads the generated CSV file to the pricing API at:
```
http://localhost:<random_port>/api/v1/prices
```

Uninstall EGS

The uninstallation script removes all resources associated with EGS, including:

Workspaces
GPU Provision Requests (GPRs)
All custom resources provisioned by EGS

warning

Before running the uninstallation script, ensure that you have backed up any important data or configurations. The script will remove all EGS-related resources, and this action cannot be undone.

Use the following command to uninstall EGS:

./egs-uninstall.sh --input-yaml egs-installer-config.yaml

Troubleshooting

Missing Binaries

Ensure all required binaries are installed and available in your system’s PATH.
Cluster Access Issues

Verify that your kubeconfig files are correctly configured so the script can access the clusters defined in the YAML configuration.
Timeout Issues

If a component fails to install within the specified timeout, increase the verify_install_timeout value in the YAML file.

EGS Installation Script Overview​

Prerequisites​

Clone the Repository​

Create Namespaces​

Run the EGS Preflight Check Script​

Run the EGS Prerequisites Installer Script​

Single Cluster Installation​

Multi-Cluster Installation​

EGS Controller Configuration​

KubeTally Configuration​

EGS Admin Portal Configuration​

Worker Cluster: Monitoring Endpoint Configuration​

Register a Worker Cluster​

Run the EGS Installation Script​

Access the Admin Portal​

Retrieve Admin Credentials Using kubectl​

Upload Custom Pricing for Cloud Resources​

Uninstall EGS​

Troubleshooting​

EGS Installation Script Overview

Prerequisites

Clone the Repository

Create Namespaces

Run the EGS Preflight Check Script

Run the EGS Prerequisites Installer Script

Single Cluster Installation

Multi-Cluster Installation

EGS Controller Configuration

KubeTally Configuration

EGS Admin Portal Configuration

Worker Cluster: Monitoring Endpoint Configuration

Register a Worker Cluster

Run the EGS Installation Script

Access the Admin Portal

Retrieve Admin Credentials Using kubectl

Upload Custom Pricing for Cloud Resources

Uninstall EGS

Troubleshooting