Prerequisites for Installation
This topic describes the prerequisites required for installing Elastic GPU Service (EGS). It is essential to have a well-configured environment that fulfills the installation requirements for EGS.
Kubernetes Cluster Requirements
- A Kubernetes cluster with GPU-enabled nodes.
- To support fractional GPU usage, the GPU must support MIG (Multi-Instance GPU), such as NVIDIA A100.
- GPU Operator must be installed on the cluster.
- If MIG capability and shared GPU features are required, ensure that the GPU Operator is properly configured. See GPU Operator with MIG for more details.
Monitoring and Observability
- NVIDIA GPU Operator is running and the
nvidia-dcgm-exporter
pod is active. - Prometheus is deployed with Persistent Volume Claims (PVCs) for data retention.
- Prometheus is configured to scrape metrics from the DCGM exporter.
- Grafana is installed and configured with the NVIDIA DCGM dashboard.
Kubernetes Cluster Access and Permissions
- An admin
kubeconfig
file to access the Kubernetes cluster. - The cluster must have outbound internet access to pull container images.
- Permissions to create the
kubeslice-controller
,kubeslice-system
, andkubeslice-<project_name>
(replace with your project name) namespaces. - Permissions to create Kubernetes LoadBalancer services.
PostgreSQL Database and Ingress
- A PostgreSQL database with persistence storage using PVCs.
- An Ingress controller (for example, NGINX) installed on the cluster.
Required Tools
The following command-line tools must be installed on the system or workstation where EGS will be deployed. These tools enable interaction with the Kubernetes cluster and help manage the installation process.
Tool | Description | Installation Guide | Minimum Version |
---|---|---|---|
Helm | Kubernetes package manager | Install Helm | 3.15.0 |
kubectl | Kubernetes CLI | Install kubectl | 1.23.6 |
kubectx and kubens | CLI tools for switching clusters and namespaces | Install kubectx & kubens | — |
jq | Command-line JSON processor | Refer to jq installation | 1.6.0 |
yq | Command-line YAML processor | Refer to yq installation | 4.44.2 |
EGS License
A valid license is required to install EGS. To get a trial license, you must register with Avesha. To register, visit the EGS registration page.
To know more about how to install a license, see License Management.