Version: 1.14.0

Features

The following table describes the key features of the EGS platform.

Main Feature	Sub Feature	Description
Multi-Cluster Bursting	Cross-cluster GPU bursting and sharing	Enables workloads to burst across clusters, sharing GPU resources dynamically. Supports heterogeneous GPU types, and optimizes resource utilization.
Multi-Cluster and Multi-Cloud Fleet Management	Cluster registration	Supports EKS, GKE, OKE, AKS, Akamai, on-prem cluster registration and management
	Create or delete a workspace	Workspace have one or more namespaces associated with it.
		Users can deploy AI workloads in the Workspace namespaces.
		Workspace has a GPU VPC. By default, no GPUs are available for workspace.
		A GPR (GPU Provision Request) is required to provision GPUs for a workspace.
Multi-Tenancy: Workspaces	One or more namespaces per workspace	One or more namespaces can be associated with a workspace.
	One or more worker clusters per workspace	Workspace can span across clusters. One or more clusters can be associated with a slice workspace.
Admin Access to Portal	Admin Profile and RBAC	RBAC for Admin access
	Access to the Admin Portal	An individual portal for an admin.
User Access to Portal	User per workspace	User per workspace
	Access to the User Portal	An individual portal for a user to manage GPRs and AI workloads within a specific slice workspace.
	User profile and RBAC	User Workspace RBAC and manage that workspace.
Inventory Management	GPU node pools, inventory list, and allocation status	Inventory management. Nvidia GPUs. GPU node pools. By default, all GPU pools/nodes are managed by EGS.
	Failover and Resilience	Automatically bursts workloads to secondary clusters during hardware failures or resource exhaustion in the primary cluster.
GPU Provisioning Requests (GPRs)	Create/delete/update GPR	GPU provisioning request for a workspace. Request to include a number of parameters such as GPU type, number, memory, nodes, cluster, and so on.
	Auto Eviction of a GPR	An admin can enable auto eviction of a GPR while registering a worker cluster.
	GPR Auto Remediation	To prevent GPU downtime, auto remediation can be configured during GPR creation.
	Auto GPU Selection	EGS automatically selects the best-fit GPUs for a GPR based on availability and workload requirements.
	Auto Cluster Selection	EGS automatically selects the best-fit cluster for a GPR based on GPU availability and workload requirements.
	GPR Templates	An admin can create GPR templates for a workspace so that users and other admins can use them.
	Auto GPR	Auto GPR can be enabled with a default template to create GPRs automatically in a slice workspace.
GPR Queue Management	GPR queue/list	List/view GPR queues, and GPR status.
	Change GPR priority	Change the priority of a queued GPR.
	Early release GPR	Early release of a provisioned GPR. Release GPUs provisioned to a workspace.
	Multi-Cluster support. GPR scoped to single cluster	Users can select the cluster for GPR.
GPU Allocation Management	Full node allocations	Full node allocated to only one workspace
	Idle timeout of allocations	After Idle timeout, return the GPUs to free pool.
	Multi-instance GPU node support	Users can configure the memory for MIG nodes during GPR creation.
	GPU Nodes Cost Exploration	The Admin portal provides cost analysis dashboard and an individual page to explore GPU nodes cost.
EGS Core API Support	API Tokens	Users can create/manage API tokens. Needed to access Core APIs.
	Core APIs	EGS core APIs
EGS Core SDK Support	Python SDK package	Supports all the EGS Core APIs
Deep AI Workload Observability	AI Workloads visualization	AI workload details in a workspace
	Pods/Model details	Pods/jobs running in the workspace. List of pods/jobs and model details.
	GPU usage/allocations	GPU/CPU allocations and usage for the workspace.
Dashboards	GPU Allocations/Usage and GPRs	GPU node/GPUs allocated/available, cost associated with allocations, total cost, leaderboards, different cluster/GPU type/and so on.
Inference Endpoints	Deploy and manage Inference Endpoints	Deploy one or more Inference Endpoints in a workspace.
	Supports HF/Mistral/other Models; Model Selection	Support for various commercial/open source LLM models.
	Supports custom model deployment	Upload custom model for deployment
	GPU provisioning for Inference endpoints	GPU presets or advanced custom configuration
	OpenAI Compatible APIs	OpenAI compatible APIs for inference - chat/completions/and so on
Events	GPR Events	Visualization of events
Admin Workflows	Register clusters, and create workspaces and users	Using the Admin access token
	Manage workspaces, GPRs, and Inference Endpoints	Manage Workspaces, GPRs, and Inference Endpoints. Explore GPU costs.
User Workflows	Manage GPRs and Inference Endpoints	Using the User access token