Manage GPU Requests
This topic describes how to view, create, and manage GPU Requests (GPRs) in a project. GPU Requests are used to request GPU nodes in a project. The GPU nodes are provisioned based on the GPRs created by the workspace users or admins. The GPRs are queued and processed based on the priority of the request.
View GPU Requests
To view the GPU Requests in a project, you must be an admin or a workspace user with the required permissions.
To view the GPU Requests:
-
Log in to the Admin Portal.
-
Go to GPU Requests on the left sidebar.
-
On the GPU Requests page, The All GPU Requests tab shows all the GPU requests across all the workspaces.
-
Use the Search textbox or Filter to filter the GPRs.
View the GPRs Specific to a Workspace
You can view the GPRs specific to a workspace. On the GPU Requests page, go to the GPU Requests Per Workspace tab, and click the workspace to see its specific GPRs.
Create a GPU Request
You can create GPU Requests (GPRs) as a workspace user or an admin.
To create a GPU Request::
-
On the GPU Requests page, go to the GPU Requests Per Workspace tab.
-
In the workspace list, click the workspace for which you want to create a GPR.
-
On the GPU Requests page, click the Create GPU Request button on the top-right corner.
-
On the Create GPU Request pane, enter the following information to configure GPU request:
-
For GPU Configuration:
-
Enter a request name in the GPU Request Name text box.
-
Choose how GPU Shape and Node Type are selected:
- Manual Selection:
- Leave the Auto Select GPU checkbox unselected.
- Select a memory value from the Memory (GB) per GPU drop-down list. The list displays the available memory (GB) per cluster.
- Select a GPU shape from the GPU Shape drop-down list.
- Select a node type from the Node Type drop-down list.
- Auto Selection:
- Select the Auto Select GPU checkbox. The GPU shape and node type values are automatically selected based on the memory specified.
- If you manually enter GPU memory in the Memory (GB) per GPU text box, the Auto Select GPU checkbox is selected automatically. The minimum memory value is 1 GB, and the maximum value is the highest allocated memory available for any worker cluster in the workspace.
- Manual Selection:
-
Set the GPUs Per Node if you want to change its default value, 1.
-
Set the GPU Nodes if you want to change its default value, 1.
-
-
For Cluster Selection:
- Select a cluster from the Available Clusters list to create a workspace.
- To include all available clusters automatically in the workspace, select the Auto Select Cluster checkbox. By default, this option is unselected.
-
For Priority Configuration:
-
The default value for GPU User is the user name. The user name is the name of the workspace user.
-
Set Priority. The default value is Medium (1-200).
You can change the priority of a GPR in the queue. You can change the priority number (low: 1-100, medium: 1-200, high: 1-300) to move a GPR in the queue. When a GPR is moved to the top of the queue, it is provisioned when the resources are available to provision the GPR.
-
Set Priority Number. The default value is 101.
-
Specify Reserve For duration in Days (ddd), Hours (hh), and Minutes (mm).
-
-
Expand Advanced Configuration and:
-
(Optional) Set the idle time out to allow the GPU nodes to be used after the configured length of the time that it can be idle. This allows other GPRs to use the unused provisioned GPU nodes. The idle timeout duration is always lesser than the duration of the reservation.
-
(Optional) To make the idle timeout to be effective, the Enforce Idle Timeout is auto selected. If you want to only configure the timeout without enforcing, then unselect this checkbox.
-
(Optional) Select the Requeue on Failure check box if you want to queue this GPR in case it fails.
EGS auto detects issues with one or more GPUs in the provisioned GPR. EGS removes that related GPR from the workspace and re-queues that GPR.
-
(Optional) Select Evict Low-Priority GPRs to ensure that GPUs are automatically reallocated from low-priority GPU Provision Requests (GPRs) to high priority GPRs when resources are limited. The worker displays a list of low-priority GPRs that are eligible for auto-eviction. You can unselect Evict low priority GPRs if you do not want auto-eviction.
infoIf the admin configures auto eviction of low priority GPRs at the cluster level, then it is automatically selected in the Create GPU Request pane.
-
Click the Get Wait Time button. The Get Wait Time button is enabled only after you fill in the required fields. Clicking Get Wait Time shows the estimated wait time for the GPU nodes provisioning.
On the Requested GPUs section, under Available GPUs, you can manually select the clusters for GPR.
-
-
Click the Request GPUs button. The GPR is created and queued for provisioning. The GPR is processed based on the priority of the request and the availability of GPU nodes in the cluster.
-
- The status of the GPR request changes to Queued if the GPU node allocation is in queue.
- The status of the GPR request changes to Running if the GPU in queue is provisioned successfully.
- The status of the GPR request changes to Released Early if the GPR is released early then the scheduled time.
- The status of the GPR request changes to Completed if the GPR Request completes is scheduled time.
View GPU Request Details
After you create a GPR, it is queued for provisioning. The GPR is processed based on the priority of the request and the availability of GPU nodes in the cluster. You can view the details of the GPR in the GPU Requests page or you can see the GPU Request (GPR) in the GPU Requests Per Workspace tab.
On the GPU Requests page, select the GPU Request to view request details.
Create a GPR from a Template
You can create a GPR from a template that is available for the parent workspace.
To create a GPR from a template:
-
On the GPU Requests page, go to the GPU Requests Per Workspace tab.
-
In the workspace list, click the workspace for which you want to create a GPR.
-
Click Create GPU Request.
-
Click Select Template. The workspace should have at least one template to apply it to the new GPU request.
-
On the Template Selection pane, select a template and click Apply Template.
-
The template gets applied. You can review the template settings and make any necessary adjustments.
-
Click Get Wait Time. Clicking Get Wait Time automatically switches to the Request GPU tab.
EGS shows the estimated wait time for the GPU nodes provisioning.
-
Select the GPU in the Available GPUs table with acceptable estimated wait time.
-
Click Request GPUs.
-
View the GPR in that workspace's GPU Requests queue or in the main GPU Requests landing page.
Manage GPR Queues
The GPR Queue helps to visualize and control how GPU requests created under various workspaces will be processed. As an admin, one can track queues for each cluster and node instances and change the execution order by adjusting priority.
Change GPR Priority
To change the priority of a GPR, you can use the Priority Queue page.
Expand the GPU Requests on the left side bar to see the Priority Queue. The Priority Queue page shows the priority of the GPRs.
You can change the priority of a GPR in the queue. You can select a GPR and increase the priority number (low: 1-100, medium: 1-200, high: 1-300) to move a GPR higher in the queue. When a GPR is moved to the top of the queue, it is provisioned when the resources are available to provision the GPR.
Edit a GPR
-
For a queued GPR, under Actions, expand the vertical ellipsis menu, and click Edit.
-
After editing the values, click Update.
Early Release a Provisioned GPR
If a GPR is provisioned, you can early release it. Early releasing a GPR removes the associated GPU nodes from the workspace.
This is useful when you want to free up GPU resources for a higher priority GPR or if the provisioned GPR is no longer needed. You can early release a provisioned GPR only if it is in the Provisioned state. You can use this workflow for any other admin operations or under utilization of GPU resources and user requests.
After the GPR is early-released, the GPU nodes will no longer be available for any AI workloads running on the workspace . Any running workloads (pods/and so on) using GPUs and running on the node will go into a pending state.
To early release a provisioned GPR:
-
On the GPU Requests page, under Actions, expand the vertical ellipsis, and click Early Release from the menu.
-
On the confirmation dialog, enter RELEASE and click Release GPR.
View Cost Analysis
From a GPR, you can directly view cost analysis.
To view cost analysis:
-
On the GPU Requests page, go to All GPU Requests tab and select workspace or a cluster.
-
Under Actions, expand the vertical ellipsis, and click View Cost Analysis from the menu.
-
You are redirected to the Cost Analysis tab of the Dashboard page.
GPR Eviction
You can early release provisioned GPRs and make required nodes available for the high priority top GPR to be provisioned.
You can see a list of GPRs that needs to be evicted to provision the top GPR. You can manually early-release the GPRs to make room for the top GPR.
Delete a GPR
You as an admin can delete a GPR that is queued.
To delete a GPR that is queued:
-
Go to the GPU Requests on the left sidebar.
-
Identify the GPR which is Queued.
-
Under Action column of that GPR, click x mark to delete it or choose Delete from the Actions menu by pulling up the vertical ellipsis.
-
On the confirmation dialog, enter DELETE and press Delete GPR.