Version: 1.15.0

Manage GPU Requests

This topic outlines the steps to create a GPU request, manage a GPR, and early-release the GPU nodes.

GPUs are not assigned to a workspace by default. Use the portal to create a GPU provision request and to run AI workloads (in the namespaces that are associated with the workspace) that require one or more GPUs.

The following are the GPU Provision Request (GPR) features:

Users can create one or more GPU provision requests as needed.
Only one GPR can be provisioned to a workspace at any given time.
Each GPR includes defined entry and exit times for GPU nodes from a workspace.
GPU nodes are isolated per workspace to ensure dedicated access.
GPUs assigned to a workspace cannot be accessed by other users or workspaces.
Users can manage GPRs through portal and have visibility into wait time for GPUs.
Users can delete or edit a GPR before it is provisioned.
If GPUs are no longer needed, users can early-release their GPR to free up resources.

Create a GPU Request

To create a GPU provision request (GPR), you can use the available GPR templates or manually configure a GPU.

To create a GPU request:

On the GPU Requests page, go to the GPU Requests Per Workspace tab.
In the workspace list, click the workspace for which you want to create a GPR.
On the GPU Requests page, click the Create GPU Request button on the top-right corner.
On the Create GPU Request pane, enter the following information to configure GPU request:
1. For GPU Configuration:
  1. Enter a request name in the GPU Request Name text box.
  2. Choose how GPU Shape and Node Type are selected:
    - Manual Selection:
      - Leave the Auto Select GPU checkbox unselected.
      - Select a memory value from the Memory (GB) per GPU drop-down list. The list displays the available memory (GB) per cluster.
      - Select a GPU shape from the GPU Shape drop-down list.
      - Select a node type from the Node Type drop-down list.
    - Auto Selection:
      - Select the Auto Select GPU checkbox. The GPU shape and node type values are automatically selected based on the memory specified.
      - If you manually enter GPU memory in the Memory (GB) per GPU text box, the Auto Select GPU checkbox is selected automatically. The minimum memory value is 1 GB, and the maximum value is the highest allocated memory available for any worker cluster in the workspace.
  3. Set the GPUs Per Node if you want to change its default value, 1.
  4. Set the GPU Nodes if you want to change its default value, 1.
2. For Cluster Selection:
  - Select a cluster from the Available Clusters list to create a workspace.
  - To include all available clusters automatically in the workspace, select the Auto Select Cluster checkbox. By default, this option is unselected.
3. For priority configuration of the requesting GPU, enter the following information:
  1. The default value for GPU User is the user name. The user name is the name of the workspace user.
  2. The Priority is automatically assigned and cannot be modified by the user. The default values is Medium (1-200).
    note
    The admin chooses the priority for each workspace. As the user, you are not allowed to modify the priority of a GPU request. If you are seeing a long wait time and would like your request to be processed earlier, you may request your admin to promote the current GPU request priority to a higher priority. The priority is assigned based on the following values:
    
    High (1-50)
    
    Medium (51-150)
    
    Low (151-200)
  3. Specify Reserve For duration in Days (ddd), Hours (hh), and Minutes (mm).
For advanced configuration, enter the following information:
1. (Optional) Set the idle time out to allow the GPU nodes to be used after the configured length of the time that it can be idle. This allows other GPRs to use the unused provisioned GPU nodes. The idle timeout duration is always lesser than the duration of the reservation.
2. (Optional) Enforce Idle Timeout : After the idle timeout is set, the Enforce Idle Timeout checkbox will be enabled. Ensure this box is checked to enforce the idle timeout setting.
3. (Optional) Select the Requeue on Failure check box if you want to queue this GPR in case it fails.
4. Only the admin can select Evict low priority GPRs to configure auto eviction of a low-priority GPR.
5. (Optional) Select Run Pre-Check for node health awareness. A pre-check is performed to ensure that the requested GPU nodes can be allocated. If the pre-check fails, the GPR creation is blocked, and an error message is displayed indicating the reason for the failure.
Click the Get Wait Time button. The Get Wait Time button is enabled only after you fill in the required fields. Clicking Get Wait Time shows the estimated wait time for the GPU nodes provisioning.

On the Requested GPus section, under Available GPUs, you can manually select the clusters for GPU provisioning,
if you have not selected the Auto Select GPU checkbox.
Click the Request GPUs button.

The GPR is created and queued for provisioning. The GPR is processed based on the priority of the request and the availability of GPU nodes in the cluster.

info

The status of the GPR request changes to Queued if the GPU node allocation is in queue.
The status of the GPR request changes to Running if the GPU in queue is provisioned successfully.
The status of the GPR request changes to Released Early if the GPR is released early then the scheduled time.
The status of the GPR request changes to Completed if the GPR Request completes is scheduled time.

View GPU Requests Details

After you create a GPR, it is queued for provisioning. The GPR is processed based on the priority of the request and the availability of GPU nodes in the cluster. You can view the details of the GPR in the GPU Requests page or you can see the GPU Request (GPR) in the GPU Requests Per Workspace tab. The user can manage the GPRs in their workspaces.

To view the GPU Request:

On the GPU Requests page, select the workspace to view GPU requests for a workspace.
For the selected workspace, select the GPR to view GPR details. You can view the details of the GPUs and Nodes allocated to the GPR and the pre-check details if it is enabled.

Create a GPR from a Template

To create a GPU request:

On the GPU Requests page, go to the GPU Requests Per Workspace tab.
In the workspace list, click the workspace for which you want to create a GPR.
On the GPU Requests page, click the Create GPU Request button on the top-right corner.
On the Create GPU Request pane, click Select Template on the top-right corner to select the available GPR template (GPR configuration) to create a GPU request.
Select the available template and click Apply Template.
The template gets applied. You can review the template settings and make any necessary adjustments.
Click Get Wait Time. Clicking Get Wait Time automatically switches to the Request GPU tab.

EGS shows the estimated wait time for the GPU nodes provisioning.
Select the GPU in the Available GPUs table with acceptable estimated wait time.
Click Request GPUs.
View the GPR in that workspace's GPU Requests queue or in the main GPU Requests landing page.

info

To view the templates assigned to a workspace, see View GPR Templates. If the available templates do not have the required configuration, you can manually configure a GPU.

Manage GPR Queues

The user can manage the GPRs that are on their workspace GPR queue.

The following operations can be performed:

The user can delete a pending GPR. This will remove the GPR from the queue.
The user can early-release a provisioned GPR. This will end the GPR early (early exit of GPU nodes).
The user can edit a pending GPR.
The user can extend a GPR with a small grace period.

Edit GPU Requests

To edit the GPU request:

On the GPU Requests page, select the GPU request you want to edit.
On the top-right, click the Actions button and select Edit. You can edit only the GPU request name.

Edit the request name and click the Update button.

Early Release the GPU Nodes

For any reason, if you want to release the GPU nodes associated with the workspace, you can early-release the GPR. From the portal you can perform early-release to release a provisioned GPR.

To early release the provisioned GPU nodes:

On the GPU Requests page, select the request you want to edit.
On the top-right, click the Actions button and select Early Release.
Enter RELEASE to confirm to early release the nodes.

warning

After the GPR is early-released, the GPU nodes will no longer be available for any AI workloads running on the workspace . Any running workloads (pods/and so on) using GPUs and running on the node will go into a pending state.

Create a GPU Request​

View GPU Requests Details​

Create a GPR from a Template​

Manage GPR Queues​

Edit GPU Requests​

Early Release the GPU Nodes​