GPR APIs
This topic describes the SDK APIs to manage GPU requests.
Create a GPU Request
Use this API to request to provision GPU nodes as per the specifications.
Syntax
egs.request_gpu(request_name, workspace_name, cluster_name, node_count, gpu_per_node_count, memory_per_gpu, instance_type, gpu_shape, exit_duration, priority, authenticated_session=None)
Parameters
Parameter | Parameter Type | Description | Required |
---|---|---|---|
request_name | String | The name of the GPU request to be shown on the EGS dashboard. | Mandatory |
workspace_name | String | The name of the workspace for which the GPU is requested for. | Mandatory |
cluster_name | String | The name of the cluster to which the GPU should be assigned. | Mandatory |
node_count | Integer | The number of nodes required to run the workload. | Mandatory |
gpu_per_node_count | Integer | The number of GPUs required per node to run the workload. | Mandatory |
memory_per_gpu | Integer | The memory requirement in GB per GPU. | Mandatory |
instance_type | String | The type of the instance requested for. | Mandatory |
gpu_shape | String | The name of the GPU type that you can get from the Inventory details. | Mandatory |
exit_duration | String | The duration for which the GPU is requested for. The format should be 0d0h0m . | Mandatory |
priority | Integer | This is the priority of the request. You can set the priority of a GPR in the queue. You can select a GPR and increase the priority number (low: 1-100, medium: 101-200, high: 201-300) to move a GPR higher in the queue. | Mandatory |
authenticated_session | [AuthenticatedSession] | The authenticated session with the EGS Controller. The default value is None . If no authenticated session is set, SDK tries to use the SDK default. If no SDK default is found, an exception is raised. | Optional |
Response Returned
Returns | Description |
---|---|
String | The unique GPU request ID. Use this ID in sub-sequent GPU related operations. |
Exceptions Raised
Raises | Description |
---|---|
exceptions.Unauthorized | This exception is raised when the operation is not allowed to the API key. |
Example
import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)
gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)
Cancel GPU Request
Use this API to cancel the previously created GPU request, which is not provisioned yet. This operation can succeed only when the GPU has not been provisioned yet.
Syntax
egs.cancel_gpu_request(request_id, authenticated_session=None)
Parameters
Parameter | Parameter Type | Description | Required |
---|---|---|---|
request_id | String | The unique GPU request ID of the GPR that you want to cancel | Mandatory |
authenticated_session | [AuthenticatedSession] | The authenticated session with the EGS Controller. The default value is None . If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised. | Optional |
Response Returned
Returns | Description |
---|---|
Void | There is no response object. |
Exceptions Raised
Raises | Description |
---|---|
exceptions.Unauthorized | This exception is raised when the operation is not allowed to the API key. |
exceptions.GpuAlreadyProvisioned | This exception is raised when a GPU request was successful, and node is provisioned already. |
Example
import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)
gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)
egs.cancel_gpu_request(gpu_request_id, auth)
Update GPU Request
Use this API to update the priority of the GPU request that is in a queue. Increasing or decreasing the priority of the GPU will impact the time in which the GPU node will be provisioned. The higher the value, the higher the priority, whereas the lower value has lower priority.
Syntax
egs.update_gpu_request_priority(request_id, new_prority, authenticated_session=None)
Parameters
Parameter | Parameter Type | Description | Required |
---|---|---|---|
request_id | String | The unique GPU request ID. | Mandatory |
new_prority | Integer | The new priority of the GPU request that you want to update. | Mandatory |
authenticated_session | [AuthenticatedSession] | The authenticated session with the EGS controller. The default value is None . If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised. | Optional |
Response Returned
Returns | Description |
---|---|
void | There is no response object. |
Exceptions Raised
Raises | Description |
---|---|
exceptions.Unauthorized | This exception is raised when the operation is not allowed to the API key. |
exceptions.GpuAlreadyProvisioned | This exception is raised when a GPU request was successful, and node has been provisioned already. |
Example
import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)
gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)
egs.update_gpu_request_priority (gpu_request_id, 301, auth)
Update GPU Request Name
Use this API to update the name of the GPU request to be displayed on the EGS Dashboard. The GPU request name can be updated before the node provisioning, when it is still in the queue. After the node is provisioned, you cannot change the GPR name.
Syntax
egs.update_gpu_request_name(request_id, new_name, authenticated_session=None)
Parameters
Parameter | Parameter Type | Description | Required |
---|---|---|---|
request_id | String | The unique GPU request ID. | Mandatory |
new_name | String | The new name of the GPU request that you want to be updated. | Mandatory |
authenticated_session | [AuthenticatedSession] | The authenticated session with the EGS Controller. The default value is None . If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised. | Optional |
Response Returned
Returns | Description |
---|---|
void | There is no response object. |
Exceptions Raised
Raises | Description |
---|---|
exceptions.Unauthorized | This exception is raised when the operation is not allowed to the API key. |
exceptions.GpuAlreadyProvisioned | This exception is raised when a GPU request was successful, and node has been provisioned already. |
Example
import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
Release GPU
GPUs are released after their exit time has elapsed. But there can instances where the task that needs to be performed on the GPU completed earlier than anticipated. In such situations, you can request for an early release of the GPU, so that GPU can be utilized for other workloads.
Use this API to early release GPUs.
Syntax
egs.release_gpu(request_id, authenticated_session=None)
Parameters
Parameter | Parameter Type | Description | Required |
---|---|---|---|
request_id | String | The unique GPU request ID. | Mandatory |
authenticated_session | [AuthenticatedSession] | The authenticated session with the EGS Controller. The default value is None . If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default has been found, an exception is raised. | Optional |
Response Returned
Returns | Description |
---|---|
void | There is no response object. |
Exceptions Raised
Raises | Description |
---|---|
exceptions.Unauthorized | This exception is raised when the operation is not allowed to the API key. |
exceptions.GpuAlreadyReleased | This exception is raised when a GPU was already released. |
Example
import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)
gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)
egs.release_gpu (gpu_request_id, auth)
GPU Request Status
Use this API to get the status of a GPU request.
Syntax
egs.gpu_request_status(request_id, authenticated_session=None)
Parameters
Parameter | Parameter Type | Description | Required |
---|---|---|---|
request_id | String | The unique GPU request ID. | Mandatory |
authenticated_session | [AuthenticatedSession] | The authenticated session with the EGS Controller. The default value is None . If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised. | Optional |
Response Returned
Returns | Description |
---|---|
class GpuRequestData | The GPU request including the internal states and status. |
Class GpuRequestData
Parameter | Type | Description |
---|---|---|
request_id | String | The GPU request ID. |
workspace_name | String | The name of the workspace for which the GPU was requested. |
cluster_name | String | The name of the cluster to which the GPU was requested. |
number_of_gpus | Integer | The number of GPUs requested. |
number_of_gpu_nodes | Integer | The number of GPU nodes requested |
instance_type | String | The instance type of the node. |
memory_per_gpu | Integer | The memory requested per GPU. |
priority | Integer | The priority of the GPU request. |
gpu_sharing_mode | String | The sharing mode of the GPU. |
estimated_start_time | String | The estimated start time at which the GPU is expected to be provisioned. |
estimated_wait_time | String | The estimated wait time within which the GPU is expected to be provisioned. |
exit_duration | String | The duration of the GPU request after which it will be automatically released. |
early_release | Boolean | The Boolean value indicates whether the GPU was early released or not. |
gpr_name | String | The name of the GPU request. |
gpu_shape | String | The shape of the GPU. |
multi_node | Boolean | The Boolean value indicates whether the request was multi-node or not. |
dedicated_nodes | Boolean | The Boolean value indicates whether the request was for dedicated nodes or not. |
enable_rdma | Boolean | The Boolean value indicates whether the RDMA was enabled or not. |
enable_secondary_network | Boolean | The Boolean value indicates whether the secondary network was enabled on the node or not. |
status | class GpuRequestStatus | The status of the GPU request. |
GpuRequestStatus
Parameter | Type | Description |
---|---|---|
provisioning_status | String | The provisioning status of the GPU request. |
failure_reason | String | The reason due to which the GPU provisioning failed. |
num_gpus_allocated | Integer | The actual number of GPUs allocated under this GPU request. |
start_timestamp | String | The timestamp at which the GPU was allocated. |
completion_timestamp | String | The timestamp at which the GPU was released. |
cost | String | The cost accumulated due to GPU provisioning. |
nodes | List[String] | The nodes to which the GPU was attached. |
internal_state | String | The state of the GPU. |
retry_count | Integer | The number of times the GPU provisioned was retried. |
delayed_count | Integer | The number of times the GPU provisioned was delayed. |
Exceptions Raised
Raises | Description |
---|---|
exceptions.Unauthorized | This exception is raised when the operation is not allowed to the API key. |
Example
import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)
gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)
gpu__request = egs.gpu_request_status (gpu_request_id, auth)
GPU Request Status by Workspace Name
Use this API to get the status of the GPU requests that were made for a given workspace.
Syntax
egs.gpu_request_status_for_workspace(workspace_name, authenticated_session=None)
Parameters
Parameter | Parameter Type | Description | Required |
---|---|---|---|
workspace_name | String | The name of the workspace for which you want to get the GPR status | Mandatory. |
authenticated_session | [AuthenticatedSession] | The authenticated session with the EGS Controller. The default value is None . If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised. | Optional |
Response Returned
Returns | Description |
---|---|
List of class GpuRequestData | The list of GPU requests including the internal states and status. |
Exceptions Raised
Raises | Description |
---|---|
exceptions.Unauthorized | This exception is raised when the operation is not allowed to the API key. |
Example
import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)
gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)
gpu_requests = egs.gpu_request_status_for_workspace (workspace_name, auth)