Skip to main content
Version: 1.10.0

GPR APIs

This topic describes the SDK APIs to manage GPU requests.

Create a GPU Request

Use this API to request to provision GPU nodes as per the specifications.

Syntax

egs.request_gpu(request_name, workspace_name, cluster_name, node_count, gpu_per_node_count, memory_per_gpu, instance_type, gpu_shape, exit_duration, priority, authenticated_session=None)

Parameters

ParameterParameter TypeDescriptionRequired
request_nameStringThe name of the GPU request to be shown on the EGS dashboard.Mandatory
workspace_nameStringThe name of the workspace for which the GPU is requested for.Mandatory
cluster_nameStringThe name of the cluster to which the GPU should be assigned.Mandatory
node_countIntegerThe number of nodes required to run the workload.Mandatory
gpu_per_node_countIntegerThe number of GPUs required per node to run the workload.Mandatory
memory_per_gpuIntegerThe memory requirement in GB per GPU.Mandatory
instance_typeStringThe type of the instance requested for.Mandatory
gpu_shapeStringThe name of the GPU type that you can get from the Inventory details.Mandatory
exit_durationStringThe duration for which the GPU is requested for. The format should be 0d0h0m.Mandatory
priorityIntegerThis is the priority of the request. You can set the priority of a GPR in the queue. You can select a GPR and increase the priority number (low: 1-100, medium: 101-200, high: 201-300) to move a GPR higher in the queue.Mandatory
authenticated_session[AuthenticatedSession]The authenticated session with the EGS Controller. The default value is None. If no authenticated session is set, SDK tries to use the SDK default. If no SDK default is found, an exception is raised.Optional

Response Returned

ReturnsDescription
StringThe unique GPU request ID. Use this ID in sub-sequent GPU related operations.

Exceptions Raised

RaisesDescription
exceptions.UnauthorizedThis exception is raised when the operation is not allowed to the API key.

Example

import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)

gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)

Cancel GPU Request

Use this API to cancel the previously created GPU request, which is not provisioned yet. This operation can succeed only when the GPU has not been provisioned yet.

Syntax

egs.cancel_gpu_request(request_id, authenticated_session=None)

Parameters

ParameterParameter TypeDescriptionRequired
request_idStringThe unique GPU request ID of the GPR that you want to cancelMandatory
authenticated_session[AuthenticatedSession]The authenticated session with the EGS Controller. The default value is None. If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised.Optional

Response Returned

ReturnsDescription
VoidThere is no response object.

Exceptions Raised

RaisesDescription
exceptions.UnauthorizedThis exception is raised when the operation is not allowed to the API key.
exceptions.GpuAlreadyProvisionedThis exception is raised when a GPU request was successful, and node is provisioned already.

Example

import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)

gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)

egs.cancel_gpu_request(gpu_request_id, auth)

Update GPU Request

Use this API to update the priority of the GPU request that is in a queue. Increasing or decreasing the priority of the GPU will impact the time in which the GPU node will be provisioned. The higher the value, the higher the priority, whereas the lower value has lower priority.

Syntax

egs.update_gpu_request_priority(request_id, new_prority, authenticated_session=None)

Parameters

ParameterParameter TypeDescriptionRequired
request_idStringThe unique GPU request ID.Mandatory
new_prorityIntegerThe new priority of the GPU request that you want to update.Mandatory
authenticated_session[AuthenticatedSession]The authenticated session with the EGS controller. The default value is None. If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised.Optional

Response Returned

ReturnsDescription
voidThere is no response object.

Exceptions Raised

RaisesDescription
exceptions.UnauthorizedThis exception is raised when the operation is not allowed to the API key.
exceptions.GpuAlreadyProvisionedThis exception is raised when a GPU request was successful, and node has been provisioned already.

Example

import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)

gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)

egs.update_gpu_request_priority (gpu_request_id, 301, auth)

Update GPU Request Name

Use this API to update the name of the GPU request to be displayed on the EGS Dashboard. The GPU request name can be updated before the node provisioning, when it is still in the queue. After the node is provisioned, you cannot change the GPR name.

Syntax

egs.update_gpu_request_name(request_id, new_name, authenticated_session=None)

Parameters

ParameterParameter TypeDescriptionRequired
request_idStringThe unique GPU request ID.Mandatory
new_nameStringThe new name of the GPU request that you want to be updated.Mandatory
authenticated_session[AuthenticatedSession]The authenticated session with the EGS Controller. The default value is None. If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised.Optional

Response Returned

ReturnsDescription
voidThere is no response object.

Exceptions Raised

RaisesDescription
exceptions.UnauthorizedThis exception is raised when the operation is not allowed to the API key.
exceptions.GpuAlreadyProvisionedThis exception is raised when a GPU request was successful, and node has been provisioned already.

Example

import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)

Release GPU

GPUs are released after their exit time has elapsed. But there can instances where the task that needs to be performed on the GPU completed earlier than anticipated. In such situations, you can request for an early release of the GPU, so that GPU can be utilized for other workloads.

Use this API to early release GPUs.

Syntax

egs.release_gpu(request_id, authenticated_session=None)

Parameters

ParameterParameter TypeDescriptionRequired
request_idStringThe unique GPU request ID.Mandatory
authenticated_session[AuthenticatedSession]The authenticated session with the EGS Controller. The default value is None. If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default has been found, an exception is raised.Optional

Response Returned

ReturnsDescription
voidThere is no response object.

Exceptions Raised

RaisesDescription
exceptions.UnauthorizedThis exception is raised when the operation is not allowed to the API key.
exceptions.GpuAlreadyReleasedThis exception is raised when a GPU was already released.

Example

import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)

gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)

egs.release_gpu (gpu_request_id, auth)

GPU Request Status

Use this API to get the status of a GPU request.

Syntax

egs.gpu_request_status(request_id, authenticated_session=None)

Parameters

ParameterParameter TypeDescriptionRequired
request_idStringThe unique GPU request ID.Mandatory
authenticated_session[AuthenticatedSession]The authenticated session with the EGS Controller. The default value is None. If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised.Optional

Response Returned

ReturnsDescription
class GpuRequestDataThe GPU request including the internal states and status.

Class GpuRequestData

ParameterTypeDescription
request_idStringThe GPU request ID.
workspace_nameStringThe name of the workspace for which the GPU was requested.
cluster_nameStringThe name of the cluster to which the GPU was requested.
number_of_gpusIntegerThe number of GPUs requested.
number_of_gpu_nodesIntegerThe number of GPU nodes requested
instance_typeStringThe instance type of the node.
memory_per_gpuIntegerThe memory requested per GPU.
priorityIntegerThe priority of the GPU request.
gpu_sharing_modeStringThe sharing mode of the GPU.
estimated_start_timeStringThe estimated start time at which the GPU is expected to be provisioned.
estimated_wait_timeStringThe estimated wait time within which the GPU is expected to be provisioned.
exit_durationStringThe duration of the GPU request after which it will be automatically released.
early_releaseBooleanThe Boolean value indicates whether the GPU was early released or not.
gpr_nameStringThe name of the GPU request.
gpu_shapeStringThe shape of the GPU.
multi_nodeBooleanThe Boolean value indicates whether the request was multi-node or not.
dedicated_nodesBooleanThe Boolean value indicates whether the request was for dedicated nodes or not.
enable_rdmaBooleanThe Boolean value indicates whether the RDMA was enabled or not.
enable_secondary_networkBooleanThe Boolean value indicates whether the secondary network was enabled on the node or not.
statusclass GpuRequestStatusThe status of the GPU request.

GpuRequestStatus

ParameterTypeDescription
provisioning_statusStringThe provisioning status of the GPU request.
failure_reasonStringThe reason due to which the GPU provisioning failed.
num_gpus_allocatedIntegerThe actual number of GPUs allocated under this GPU request.
start_timestampStringThe timestamp at which the GPU was allocated.
completion_timestampStringThe timestamp at which the GPU was released.
costStringThe cost accumulated due to GPU provisioning.
nodesList[String]The nodes to which the GPU was attached.
internal_stateStringThe state of the GPU.
retry_countIntegerThe number of times the GPU provisioned was retried.
delayed_countIntegerThe number of times the GPU provisioned was delayed.

Exceptions Raised

RaisesDescription
exceptions.UnauthorizedThis exception is raised when the operation is not allowed to the API key.

Example

import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)

gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)

gpu__request = egs.gpu_request_status (gpu_request_id, auth)

GPU Request Status by Workspace Name

Use this API to get the status of the GPU requests that were made for a given workspace.

Syntax

egs.gpu_request_status_for_workspace(workspace_name, authenticated_session=None)

Parameters

ParameterParameter TypeDescriptionRequired
workspace_nameStringThe name of the workspace for which you want to get the GPR statusMandatory.
authenticated_session[AuthenticatedSession]The authenticated session with the EGS Controller. The default value is None. If no authenticated session is provided, SDK tries to use the SDK default. If no SDK default is found, an exception is raised.Optional

Response Returned

ReturnsDescription
List of class GpuRequestDataThe list of GPU requests including the internal states and status.

Exceptions Raised

RaisesDescription
exceptions.UnauthorizedThis exception is raised when the operation is not allowed to the API key.

Example

import egs
auth = egs.authenticate("https://egs-core-apis.example.com", "5067bd55-1aef-4c84-8987-3e966e917f07")
inventory = egs.inventory(None, auth)
workspace_name = egs.create_workspace("llm-gpt-4o", ["worker-1"], ["llm-gpt-40-dev"], "John Doe", "john.doe@avesha.io", auth)

gpu_request_id = egs.request_gpu("gpt-4o-workload", workspace_name, "worker-1", 1, 1, 81920, inventory[0].instance_type, inventory[0].gpu_shape, "0d6h0m", 201, auth)

gpu_requests = egs.gpu_request_status_for_workspace (workspace_name, auth)