GPR
This topic describes the steps to create and release a GPR using the Python scripts. A workspace user issues GPR request for GPU resource allocation for a period of time. if you want to release early that GPR before allocated time or if that job is done faster than you expected you can early-release the GPR using the script.
Prerequisites
-
A Kubernetes cluster with EGS installed. For more information, see Install EGS.
-
Install the EGS SDK package in the Linux environment that has access to the Internet.
-
Install kubectl tool.
Install the EGS-SDK Package
Before installing the package, ensure that you have Python version 3.7 or later installed on your system.
To install the egs-sdk package, use the following command:
pip install git+https://github.com/kubeslice-ent/egs-sdk.git
To install the egs-sdk package on a specific branch or version, use the following command:
pip install git+https://github.com/kubeslice-ent/egs-sdk.git@<branch_or_tag_name>
Set Environment Variables
Expose EGS_ENDPOINT
and EGS_API_KEY
as the environment variables:
-
Use the following command to get
EGS_ENDPOINT
:kubectl get svc -n kubeslice-controller
Copy the external IP of the egs-core-apis service and append
:8080
to it. For example, the endpoint point will behttp://<EXTERNAL-IP>:8080
. -
EGS_API_KEY
can be obtained from the EGS UI.
Clone the Repository
You must clone the egs-sdk.git
repository to download the Python scripts. The scripts are located in this directory.
-
Clone the
egs-sdk.git
repository using the following command:git clone https://github.com/kubeslice-ent/egs-sdk.git
-
Go to the
examples
directory using the following command:cd egs-sdk/examples/examples01
GPR Parameters
Parameter | Description |
---|---|
cluster_name | The name of the worker cluster. |
workspace_name | The name of the slice workspace. |
priority | The priority assigned to the GPR. |
exit_duration | The duration after which the GPU is released. |
request_name | The name of the GPR request. |
gpu_shape | The name of the GPU shape. |
request_id | The GPU request ID. |
For more information, see SDK Parameters.
Create a GPR
The make-gpr.py script is used to create a GPR. You can create a GPR only if the workspace is available.
Syntax
python make_gpr.py --cluster_name <cluster-name> --workspace_name <workspace-name> --priority <priority-number> --exit_duration <0d0h0m> --request_name <gpr-request-name> --gpu_shape <GPU shape>
Examples
The --gpu_shape shape is an optional parameter. If the provided GPU shape is not found or if the --gpu_shape
parameter is not passed.
It will default to first node, first GPU.
-
The following is an example command to create a GPR (passing
gpu-shape
parameter):python make_gpr.py --cluster_name "worker-1" --workspace_name test1 --priority 100 --exit_duration 5m --request_name test-gpr4 --gpu_shape Tesla-P100-PCIE-16GB
Example Output
Workspace test1 exists in worker-1 cluster
Tesla-P100-PCIE-16GB
{"cluster_name": "worker-1", "gpu_per_node": 2, "gpu_shape": "Tesla-P100-PCIE-16GB", "instance_type": "n1-highcpu-2", "memory_per_gpu": 16, "total_gpu_nodes": 1}
GPR Created Successfully with gpu_request_id: gpr-7bdeec73-06be -
The following is an example command to create a GPR (without gpu-shape parameter):
python make_gpr.py --cluster_name "worker-1" --workspace_name test1 --priority 100 --exit_duration 5m --request_name test-gpr1
Example Output
Workspace test1 exists in worker-1 cluster
GPR Created Successfully with gpu_request_id: gpr-4d8e77d1-632d
Release a GPR
The release_gpr.py script is used to release a GPR. The GPR request ID is provided as input parameter to the script.
Syntax
python release_gpr.py --request_id <gpr-request-id>
Example
The following is an example command to release a GPR:
python release_gpr.py --request_id gpr-191b3dd5-f110
Example Output (If the GPR request is in queue)
Current GPR Provisioning status :Queued
Hence Canceling GPU Request
Example Output (If the GPR is provisioned)
Current GPR Provisioning status :Successful
Hence Releasing GPU Request
There are other example scripts available in the examples folder. Try them out to create and delete workspaces.