Skip to main content
Version: 1.11.0

GPR

This topic describes the steps to create and release a GPR using the Python scripts. A workspace user issues GPR request for GPU resource allocation for a period of time. if you want to release early that GPR before allocated time or if that job is done faster than you expected you can early-release the GPR using the script.

Prerequisites

  • A Kubernetes cluster with EGS installed. For more information, see Install EGS.

  • Install the EGS SDK package in the Linux environment that has access to the Internet.

  • Install kubectl tool.

Install the EGS-SDK Package

Before installing the package, ensure that you have Python version 3.7 or later installed on your system.

To install the egs-sdk package, use the following command:

pip install git+https://github.com/kubeslice-ent/egs-sdk.git

To install the egs-sdk package on a specific branch or version, use the following command:

pip install git+https://github.com/kubeslice-ent/egs-sdk.git@<branch_or_tag_name>

Set Environment Variables

Expose EGS_ENDPOINT and EGS_API_KEY as the environment variables:

  • Use the following command to get EGS_ENDPOINT:

    kubectl get svc -n kubeslice-controller

    Copy the external IP of the egs-core-apis service and append :8080 to it. For example, the endpoint point will be http://<EXTERNAL-IP>:8080.

  • EGS_API_KEY can be obtained from the EGS UI.

Clone the Repository

You must clone the egs-sdk.git repository to download the Python scripts. The scripts are located in this directory.

  1. Clone the egs-sdk.git repository using the following command:

    git clone https://github.com/kubeslice-ent/egs-sdk.git
  2. Go to the examples directory using the following command:

    cd egs-sdk/examples/examples01

GPR Parameters

ParameterDescription
cluster_nameThe name of the worker cluster.
workspace_nameThe name of the slice workspace.
priorityThe priority assigned to the GPR.
exit_durationThe duration after which the GPU is released.
request_nameThe name of the GPR request.
gpu_shapeThe name of the GPU shape.
request_idThe GPU request ID.

For more information, see SDK Parameters.

Create a GPR

The make-gpr.py script is used to create a GPR. You can create a GPR only if the workspace is available.

Syntax

python make_gpr.py --cluster_name <cluster-name> --workspace_name <workspace-name> --priority <priority-number> --exit_duration <0d0h0m> --request_name <gpr-request-name> --gpu_shape <GPU shape>

Examples

note

The --gpu_shape shape is an optional parameter. If the provided GPU shape is not found or if the --gpu_shape parameter is not passed. It will default to first node, first GPU.

  1. The following is an example command to create a GPR (passing gpu-shape parameter):

    python make_gpr.py --cluster_name "worker-1" --workspace_name test1 --priority 100 --exit_duration 5m --request_name test-gpr4 --gpu_shape Tesla-P100-PCIE-16GB

    Example Output

    Workspace test1 exists in worker-1 cluster
    Tesla-P100-PCIE-16GB
    {"cluster_name": "worker-1", "gpu_per_node": 2, "gpu_shape": "Tesla-P100-PCIE-16GB", "instance_type": "n1-highcpu-2", "memory_per_gpu": 16, "total_gpu_nodes": 1}
    GPR Created Successfully with gpu_request_id: gpr-7bdeec73-06be
  2. The following is an example command to create a GPR (without gpu-shape parameter):

    python make_gpr.py --cluster_name "worker-1" --workspace_name test1 --priority 100 --exit_duration 5m --request_name test-gpr1

    Example Output

    Workspace test1 exists in worker-1 cluster
    GPR Created Successfully with gpu_request_id: gpr-4d8e77d1-632d

Release a GPR

The release_gpr.py script is used to release a GPR. The GPR request ID is provided as input parameter to the script.

Syntax

python release_gpr.py --request_id <gpr-request-id>

Example

The following is an example command to release a GPR:

python release_gpr.py --request_id gpr-191b3dd5-f110

Example Output (If the GPR request is in queue)

Current GPR Provisioning status :Queued
Hence Canceling GPU Request

Example Output (If the GPR is provisioned)

Current GPR Provisioning status :Successful
Hence Releasing GPU Request
info

There are other example scripts available in the examples folder. Try them out to create and delete workspaces.