Please review "Launching Containers From the Command Line" on how to access and launch containers.
Each launch script on dsmlp-login.ucsd.edu (or ieng6) in ../public/bin specifies the default Docker image to use, the required number of CPU cores, GPU cards, and GB RAM assigned to its containers. Instructors and TAs may directly modify the coursewide scripts located in ../public/bin. Otherwise, users may copy an existing launch script into their home directory, then modify that private copy:
$ cp -p `which launch-pytorch.sh` $HOME/my-launch-pytorch.sh
$ nano $HOME/my-launch-pytorch.sh
$ $HOME/my-launch-pytorch.sh
An example of such a launch configuration, which launches the DSMLP machine learning notebook server, and calls the run_jupyter.sh entrypoint script defined in that dockerfile (see "An Overview of DSMLP Containers"), is as follows:
K8S_DOCKER_IMAGE="ucsdets/scipy-ml-notebook:2021.1-stable"
K8S_ENTRYPOINT="./run_jupyter.sh"
K8S_NUM_GPU=1 # max of 1 (contact ETS to raise limit)
K8S_NUM_CPU=4 # max of 8 ("")
K8S_GB_MEM=32 # max of 64 ("")
# Controls whether an interactive Bash shell is started
SPAWN_INTERACTIVE_SHELL=YES
# Sets up proxy URL for Jupyter notebook inside
PROXY_ENABLED=YES
PROXY_PORT=8888
Defaults set within launch scripts' environment variables may be overridden using the following command-line options:
Option |
Description |
Example |
-c N |
Adjust # CPU cores |
-c 8 |
-g N |
Adjust # GPU cards |
-g 2 |
-m N |
Adjust # GB RAM |
-m 64 |
-i IMG |
Docker image name |
-i nvidia/cuda:latest |
-e ENTRY |
Docker image ENTRYPOINT/CMD. |
-e /setup.sh |
-n N |
Request specific cluster node (1-10) |
-n 7 |
-v |
Request specific GPU (gtx1080ti,k5200,titan) |
-v k5200 |
-b |
Request background pod |
(see below) |
Example:
[cs190f @ieng6-201]:~:56$ launch-py3torch-gpu.sh -m 64 -v k5200
The maximum limits (8 CPU, 64GB, 1 GPU) apply to all of your running containers:
Increases to GPU allocations require consent of TA, instructor or advisor.
To support longer training runs, we permit background execution of student containers, up to 12 hours execution time, via the "-b" command line option.
Use the ‘kubesh <pod-name>’ command to connect or reconnect to a background container, and ‘kubectl delete pod <pod-name>’ to terminate.
Please be considerate and terminate idle containers: while containers share system RAM and CPU resources under the standard Linux/Unix model, the cluster’s 80 GPU cards are assigned to users on an exclusive basis. When attached to a container they become unusable by others even if completely idle.
If you need to create a custom container from one of the standard containers, please see Instructions on Building a Custom Image. Typically these are used for entire courses using the DSMLP platform, or if individual users need to install operating system-level packages (e.g. 'apt-get install', 'yum'). To create and use a custom image:
launch.sh -i {MY_DOCKER_ACCOUNT/REPO:TAG}
to launch your container. More information: Launching Containers from the Command LineIt may take some time to run your container as it has to download onto the DSMLP servers. Run the command: kubectl describe pod -n {USERNAME}
to see the state of your container
To customize your environment within your existing container, please see: How To: Customize your environment in DSMLP/Datahub (including jupyter notebooks)
For the original version of this guide, click here.
For more information about datahub.ucsd.edu, check out the FAQ.
Your instructor or TA will be your best resource for course-specific questions.
If you still have questions or need additional assistance, please email dsmlp@ucsd.edu or visit support.ucsd.edu.