IT Services - How to Select and Configure Your Container on the Data Science/Machine Learning Platform (DSMLP)

Overview

Please review "Launching Containers From the Command Line" on how to access and launch containers. Launch scripts include:

launch-scipy-ml.sh (data science/machine learning container)
launch.sh (custom container; see below)

Launch Script Command Line Options

Defaults set within launch scripts' environment variables may be overridden using the following command-line options:

Option	Description	Example
-h	List all command line options	-h
-c <# CPU>	Adjust # CPU cores	-c 8
-g <# GPU>	Adjust # GPU cards	-g 2
-m <GB RAM>	Adjust # GB RAM	-m 64
-i <IMAGE>	Docker image name	-i nvidia/cuda:latest (see below)
-e <COMMAND>	Docker image ENTRYPOINT/CMD. Review the Dockerfile for the name of the launch script.	-e /setup.sh
-n <NODE>	Request specific cluster node (1-10)	-n 7
-v	Request specific GPU (gtx1080ti,k5200,titan)	-v k5200
-b	Request background pod (implies -J)	(see below)
-B	Request batch (non-interactive) pod (implies -J)	(see below)
-j	launch Jupyter notebook server within container (default)
-J	Inhibit launch of Jupyter notebook server
-W <COURSE>	Run in course-specific workspace directory	-W DSC10_FA22_A00
-G <GROUP ID>	Launch with /teams folder	-G 100001234
-s	Launch only CLI shell; do not launch Jupyter notebook server
-S	Do not launch container CLI shell
-q	Quiet mode - suppress informational messages during container launch.
-f <COMMAND>	Execute command within container, dump job output to stdout; implies -S (no shell), -J (no Jupyter). If the command is a shell script, any file paths in the script must be relative to the root directory inside the container (not your dsmlp-login home directory).	-f ./myscript.sh OR -f ./private/myscript.sh
-H	Launch sshd within container for use with ProxyCommand (see VS Code documentation)
-P <POLICY>	Specify image pull policy (ifnotpresent\|always\|never) assigned to container	-P Always
-N <name>	Specify alternate Pod name	-N mypod
-d	Dump Kubernetes Pod spec (JSON) - do not execute
--	End processing of command line; remaining arguments are passed to container	-- /mycommand.sh arg1

Launch a machine learning container and run in the DSC10 FA22 course workspace:

launch-scipy-ml.sh -g 1 -m 64 -W DSC10_FA22_A00 -v k5200

Launch a custom container (see section below) and run a custom script in your home (non-course workspace) directory:

launch.sh -i dockerhub-username/dockerhub-repo-name:tag -f ./private/script.sh

Adjusting CPU/GPU/RAM limits

The maximum limits (8 CPU, 64GB, 1 GPU) apply to all of your running containers:

You may run 8 1 CPU-core containers, or 1 8-core container, or anything in-between.
Contact ETS to request increases to these default limits.

Increases to GPU allocations require consent of TA, instructor or advisor.

Background Execution / Long-Running Jobs

To support longer training runs, we permit background execution of student containers, up to 12 hours execution time, via the "-b" command line option.

Use the ‘kubesh <pod-name>’ command to connect or reconnect to a background container, and ‘kubectl delete pod <pod-name>’ to terminate.

Please be considerate and terminate idle containers: while containers share system RAM and CPU resources under the standard Linux/Unix model, the cluster’s 80 GPU cards are assigned to users on an exclusive basis. When attached to a container they become unusable by others even if completely idle.

Batch or non-interactive jobs

Batch or unattended jobs may be launched via the "-B" command line option; you must specify a script or program to be executed within the container. Use the "--" option to separate 'launch.sh' options such as "-g 1" from those of your command:

[user@dsmlp-login]:~:611$ launch-scipy-ml.sh -g 1 -B -- python ./f2.py
Wed Mar 27 14:45:46 PDT 2024 INFO job was successfully submitted
Please remember to shut down via: "kubectl delete pod user-455" ; "kubectl get pods" to list running pods.
You may retrieve output from your pod via: "kubectl logs user-455"

Pod status ("kubectl get pods") will remain "Pending" until resources become available and the job is scheduled on a node. Run "kubectl describe pod <pod-name>" for more detailed messages regarding scheduling or execution.

Review job output via "kubectl logs <pod-name>", or redirect your command to a file within your container:

[user@dsmlp-login]:~:616$ launch-scipy-ml.sh -g 1 -B -- bash -c 'python ./f2.py > output.txt'

Creating and Specifying a Custom Docker Image

If you need to create a custom container from one of the standard containers, please see Instructions on Building a Custom Image. Typically these are used for entire courses using the DSMLP platform, or if individual users need to install operating system-level packages (e.g. 'apt-get install', 'yum'). To create and use a custom image:

Select any standard image (ucsdets/datahub-base-notebook, ucsdets/datascience-notebook, etc.) to base your image from
Follow the Instructions on Building a Custom Image guide above to create, build and host a docker image. We currently recommend github actions.
ssh to dsmlp-login.ucsd.edu
1. NOTE: If you are using Visual Studio Code or launching a container with a Jupyter notebook after you log into dsmlp-login, you will need to be on the UC San Diego campus network for this to work. If you are off-campus, please connect to the VPN before you ssh to dsmlp-login
Run the command launch.sh -i {MY_DOCKER_ACCOUNT/REPO:TAG} to launch your container. More information: Launching Containers from the Command Line

It may take some time to run your container as it has to download onto the DSMLP servers. Run the command: kubectl describe pod -n {USERNAME} to see the state of your container

Adding Custom Python Packages to Your Container

To customize your environment within your existing container, please see: How To: Customize your environment in DSMLP/Datahub (including jupyter notebooks)

Original version of this guide (warning: may be outdated).

For more information about datahub.ucsd.edu, check out the FAQ.

Your instructor or TA will be your best resource for course-specific questions.

If you still have questions or need additional assistance, please email datahub@ucsd.edu or visit support.ucsd.edu.