An overview of Standard Datahub/DSMLP Containers maintained by UCSD Educational Technology Services


Overview


The Data Science/Machine Learning Platform (DSMLP) utilizes Docker containers to maintain consistent course/research environments for users. The following documentation is an overview of standard containers maintained by UC San Diego Educational Technology Services.

Standard Containers


UCSD Educational Technology supports and maintains 4 notebooks for courses and research which are all based off a stable version of jupyter/datascience-notebook. The figure below describes the inheritance relationship among maintained containers. All child containers have the same features as the parent container.

Container Inheritance Diagram

Every environment may contain at least 1 Anaconda environments and jupyter kernels which are described in the subsections. The conda environments may be used 3 ways:

Updated list of packages in each container

To see the current list of packages in the stable version of each container, please visit the IT Services datahub-docker-stacks repository "Stable Tag" wiki page.  Click on "Link" under "Manifest" next to your container of interest that has been labeled with the "stable" tag.  Scroll down to the "Conda packages" and/or the "System packages" link.  Click on "Details" to see package versions.  

Dockerfile for each image

To see the current dockerfile that was used when the container image was built, see the individual image directories under https://github.com/ucsd-ets/datahub-docker-stack/tree/main/images


Features by Container


ucsdets/datahub-base-notebook

Features


Kernels 

Python 3

Python 3 and Python libraries: pandas, numpy, scipy, statsmodels, datascience, matplotlib. For a complete list of all python libraries, run the command: pip list at the jupyter terminal.

Python 3 (Clean)

Use this kernel to test your notebooks in case you run into notebook errors. 

Julia

Run Julia code inside a jupyter notebook.

R

Run R code inside a jupyter notebook.


ucsdets/datascience-notebook

Features

All the features from the ucsdets/datahub-base-notebook container.


Kernels

Contains all the kernels from the ucsdets/datahub-base-notebook container with additional python libraries: okpy, dpkt, nose, datascience added to the base kernel.


ucsdets/scipy-ml-notebook


Features

All the features from the ucsdets/dathaubase-notebook container common Python machine learning libraries.


Kernels

All the features of the base kernel from the ucsdets/datahub-base-notebook container with additional python libraries: tensorflow, tensorboard, PyQt5, pytorch, torchvision, nltk, scapy, gym, opencv

Additional software support: cuda, cudnn, nccl


ucsdets/datascience-rstudio-docker


Features


Kernels

All the features from the ucsdets/datahub-base-notebook container.


ucsdets/rl-notebook


Features

All the features of datahub-base-notebook and popular reinforcement learning libraries.

Note: this standard container is only available by ssh'ing to dsmlp-login and running launch.sh, and not datahub.ucsd.edu


Kernels

base

All the features of the base kernel from the ucsdets/datahub-base-notebook


gym

Machine learning libraries: tensorflow and pytorch

Reinforcement learning librarires: gym and pybullet

Additional software support: cuda, cudnn, nccl


Creating a Custom Container 


Please see Instructions on Building a Custom Image to create your own container from one of the examples above. 

Your instructor or TA will be your best resource for course-specific questions.

If you still have questions or need additional assistance, please see our list of Knowledge Base articles, or email datahub@ucsd.edu