Big Data Science Workshop

Thursday, April 26, 2018

The Center for Scientific Computing would like to invite you to our Big Data Science Workshop on Thursday April 26, 2018 from 8:30-4:30PM in Elings Hall 1601. Our free all-day workshop will be presented by experts from San Diego Supercomputer Center (SDSC).

Topics will cover big data technologies and the latest SDSC resources. While these talks are directly related to working on the clusters at SDSC, much information will be applicable to working on the UCSB campus clusters, which are free to all UCSB researchers.

The workshop will be composed of lectures and a hands-on session (bring a laptop if you would like to participate in the hands-on session). Participants are not required to attend the entire day and may RSVP for specific sessions. Some experience with Linux on the cluster is recommended. For those who are interested in more introductory tutorials in computing, we will soon post our Fall 2018 Workshops schedule here

Coffee, refreshments and lunch will be provided. PLEASE RSVP HERE.


Details and tentative schedule:

8:30 - 10:00:  Introduction to SDSC Comet

This talk will cover the SDSC Comet hardware architecture, software stack, provide guidance on the scheduling options, tips on filesystem usage, and details on Singularity container usage.

10:30 - 12:00:  Data Analysis using Python notebooks

We will cover use of Jupyter notebooks on Comet, using python for data preparation, and explore pandas (the Python data analysis toolkit).

12:00 - 1:00:  Lunch 

1:00 - 2:30:  Running and Programming on SDSC Comet

This talk will cover the setup of Spark within the Comet scheduling framework, default configuration with myHadoop, tuning options, and RDMA-Spark.

3:00 - 4:30:  Introduction to Machine Learning on Comet

Overview of machine learning / deep learning tools available on Comet. Examples using R, Python, and TensorFlow.


For any questions, please contact Paul Weakliem at or (805) 893-4205.

Related Links