Learning Objectives:
- Familiarity with data wrangling (structured and unstructured data) utilizing pandas/python
- Developing interactive visualization tools using R/Shiny
- Conducting Exploratory Data Analysis to evaluate model assumptions and assess data processing
- Implementation of machine learning algorithms and evaluation utilizing Kaggle challenge framework
- Understanding limitations and dependencies of classification algorithms
Schedule for Advanced Data After Dark : May 23rd through 26th, 2016
Location: CLSB 3A003B Tiered Lectural Hall South (map and directions)
Pre-Workshop Refreshers
Monday, May 23rd : 5 to 7pm
-
Data Wrangling (Pandas and Python) (data notebook)
-
Day 1 Presentation (presentation)
Tuesday, May 24th : 5 to 7pm
-
Data tells a Story: QA/QC
-
EDA and Interactive Visaulzation (Buidling R/Shiny Dashboard) (git clone)
Wednesday, May 25th : 5 to 7pm
-
Supervised Learning Algorithms (focused - 2) (git clone)
Please install the knitr, MASS, and tree packages.
Thursday, May 26th : 5 to 7pm
-
Handle with Care: Caveat and Advice for Machine Learning, Dimensionality reduction, Validation and Evaluation (presentation)