An Introduction to Statistical Machine Learning using R (Online)
September 20 @ 9:30 am - 12:00 pm
An event every 2 days that begins at 9:30 am, repeating until September 22, 2022
An event every 2 weeks that begins at 9:30 am on Tuesday and Thursday, repeating until October 6, 2022
PLEASE NOTE: This course will be divided in to two parts: September 20 & 22 and October 4 & 6
Statistical machine learning and data mining is an interdisciplinary research area which is closely related to statistics, computer sciences, engineering, and bioinformatics. Many statistical machine learning and data mining techniques and algorithms are very useful for various scientific areas. This short course will provide an overview of statistical machine learning and data mining techniques with applications to the analysis of real data. Supervised learning techniques will be covered, including penalized regression such as LASSO and its variants, support vector machines. The main emphasis will be on the analysis of real data sets from various scientific fields. The techniques discussed will be demonstrated in R.
This course is intended for researchers who have some knowledge of statistics and want to be introduced to statistical machine learning and data mining, or practitioners who would like to apply statistical machine learning techniques to their problems.
Participants should be familiar with linear regression and basic statistical and probability concepts, as well as some familiarity with R programming.
Part I Outline (R exercises will be included): September 20 & 22
Fundamentals of Statistical Learning
- Training versus Test error rates
- Supervised versus Unsupervised Method
- Linear Regression and Penalized Regression
- Ridge Regression
- Further Extensions (if time permits)
- Logistic regression and penalized logistic regression
- Nearest Neighbors Classification
- Support Vector Machines
Part II Outline (R exercises will be included): October 4 & 6
Supervised Learning: Tree-based Methods
- Random Forests
- Unsupervised Learning: Dimension reduction
- Principal Component Analysis
- Other Dimension reduction Techniques
- Unsupervised Learning: Clustering
- Hierarchical Clustering
- Other Selected Topics (if time permits)
Instructor: Yufeng Liu
Yufeng Liu is currently professor in Department of Statistics and Operations Research, Department of Biostatistics, and Department of Genetics at UNC-Chapel Hill. His current research interests include statistical machine learning, high dimensional data analysis, personalized medicine, and bioinformatics. He has taught statistical machine learning courses multiple times at UNC, as well as short courses on this subject at Joint Statistical Meetings, ENAR, FDA, and Biostatistics Summer Institutes at University of Washington. Dr. Liu received the CAREER Award from National Science Foundation in 2008, and Ruth and Phillip Hettleman Prize for Artistic and Scholarly Achievement in 2010, and the inaugural Leo Breiman Junior Award in 2017. He is currently an elected fellow at American Statistical Association (ASA), Institute of Mathematical Statistics (IMS), and an elected member of International Statistical Institute (ISI).
- UNC-CH Students: $0, with a $20 deposit to hold your spot (deposit is refundable upon your attendance for at least 66% of the course)
- UNC-CH Faculty/Staff/Postdoc/Resident/Visiting Scholars: $40
- Non UNC-CH: $40
Additional course information:
- Registration will close at 12:01am 9/17/2022. No late registrations will be accepted.
- Cancellation/ Refund Policy: A full refund will be given to those who cancel their registration no later than 10 days prior to the course. If you cancel within the 10 days prior to the class, no refund will be given. Please allow 30 days to receive your refund.
- Zoom link for this course will be sent prior to the course. Registration must be made at least 3 days prior to the course date to receive the Zoom link.
- For questions regarding this class, please contact Jill Stevens at firstname.lastname@example.org