Statistical Learning
About this Course
This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines; neural networks and deep learning; survival models; multiple testing. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical). This is not a math-heavy class, so we try and describe the methods without heavy reliance on formulas and complex mathematics. We focus on what we consider to be the important elements of modern data science. Computing is done in R. There are lectures devoted to R, giving tutorials from the ground up, and progressing with more detailed sessions that implement the techniques in each chapter. The lectures cover all the material in An Introduction to Statistical Learning, with Applications in R (second addition) by James, Witten, Hastie and Tibshirani (Springer, 2021). The pdf for this book is available for free on the book website.Created by: Stanford University
Level: Introductory

Related Online Courses
Develop the skills necessary to create structured database environments using a relational database management system (RDBMS), such as MySQL, that incorporates basic processing functionality and... more
Demystify complex big data technologies Compared to traditional data processing, modern tools can be complex to grasp. Before we can use these tools effectively, we need to know how to handle big... more
El análisis exploratorio de datos (EDA, por sus siglas en inglés, Exploratory Data Analysis) es el proceso o tratamiento estadístico al cual se someten los datos de una muestra con la que se bu... more
Designing a data lake is challenging because of the scale and growth of data. Developers need to understand best practices to avoid common mistakes that could be hard to rectify. In this course we... more
Statistics 1 Part 1 is a self-paced course from LSE which aims to introduce you to and develop your understanding of essential statistical concepts, methods and techniques, emphasising the... more