Skip to main content

DAT203.2 - Principles of Machine Learning

Principles of Machine Learning (DAT203.2) is the 7th in a series of 10 courses that form the Microsoft Professional Program in Data Science. It proves that the further you get into this 10-course sequence, the more enjoyable the classes become. Similar to Data Science Orientation, this class is co-led by Cynthia Rudin and Steve Elston.

Principles of Machine Learning

The lecture is composed of 60 videos spanning 8 hours lecture time. Watching them and working the exercises reveals the true practical value of the data science tools. This course forces you to genuinely harness the Azure Machine Learning environment with Python or R scripts. All told, this course required about 30 hours to complete.

The course covers the principles of classification and regression, then spends an extended period of time discussing improved learning models such as tree and ensemble methods, optimization-based methods, clustering, and recommenders.

Decision Tree Example

Cynthia discusses feature selection and regularization -- all of which help you to build models using the most relevant features. While listening to the videos, you felt like you're getting real-world knowledge of the value of these techniques. The presenters make an effort to explain practical limitations of the different approaches. They work an extended decision tree example about whether a customer is likely to wait for a table at a restaurant.

Due to the nature of the topics covered, you are getting a superficial treatment of the techniques and approaches available. But I felt the broad nature of the discussion helped me better understand the options, and would guide future application of these techniques to a problem. These topics deserve more study and practice to apply reliably.

Module assessments are 60% of the grade, you have two tries to answer. I found the modules particularly helpful because they come with a detailed step-by-step PDF guide that walks you through the Azure ML and Python or R scripts.

AzureML Dataview

The final challenge is 40% of the grade. I required 6 hours to complete this. It is not overly difficult because they guide you a bit on the project. It is an occasion to synthesize and apply what you've learned in the course. Your goal is to design a predictive model to determine arrival time for airline flights. You are provided with a data set which contains historic flights, airline carrier, routes, time of day, the day of the week, etc. Your score is based primarily on your ability to apply your model to accurately predict arrival time of 25 flights.

This is an altogether informative and enjoyable course, I highly recommend it.