Skip to main content

DAT204 - Intro to R for Data Science

As a developer, I'm drawn to terse/concise languages that are purpose-built for an objective. Regular expressions are a prime example. There is something beautiful about expressing things in few words (something I try to do in blogging with only partial success!) In this context, I was eagerly anticipating "Intro to R for Data Science." This course (and this language) did not disappoint.

Before going further, I should note something: Within the Microsoft Professional Program for Data Science, it is the student's discretion to take a Python or an R track. Your decision will be shaped by whether you have prior familiarity with one of those environments, and whether you want to reinforce what you already know or venture into a new tool. Your choice of this class will logically dictate the 2nd "advanced" course required later in the MPP track.

Intro to R for Data Science

Wiki Notes

The course lays the foundation of data types, vectors, matrices, factors, lists and data frames. It then briefly describes the graphics capabilities of R. Those most fundamental issues about a language are covered early. The presenter does a great job of progressively walking thru topics.

In order to get meaning from each of these courses, I write detailed notes in our Atlassian Confluence wiki. That includes grabbing relevant screenshots of the video. Provided below is an example of notes taken about R Data Frames. Taking these courses in succession like I've done demands this sort of note-taking. This Confluence wiki is my fully searchable 2nd brain on occasion of applying these skills down the road. Due to the volume of information coming at you in these classes, I'd strongly recommend some comparable note-taking habit.

On this particular course, the author provides 16 PDF handouts with the slides for each video. This is the first class in the MPP sequence to offer this, and are a useful resource for those taking this course.

The course presenter "Philip" is undoubtedly the best presenter to this point in the MPP sequence. He is enthusiastic and works from a carefully written and delivered script. The class is composed of 16 videos with a total running time of about 100 minutes.

DataCamp Instructor

The course makes use of a DataCamp learning environment -- which is an online IDE simulator that walks you through a guided lab to test your comprehension. It is simple to learn and worked without a glitch. Though not mandatory, I'd recommend the installation of RStudio to experiment offline with the language.

There are a reasonable number of concepts presented. In total, it required 20 hours to complete the course.