Skip to main content

DataBaselines Blog

Our company acquired a data file containing over 15,000 rows and 300 columns. We are trying to identify patterns in the data. Where do we begin evaluating such a large dataset? Would Using R be helpful?

Our company acquired a data file containing over 15,000 rows and 300 columns. We are trying to identify patterns in the data. Where do we begin evaluating such a large dataset? Would Using R be helpful?

Related Scenarios:

Our organization produces member directories every year. They are tedious to create and prone to typos / inaccuracies. How can I use a database to create a member directory or product catalog?

Our organization produces member directories every year. They are tedious to create and prone to typos / inaccuracies. How can I use a database to create a member directory or product catalog?

Related Scenarios:

Microsoft Professional Program in Data Science

A database analysts experience with the Microsoft Professional Program in Data Science. This page is an index of 10 blog entries describing coursework included in this new edX.org curriculum aligned with Microsoft data science tools. This entry concerns the overview course.

DAT213 - Analyzing Big Data with Microsoft R Server

Jul 25, 2017

This course teaches exploratory data analysis skills using the Microsoft R Server implementation known as RevoScaleR. This product is in most ways functionally equivalent to the open source CRAN-R. RevoScaleR offers three significant benefits over its open source brother: the ability to run analyses in parallel across different servers, the ability to "chunk" data for evaluation and bypass the in-memory limitation of R, and the ability to read more natively from data sources like SQL Server, Hadoop, and Spark.

DAT209 - Programming with R for Data Science

Jul 10, 2017

Programming R for Data Science is taught by Anders Stockmarr (on the faculty of Technical University of Denmark.) For US audiences, his accent requires some getting used to. He places emphasis on unexpected syllables and has a unique way of pronouncing many things. I found it helpful to use headphones and to adjust the playback speed of the recordings. It is worth making the effort to understand Dr.

DAT203 - Data Science Essentials

Jun 21, 2017

Data Science Essentials (DAT203) marks the point where we have enough foundation that we can start forming a bigger picture of data science. To that goal, the course provides this definition:

Data Science is the exploration and quantitative analysis of all available structured and unstructured data to develop understanding, extract knowledge, and formulate actionable results.

DAT204 - Intro to R for Data Science

SQL developer notes about the Intro to R for Data Science within the Microsoft Professional Program in Data Science. Great language, great course...