John Pearson

About

Publications

Research

Teaching

cv

Lab Website

Google Scholar

ResearchGate

GitHub

LinkedIn

A minimal curriculum for learning R

26 November 2014

I like R. At least, well enough. I find the common lore to be true: the language is inconsistent and crufty, objects are bizarre, the data structures can be hard to work with. But the package system is excellent, with significantly better support for advanced statistical methods and analysis of categorical data. It’s clear that several of my favorite Python packages, pandas and statsmodels in particular, deliberately borrow from the best R has to offer, and ggplot2 produces, in my opinion, the best-looking off-the-shelf plots available. So collected here for the benefit of my friends learning R, is my shortlist of R learning recommendations.

None of this will teach you what analyses to run on your data, and it’s light on the nuts and bolts of actual statistical analysis. But it will give you a better toolchain with which to clean, organize, and massage your data, which, if the pundits are to be believed, is ~80% of the work of being a data scientist.

Once you get this far, I think it’s easy enough to download whatever package is relevant to your analysis of choice (and surely there is one), get your data in order, and run it. As always, doing the real work is up to you.