Data Challenge Lab Home

Supplemental readings

Tidying computational biology models with biobroom

http://varianceexplained.org/r/tidy-genomics-biobroom/

by David Robinson

Getting Started with R Markdown

https://www.rstudio.com/resources/webinars/getting-started-with-r-markdown/

This 60 minute webinar introduces you to R Markdown. It covers similar content to the readings in video form.

Expert data wrangling with R

by Garrett Grolemund

http://proquest.safaribooksonline.com.ezproxy.stanford.eduvideo/programming/r/9781491917046

A video introduction to dplyr.

Happy Git and GitHub for the useR

http://happygitwithr.com

Refer to the first two parts of this book (“Installation” and “Connect Git, GitHub, RStudio” if you have problems getting set up with Git and Github.

Secrets of a happy graphing life

http://stat545.com/block016_secrets-happy-graphing.html

Len Keifer’s blog

http://lenkiefer.com

Len Keifer is Deputy Chief Economist at Freddie Mac, and posts many interesting analyses of economic data.

Non-tidy data

http://simplystatistics.org/2016/02/17/non-tidy-data/

An example, by Jeff Leek, of when tidy data isn’t as useful as a non-tidy structure.

naniar

by Nicholas Tierny

https://github.com/njtierney/naniar

This is an ggplot2 extension that adds comprehensive tools for visulising missing data.

How humans see data

by John Rauser

https://youtu.be/fSgEeI2Xpdc

Slides at http://www.slideshare.net/JohnRauser/how-humans-see-data

RMarkdown homepage

http://rmarkdown.rstudio.com/

The official RStudio source of RMarkdown documentation. We’re only scraping the surface of RMarkdown’s capabilities in this course; read here to unlock more of its power.

A Tale of Twenty-Two Million Citi Bike Rides: Analyzing the NYC Bike Share System

by Todd W. Schneider

http://toddwschneider.com/posts/a-tale-of-twenty-two-million-citi-bikes-analyzing-the-nyc-bike-share-system/

This analysis uses R (along with other tools) to analyse over 20 million bike rides in NYC. As you read through the analysis think how you would do it yourself using the techniques you have learned so far.

Analyzing 1.1 Billion NYC Taxi and Uber Trips, with a Vengeance

by Todd W. Schneider

http://toddwschneider.com/posts/analyzing-1-1-billion-nyc-taxi-and-uber-trips-with-a-vengeance/

This analysis uses much more data than the previous analysis. What additional challenges did Todd need to solve to allow him to work with this much data?

RStudio conference purrr tutorial

by Charlotte Wickham

https://github.com/cwickham/purrr-tutorial

This purrr tutorial doesn’t assume any knowledge of functions, providing an approachable introduction.

Tidy Data

https://www.jstatsoft.org/v59/i10/

This article lays out more of the details behind tidy data, and works through a few more case studies. However, it uses a previous iteration of the tools, using reshape2 instead of tidyr. To see the code translated into tidyr, read the tidy data vignette.

Data Visualisation in R with ggplot2

by Kara Woo

http://proquest.safaribooksonline.com.ezproxy.stanford.eduvideo/programming/r/9781491963661

A video introduction to ggplot2.