Data Workflow + Tidy data + Wrangling

MP223 - Applied Econometrics Methods for the Social Sciences

Eduard Bukin

R setup

library(tidyverse)       # for data wrangling

# set default theme and larger font size for ggplot2
ggplot2::theme_set(ggplot2::theme_minimal(base_size = 16))

# set default figure parameters for knitr
knitr::opts_chunk$set(
  fig.width = 8,
  fig.asp = 0.618,
  fig.retina = 3,
  dpi = 300,
  out.width = "80%"
)

Introduction

Data analysis workflow

Tidy data (1/4)

Tidy data (2/4) wide format

Tidy data (3/4) long format

Tidy data (4/4) transformation

Wrangling

Tidy data and Wrangling: See also

R4DS: R for data science by Hadley Wickham and Garrett Grolemund (book’s source code) (Wickham and Grolemund 2017)

Takeaways

  • Tidy data: Wide + long formats

  • Data analysis workflow

  • Learning materials

References

Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science. O’Reilly Media. http://r4ds.had.co.nz/.