(Builds on: Exploratory data analysis (1D))
(Leads to: dplyr and databases)
It is extremely rare to only require a single table of data for an analysis. Far more often you will need to combine together multiple sources of information. Interconnected datasets are often called relational because you need to care about the relationships between the datasets.
Here you’ll first learn about the keys that define the relationship.
You’ll then learn about mutating joins, so called because their primary
impact is to add new columns, like a mutate()
. It’s also useful to learn
about the filtering joins, semi_join()
and anti_join()
, which work
primarily like a filter()
, restricting the rows.
Introduction [r4ds-13.1]
nycflights13 [r4ds-13.2]
Keys [r4ds-13.3]
Mutating joins [r4ds-13.4]
Filtering joins [r4ds-13.5]