In this Challenge, you are asked to use the data manipulation and visualization tools you’ve learned to interrogate a dataset. Your write-up will be in the form of an Rmarkdown document and the associated PDF file. This document explains the Challenge, and also serves as a template to get you started with Rmarkdown and ggplot2. The source is here. Open the source file with Rstudio. You can edit it and knit it into a PDF file. Use the help provided by Rstudio to learn about Rmarkdown.

U.S. mortality data

The following codes download the data and load it into the R session.

library(ggplot2)
library(plyr)
library(reshape2)
library(magrittr)

course.url <- "https://kinglab.eeb.lsa.umich.edu/202/data"
datafile <- paste(course.url,"mortality_us.rds",sep='/')
download.file(datafile,destfile="./mortality_us.rds",mode="wb")
readRDS("mortality_us.rds") -> mort

The data are numbers of deaths in the U.S., by year, age group, and cause. In addition, the number of individuals of each age group each year is given. The data come from the U.S. Centers for Disease Control and Prevention (link to data).

Specific instructions

  1. Formulate an interesting question that you might try to answer with these data.
  2. State your question as clearly as possible.
  3. Design and execute a visualization to shed light on the question.
  4. Explain your reasoning and your calculation carefully.
  5. Interpret your visualization in terms of your question.
  6. Upload your write-up to the course canvas site. Include both .Rmd and .pdf files.
  7. After you’ve uploaded your write-up, perform peer review of another student’s.

To help you get started, the figure below gives an example. What insights can you glean from it?

Dealing with errors

Sometimes you have an error that you would like to display, perhaps because you seek advice in understanding and remediating it. By setting the error chunk option to TRUE, you enable display of the error message without crashing knitr. For example:

ddply(mort,~cause+age,summarize,pop=sum(Pop))
## Error in eval(cols[[col]], .data, parent.frame()): object 'Pop' not found

(The Rmarkdown source shows how this is achieved.)


Important Note: You will turn in each draft of your report via the course Canvas site. Upload both the Rmarkdown document and a PDF version of your report. Choose the filenames according to the following formula: Challenge_X-Y.Z where X is the challenge problem number (here, 1), Y is the draft number (1, 2, or 3), and Z is the appropriate extension (.Rmd or .pdf).