In this Challenge, you are asked to use the data manipulation and visualization tools you’ve learned to interrogate a dataset. Your write-up will be in the form of an Rmarkdown document and the associated PDF file. This document explains the Challenge, and also serves as a template to get you started with Rmarkdown and ggplot2. The source is here. Open the source file with Rstudio. You can edit it and knit it into a PDF file. Use the help provided by Rstudio to learn about Rmarkdown.
The following codes download the data and load it into the R session.
library(ggplot2)
library(plyr)
library(reshape2)
library(magrittr)
course.url <- "https://kinglab.eeb.lsa.umich.edu/202/data"
datafile <- paste(course.url,"mortality_us.rds",sep='/')
download.file(datafile,destfile="./mortality_us.rds",mode="wb")
readRDS("mortality_us.rds") -> mort
The data are numbers of deaths in the U.S., by year, age group, and cause. In addition, the number of individuals of each age group each year is given. The data come from the U.S. Centers for Disease Control and Prevention (link to data).
.Rmd
and .pdf
files.To help you get started, the figure below gives an example. What insights can you glean from it?
Sometimes you have an error that you would like to display, perhaps because you seek advice in understanding and remediating it. By setting the error
chunk option to TRUE
, you enable display of the error message without crashing knitr
. For example:
ddply(mort,~cause+age,summarize,pop=sum(Pop))
## Error in eval(cols[[col]], .data, parent.frame()): object 'Pop' not found
(The Rmarkdown source shows how this is achieved.)
Important Note: You will turn in each draft of your report via the course Canvas site. Upload both the Rmarkdown document and a PDF version of your report. Choose the filenames according to the following formula: Challenge_X-Y.Z
where X
is the challenge problem number (here, 1), Y
is the draft number (1, 2, or 3), and Z
is the appropriate extension (.Rmd
or .pdf
).