The final project will consist of a poster-length report summarizing an analysis of your choice, coded in R, and presented within a R Markdown document. The topic can be related to your research interests or a separate topic.
Your project does not have to involve big data or complicated analyses. I’m looking for a demonstration of your creativity and programming skills. A small, well designed project is better than a large, incomplete one. Please focus on quality over quantity.
One limitation is that your project must use publicly accessible data. If you want to use your own data, you must make it available on a website (e.g. Figshare or Github) so that others are able to re-run your code. Also check out ROpenSci which has libraries that can pull data from many different sources. You can also download data from publicly accessible servers.
The project proposal will about 1 page in length and include the following:
Finding good data takes time, and can take longer than the time to tidy your data. This task could easily take 3-6 hours to find the data you need for your project. After you find good data sources make sure to complete the remaining tasks.
Use this template (which is an R Markdown File) for your project proposal. You can see what this file looks like ‘after rendering’ here.
The first draft of your project will be assessed by your peers in GitHub. The objectives of the peer evaluation are:
You should use the project website template (or similar) to generate a html version of your project report. If your project requires any data not available in public repositories, you should put it in a folder called /data
in your project’s home directory and then import it into R with read.csv('data/filname.csv')
or similar so that anyone with a copy of the repository can re-create the HTML output.
Select two repositories and evaluate them according to the instructions and rubric below.
index.Rmd
in RStudio and click knit
or Build Website
in the Build
tab in the upper right.Evaluate the following provide any feedback via pull request. 1) Website 1) Introduction [~ 200 words]: Clearly stated background and questions / hypotheses / problems being addressed. Sets up the analysis in an interesting and compelling way. 2) Data: Script downloads at least one dataset automatically through the internet. This could use a direct download (e.g. download.file()) or an API (anything from ROpenSci). 3) Figure: The HTML file includes at least one figure of the data. 2) Output: The .Rmd produces HTML output with 1) section headers for all the major sections of the paper 2) a draft of the complete introduction.
Be sure to install any required libraries (do not complain if it fails because you don’t have a library installed).
You will have two opportunites to submit your final project and your final grade on the project will be the highest of the two submissions.
Links to your project website will be uploaded to UBLearns at the end of the semester and posted on the course website.
The final project will include and be graded as follows:
See the project rubric below for more details and examples.
Note that the word counts are quite short (~200 words per section). This does not mean it’s easy! In fact, conveying all the necessary information succinctly requires extra effort. If English is not your first language, you are encouraged to contact the UB Writing Center to get help writing succinctly and clearly. They schedule 45 minute sessions to go over your writing which can dramatically improve the quality of your project. Plan ahead to schedule this before upcoming deadlines.
The more complete the second draft, the more feedback I’ll be able to provide to ensure an excellent final project. So it’s in your interest to finish as much as possible. In addition to the details from the first draft, I would like to see drafts of the text and figures/tables/etc in each section.
When submitting your your second draft, you can include any questions or comments in the draft (e.g., “I’m planning to do X, but I’m not sure how to organize the data appropriately”) or as a comment in the UBLearns submission webpage. Please do not include these comments in the final submission.
The final project will be produced as a RMarkdown Website that includes all the steps necessary to run the analysis and produce the output (figures, tables,etc.). For examples of similar documents, explore the RPubs website.
See the RMarkdown page for ideas on different html output designs. In particular, check out the FlexaDashboard options if you want to include interactive displays.
Figures (maps and other graphics) are a vital component of scientific communication and you should carefully plan your figures to convey the results of your analysis.
You should cite any relevant materials (including data sources and methods) in the text using a standard author-date citation format (e.g. Wilson, 2015) and then described in a References section. You can either compile the references manually (e.g. cutting and pasting the citation into the references section) or use the automated system in RMarkdown explained here. Other citation styles are acceptable as long as they are consistent, complete, and easy to understand.
Sites with examples of visual display of quantitative information
Suggestions for lightening presentations adapted from the Software Sustainability Institute.