Description

The final project will consist of a poster-length report summarizing an analysis of your choice, coded in R, and presented within a R Markdown document. The topic can be related to your research interests or a separate topic.

Your project does not have to involve big data or complicated analyses. I’m looking for a demonstration of your creativity and programming skills. A small, well designed project is better than a large, incomplete one. Please focus on quality over quantity.

One limitation is that your project must use publicly accessible data. If you want to use your own data, you must make it available on a website (e.g. Figshare or Github) so that others are able to re-run your code. Also check out ROpenSci which has libraries that can pull data from many different sources. You can also download data from publicly accessible servers.

Project Phases

Project Proposal

The project proposal will about 1 page in length and include the following:

  1. Introduction to problem/question
  2. Links to inspiring examples: Include links and screenshots of a few (~3-5) example graphics found on the internet that convey what you want to do. Include a few sentences about why you selected each link.
  3. Proposed data sources: Be as specific as possible.
  4. Proposed methods: List the approach you will use. Don’t simply list packages, tell me what you will do with them.
  5. Expected results: Describe what you want to produce (graphics, analyses, etc.)

Finding good data takes time, and can take longer than the time to tidy your data. This task could easily take 3-6 hours to find the data you need for your project. After you find good data sources make sure to complete the remaining tasks.

Use this template (which is an R Markdown File) for your project proposal. You can see what this file looks like ‘after rendering’ here.

First Draft

The first draft of your project will be assessed by your peers in GitHub. The objectives of the peer evaluation are:

  • Expose you to the work of your peers so you know what others are working on
  • Provide an opportunity to share your knowledge to improve their project

You should use the project website template (or similar) to generate a html version of your project report. If your project requires any data not available in public repositories, you should put it in a folder called /data in your project’s home directory and then import it into R with read.csv('data/filname.csv') or similar so that anyone with a copy of the repository can re-create the HTML output.

Submission Instructions

  1. PREFERRED OPTION: Submit links to your GitHub repository and website.
    • Your project repository looks something like this:
    • Your public website link looks something like this:
    • Your submission screen in UBLearns will look like this:

Evaluation Instructions

Select two repositories and evaluate them according to the instructions and rubric below.

  1. Explore the final projects in the class repositor
  2. Open the repository and check if there have already been two reviews by checking if there are 2 (or more) “Pull Requests”. For example, in the image below, there are 0 pull requests, so this repository would be available for you to review. If there are already 2 pull requests, select another repository.
  3. Go to the github page linked in the assignment and download the repository as a zip file (click on the button).
  4. Unzip the file after it downloads
  5. Open the project or index.Rmd in RStudio and click knit or Build Website in the Build tab in the upper right.

Evaluate the following provide any feedback via pull request. 1) Website 1) Introduction [~ 200 words]: Clearly stated background and questions / hypotheses / problems being addressed. Sets up the analysis in an interesting and compelling way. 2) Data: Script downloads at least one dataset automatically through the internet. This could use a direct download (e.g. download.file()) or an API (anything from ROpenSci). 3) Figure: The HTML file includes at least one figure of the data. 2) Output: The .Rmd produces HTML output with 1) section headers for all the major sections of the paper 2) a draft of the complete introduction.

Be sure to install any required libraries (do not complain if it fails because you don’t have a library installed).

Final Draft

You will have two opportunites to submit your final project and your final grade on the project will be the highest of the two submissions.

Links to your project website will be uploaded to UBLearns at the end of the semester and posted on the course website.

The final project will include and be graded as follows:

  1. Title (<25 words)
  2. Introduction [~ 200 words, 10%]
    • Clearly stated background and questions / hypotheses / problems being addressed. Sets up the analysis in an interesting and compelling way.
  3. Materials and methods [~ 200 words]
    • Narrative (10%): Clear narrative description of the data sources and methods. Includes data from at least two sources that were integrated / merged in R.
    • Code (25%): The code associated with the project is well organized and easy to follow. Demonstrates mastery of R graphics and functions.
    • Data (5%): The underlying data are publicly accessible via the web and downloaded/accessed within the Rmd script. If you want to use your own data, you must make it available on a website (e.g. Figshare) so that others are able to re-run your code.
  4. Results [~200 words, 25%]
    • Tables and figures (maps and other graphics) are carefully planned to convey the results of your analysis. Intense exploration and evidence of many trials and failures. The author looked at the data in many different ways before coming to the final presentation of the data.
  5. Conclusions [~200 words, 5%]
    • Clear summary adequately describing the results and putting them in context. Discussion of further questions and ways to continue investigation.
  6. References [5%]
    • All sources are cited in a consistent manner
  7. General Scores
    • General organization (10%): Clear labels/headings demarcate separate sections. Excellent flow from one section to the next. Tables and graphics carefully tuned and placed for desired purpose.The ‘story’ is very well organized, makes good use of graphics, and is easy to understand.
    • General Grammar (5%): All sentences are well constructed and have varied structure and length. The author makes no errors in grammar, mechanics, and/or spelling.

See the project rubric below for more details and examples.

Note that the word counts are quite short (~200 words per section). This does not mean it’s easy! In fact, conveying all the necessary information succinctly requires extra effort. If English is not your first language, you are encouraged to contact the UB Writing Center to get help writing succinctly and clearly. They schedule 45 minute sessions to go over your writing which can dramatically improve the quality of your project. Plan ahead to schedule this before upcoming deadlines.

The more complete the second draft, the more feedback I’ll be able to provide to ensure an excellent final project. So it’s in your interest to finish as much as possible. In addition to the details from the first draft, I would like to see drafts of the text and figures/tables/etc in each section.

When submitting your your second draft, you can include any questions or comments in the draft (e.g., “I’m planning to do X, but I’m not sure how to organize the data appropriately”) or as a comment in the UBLearns submission webpage. Please do not include these comments in the final submission.

Formatting

The final project will be produced as a RMarkdown Website that includes all the steps necessary to run the analysis and produce the output (figures, tables,etc.). For examples of similar documents, explore the RPubs website.

See the RMarkdown page for ideas on different html output designs. In particular, check out the FlexaDashboard options if you want to include interactive displays.

Figures

Figures (maps and other graphics) are a vital component of scientific communication and you should carefully plan your figures to convey the results of your analysis.

References

You should cite any relevant materials (including data sources and methods) in the text using a standard author-date citation format (e.g. Wilson, 2015) and then described in a References section. You can either compile the references manually (e.g. cutting and pasting the citation into the references section) or use the automated system in RMarkdown explained here. Other citation styles are acceptable as long as they are consistent, complete, and easy to understand.

Resources

Sites with examples of visual display of quantitative information

References

Suggestions for lightening presentations adapted from the Software Sustainability Institute.