Homework assignments

These will be assigned sporadically. (Click the reload button on your browser to make sure you are seeing the most recent version of this page.)

1) Graph repair (due Friday, Jan 20)


Find a flawed graph in a publication by your graduate supervisor.

If your supervisor is flawless, pick a graph by someone else in your group or department.
  • Explain what the graph is intending to display.
  • Explain how the graph is flawed in a few sentences.
  • Redraw the graph in R using principles of effective display.
  • Explain how your improvements display the pattern more effictively.
  • Attach your clean R code.
  • You can email it to me as a pdf file (rather than Word, etc).
Students from the same lab must consult one another so that they don’t choose the same or very similar graphs

HOMEWORK 2

2) Fit a linear model (due Tuesday, March 6)

Obtain a data set and analyze by fitting a linear model in R. The aims of this assignment are; 1) to gain experience with conducting a statistical analysis using real data and interpreting the results and 2) see how reproducible the (published) results are.

To begin:
  • Search for a data set which is available, using for example Dryad or the Ecological Archives, and (preferably!) studies used R. Google Scholar might help here (search for e.g. R and linear model and Dryad).
  • Add you name and reference to the Google Docs and send pdf of paper + data set to me. NOTE: if you find multiple useful references/data sets, please add them to the docs for others to use: we need to find a good data set for all to make the reproducibility project work.
  • Explain the purpose of the study that yielded the data (short paragraph).
  • Explain the specific data set you are using. For example, say where the data are from, give the meaning of the variables, and so on.
  • (Mandatory) State what parameters you want to estimate with these data (1-2 sentences).
  • (Optional) State what you want to test with these data (1-2 sentences).
Rules:
  • Use only 1 response variable.
  • Include at least 1 proper fixed factor, such as an experimental or observational treatment. Can be categorical or numeric.
  • Include at least 1, and no more than 2, additional explanatory variables (random or fixed factors, blocks, covariates, etc).
  • Try to keep sample size manageable to avoid computational issues (use subset if real data set too huge). Contact me if this might be an issue for you data set.
  • Don't use the same papers, consult the Google Docs and work together to get a unique data set.
NOTE: the above applies for the assignment you have to hand in. For the reproducibility     part the entire analysis should be redone (even if it has more explanatory variables)

Results:
  • Create a graph to visualize the data.
  • Fit a linear model to the data. Write down the model you fit.
  • Interpret the output: Explain the parameter estimates and (optionally) the test results.
  • Create a graph to visualize the model fit to the data. Explain the graph.
  • Address how well the statistical assumptions of your analysis were met, and how you handled violations.
  • State the conclusions reached from your analyses. What did your estimates or tests tell you?
  • Did you manage to reproduce the results? Were all the data needed available?
  • Include your clean R code in an appendix.
  • Send to me in a single pdf file.
Presentation:
  • The last day of the course will focus on the reproducibility part. Each person will present in a couple of slides their paper, approach and results 'elevator speech' style (3 min synopsis of 'who you are, what you do and what you found': useful when you talk to the persons you've always wanted to work with).

3) Reproducibility of research (Thursday, April 5)

In the previous assingment a data set of a published study was re-analised. Although science relies on reproducibility of the newly acquired insights, this is not always the case. The aim of this assignment is to compare the original study with your own analysis. These may, or may not differ.
Each student will present in a 3-minutes talk the central question of the original study, the hypothesis tested, the 'original' results, the results of their own analysis and whether or not the results were reproducible.
As the presentation will be very short you inevitably have to keep it general. Try to captivate the audience and keep it simple!

General:
  • if multiple analysis for different (sub)questions were conducted, pick one
  • use visual aids
  • you may use as many slides as you want but consider this carefully (more is not always better)
  • there will be time for a couple of questions, be prepared
  • think of ways to get peoples attention

The rules:
  • maximum of three minutes
  • at least one graph
  • no lengthy, detailed stats output (however beautiful your analysis is....)