Homework assignments
These will be assigned sporadically. (Click the reload button on your browser to make sure you are seeing the most recent version of this page.)1) Graph repair (due Friday, Jan 20)
Find a flawed graph in a publication by your graduate supervisor.
If your supervisor is flawless, pick a graph by someone else in your group or department.
- Explain what the graph is intending to display.
- Explain how the graph is flawed in a few sentences.
- Redraw the graph in R using principles of effective display.
- Explain how your improvements display the pattern more effictively.
- Attach your clean R code.
- You can email it to me as a pdf file (rather than Word, etc).
HOMEWORK 2
2) Fit a linear model (due Tuesday, March 6)
Obtain a data set and analyze by fitting a linear model in R. The aims of this assignment are; 1) to gain experience with conducting a statistical analysis using real data and interpreting the results and 2) see how reproducible the (published) results are.To begin:
- Search for a data set which is available, using for example Dryad or the Ecological Archives, and (preferably!) studies used R. Google Scholar might help here (search for e.g. R and linear model and Dryad).
- Add you name and reference to the Google Docs and send
pdf of paper + data set to me. NOTE: if you find multiple
useful references/data sets, please add them to the docs
for others to use: we need to find a good data set for all
to make the reproducibility project work.
- Explain the purpose of the study that yielded the data (short paragraph).
- Explain the specific data set you are using. For example, say where the data are from, give the meaning of the variables, and so on.
- (Mandatory) State what parameters you want to estimate with these data (1-2 sentences).
- (Optional) State what you want to test with these data (1-2 sentences).
- Use only 1 response variable.
- Include at least 1 proper fixed factor, such as an experimental or observational treatment. Can be categorical or numeric.
- Include at least 1, and no more than 2, additional explanatory variables (random or fixed factors, blocks, covariates, etc).
- Try to keep sample size manageable to avoid
computational issues (use subset if real data set too
huge). Contact me if this might be an issue for you data
set.
- Don't use the same papers, consult the Google Docs and work together to get a unique data set.
Results:
- Create a graph to visualize the data.
- Fit a linear model to the data. Write down the model you fit.
- Interpret the output: Explain the parameter estimates and (optionally) the test results.
- Create a graph to visualize the model fit to the data. Explain the graph.
- Address how well the statistical assumptions of your analysis were met, and how you handled violations.
- State the conclusions reached from your analyses. What
did your estimates or tests tell you?
- Did you manage to reproduce the results? Were all the
data needed available?
- Include your clean R code in an appendix.
- Send to me in a single pdf file.
- The last day of the course will focus on the
reproducibility part. Each person will present in a couple
of slides their paper, approach and results 'elevator
speech' style (3 min synopsis of 'who you are, what you do
and what you found': useful when you talk to the persons
you've always wanted to work with).
3) Reproducibility of research (Thursday, April 5)
In the previous assingment a
data set of a published study was re-analised. Although
science relies on reproducibility of the newly acquired
insights, this is not always the case. The aim of this
assignment is to compare the original study with your own
analysis. These may, or may not differ. Each student will present in a 3-minutes talk the central question of the original study, the hypothesis tested, the 'original' results, the results of their own analysis and whether or not the results were reproducible.
As the presentation will be very short you inevitably have to keep it general. Try to captivate the audience and keep it simple!
General:
- if multiple analysis for different (sub)questions were conducted, pick one
- use visual aids
- you may use as many slides as you want but consider this carefully (more is not always better)
- there will be time for a couple of questions, be prepared
- think of ways to get peoples attention
The rules:
- maximum of three minutes
- at least one graph
- no lengthy, detailed stats output (however beautiful
your analysis is....)