In this workshop we will fit linear mixed-effects models to data in R. See the “Fit model” tab of the R tips page for help. You might also need to refer to the R tips “Graphs & Tables” page to help visualize your results.
Linear mixed-effects models are used when you have random effects, which occurs when multiple measurements are made on randomly sampled groups. The measurements from the same natural groups are not independent random samples on their own. Instead, the units or groups are assumed to be randomly sampled from a “population” of groups. Example situations include
Linear models for mixed effects are implemented in the R
lme4
and lmerTest
packages
(lmerTest
includes lme4
plus additional
functions). An alternative option is to use the lme()
method in the nmle
package. The methods used to calculate
approximate degrees of freedom in lme4
are a bit more
accurate than those used in the nmle
package, especially
when sample size is not large.
To begin using methods for fitting linear mixed-effects models,
install (if you have not already done so) and load the
lmerTest
package.
This first data set was extracted from a paper by Griffith and Sheldon (2001, Animal Behaviour 61: 987–993), who measured the white forehead patch of 30 male collared flycatchers in two years on the Swedish island of Gotland. The patch is important in mate attraction, but varies in size from year to year. Our goal here will be to estimate the repeatability of patch length (mm). The data are here.
Read the data from the file.
View the first few lines of data to make sure it was read correctly.
Create a plot showing the pair of measurements for each individual flycatcher in the two years of study. You can try to make the kind of dot plot I showed in lecture, or see the R tips “Graphs & Tables” page for suggestions on how to plot paired data. Is there evidence of measurement variability between years?
Fit a linear mixed-effects model to the data, treating the
individual birds as the random groups and the multiple measurements per
bird as repeated measures.
Note: The two measurements on each bird were taken in successive years
of the study. For simplicity here, do not include year in the model.
(Okay, if you really want to try including year in the model, go ahead.
Just make sure to convert it to a character or factor in R so it is not
treated as a numeric variable. Recalculate repeatability with this model
as described in steps (2) and (3) below. How is the interpretation of
repeatability changed?)
Extract parameter estimates (coefficients) from the saved
lmer()
object (the command is the same one we used with
lm()
to get the coefficients table). Inspect the output for
the random effects. What are the two sources of random variation? What
does the fixed effect refer to?
In the output, examine the standard deviations for the random effects. There should be two standard deviations: one for “(Intercept)” and one for “Residual”. This is because the mixed effects model has two sources of random variation: variation among repeat measurements within birds, and variation among birds in their patch lengths. Which of these two sources corresponds to “(Intercept)” and which to “Residual”?
Also examine the output for the fixed effect results. The only fixed effect in the model formula is the grand mean of all the patch length measurements. It is called “(Intercept)”, but don’t confuse with the intercepts for the random groups. The fixed effect output gives you the estimate of the grand mean and a standard error for that estimate. Notice how the fixed effect output provides estimates of means, whereas the random effects output provides estimates of variances (or standard deviations).
Extract the variance components from the fitted model and estimate the repeatability of patch length from year to year*.
Interpret the measure of repeatability obtained in the previous step. If the repeatability you obtained is less than 1.0, what is the source of the variation among measurements within individuals. Is it measurement error alone?
Produce a plot of residuals against fitted values. Notice anything odd? There sees to be a slightly positive trend. This isn’t a mistake, but results from “shrinkage” of the best linear unbiased predictors (BLUPs). Consult the lecture notes and the “Fit model” tab at the R tips pages (see the repeatability example under “Fit a linear mixed-effects model”) for additional information on what is happening.
* 0.776.
Cronly-Dillon and Muntz (1965; J. Exp. Biol 42: 481-493) used the optomotor response to measure color vision in the goldfish. We looked at a plot of some of these data already in lecture. Here we will fit a model to the data and include the full set of wavelengths tested. Each of 5 fish was tested at all the wavelength in random order. A large value of sensitivity indicates that the fish can detect the given wavelength at very dim light levels. A curious feature of the optomotor response is that fish don’t habituate, and so it is unlikely that a measurement of visual sensitivity under one wavelength would carry over and have an effect on later measurements at another wavelength. The data are here.
Read the data from the file, and view the first few lines to make sure it was read correctly.
Use an interaction plot to compare the responses of individual fish across the different light wavelength treatments.
What type of experimental design was used?* This will determine the linear mixed model to use when fitting the data.
Fit a linear mixed-effects model to the data. R will give you the message: “boundary (singular) fit”. This happens to me sometimes when I try to fit a mixed effects model. The reason will become clearer below. Meantime, ignore the message and proceed cautiously.
Plot the fitted (predicted) values**. The difference between the predicted and observed values for each fish represent the residuals.
What assumptions are you making in (1)? Create a plot of residuals against fitted values to check one of these assumptions.
Extract parameter estimates from the saved lmer()
object. Inspect the results for the fixed effects. The coefficients
given have the same interpretation as in the case of a categorical
variable analyzed using lm()
(arbitrarily, the light
treatment “nm426” is set as the “control”).
Inspect the output for the random effects. Once again we have two
sources of random error in our mixed effects model. What are they? Which
of them corresponds to the (Intercept)
and which to the
Residual
in the output? Notice that the estimated standard
deviation for one of the sources of variation is zero in this data set.
This is the reason behind the “boundary (singular) fit” message. A true
variance of 0 is not biologically realistic, but estimates from samples
can sometimes be 0 when data sets are small by sampling error. See point
8 below for a solution.
Generate the model-based estimates of the mean sensitivities for each wavelength.
Are the differences among wavelengths significant? Generate the
ANOVA table for the lmer()
object. What effects are tested
here, the random effects or the fixed effects?*** Interpret the ANOVA
results.
Estimates of variance can go to zero because of sampling error
when the number of groups is small, causing a “boundary (singular) fit”
error. Various solutions are possible, but an easy one is to use the
blmer()
command in the blme
package instead of
lmer()
. The blmer()
function behaves similarly
to lmer()
but it incorporates a penalty on the variance in
the likelihood that reduces the chances that estimates will go to zero.
To try this you’ll have to install the blme
package. When I
ran it on these data the results were almost identical. The estimated
variance among fish was small but not zero.
*It is a “subjects-by-treatment” repeated measures design, since each fish is measured once under each treatment. It is essentially the same as a randomized complete block design (think of the individual fish as “blocks”).
**visreg()
is preferred, because both the data and the
fitted values are plotted. Note how the predicted values are very
similar between fish. This indicates that there was very little
estimated variance among individual fish in this study.
***Generally, only the fixed effects are tested in an ANOVA table. It
is possible to test the null hypothesis of no variance in a random
effect using lmerTest
, but I’ve yet to think of a
compelling reason why one would ever do this.
The Kluane project experimentally investigated the effects of fertilization and herbivory on vegetation dynamics in the boreal forest ecosystem of Kluane National Park in the Yukon (Krebs, C.J., Boutin, S. & Boonstra, R., eds (2001a) Ecosystem dynamics of the Boreal Forest. The Kluane Project. Oxford University Press, New York). The current data are from a study of the effects of plant resources and herbivory on the defensive chemistry of understory plant species.
Each of sixteen 5x5 m plots was randomly assigned one of four treatments: 1) surrounded by a fence exclosure to exclude herbivores; 2) fertilized with N-P-K fertilizer; 3) fenced and fertilized; and 4) untreated control. Each of the 16 plots was then divided in two. One side of each plot (randomly chosen) received the treatment continually over the 20 years of study. The other half of each plot received the treatment for the first ten years, after which it was left to revert to its untreated state. The data to be analyzed here record the concentration of phenolics (a crude measure of plant defense compounds) in yarrow (Achillea millefolium), a herb common in the plots. The measurement units are mg Tannic Acid Equivalents per g dry weight. The data were kindly provided by P. de Koning and R. Turkington. The data are here.
Read the data from the file.
Inspect the first few lines of data. Plot and treatment are
self-explanatory. Treatment is given as a single variable with four
levels (let’s stick with this approach rather than model as two
variables, enclosure and fertilizer, with a 2x2 factorial design).
Duration indicates whether the half-plots received the treatment for the
full 20 years or whether the treatment was stopped (“reversed”) after 10
years. The variable phen.ach
is the concentration of
phenolics in yarrow.
Draw a graph to illustrate the concentrations of phenolics in yarrow in the different treatment and duration categories. There aren’t many data points in each combination of treatment and duration levels, so a strip chart by groups is probably a better choice than a box plot by groups (see “Graphs & Tables” page of R tips). Can you create a plot that indicates the pairing between plot-halves?
What type of experimental design was used?* This will determine the linear mixed model to fit to the data.
Fit a linear mixed model to the data without an interaction between treatment and duration. Use the log of phenolics as the response variable, as the log-transformation improved the fit of the data to linear model assumptions. For our purposes here, ignore the error message “boundary (singular) fit: see help(‘isSingular’)”.
Visualize the model fit to the data. Use the by =
argument with visreg()
to separate panels by duration (if
xvar
is treatment) or treatment (if xvar
is
duration). visreg()
won’t preserve the pairing, but will
allow you to inspect residuals.
Now repeat the model fitting, but this time include the interaction between treatment and duration. Visualize the model fit to the data. What is the most noticeable difference between the two model fits, one with the interaction and the other without? Describe what including the interaction term “allows” that the model without an interaction term does not. Judging by eye, which model appears to fit the data best?
Use the diagnostic plot to check a key assumption of linear mixed models for the model including the interaction term.
Estimate the parameters of the linear model (including
interaction) using the fitted model object. Notice that there are now
many coefficients in the table of fixed effects. In principle, these can
be understood by examining how R models terms behind the scenes, but the
task is made more challenging with two factors and an interaction. It
might be more useful to use emmeans()
instead to obtain
model fitted means, which are more readily interpretable.
In the output from the previous step, you will see two quantities given for “Std.Dev” under the label “Random effects”. Explain what these quantities refer to.
Use emmeans()
to estimate the model fitted means for
all combinations of the fixed effects.
Generate the ANOVA table for the fixed effects. Which terms were statistically significant?
Notice that, by default, lmerTest
will test model
terms using Type 3 sums of squares (the “drop one” approach) rather than
sequentially (Type 1), which is the default in lm()
. (Why
can’t statisticians all get together and come to a consensus?) Repeat
the ANOVA table using Type 1 instead. Are the results any
different?**
*The experiment used a split-plot design, in which whole plots were randomly assigned different treatments, and then different levels of a second treatment (duration) were assigned to plot halves.
**There should be no difference because the design is completely balanced.
© 2009-2025 Dolph Schluter