This document discusses linear mixed models and the estimation methods BLUP and BLUE. It provides an introduction to random and fixed effects, as well as the mixed model equations used to derive BLUP and BLUE simultaneously. BLUP provides the best linear unbiased predictions of random effects, while BLUE gives the best linear unbiased estimates of fixed effects. The document also provides an example using the orthodontic growth data set to demonstrate fitting a linear mixed model and estimate variance components with REML.
BLUP (Best LinearUnbiased Predictors)
Estimation in Generalized Mixed Models
Lim, Kyuson
Department of Mathematics and Statistics
McMaster University
November 29th, 2021
Kyuson Lim 1 / 16
2.
Outline
1 BLUP andBLUE in Mixed Model
2 Biblibiography
Kyuson Lim 2 / 16
3.
Introduction to linearmodel
Linear model, sets of variables that categorize objects or observations are referred to as factors/effects.
Random: values drawn from some probability distribution with mean 0 and unknown variance ⇒
covariance structure.
BLUP (Best Linear Unbiased Prediction): Random effect are predicted in BLUP.
Random effect, have covariance structure.
Fixed: no variance, chosen fixed set of factors.
BLUE (Best Linear Unbiased Estimation): Fixed effects are estimated in BLUE.
Restricted Maximum Likelihood (REML): MLE of β and Σ is biased, considerably in small/moderate
sample size, REML is recommended to approach for estimating Σ.
REML is also referred to as residual maximum likelihood.
Residual Maximum Likelihood (REML): is based on the notion of separating the likelihood used for
estimating Σ from estimate used for β.
REML log-likelihoods for models with different fixed effects are not comparable .
General interest (Mixed Models): comparison of outcomes within each subject over time as well as
comparisons across subjects or groups of subjects.
Kyuson Lim 3 / 16
4.
Linear Mixed model
Idea:obtain approximate residual maximum likelihood (REML) estimates.
Goal: BLUP is best predictor (under normality) of u based on error contrast, transformation of y with mean 0.
y = Xβ + Zu
| {z }
η,u∼N(0,G∗)
+ (Random error)
| {z }
N(0,R∗)
, where
u
e
∼ N
0,
Gσ2
u 0
0 Rσ2
e
, R∗
= RVe, G∗
= GVu
u : random components to be predicted
G : genomic relationship matrix
X : design matrix
β : fixed effects to be estimated (unknown)
Z : Incidence matrix for random effects
A : additive relationship matrix, V(u) = AVA ⇔ A−1
= G−1
Non-zero off-diagonal elements in A reflect the use of information from relative in BLUP.
RVe = V() : Ve is the within-environment error variance
R : diagonal matrix (0s in off-diagonal) with reciprocal of the number of environment in each set
Kyuson Lim 4 / 16
5.
BLUP in linearmodel
MME produces BLUE and BLUP simultaneously.
In Bayesian framework, BLUP was derived as the posterior mean of β under a non-informative prior
for β.
BLUP estimators of variance components:
σ̃2
= (y − Xβ − Zu)0
R−1
(y − Xβ − Zu)/n
BLUP consists of maximizing sum of 2 log-likelihoods, which is the joint distribution of y and
u, f(y, u) = g(y|u)h(u): l = l1 + l2
1 l1: log-likelihood for y|u.
L(R∗
|y) = g(y|u) = 2π− 1
2 n
|R∗
|− 1
2 exp
−
(y − Xβ − Zu)0
R∗−1
(y − Xβ − Zu)
2
2 l2: log-likelihood for u.
L(G∗
|u) = h(u) = 2π− 1
2 q
|G∗
|
− 1
2
exp
−
(u0
G∗−1
u)
2
Kyuson Lim 5 / 16
MME (Mixed ModelEquation) differentiation
Differentiate f(β, u),
∂f(β, u)
∂β
= −2X0
R∗−1
y + 2X0
R∗−1
Xβ + 2X0
R∗−1
Zu = 0
∂f(β, u)
∂u
= −2G∗−1
u + 2β0
X0
R∗−1
Z + 2Z0
R∗−1
Zu − 2Z0
R∗−1
y = 0
Solve two equation to get Henderson’s Mixed Model equation (MME):
X0
R∗−1
X X0
R∗−1
Z
Z0
R∗−1
X Z0
R∗−1
Z + G∗−1
β
u
=
X0
R∗−1
y
Z0
R∗−1
y
Substitute in R∗
= RVe, G∗
= GVu and multiply both sides by Ve as well as rearrange λ = Ve
Vu
,
X0
X X0
Z
Z0
X Z0
Z + G−1
λ
β
u
=
X0
y
Z0
y
Then, inverse matrix is taken for β and u
β
u
=
X0
X X0
Z
Z0
X Z0
Z + G−1
λ
−1
X0
y
Z0
y
Kyuson Lim 7 / 16
8.
BLUE and BLUP
TheREML (Residual Maximum Likelihood) estimators of the variance component are defined as
follows.
X0
R−1
X X0
R−1
Z
Z0
R−1
X Z0
R−1
Z + G−1
λ
−1
=
A11 A12
A21 A22
, where A−1
= G−1
MME produces BLUE and BLUP simultaneously, by λ.
BLUE: sum /nx , simply average, which is the sum of phenotypes in each environment / the
number of phenotypes observed in each environment.
BLUP: sum/nz + λ → 0, sum of phenotypes for each genotype / number of phenotypes
observed for each genotype (proportional) +λ.
Refer to as “borrowing strength” from the mean.
More observation, less shrinkage:
Less weight should be given to ith individual’s average response when it is more variable.
More shrinkage when σ2
u is relatively small, and σ2
e is relatively large:
Less weight given to ith individual’s average response when there is a little variability between
subjects but high variability within subject.
Generally, both ML and REML estimators of β are the same as BLUP estimator.
Kyuson Lim 8 / 16
9.
BLUE (Best LinearUnbiased Estimation)
For a mixed model, the distribution of the response (y) depends on a vector quantity ,η = Xβ + Zu.
Hence, the log-likelihood of y|u is l1 = log f(y; β|u) and the likelihood of u (random component) is l2 = constant
−1
2
Pk
j=1{vj log(2πσ2
j ) + 1
σ2
j
u0
j A−1
j uj }, vj ∼ N(0, Gσ2
u) (but independent with u) and a component of X which is a
n × v matrix.
The joint log-likelihood of y and u is l = l1 + l2 which is updated by the Newton-Raphson iterative procedure as a
method of scoring.
β
u
=
β0
u0
+ V−1
∂l1
∂β0
∂l1
∂u0
!
− V−1
0
σ−2
A−1
u0
, V =
− ∂2
l1
∂β∂β0 − ∂2
l1
∂u∂u0
− ∂2
l1
∂u∂β0 − ∂2
l1
∂u∂u0 + 1
σ2 A−1
!
The second-order derivatives of I are the same second-order derivatives l1.
Idea of Newton-Raphson method: using the implementation of Jacobian matrix,
βk
= βk−1
− J(βk−1
)−1
V(βk−1
)
Given Σ̂ obtained from REML, minimizing the non-constant term in log-likelihood given β̂GLS (Generalized Least
Squares) estimator of β. A single trait linear mixed model has
BLUP: û = GZ0
V−1
(Y − Xβ̂), Σ = ZGZ0
+ R
BLUE: β̂ = (X0
V−1
X)−1
X0
V̂−1
Y
Kyuson Lim 9 / 16
10.
Newton-Raphson iterative procedure
TheNewton-Raphson algorithm is originated from Taylor’s series
f(x) ≈ f(xk ) + (x − xk )f0
(xk ) + 1
2! (x − xk )2
f00
(xk ) + · · · + 1
n! (x − xk )n
f(n)
(xk ) about some point, the system
of non-linear equations is solved by the procedure of Newton-Raphson method. Now the
Newton-Raphson method takes the first two terms of the expansion,
f(x) = f(xk ) + (x − xk )f0
(xk ),
and assume that x = xk+1 is the solution of the equation f(x) = 0 then 0 = f(xk ) + (xk+1 − xk )f0
(xk ) to
be rearranged. Generally, the Newton-Raphson method could only be solved for non-linear equation
with a single variable. We approximate roots, f(β) = 0.
1 Start with initial value β(0)
of β.
2 First-order linear approximation of f at β(0)
+ h:
f(β(0)
+ h) ≈ f(β(0)
) + hf0
(β(0)
)
3 Solve to find solution β(1)
(updated) = β(0)
+ h of f(β) = 0 ⇒ f(β(1)
) = 0 by
h = −{f0
(β(0)
)}−1
f(β(0)
) and thus β(1)
= β(0)
− {f0
(β(0)
)}−1
f(β(0)
)
4 Iterate until process converges β(k+1)
≈ β(k)
.
Kyuson Lim 10 / 16
11.
R Example: orthodonticgrowth data in nlme package
Potthoff and Roy (1964) first reported a data set from a study undertaken at the Department of Orthodontics from
the University of North Carolina Dental School.
Investigators followed the growth of 27 children (16 males and 11 females). At ages 8, 10, 12 and 14 investigators
measured the distance from the center of the pituitary to the pterygomaxillary fissure. Interest centers on developing a
model for these distances in terms of age and sex.
Distanceij = β0 + β1 Agej + bi + eij , bi ∼ N(0, σ2
b), eij ∼ N(0, σ2
e)
Utilizing the independence between random effect and random error terms,
Cov(yij , yik ) = Var(bi + eij , bi + eik ) = Var(bi ) = σ2
b and Var(yij ) = σ2
b + σ2
e.
Conclusion: examine female and male group separately, and combine two separate models by allowing error variance
differed with sex, while assuming the random effect variance is constant across sex ⇒ question about whether the
growth rate differs across sex.
Distanceij = β0 + β1 Agej + β2 Sex + β3 Sex × Age j + bi + esex
ij , bi ∼ N(0, σ2
b), esex
ij ∼ N(0, σ2
esex )
Allowing for the random effect variance to differ across sex has produced estimated correlations in line with those
obtained from the separate models for males and females reported earlier.
corrmale(Distanceij , Distanceik ) = 0.55
corrfemale(Distanceij , Distanceik ) = 0.85
Kyuson Lim 11 / 16
12.
R Example: orthodonticgrowth data, female groups
Compound symmetry: the correlation of yij and yij is
σ2
b
σ2
b
+σ2
e
for same subject (i), constant, whatever the
difference between j and k.
T - cbind(DistFAge8,DistFAge10,DistFAge12,DistFAge14)
c-cor(T)
round(c,3)
DistFAge8 DistFAge10 DistFAge12 DistFAge14
DistFAge8 1.000 0.830 0.862 0.841
DistFAge10 0.830 1.000 0.895 0.879
DistFAge12 0.862 0.895 1.000 0.948
DistFAge14 0.841 0.879 0.948 1.000
Female group: a similarity among correlations range from 0.83 to 0.948 proves the constant correlation
within subjects over any chosen time interval.
Equivalently,
Distanceij = β0 + β1 Agej + ij , ij = bi + eij ⇔ Y = Xβ + , ∼ N(0, Σ)
Σ is a symmetric matrix with diagonal to be D, D =
σ2
b + σ2
e σ2
b σ2
b σ2
b
σ2
b σ2
b + σ2
e σ2
b σ2
b
σ2
b σ2
b σ2
b + σ2
e σ2
b
σ2
b σ2
b σ2
b σ2
b + σ2
e
Kyuson Lim 12 / 16
13.
R Example: REMLfit
REML fit of model for females:
The error variance, σ̂2
e = 0.782
= 0.608 and σ2
b = 2.06852
= 4.279 ⇒ Corr(yij , yik ) = 0.88
mFRI - lme(distance~age,data=FOrthodont,random=~1|Subject,method=REML)
summary(mFRI)
Linear mixed-effects model fit by REML
Data: FOrthodont
AIC BIC logLik
149.2183 156.169 -70.60916
Random effects:
Formula: ~1 | Subject
(Intercept) Residual
StdDev: 2.06847 0.7800331
Fixed effects: distance ~ age
Value Std.Error DF t-value p-value
(Intercept) 17.372727 0.8587419 32 20.230440 0
age 0.479545 0.0525898 32 9.118598 0
Correlation:
(Intr)
age -0.674
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.2736479 -0.7090164 0.1728237 0.4122128 1.6325181
Number of Observations: 44
Number of Groups: 11
Plot of distance against age for females with fits from the model
coef(mFRI)
(Intercept) age
F10 13.36740 0.4795455
F09 15.90228 0.4795455
F06 15.90228 0.4795455
F01 16.14369 0.4795455
F05 17.35078 0.4795455
F07 17.71291 0.4795455
F02 17.71291 0.4795455
F08 18.07503 0.4795455
F03 18.43716 0.4795455
F04 19.52353 0.4795455
F11 20.97204 0.4795455
The estimated random intercept is higher than one may initially expect for subject F10 and lower than one may
initially expect for subject F11, due to so called shrinkage associated with random effects.
There is more shrinkage when ni , the number of observations on the ith subject is small.
Less weight should be given to the ith individual’s average response.
Kyuson Lim 13 / 16
14.
R Example: REMLfit
The R output for Distanceij = β0 + β1 Agej + β2 Sex + β3 Sex × Age j + bi + esex
ij , bi ∼ N(0, σ2
b), esex
ij ∼ N(0, σ2
sex):
summary(m10.5)
Linear mixed-effects model fit by REML
Data: Orthodont
AIC BIC logLik
429.2205 447.7312 -207.6102
Random effects:
Formula: ~1 | Subject
(Intercept) Residual
StdDev: 1.84757 1.669823
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | Sex
Parameter estimates:
Male Female
1.0000000 0.4678944
Fixed effects: distance ~ age * Sex
Value Std.Error DF t-value p-value
(Intercept) 16.340625 1.1450945 79 14.270111 0.0000
age 0.784375 0.0933459 79 8.402883 0.0000
SexFemale 1.032102 1.4039842 25 0.735124 0.4691
age:SexFemale -0.304830 0.1071828 79 -2.844016 0.0057
Correlation:
(Intr) age SexFml
age -0.897
SexFemale -0.816 0.731
age:SexFemale 0.781 -0.871 -0.840
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.00556474 -0.63419474 0.01890475 0.55016878 3.06446971
Number of Observations: 108
Number of Groups: 27
The fixed effect due to the interaction between Sex and Age in model is highly statistically significant (p-value = 0.0057).
The estimated coefficient of this interaction term is such that the growth rate of females is significantly less than that of males
while the standard errors of these estimates differ a little across the two models.
Kyuson Lim 14 / 16
15.
R Example: REMLfit
REML log-likelihoods for models with the same fixed effects can be used to produce a LRT to compare two
nested covariance models.
Based on comparison twice the difference in two maximized RML log-likelihoods to χ2
-distribution.
1 Model 1: Distanceij = β0 + β1 Agej + β2 Sex + β3 Sex × Age j + bi + esex
ij , bi ∼ N(0, σ2
b), esex
ij ∼ N(0, σ2
sex )
2 Model 2: Distanceij = β0 + β1 Agej + β2 Sex + β3 Sex × Age j + bi + eij
Given below is the output from R comparing the REML fits of models for sex factor in the random variable.
# model 1
m10.5 - lme(distance~age*Sex,data=Orthodont,random=~1|Subject,method=REML,
weights=varIdent(form=~1|Sex))
# model 1
m10.6 - lme(distance~age*Sex,data=Orthodont,random=~1|Subject,method=REML)
#comparison
anova(m10.6,m10.5)
Model df AIC BIC logLik Test L.Ratio p-value
m10.6 1 6 445.7572 461.6236 -216.8786
m10.5 2 7 429.2205 447.7312 -207.6102 1 vs 2 18.53677 .0001
The likelihood ratio test is highly statistically significant indicating that model1 provides a significantly
better model for the variance-covariance than does model2.
For model validation purpose, Box plots of the Cholesky residuals from models should be tested.
Kyuson Lim 15 / 16
16.
References
Sheather, S. (2009).A modern approach to regression with R. Springer Science
Business Media. https://gattonweb.uky.edu/sheather/book/
Morota lab, http://morotalab.org/HUJI2019/day1/day1.html#61
Jyang lab, https://jyanglab.com/AGRO-932/chapters/a2.1-qg/rex11_gwas1.html#20
Thank you for the participation and understandings !
Kyuson Lim 16 / 16