Homework 6
1.
Prove that $\hat \sigma^2 = \dfrac{SS_E}{(a-1)(b-1)}$ is the unbiased estimation of $\sigma^2$
What we learned is that
$$
\begin{align}
SST &= \sum{i=1}^a \sum{j=1}^b (y{ij} - \bar y)^2 \
&= \sum{i=1}^a \sum{j=1}^b (y{ij} - \bar y{i\cdot} - \bar y{\cdot j} + \bar y + \bar y{i\cdot} - \bar y + \bar y{j\cdot} - \bar y)^2 \
&= \underbrace{\sum{i=1}^a \sum{j=1}^b (y{ij} - \bar y{i\cdot} - \bar y{\cdot j} + \bar y)^2}_{SSE} + \underbrace{\sum{i=1}^a b(\bar y{i\cdot} - \bar y)^2}{SSA} + \underbrace{\sum{j=1}^b a(\bar y{\cdot j} - \bar y)^2}{SS_B} \
\end{align}
$$
For clarity, we give the corresponding matrix form:
$$
\begin{align}
SST &= y’(I{ab} - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)y \
&= y’\Big( (I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) + (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) + (B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \Big)y \
&= \underbrace{y’(I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)y}_{SSE} + \underbrace{ y’(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)y }{SSA} + \underbrace{ (B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) }{SS_B}
\end{align}
$$
where
$$
\begin{align}
&A = \underbrace{\frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’ \oplus \cdots \oplus \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}b’}{a \text{ items}} = (I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) \
&B = \underbrace{\frac{1}{b}\boldsymbol{1}_a\boldsymbol{1}_a’ \oplus \cdots \oplus \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}a’}{b \text{ items}} = (I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’)
\end{align}
$$
By property of tensor product, we declare that
$A$ and $B$ are symmetric and idempotent.
$$
\begin{align}
A^2 &= (I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) \cdot (I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) = (I_a\cdot I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’ \cdot \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) \
&= I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’ = A \
B^2 &= (I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’) \cdot (I_a \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’) = (I_b\cdot I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’ \cdot \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’) \
&= I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’ = B \
\end{align}
$$
Hence, we claim that $I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’$, $A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’$ and $B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’$ are symmetric and idempotent, with their rank equal to trace:
$$
\begin{align}
&\mathrm{rank}(I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) = (a-1)(b-1) \
&\mathrm{rank}(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) = a-1 \
&\mathrm{rank}(B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’) = b-1
\end{align}
$$
Thus, under the normal hypothesis $Y \sim N(0, \sigma^2 I)$, we have
$$
\begin{align}
&\frac{Y’(I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)Y}{\sigma^2} \sim \chi^2\Big((a-1)(b-1)\Big) \
&\frac{Y’(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)Y}{\sigma^2} \sim \chi^2(a-1) \quad \frac{Y’(B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’)Y}{\sigma^2} \sim \chi^2(b-1)
\end{align}
$$
Now, it is sufficient to say $\dfrac{SS_E}{(a-1)(b-1)}$ is the unbiased estimation of $\sigma^2$.
2.
Prove that both of $SS_A$ and $SS_B$ are independent with $SS_E$
Note that $\boldsymbol{1}_{ab} = I_a \otimes \boldsymbol{1}_b\boldsymbol{1}_b’ = I_b \otimes \boldsymbol{1}_a\boldsymbol{1}_a’$, it is easy to find that
$$
\begin{align}
&A\cdot \boldsymbol{1}_{ab} = (I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) \cdot I_a \otimes \boldsymbol{1}_b\boldsymbol{1}_b’ = I_a \otimes \boldsymbol{1}_b\boldsymbol{1}b’ = \boldsymbol{1}{ab} \
&B\cdot \boldsymbol{1}_{ab} = (I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’) \cdot I_b \otimes \boldsymbol{1}_a\boldsymbol{1}_a’ = I_b \otimes \boldsymbol{1}_a\boldsymbol{1}a’ = \boldsymbol{1}{ab}
\end{align}
$$
Now, based on the derivation of question 1, it suffices to show that
$$
\begin{align}
& (I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) = 0 \
& (I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) (B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) = 0
\end{align}
$$
By symmetry, we only give the derivation of the first equation, the second is similar to this work.
$$
\begin{align}
& (I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \
&= [I{ab}-B-(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)] (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \
&= (I-B)(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) - (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)^2 \
&= A - BA - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ + \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ - (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)^2 \
&= A - BA - (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ - BA \
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ - (I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’)(I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}b’)\
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’ - \frac{1}{ab} (I_b\boldsymbol{1}_b\boldsymbol{1}_b’ \otimes I_a\boldsymbol{1}_a\boldsymbol{1}a’ ) \
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’ - \frac{1}{ab} (\boldsymbol{1}_b\boldsymbol{1}_b’ \otimes \boldsymbol{1}_a\boldsymbol{1}a’ ) \
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’ \
&= 0
\end{align}
$$
The detailed matrix $A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’$ is idempotent since
$$
\begin{align}
(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)^2 &= (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \
&= A^2 - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’A - \frac{1}{ab}A\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ + \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ \
&= A^2 - \frac{2}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ + \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ \
&= A - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’
\end{align}
$$
Hence the independency is proved, which furthermore indicates the fact that under the hypothesis $H_1$
$$
F_A = \frac{SS_A/(a-1)}{SSE/(a-1)(b-1)} \sim F{a-1, (a-1)(b-1)}
$$
Similarly, under the hypothesis $H_2$
$$
F_B = \frac{SS_B/(b-1)}{SSE/(a-1)(b-1)} \sim F{b-1, (a-1)(b-1)}
$$
3.
Prove that $c’\beta$ is an estimable function $ \iff\, c \in \mathcal{M}(X’)$
$c’\beta$ is said to be estimable if and only if there is a linear function of $y$, such that $a’y$ is an unbiased estimator of $c’\beta$ (for arbitrary $\beta$ ), i.e.
$$
\begin{align}
E(a’y) = c’\beta \quad \forall \beta\in\mathbb{R}^p &\iff a’X\beta = c’\beta \quad \forall \beta\in\mathbb{R}^p \
&\iff a’X = c \
&\iff c = X’a \in \mathcal{M}(X’)
\end{align}
$$
4. Project
One-way ANCOVA
This example comes from the litter dataset in the multcomp package. Pregnant mice were divided into four treatment groups; each group received a different dose of a drug (0, 5, 50, or 500). The mean post-birth weight for each litter was the dependent variable and gestation time was included as a covariate.
1 | data(litter, package="multcomp") |
1 | dose |
From the table()
function we can see that there are an unequal number of litters at each dosage level.
1 | aggregate(weight, by=list(dose), FUN=mean) |
1 | Group.1 x |
Based on the group means provided by the aggregate()
function, the no-drug group had the highest mean litter weight.
1 | fit <- aov(weight ~ gesttime + dose) |
1 | Df Sum Sq Mean Sq F value Pr(>F) |
The ANCOVA F tests indicate that
- gestation time was related to birth weight
- drug dosage was related to birth weight after controlling for gestation time.
Furthermore, the effects
package provides a powerful method of obtaining the adjusted means.
1 | library(effects) |
1 | dose effect |
But in this case, the adjusted means are similar to unadjusted one produced by the aggregate()
function.
Now, suppose we are interested in whether the no-drug condition differs from the three-drug condition. We can use the multcomp package to test specific defined hypothesis about the means.
1 | library(multcomp) |
1 |
|
The contrast c(3, -1, -1, -1)
specifies a comparison of the first group with the average of the other three. The hypothesis is tested with a t-statistics, which is significant at the $p<0.05$ level.
Therefore, we can conclude that the no-drug group has a higher birth weight than drug conditions.
Assessing test assumptions
In this case, it is assumed that the regression slope for predicting birth weight from gestation time is the same in each of four treatment groups. A test for the homogeneity of regression slopes can be obtained by including a gestation*dose
interaction term in ANCOVA model.
1 | fit2 <- aov(weight ~ gesttime*dose, data=litter) |
1 | Df Sum Sq Mean Sq F value Pr(>F) |
The interaction is nonsignificant, supporting the assumption of equality of slopes. That’s to say, the relationship between gestation and birth weight does not depend on the level of dose variable.
Visualizing the results
The ancova()
function in the HH
package provides a plot of the relationship between the dependent variable, the covariate, and the factor.
1 | library(HH) |
Here we can see that the regression lines for predicting birth weight from gestation time are parallel in each group but have differetn intercepts. As gestation time increases, birth weight increases. Additionally, we can see that 0-dose group has the largest intercept and the 5-dose group has the lowest intercept.
Now we use the statement ancova(weight ~ gesttime*dose)
instead to generate a plot that allows both the slopes and intercepts to vary by group.
Two-way factorial ANOVA
This example uses the ToothGrowth
dataset in the base installation to demonstrate a two-way between-groups ANOVA. 60 guinea pigs are randomly assigned to receive one of three levels of ascorbic acid ($0.5$, $1$, or $2$ mg) and one of two delivery methods (orange juice or Vitamin C), under the restriction that each treatment combination has 10 guinea pigs. The dependent variable is tooth length.
1 | attach(ToothGrowth) |
1 | dose |
The table
statement indicates that we have a balanced design (equal sample sizes in each cell of the design), and the next two aggregate
statements provide the cell means and standard deviations.
1 | aggregate(len, by=list(supp, dose), FUN=mean) |
1 | Group.1 Group.2 x |
1 | aggregate(len, by=list(supp, dose), FUN=sd) |
1 | Group.1 Group.2 x |
The below ANOVA table provided by the summary()
function indicates that both main effects (supp
and dose
) and the interaction between these factors are significant.
1 | dose <- factor(dose) |
1 | Df Sum Sq Mean Sq F value Pr(>F) |
Furthermore, we can use the interaction2wt()
function in the HH
package to produce a plot of both main effects and two-way interactions for any factorial design of any order.
1 | library(HH) |
The above graph indicates that tooth growth increases with the dose of ascorbic acid for both orange juice and Vitamin C. For the $0.5$ and $1$ mg doses, orange juice produced more tooth growth than Vitamin C. For $2$ mg of ascorbic acid, both delivery methods produced identical growth.
Kai Liu, wrote on 2017/12/14
Student ID: 116071910015
Mail: liu1995@sjtu.edu.cn