Project 4 of Statistics.md

Homework 6

1.

Prove that $\hat \sigma^2 = \dfrac{SS_E}{(a-1)(b-1)}$ is the unbiased estimation of $\sigma^2$

What we learned is that

$$
\begin{align}
SST &= \sum{i=1}^a \sum{j=1}^b (y{ij} - \bar y)^2 \
&= \sum{i=1}^a \sum{j=1}^b (y{ij} - \bar y{i\cdot} - \bar y{\cdot j} + \bar y + \bar y{i\cdot} - \bar y + \bar y{j\cdot} - \bar y)^2 \
&= \underbrace{\sum{i=1}^a \sum{j=1}^b (y{ij} - \bar y{i\cdot} - \bar y{\cdot j} + \bar y)^2}_{SSE} + \underbrace{\sum{i=1}^a b(\bar y{i\cdot} - \bar y)^2}{SSA} + \underbrace{\sum{j=1}^b a(\bar y{\cdot j} - \bar y)^2}{SS_B} \
\end{align}
$$

For clarity, we give the corresponding matrix form:

$$
\begin{align}
SST &= y’(I{ab} - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)y \
&= y’\Big( (I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) + (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) + (B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \Big)y \
&= \underbrace{y’(I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)y}_{SSE} + \underbrace{ y’(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)y }{SSA} + \underbrace{ (B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) }{SS_B}
\end{align}
$$

where

$$
\begin{align}
&A = \underbrace{\frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’ \oplus \cdots \oplus \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}b’}{a \text{ items}} = (I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) \
&B = \underbrace{\frac{1}{b}\boldsymbol{1}_a\boldsymbol{1}_a’ \oplus \cdots \oplus \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}a’}{b \text{ items}} = (I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’)
\end{align}
$$

By property of tensor product, we declare that
$A$ and $B$ are symmetric and idempotent.

$$
\begin{align}
A^2 &= (I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) \cdot (I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) = (I_a\cdot I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’ \cdot \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) \
&= I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’ = A \
B^2 &= (I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’) \cdot (I_a \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’) = (I_b\cdot I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’ \cdot \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’) \
&= I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’ = B \
\end{align}
$$

Hence, we claim that $I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’$, $A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’$ and $B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’$ are symmetric and idempotent, with their rank equal to trace:

$$
\begin{align}
&\mathrm{rank}(I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) = (a-1)(b-1) \
&\mathrm{rank}(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) = a-1 \
&\mathrm{rank}(B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’) = b-1
\end{align}
$$

Thus, under the normal hypothesis $Y \sim N(0, \sigma^2 I)$, we have

$$
\begin{align}
&\frac{Y’(I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)Y}{\sigma^2} \sim \chi^2\Big((a-1)(b-1)\Big) \
&\frac{Y’(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)Y}{\sigma^2} \sim \chi^2(a-1) \quad \frac{Y’(B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’)Y}{\sigma^2} \sim \chi^2(b-1)
\end{align}
$$

Now, it is sufficient to say $\dfrac{SS_E}{(a-1)(b-1)}$ is the unbiased estimation of $\sigma^2$.

2.

Prove that both of $SS_A$ and $SS_B$ are independent with $SS_E$

Note that $\boldsymbol{1}_{ab} = I_a \otimes \boldsymbol{1}_b\boldsymbol{1}_b’ = I_b \otimes \boldsymbol{1}_a\boldsymbol{1}_a’$, it is easy to find that

$$
\begin{align}
&A\cdot \boldsymbol{1}_{ab} = (I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}_b’) \cdot I_a \otimes \boldsymbol{1}_b\boldsymbol{1}_b’ = I_a \otimes \boldsymbol{1}_b\boldsymbol{1}b’ = \boldsymbol{1}{ab} \
&B\cdot \boldsymbol{1}_{ab} = (I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’) \cdot I_b \otimes \boldsymbol{1}_a\boldsymbol{1}_a’ = I_b \otimes \boldsymbol{1}_a\boldsymbol{1}a’ = \boldsymbol{1}{ab}
\end{align}
$$

Now, based on the derivation of question 1, it suffices to show that

$$
\begin{align}
& (I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) = 0 \
& (I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) (B-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) = 0
\end{align}
$$

By symmetry, we only give the derivation of the first equation, the second is similar to this work.

$$
\begin{align}
& (I{ab}-A-B+\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \
&= [I{ab}-B-(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)] (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \
&= (I-B)(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) - (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)^2 \
&= A - BA - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ + \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ - (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)^2 \
&= A - BA - (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ - BA \
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ - (I_b \otimes \frac{1}{a}\boldsymbol{1}_a\boldsymbol{1}_a’)(I_a \otimes \frac{1}{b}\boldsymbol{1}_b\boldsymbol{1}b’)\
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’ - \frac{1}{ab} (I_b\boldsymbol{1}_b\boldsymbol{1}_b’ \otimes I_a\boldsymbol{1}_a\boldsymbol{1}a’ ) \
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’ - \frac{1}{ab} (\boldsymbol{1}_b\boldsymbol{1}_b’ \otimes \boldsymbol{1}_a\boldsymbol{1}a’ ) \
&= \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}_{ab}’ \
&= 0
\end{align}
$$

The detailed matrix $A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’$ is idempotent since

$$
\begin{align}
(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)^2 &= (A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’)(A-\frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’) \
&= A^2 - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’A - \frac{1}{ab}A\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ + \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ \
&= A^2 - \frac{2}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ + \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’ \
&= A - \frac{1}{ab}\boldsymbol{1}{ab}\boldsymbol{1}{ab}’
\end{align}
$$

Hence the independency is proved, which furthermore indicates the fact that under the hypothesis $H_1$

$$
F_A = \frac{SS_A/(a-1)}{SSE/(a-1)(b-1)} \sim F{a-1, (a-1)(b-1)}
$$

Similarly, under the hypothesis $H_2$

$$
F_B = \frac{SS_B/(b-1)}{SSE/(a-1)(b-1)} \sim F{b-1, (a-1)(b-1)}
$$

3.

Prove that $c’\beta$ is an estimable function $ \iff\, c \in \mathcal{M}(X’)$

$c’\beta$ is said to be estimable if and only if there is a linear function of $y$, such that $a’y$ is an unbiased estimator of $c’\beta$ (for arbitrary $\beta$ ), i.e.

$$
\begin{align}
E(a’y) = c’\beta \quad \forall \beta\in\mathbb{R}^p &\iff a’X\beta = c’\beta \quad \forall \beta\in\mathbb{R}^p \
&\iff a’X = c \
&\iff c = X’a \in \mathcal{M}(X’)
\end{align}
$$

4. Project

One-way ANCOVA

This example comes from the litter dataset in the multcomp package. Pregnant mice were divided into four treatment groups; each group received a different dose of a drug (0, 5, 50, or 500). The mean post-birth weight for each litter was the dependent variable and gestation time was included as a covariate.

1
2
3

data(litter, package="multcomp")
attach(litter)
table(dose)

1
2
3

dose
  0   5  50 500
 20  19  18  17

From the table() function we can see that there are an unequal number of litters at each dosage level.

1	aggregate(weight, by=list(dose), FUN=mean)

  Group.1        x
1       0 32.30850
2       5 29.30842
3      50 29.86611
4     500 29.64647

Based on the group means provided by the aggregate() function, the no-drug group had the highest mean litter weight.

1 2	fit <- aov(weight ~ gesttime + dose) summary(fit)

Df Sum Sq Mean Sq F value  Pr(>F)
gesttime     1  134.3  134.30   8.049 0.00597 **
dose         3  137.1   45.71   2.739 0.04988 *
Residuals   69 1151.3   16.69
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The ANCOVA F tests indicate that

gestation time was related to birth weight
drug dosage was related to birth weight after controlling for gestation time.

Furthermore, the effects package provides a powerful method of obtaining the adjusted means.

1 2	library(effects) effect("dose", fit)

 dose effect
dose
       0        5       50      500
32.35367 28.87672 30.56614 29.33460

But in this case, the adjusted means are similar to unadjusted one produced by the aggregate() function.

Now, suppose we are interested in whether the no-drug condition differs from the three-drug condition. We can use the multcomp package to test specific defined hypothesis about the means.

1
2
3

library(multcomp)
contrast <- rbind("no drug vs. drug" = c(3, -1, -1, -1))
summary(glht(fit, linfct=mcp(dose=contrast)))


	 Simultaneous Tests for General Linear hypothesis

Multiple Comparisons of Means: User-defined Contrasts


Fit: aov(formula = weight ~ gesttime + dose)

Linear hypothesis:
                      Estimate Std. Error t value Pr(>|t|)
no drug vs. drug == 0    8.284      3.209   2.581    0.012 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

The contrast c(3, -1, -1, -1) specifies a comparison of the first group with the average of the other three. The hypothesis is tested with a t-statistics, which is significant at the $p<0.05$ level.

Therefore, we can conclude that the no-drug group has a higher birth weight than drug conditions.

Assessing test assumptions

In this case, it is assumed that the regression slope for predicting birth weight from gestation time is the same in each of four treatment groups. A test for the homogeneity of regression slopes can be obtained by including a gestation*dose interaction term in ANCOVA model.

1 2	fit2 <- aov(weight ~ gesttime*dose, data=litter) summary(fit2)

              Df Sum Sq Mean Sq F value  Pr(>F)
gesttime       1  134.3  134.30   8.289 0.00537 **
dose           3  137.1   45.71   2.821 0.04556 *
gesttime:dose  3   81.9   27.29   1.684 0.17889
Residuals     66 1069.4   16.20
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The interaction is nonsignificant, supporting the assumption of equality of slopes. That’s to say, the relationship between gestation and birth weight does not depend on the level of dose variable.

Visualizing the results

The ancova() function in the HH package provides a plot of the relationship between the dependent variable, the covariate, and the factor.

1 2	library(HH) ancova(weight ~ gesttime + dose, data=litter)

Here we can see that the regression lines for predicting birth weight from gestation time are parallel in each group but have differetn intercepts. As gestation time increases, birth weight increases. Additionally, we can see that 0-dose group has the largest intercept and the 5-dose group has the lowest intercept.

Now we use the statement ancova(weight ~ gesttime*dose) instead to generate a plot that allows both the slopes and intercepts to vary by group.

Two-way factorial ANOVA

This example uses the ToothGrowth dataset in the base installation to demonstrate a two-way between-groups ANOVA. 60 guinea pigs are randomly assigned to receive one of three levels of ascorbic acid ($0.5$, $1$, or $2$ mg) and one of two delivery methods (orange juice or Vitamin C), under the restriction that each treatment combination has 10 guinea pigs. The dependent variable is tooth length.

1 2	attach(ToothGrowth) table(supp, dose)

    dose
supp 0.5  1  2
  OJ  10 10 10
  VC  10 10 10

The table statement indicates that we have a balanced design (equal sample sizes in each cell of the design), and the next two aggregate statements provide the cell means and standard deviations.

1	aggregate(len, by=list(supp, dose), FUN=mean)

  Group.1 Group.2     x
1      OJ     0.5 13.23
2      VC     0.5  7.98
3      OJ     1.0 22.70
4      VC     1.0 16.77
5      OJ     2.0 26.06
6      VC     2.0 26.14

1	aggregate(len, by=list(supp, dose), FUN=sd)

  Group.1 Group.2        x
1      OJ     0.5 4.459709
2      VC     0.5 2.746634
3      OJ       1 3.910953
4      VC       1 2.515309
5      OJ       2 2.655058
6      VC       2 4.797731

The below ANOVA table provided by the summary() function indicates that both main effects (supp and dose) and the interaction between these factors are significant.

1
2
3

dose <- factor(dose)
fit <- aov(len ~ supp*dose)
summary(fit)

            Df Sum Sq Mean Sq F value   Pr(>F)
supp         1  205.4   205.4  15.572 0.000231 ***
dose         2 2426.4  1213.2  92.000  < 2e-16 ***
supp:dose    2  108.3    54.2   4.107 0.021860 *
Residuals   54  712.1    13.2
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Furthermore, we can use the interaction2wt() function in the HH package to produce a plot of both main effects and two-way interactions for any factorial design of any order.

1 2	library(HH) interaction2wt(len ~ supp*dose)

The above graph indicates that tooth growth increases with the dose of ascorbic acid for both orange juice and Vitamin C. For the $0.5$ and $1$ mg doses, orange juice produced more tooth growth than Vitamin C. For $2$ mg of ascorbic acid, both delivery methods produced identical growth.

Kai Liu, wrote on 2017/12/14

Student ID: 116071910015

Mail: liu1995@sjtu.edu.cn