Project 3 of Statistics.md

Homework 5

13.13

The gamma probability density function is

$$
f(y,r,\lambda) = \frac{\lambda^r}{\Gamma(r)} e^{-\lambda y} y^{r-1}, \quad\text{for } y,\lambda\geq 0
$$

Show that the gamma is a member of the exponential family.

Proof:

Since

$$
\begin{align}
f(y,r,\lambda) &= \exp\lbrace r\log\lambda - \log\Gamma(\lambda) - \lambda y + (r-1)\log y \rbrace \
&= \exp\lbrace \underbrace{(\frac{\lambda}{r} y - \log\frac{\lambda}{r})/(-\frac1r)}{[\theta y - b(\theta)]/a(\phi)} + \underbrace{(r-1)\log y - \log\Gamma(r) + r\log r}{h(y,\phi)} \rbrace
\end{align}
$$

Set $\theta = \dfrac{\lambda}{r}$, $\phi = r$, we can derive that $b(\theta) = \log(\theta)$, $a(\phi) = -\dfrac1r$

$$
h(y,\phi) = (\phi-1)\log y - \log\Gamma(\phi) + \phi\log\phi
$$

Thus
$$
\begin{align}
&E(y) = \frac{\mathrm{d} b(\theta)}{\mathrm{d} b(\theta)} = \frac{1}{\theta} = \frac{r}{\lambda} \
&\mathrm{Var}(y) = \frac{\mathrm{d}^2 b(\theta)}{\mathrm{d}\theta^2} \cdot a(\phi) = \frac{r}{\lambda^2}
\end{align}
$$

13.14

The exponential porbabilty density function is

$$
f(y,\lambda) = \lambda e^{-\lambda y}, \quad\text{for } y,\lambda\geq 0
$$

Show that the exponential distribution is a member of the exponential family.

Proof:

Notice the exponential distribution is a special case of gamma distribution.

$$
f(y, \lambda) = \exp(-\lambda y + \log\lambda)
$$

Set $\theta = -\lambda$, $b(\theta) = -\log\lambda = -\log(-\theta)$, with respect to $\phi = 1$, $a(\phi) = 1$ and $h(y,\phi)=0$

Thus, similar result can be derived

$$
\begin{align}
&E(y) = \frac{\mathrm{d}b(\theta)}{\mathrm{d}\theta} = \frac{1}{\lambda} \
&\mathrm{Var}(y) = \frac{\mathrm{d}^2 b(\theta)}{\mathrm{d}\theta^2}\cdot1 = \frac{1}{\lambda^2}
\end{align}
$$

which indicates the distribution to be a member of NEF.

13.15

The negative binomial probabilty mass function is

$$
f(y,\pi,\alpha) = \binom{y+\alpha-1}{\alpha-1} \pi^{\alpha}(1-\pi)^y \quad\text{for } y=0,1,\ldots,\alpha>0 \text{ and } 0\leq\pi\leq 1
$$

Show that the negative binomial is a memeber of the exponential family.

Proof:

$$
\begin{align}
f(y,\pi,\alpha) &= \binom{y+\alpha-1}{\alpha-1} \pi^{\alpha}(1-\pi)^y \
&= \frac{\alpha}{y+\alpha} \binom{y+\alpha}{y}\pi^{\alpha}(1-\pi)^y \
\text{Set }n=y+\alpha\quad &= (1-\frac{y}{n})\binom{n}{y}\pi^{n-y}(1-\pi)^y \
&= \exp\Big{ (n-y)\log\pi+y\log(1-\pi)+\log (1-\frac{y}{n})\binom{n}{y} \Big} \
&= \exp\Big{ \underbrace{\Big( \frac{y}{n}\log\frac{1-\pi}{\pi} + \log\pi \Big)}{y\theta + b(\theta)}/\frac1n + \underbrace{\log (1-\frac{y}{n})\binom{n}{y}}{h(y,\phi)} \Big}
\end{align}
$$

Set
$$
\begin{align}
&\theta=\log\frac{1-\pi}{\pi} \
&b(\theta)=-\log\pi = \log(1+e^{\theta})
\end{align}
$$

with respect to $\phi=1$, $a(\phi)=\dfrac1n$, $h(y,\phi)=\log (1-\dfrac{y}{n})\binom{n}{y}$

Thus

$$
\begin{align}
&E(y) = \frac{\mathrm{d}b(\theta)}{\mathrm{d}\theta} = \frac{e^\theta}{1+e^\theta} = 1-\pi \
&\mathrm{Var}(y) = \frac{\mathrm{d}^2 b(\theta)}{\mathrm{d}\theta^2}\cdot a(\phi) = \frac1n\cdot\frac{e^\theta}{(1+e^\theta)^2} = \frac1n\pi(1-\pi)
\end{align}
$$

13.16

The data in the table below are from an experiment designed to study the advance rate $y$ of a drill. The four design factors are $x_1$=load, $x_2$=flow, $x_3$=drilling speed, and $x_4$=type of drilling mud (the original experiment is described by Cuthbert Daniel in his 1976 book on industrial experimentation).

| Observation | $x_1$ | $x_2$ | $x_3$ | $x_4$ | Advance Rate, $y$ |
| ———– | —– | —– | —– | —– | —————– |
| 1 | - | - | - | - | 1.68 |
| 2 | + | - | - | - | 1.98 |
| 3 | - | + | - | - | 3.28 |
| 4 | + | + | - | - | 3.44 |
| 5 | - | - | + | - | 4.98 |
| 6 | + | - | + | - | 5.70 |
| 7 | - | + | + | - | 9.97 |
| 8 | + | + | + | - | 9.07 |
| 9 | - | - | - | + | 2.07 |
| 10 | + | - | - | + | 2.44 |
| 11 | - | + | - | + | 4.09 |
| 12 | + | + | - | + | 4.53 |
| 13 | - | - | + | + | 7.77 |
| 14 | + | - | + | + | 9.43 |
| 15 | - | + | + | + | 11.75 |
| 16 | + | + | + | + | 16.30 |

Fit a generalized linear model to the advance rate response. Use a gamma response distribution and a log link, and include all four regressors in the linear predictor.

First, we write all these data into a csv file named drill and import it into R dataframe.

1	data <- read.table("/path/to/drill.csv", header = TRUE, sep = ",")

To study the advance rate of a drill, we need to transform the columns into dichotomous factor called load, flow, speed, type with the following code.

data$load[data$x1=="+"] <- 1
data$load[data$x1=="-"] <- 0

data$flow[data$x2=="+"] <- 1
data$flow[data$x2=="-"] <- 0

data$speed[data$x3=="+"] <- 1
data$speed[data$x3=="-"] <- 0

data$type[data$x4=="+"] <- 1
data$type[data$x4=="-"] <- 0

table(data$load)

1
2

0 1
8 8

The dichotomous factor can now be used as the outcome variable in a regression model using gamma response distribution and a log link.

1 2	fit.full <- glm(y ~ load + flow + speed + type, data, family=Gamma(link = "log")) summary(fit.full)

Call:
glm(formula = y ~ load + flow + speed + type, family = Gamma(link = "log"),
    data = data)

Deviance Residuals:
      Min         1Q     Median         3Q        Max
-0.162931  -0.057552   0.006942   0.060239   0.128465

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.50269    0.05435   9.250 1.60e-06 ***
load         0.13126    0.04861   2.700   0.0206 *
flow         0.58123    0.04861  11.957 1.21e-07 ***
speed        1.15728    0.04861  23.808 8.18e-11 ***
type         0.32690    0.04861   6.725 3.26e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.009451408)

    Null deviance: 7.02517  on 15  degrees of freedom
Residual deviance: 0.10527  on 11  degrees of freedom
AIC: 28.17

Number of Fisher Scoring iterations: 4

From the p-values for the regression coefficients (last column), we can see that load, flow, speed and type may make a significant contribution to the equation. Thus, we can reject the hypothesis that the parameters are $0$. That is to say, each regression coefficient in the model is statistically significant $(p<0.5)$.

Let’s take a look at the regression coefficients:

1	coef(fit.full)

1 2	(Intercept) load flow speed type 0.5026869 0.1312552 0.5812258 1.1572823 0.3268990

Since log(odds) are difficult to interpret, we can exponentiate them to put the result on an odds scale:

1	exp(coef(fit.full))

1 2	(Intercept) load flow speed type 1.653157 1.140259 1.788229 3.181276 1.386661

Now, we can see that the advance rate of a drill is increased by a factor of 1.140259 for one-unit increase in drill load (holding flow, speed and type constant). Similar increasing trend can be detected in other factors. Because the predictor variables can’t equal 0, the intercept isn’t meaningfull in this case.

Also, we can use the confint() function to obtain confident intervals for the coefficients.

1	exp(confint(fit.full))

               2.5 %   97.5 %
(Intercept) 1.488639 1.839411
load        1.036550 1.254344
flow        1.625624 1.967101
speed       2.891712 3.499836
type        1.260412 1.525555

97.5 percent confidence intervals fro each of the coefficients on an odds scale are printed above.

Finally, we use compare the residual deviance with the residual degrees of freedom to detect the overdispersion.

1	deviance(fit.full)/df.residual(fit.full)

1	[1] 0.009570289

which is far away from 1, suggesting the overdispersion.

Deviance can be draw by the command

1	fit.full$deviance

1	[1] 0.1052732

Bonus Problems

1.

Prove that the deduction matrix of variance $\mathrm{Var}(\hat\beta^*{q+1}) - \mathrm{Var}(\hat\beta{q+1})$ is positive semidefinite.

What we knew is that

$$
\begin{align}
&\mathrm{Var}(\hat\beta^) = \sigma^2(X’X)^{-1} \
&\mathrm{Var}(\hat\beta^{q+1}) = \sigma^2(X{q+1}’X_{q+1})^{-1}
\end{align}
$$

where $X = \left( \begin{array}{c:c} X_{q+1} & X_r \end{array}\right)$.

Consider the normal matrix

$$
X’X = \begin{pmatrix}
X{q+1}’X{q+1} & X_{q+1}’X_r \
Xr’X{q+1} & X_r’Xr
\end{pmatrix} = \begin{pmatrix}
\Sigma{11} & \Sigma{12} \
\Sigma{21} & \Sigma_{22}
\end{pmatrix}
$$

Inverse of this bolck matrix is already given as

$$
\begin{pmatrix}
\Sigma{11} & \Sigma{12} \
\Sigma{21} & \Sigma{22} \
\end{pmatrix}^{-1} =\, \begin{pmatrix}
\Sigma{11}^{-1}+\Sigma{11}^{-1}\Sigma{12}\Sigma{22.1}^{-1}\Sigma{21}\Sigma{11}^{-1} & -\Sigma{11}^{-1}\Sigma{12}\Sigma{22.1}^{-1} \
-\Sigma{22.1}^{-1}\Sigma{21}\Sigma{11}^{-1} & \Sigma_{22.1}^{-1} \
\end{pmatrix}
$$

where
$$
\Sigma{22.1}=\Sigma{22}-\Sigma{21}\Sigma{11}^{-1}\Sigma_{12}
$$

Thus, we say the first entry of the inverse matrix should be

$$
\Sigma{11}^{-1}+\Sigma{11}^{-1}\Sigma{12}\Sigma{22.1}^{-1}\Sigma{21}\Sigma{11}^{-1}
$$

Set $A = \Sigma{11}^{-1}\Sigma{12} = (X{q+1}’X{q+1})^{-1}\cdot(X_{q+1}’X_r)$ to be the alias matrix. Hence we have

$$
\mathrm{Var}(\hat\beta^*{q+1}) - \mathrm{Var}(\hat\beta{q+1}) = \sigma^2 A\big(X_r’X_r - Xr’X{q+1}A\big)^{-1}\,A’
$$

which is obviously positive semidefinite.

2.

Prove that

$$
F = \dfrac{SS_A/(a-1)}{SSE/(n-a)} \sim F{a-1, n-a}
$$

under the hypothesis $H_0$.

What we learned is that

$$
\begin{align}
SST &= \sum{i=1}^a \sum_{j=1}^{ni} (y{ij} - \bar y)^2 \
&= \sum{i=1}^a \sum{j=1}^{ni} (y{ij} - \bar y{i\cdot} + \bar y{i\cdot} - \bar y)^2 \
&= \underbrace{\sum{i=1}^a \sum{j=1}^{ni} (y{ij} - \bar y{i\cdot})^2}{SSE} + \underbrace{\sum{i=1}^a \sum_{j=1}^{ni} (\bar y{i\cdot} - \bar y)^2}_{SS_A}
\end{align}
$$

For clarity, we give the coresponding matrix form:

$$
\begin{align}
SST &= y’(I-\frac1n \boldsymbol{1}\boldsymbol{1}’)y \
&= y’(I - A + A - \frac1n \boldsymbol{1}\boldsymbol{1}’)y \
&= \underbrace{y’(I - A)y}{SSE} + \underbrace{y’(A - \frac1n \boldsymbol{1}\boldsymbol{1}’)y}{SS_A}
\end{align}
$$

where $A = \dfrac{1}{n1}\boldsymbol{1}{n1}\boldsymbol{1}{n_1}’ \oplus \cdots \oplus \dfrac{1}{na}\boldsymbol{1}{na}\boldsymbol{1}{n_a}’$

Since $\boldsymbol{1}_{ni}’\boldsymbol{1}{n_i} = ni$, we know that $I{n_i} - \dfrac{1}{ni}\boldsymbol{1}{ni}\boldsymbol{1}{n_i}’$ is a Household matrix, which is definitely symetric and idempotent. i.e.

$$
A’=A \qquad A^2 = A
$$

Next, we assume that $B=I-A$, $C=A-\dfrac1n\boldsymbol{1}\boldsymbol{1}’$

We claim that

$$
\begin{align}
&B’=B \qquad B^2 = B \
&C’=C \qquad C^2 = C
\end{align}
$$

with its rank

$$
\mathrm{rank}(B)=n-a \quad \mathrm{rank}(C)=a-1
$$

Thus, under the normal hyperthesis $Y\sim N(0,\sigma^2I)$, we have

$$
\begin{align}
&\frac{Y’BY}{\sigma^2} \sim \chi^2(n-a) \
&\frac{Y’CY}{\sigma^2} \sim \chi^2(a-1)
\end{align}
$$

together with

$$
\begin{align}
BC &= (I-A)(A-\frac1n\boldsymbol{1}\boldsymbol{1}’) \
&= A-\frac1n\boldsymbol{1}\boldsymbol{1}’-A^2+\frac1n A\boldsymbol{1}\boldsymbol{1}’ \
&= A-\frac1n\boldsymbol{1}\boldsymbol{1}’-A+\frac1n\boldsymbol{1}\boldsymbol{1}’ \
&= 0
\end{align}
$$

The last two equation follows from the fact that

$$
\begin{align}
A \cdot \boldsymbol{1} &= \big( \frac{1}{n1}\boldsymbol{1}{n1}\boldsymbol{1}{n_1}’ \oplus \cdots \oplus \frac{1}{na}\boldsymbol{1}{na}\boldsymbol{1}{na}’ \big) \cdot \big(\boldsymbol{1}{n1} \oplus \cdots \oplus \boldsymbol{1}{n_a}\big) \
&= \frac{1}{n1}\boldsymbol{1}{n1}\boldsymbol{1}{n1}’\boldsymbol{1}{n_1} \oplus \cdots \oplus \frac{1}{na}\boldsymbol{1}{na}\boldsymbol{1}{na}’\boldsymbol{1}{na} \
&= \boldsymbol{1}{n1} \oplus \cdots \oplus \boldsymbol{1}{n_a} = \boldsymbol{1}
\end{align}
$$

Hence, we know that $Y’(I-A)Y$ and $Y’(A-\dfrac1n\boldsymbol{1}\boldsymbol{1}’)Y$ are mutually independent, which indicates

$$
F=\frac{Y’CY/(a-1)}{Y’BY/(n-a)} = \frac{SS_A/(a-1)}{SSE/(n-a)} \sim F{a-1, n-a}
$$

Furthermore, we can derive two unbiased estimates.

$$
\begin{align}
E(SS_E) &= E\big(Y’(I-A)Y\big) \
&= \mu’(I-A)\mu + \mathrm{tr}\big((I-A)\sigma^2I\big) \
&= \mu’\mu - \mu’A\mu + \sigma^2(n-a) \
&= \sigma^2(n-a)
\end{align}
$$

The last equation is derived from the hypothesis.

Similarly,

$$
\begin{align}
E(SS_A) &= E\big(Y’(A-\frac1n\boldsymbol{1}\boldsymbol{1}’)Y\big) \
&= \mu’(A-\frac1n\boldsymbol{1}\boldsymbol{1}’)\mu + \mathrm{tr}\big((A-\frac1n\boldsymbol{1}\boldsymbol{1}’)\sigma^2I\big) \
&= \mu’\mu - \mu’A\mu + \sigma^2(a-1) \
&= \sigma^2(a-1)
\end{align}
$$

That’s to say

$$
E\big(SS_E/(n-a)\big) = E\big(SS_A/(a-1)\big) = \sigma^2
$$

Kai Liu, wrote on 2017/11/28

Student ID: 116071910015

Mail: liu1995@sjtu.edu.cn