Project 3 of Statistics.md

Homework 5

13.13

The gamma probability density function is

$$
f(y,r,\lambda) = \frac{\lambda^r}{\Gamma(r)} e^{-\lambda y} y^{r-1}, \quad\text{for } y,\lambda\geq 0
$$

Show that the gamma is a member of the exponential family.

Proof:

Since

$$
\begin{align}
f(y,r,\lambda) &= \exp\lbrace r\log\lambda - \log\Gamma(\lambda) - \lambda y + (r-1)\log y \rbrace \
&= \exp\lbrace \underbrace{(\frac{\lambda}{r} y - \log\frac{\lambda}{r})/(-\frac1r)}{[\theta y - b(\theta)]/a(\phi)} + \underbrace{(r-1)\log y - \log\Gamma(r) + r\log r}{h(y,\phi)} \rbrace
\end{align
}
$$

Set $\theta = \dfrac{\lambda}{r}$, $\phi = r$, we can derive that $b(\theta) = \log(\theta)$, $a(\phi) = -\dfrac1r$

$$
h(y,\phi) = (\phi-1)\log y - \log\Gamma(\phi) + \phi\log\phi
$$

Thus
$$
\begin{align}
&E(y) = \frac{\mathrm{d} b(\theta)}{\mathrm{d} b(\theta)} = \frac{1}{\theta} = \frac{r}{\lambda} \
&\mathrm{Var}(y) = \frac{\mathrm{d}^2 b(\theta)}{\mathrm{d}\theta^2} \cdot a(\phi) = \frac{r}{\lambda^2}
\end{align
}
$$

13.14

The exponential porbabilty density function is

$$
f(y,\lambda) = \lambda e^{-\lambda y}, \quad\text{for } y,\lambda\geq 0
$$

Show that the exponential distribution is a member of the exponential family.

Proof:

Notice the exponential distribution is a special case of gamma distribution.

$$
f(y, \lambda) = \exp(-\lambda y + \log\lambda)
$$

Set $\theta = -\lambda$, $b(\theta) = -\log\lambda = -\log(-\theta)$, with respect to $\phi = 1$, $a(\phi) = 1$ and $h(y,\phi)=0$

Thus, similar result can be derived

$$
\begin{align}
&E(y) = \frac{\mathrm{d}b(\theta)}{\mathrm{d}\theta} = \frac{1}{\lambda} \
&\mathrm{Var}(y) = \frac{\mathrm{d}^2 b(\theta)}{\mathrm{d}\theta^2}\cdot1 = \frac{1}{\lambda^2}
\end{align
}
$$

which indicates the distribution to be a member of NEF.

13.15

The negative binomial probabilty mass function is

$$
f(y,\pi,\alpha) = \binom{y+\alpha-1}{\alpha-1} \pi^{\alpha}(1-\pi)^y \quad\text{for } y=0,1,\ldots,\alpha>0 \text{ and } 0\leq\pi\leq 1
$$

Show that the negative binomial is a memeber of the exponential family.

Proof:

$$
\begin{align}
f(y,\pi,\alpha) &= \binom{y+\alpha-1}{\alpha-1} \pi^{\alpha}(1-\pi)^y \
&= \frac{\alpha}{y+\alpha} \binom{y+\alpha}{y}\pi^{\alpha}(1-\pi)^y \
\text{Set }n=y+\alpha\quad &= (1-\frac{y}{n})\binom{n}{y}\pi^{n-y}(1-\pi)^y \
&= \exp\Big{ (n-y)\log\pi+y\log(1-\pi)+\log (1-\frac{y}{n})\binom{n}{y} \Big} \
&= \exp\Big{ \underbrace{\Big( \frac{y}{n}\log\frac{1-\pi}{\pi} + \log\pi \Big)}{y\theta + b(\theta)}/\frac1n + \underbrace{\log (1-\frac{y}{n})\binom{n}{y}}{h(y,\phi)} \Big}
\end{align
}
$$

Set
$$
\begin{align}
&\theta=\log\frac{1-\pi}{\pi} \
&b(\theta)=-\log\pi = \log(1+e^{\theta})
\end{align
}
$$

with respect to $\phi=1$, $a(\phi)=\dfrac1n$, $h(y,\phi)=\log (1-\dfrac{y}{n})\binom{n}{y}$

Thus

$$
\begin{align}
&E(y) = \frac{\mathrm{d}b(\theta)}{\mathrm{d}\theta} = \frac{e^\theta}{1+e^\theta} = 1-\pi \
&\mathrm{Var}(y) = \frac{\mathrm{d}^2 b(\theta)}{\mathrm{d}\theta^2}\cdot a(\phi) = \frac1n\cdot\frac{e^\theta}{(1+e^\theta)^2} = \frac1n\pi(1-\pi)
\end{align
}
$$

13.16

The data in the table below are from an experiment designed to study the advance rate $y$ of a drill. The four design factors are $x_1$=load, $x_2$=flow, $x_3$=drilling speed, and $x_4$=type of drilling mud (the original experiment is described by Cuthbert Daniel in his 1976 book on industrial experimentation).


| Observation | $x_1$ | $x_2$ | $x_3$ | $x_4$ | Advance Rate, $y$ |
| ———– | —– | —– | —– | —– | —————– |
| 1 | - | - | - | - | 1.68 |
| 2 | + | - | - | - | 1.98 |
| 3 | - | + | - | - | 3.28 |
| 4 | + | + | - | - | 3.44 |
| 5 | - | - | + | - | 4.98 |
| 6 | + | - | + | - | 5.70 |
| 7 | - | + | + | - | 9.97 |
| 8 | + | + | + | - | 9.07 |
| 9 | - | - | - | + | 2.07 |
| 10 | + | - | - | + | 2.44 |
| 11 | - | + | - | + | 4.09 |
| 12 | + | + | - | + | 4.53 |
| 13 | - | - | + | + | 7.77 |
| 14 | + | - | + | + | 9.43 |
| 15 | - | + | + | + | 11.75 |
| 16 | + | + | + | + | 16.30 |

Fit a generalized linear model to the advance rate response. Use a gamma response distribution and a log link, and include all four regressors in the linear predictor.

First, we write all these data into a csv file named drill and import it into R dataframe.

1
data <- read.table("/path/to/drill.csv", header = TRUE, sep = ",")

To study the advance rate of a drill, we need to transform the columns into dichotomous factor called load, flow, speed, type with the following code.

1
2
3
4
5
6
7
8
9
10
11
12
13
data$load[data$x1=="+"] <- 1
data$load[data$x1=="-"] <- 0

data$flow[data$x2=="+"] <- 1
data$flow[data$x2=="-"] <- 0

data$speed[data$x3=="+"] <- 1
data$speed[data$x3=="-"] <- 0

data$type[data$x4=="+"] <- 1
data$type[data$x4=="-"] <- 0

table(data$load)
1
2
0 1
8 8

The dichotomous factor can now be used as the outcome variable in a regression model using gamma response distribution and a log link.

1
2
fit.full <- glm(y ~ load + flow + speed + type, data, family=Gamma(link = "log"))
summary(fit.full)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Call:
glm(formula = y ~ load + flow + speed + type, family = Gamma(link = "log"),
data = data)

Deviance Residuals:
Min 1Q Median 3Q Max
-0.162931 -0.057552 0.006942 0.060239 0.128465

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.50269 0.05435 9.250 1.60e-06 ***
load 0.13126 0.04861 2.700 0.0206 *
flow 0.58123 0.04861 11.957 1.21e-07 ***
speed 1.15728 0.04861 23.808 8.18e-11 ***
type 0.32690 0.04861 6.725 3.26e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.009451408)

Null deviance: 7.02517 on 15 degrees of freedom
Residual deviance: 0.10527 on 11 degrees of freedom
AIC: 28.17

Number of Fisher Scoring iterations: 4

From the p-values for the regression coefficients (last column), we can see that load, flow, speed and type may make a significant contribution to the equation. Thus, we can reject the hypothesis that the parameters are $0$. That is to say, each regression coefficient in the model is statistically significant $(p<0.5)$.

Let’s take a look at the regression coefficients:

1
coef(fit.full)
1
2
(Intercept)        load        flow       speed        type
0.5026869 0.1312552 0.5812258 1.1572823 0.3268990

Since log(odds) are difficult to interpret, we can exponentiate them to put the result on an odds scale:

1
exp(coef(fit.full))
1
2
(Intercept)        load        flow       speed        type
1.653157 1.140259 1.788229 3.181276 1.386661

Now, we can see that the advance rate of a drill is increased by a factor of 1.140259 for one-unit increase in drill load (holding flow, speed and type constant). Similar increasing trend can be detected in other factors. Because the predictor variables can’t equal 0, the intercept isn’t meaningfull in this case.

Also, we can use the confint() function to obtain confident intervals for the coefficients.

1
exp(confint(fit.full))
1
2
3
4
5
6
               2.5 %   97.5 %
(Intercept) 1.488639 1.839411
load 1.036550 1.254344
flow 1.625624 1.967101
speed 2.891712 3.499836
type 1.260412 1.525555

97.5 percent confidence intervals fro each of the coefficients on an odds scale are printed above.

Finally, we use compare the residual deviance with the residual degrees of freedom to detect the overdispersion.

1
deviance(fit.full)/df.residual(fit.full)
1
[1] 0.009570289

which is far away from 1, suggesting the overdispersion.

Deviance can be draw by the command

1
fit.full$deviance
1
[1] 0.1052732

Bonus Problems

1.

Prove that the deduction matrix of variance $\mathrm{Var}(\hat\beta^*{q+1}) - \mathrm{Var}(\hat\beta{q+1})$ is positive semidefinite.

What we knew is that

$$
\begin{align}
&\mathrm{Var}(\hat\beta^
) = \sigma^2(X’X)^{-1} \
&\mathrm{Var}(\hat\beta^{q+1}) = \sigma^2(X{q+1}’X_{q+1})^{-1}
\end{align
}
$$

where $X = \left( \begin{array}{c:c} X_{q+1} & X_r \end{array}\right)$.

Consider the normal matrix

$$
X’X = \begin{pmatrix}
X{q+1}’X{q+1} & X_{q+1}’X_r \
Xr’X{q+1} & X_r’Xr
\end{pmatrix} = \begin{pmatrix}
\Sigma
{11} & \Sigma{12} \
\Sigma
{21} & \Sigma_{22}
\end{pmatrix}
$$

Inverse of this bolck matrix is already given as

$$
\begin{pmatrix}
\Sigma{11} & \Sigma{12} \
\Sigma{21} & \Sigma{22} \
\end{pmatrix}^{-1} =\, \begin{pmatrix}
\Sigma{11}^{-1}+\Sigma{11}^{-1}\Sigma{12}\Sigma{22.1}^{-1}\Sigma{21}\Sigma{11}^{-1} & -\Sigma{11}^{-1}\Sigma{12}\Sigma{22.1}^{-1} \
-\Sigma
{22.1}^{-1}\Sigma{21}\Sigma{11}^{-1} & \Sigma_{22.1}^{-1} \
\end{pmatrix}
$$

where
$$
\Sigma{22.1}=\Sigma{22}-\Sigma{21}\Sigma{11}^{-1}\Sigma_{12}
$$

Thus, we say the first entry of the inverse matrix should be

$$
\Sigma{11}^{-1}+\Sigma{11}^{-1}\Sigma{12}\Sigma{22.1}^{-1}\Sigma{21}\Sigma{11}^{-1}
$$

Set $A = \Sigma{11}^{-1}\Sigma{12} = (X{q+1}’X{q+1})^{-1}\cdot(X_{q+1}’X_r)$ to be the alias matrix. Hence we have

$$
\mathrm{Var}(\hat\beta^*{q+1}) - \mathrm{Var}(\hat\beta{q+1}) = \sigma^2 A\big(X_r’X_r - Xr’X{q+1}A\big)^{-1}\,A’
$$

which is obviously positive semidefinite.

2.

Prove that

$$
F = \dfrac{SS_A/(a-1)}{SSE/(n-a)} \sim F{a-1, n-a}
$$

under the hypothesis $H_0$.

What we learned is that

$$
\begin{align}
SST &= \sum{i=1}^a \sum_{j=1}^{ni} (y{ij} - \bar y)^2 \
&= \sum{i=1}^a \sum{j=1}^{ni} (y{ij} - \bar y{i\cdot} + \bar y{i\cdot} - \bar y)^2 \
&= \underbrace{\sum{i=1}^a \sum{j=1}^{ni} (y{ij} - \bar y{i\cdot})^2}{SSE} + \underbrace{\sum{i=1}^a \sum_{j=1}^{ni} (\bar y{i\cdot} - \bar y)^2}_{SS_A}
\end{align
}
$$

For clarity, we give the coresponding matrix form:

$$
\begin{align}
SST &= y’(I-\frac1n \boldsymbol{1}\boldsymbol{1}’)y \
&= y’(I - A + A - \frac1n \boldsymbol{1}\boldsymbol{1}’)y \
&= \underbrace{y’(I - A)y}
{SSE} + \underbrace{y’(A - \frac1n \boldsymbol{1}\boldsymbol{1}’)y}{SS_A}
\end{align
}
$$

where $A = \dfrac{1}{n1}\boldsymbol{1}{n1}\boldsymbol{1}{n_1}’ \oplus \cdots \oplus \dfrac{1}{na}\boldsymbol{1}{na}\boldsymbol{1}{n_a}’$

Since $\boldsymbol{1}_{ni}’\boldsymbol{1}{n_i} = ni$, we know that $I{n_i} - \dfrac{1}{ni}\boldsymbol{1}{ni}\boldsymbol{1}{n_i}’$ is a Household matrix, which is definitely symetric and idempotent. i.e.

$$
A’=A \qquad A^2 = A
$$

Next, we assume that $B=I-A$, $C=A-\dfrac1n\boldsymbol{1}\boldsymbol{1}’$

We claim that

$$
\begin{align}
&B’=B \qquad B^2 = B \
&C’=C \qquad C^2 = C
\end{align
}
$$

with its rank

$$
\mathrm{rank}(B)=n-a \quad \mathrm{rank}(C)=a-1
$$

Thus, under the normal hyperthesis $Y\sim N(0,\sigma^2I)$, we have

$$
\begin{align}
&\frac{Y’BY}{\sigma^2} \sim \chi^2(n-a) \
&\frac{Y’CY}{\sigma^2} \sim \chi^2(a-1)
\end{align
}
$$

together with

$$
\begin{align}
BC &= (I-A)(A-\frac1n\boldsymbol{1}\boldsymbol{1}’) \
&= A-\frac1n\boldsymbol{1}\boldsymbol{1}’-A^2+\frac1n A\boldsymbol{1}\boldsymbol{1}’ \
&= A-\frac1n\boldsymbol{1}\boldsymbol{1}’-A+\frac1n\boldsymbol{1}\boldsymbol{1}’ \
&= 0
\end{align
}
$$

The last two equation follows from the fact that

$$
\begin{align}
A \cdot \boldsymbol{1} &= \big( \frac{1}{n1}\boldsymbol{1}{n1}\boldsymbol{1}{n_1}’ \oplus \cdots \oplus \frac{1}{na}\boldsymbol{1}{na}\boldsymbol{1}{na}’ \big) \cdot \big(\boldsymbol{1}{n1} \oplus \cdots \oplus \boldsymbol{1}{n_a}\big) \
&= \frac{1}{n1}\boldsymbol{1}{n1}\boldsymbol{1}{n1}’\boldsymbol{1}{n_1} \oplus \cdots \oplus \frac{1}{na}\boldsymbol{1}{na}\boldsymbol{1}{na}’\boldsymbol{1}{na} \
&= \boldsymbol{1}
{n1} \oplus \cdots \oplus \boldsymbol{1}{n_a} = \boldsymbol{1}
\end{align
}
$$

Hence, we know that $Y’(I-A)Y$ and $Y’(A-\dfrac1n\boldsymbol{1}\boldsymbol{1}’)Y$ are mutually independent, which indicates

$$
F=\frac{Y’CY/(a-1)}{Y’BY/(n-a)} = \frac{SS_A/(a-1)}{SSE/(n-a)} \sim F{a-1, n-a}
$$

Furthermore, we can derive two unbiased estimates.

$$
\begin{align}
E(SS_E) &= E\big(Y’(I-A)Y\big) \
&= \mu’(I-A)\mu + \mathrm{tr}\big((I-A)\sigma^2I\big) \
&= \mu’\mu - \mu’A\mu + \sigma^2(n-a) \
&= \sigma^2(n-a)
\end{align
}
$$

The last equation is derived from the hypothesis.

Similarly,

$$
\begin{align}
E(SS_A) &= E\big(Y’(A-\frac1n\boldsymbol{1}\boldsymbol{1}’)Y\big) \
&= \mu’(A-\frac1n\boldsymbol{1}\boldsymbol{1}’)\mu + \mathrm{tr}\big((A-\frac1n\boldsymbol{1}\boldsymbol{1}’)\sigma^2I\big) \
&= \mu’\mu - \mu’A\mu + \sigma^2(a-1) \
&= \sigma^2(a-1)
\end{align
}
$$

That’s to say

$$
E\big(SS_E/(n-a)\big) = E\big(SS_A/(a-1)\big) = \sigma^2
$$


Kai Liu, wrote on 2017/11/28

Student ID: 116071910015

Mail: liu1995@sjtu.edu.cn

文章作者: Monad Kai
文章链接: onlookerliu.github.io/2017/12/01/Project-3-of-Statistics/
版权声明: 本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 Code@浮生记
支付宝打赏
微信打赏