Assignment 1 of Advanced Computational Method

Homework 1

Monte Carlo method is a method to compute the expectation of a random variable.

The basic steps follow like this:

  1. draw random samples ${x_1, x_2, \cdots, x_n}$ according to $p(x)$.

  2. derive the monte carlo estimator of $E(f)$ from

$$
\hat{f}{\text{mc}}=\frac{1}{n}\sum{i=1}^n f(x_i)
$$

Problem 1

Suppose $x_1, x2, \cdots, x{10}$ satisfy independent identical distributions i.e. $x_i\sim N(0,1)\quad i=1,2,\cdots,10$

Set $f(x)=\sum_{i=1}^{10}x_i^2$, use Monte Carlo method to compute $E[f(x)]$.

Compare your results with the exact value.

Solution:

From basic statistics, we can deduce that
$$
\begin{align}
Ef(x)&=E\sum_{i=1}^{10}x_i^2=10\,Ex_i^2\
&=10\,[Dx_i+(Ex_i)^2]\
&=10\times(1+0)\
&=10
\end{align
}
$$

Thus, the exact value of $E[f(x)]$ equals 10.

Continuely, we derive a Monte Carlo method for evaluating $E[f(x)]$

1
2
3
4
5
6
7
8
9
10
using Distributions
@everywhere function MC_mean(N::Int)
# N = number of Monto Carlo points

x = rand(Normal(0, 1), N); # draw random samples

return 10 * mean(x.^2) # return the Monte Carlo estimator of f
end

MC_mean(10^6)
1
9.98580160514502

A billion Monte Carlo steps yields an even better estimate:

1
MC_mean(10^9)
1
10.000288570895224

Now, let’s take a brief look at the time cost.

1
@elapsed MC_mean(10^6)
1
0.013395715
1
@elapsed MC_mean(10^9)
1
50.939757386

When it comes to Monte Carlo simulation, our goal is to get good data as fast as possible. For now, let’s look at how to speed up our code. First of all, we can easily notice the time cost grows out of all proportion. The calculation for 1 million Monte Carlo cost is over 1000 times faster than the billion’s.

1
(@elapsed MC_mean(10^9))/(@elapsed MC_mean(10^6))
1286.3957285556933

Of course, we can do better than this. In preparation for these calculations, let us introduce a parallel computation for Monte Carlo simulation. We build a function that takes MC_mean and evenly splits the jobs to our processors:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
function parallel_calc(f::Function, N::Int, proc::Int=4)

# f = some predefined function
# N = number of Monte Carlo steps
# proc = number of additional processors, preset to four

n = round(Int, N/proc); # split the number of Monte Carlo steps equally between processors

proc_sum = @parallel (+) for i=1:proc # loop through the processors and distribute jobs
f(n)
end

return proc_sum / proc # return the average of the parallel jobs

end
parallel_calc (generic function with 2 methods)

Now, we can quickly and efficiently calculate a billion MC samples:

1
@elapsed parallel_calc(MC_mean, 10^9)
21.05725137

There is also an easier way to see the speed up of parallel processing by plotting the elapsed time. Below, we define a simple code to plot the elapsed times of MC_mean(in blue) and parallel (in red) vs. the number of MC samplings:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
using PyPlot
PyPlot.svg(true)

function mean_time_plotter(N::Int)
n = Int64(N/10) # ensure that you only plot 10 points

plot(collect(1:n:N),[@elapsed MC_mean(i) for i in 10:n:N], linestyle="--", marker=".", color="blue", label="Single Processor")

# the above plots mean for 10 values of MC steps from 10 to N vs. the number of MC steps

plot(collect(1:n:N),[@elapsed parallel_calc(MC_mean,i) for i in 10:n:N], linestyle="--", marker=".", color="red", label="Parallel Processor")

title("Comparison of Parallel vs. Single Processor MC Efficiency") # make title and labels
ylabel("Elapsed Time (sec)")
xlabel("# of random points")

legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.) # anchor the legend

end
mean_time_plotter (generic function with 1 method)
1
mean_time_plotter(10^9)

PyObject <matplotlib.legend.Legend object at 0x132777510>

We can see that the elapsed time of simulation is approximately equal around 0.7 billion MC samplings, after which parallel processing begins to win in favor of the single processor. This trend is continued when we look at a billion MC samples.

Now, we turn to present this MC data and visualize the error. To do this, let’s write a simple program to plot the output of parallel vs. the number of Monte Carlo samplings. Because we already know what the mean is, let’s also plot the accepted value as a straight line in red. Finally, let’s also calculate the percent error and display it in the title. The format used here will be useful in later code where we have to compare with some exact analytical value. The following tedious codes (if I remember to present) are quite similar to each other.

However, one thing unfortunate is that the parallel efficiency does not high within one billion samples. In proper case, for example, we need fast result, single processor is fairly enough.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
function MC_mean_plotter(f::Function, N::Int, m::Int)

n = Int64(N/m) # define the step size between points to plot

mean_list = [parallel_calc(f,i) for i in n:n:N] # generate list of mean values from parallel for i MC steps

plot(collect(n:n:N), mean_list, linestyle="--", marker=".",
color="blue", label=string(L"MC Estimator: ", round(mean_list[end],6)))
# in the above, we plot the MC estimate of 10 vs. the number of MC steps

axhline(y=10, linestyle="-", color="red", label=string("Exact Value: ", 10))
# in the above, we plot a horizontal line at the 10 value

per_error = 100*abs((mean_list[end]-10)/(mean_list[end])) # calculate percent error from mean_list

title(string(L"Monte Carlo Estimation of $10$--within ", round(per_error,5), "%")) # make titles and labels
ylabel("Estimation of 10")
xlabel("# of random points")

legend(bbox_to_anchor=(1.05,1), loc=2, borderaxespad=0.) # make legend

end
MC_mean_plotter (generic function with 1 method)

To get a nice plot, we take a maximum of a million Monte Carlo samples and plot a total of a hundred points:

1
MC_mean_plotter(MC_mean, 10^9, 100)

PyObject <matplotlib.legend.Legend object at 0x1326e2cd0>

Although we appear to obtain a reasonable value for 10(below 0.00421%!), it is difficult to see how well the data is converging. Therefore, let’s build a new function that calculates a list of $M$ Monte Carlo estimates of 10 at a given value of $N$. From this list, we can obtain the average and standard deviation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
function stat_calc(f::Function, N::Int, M::Int)

# f = some general function
# N = number of Monte Carlo steps
# M = number of times to repeat a single MC step to for statistical analysis

avg_list = [parallel_calc(f,N) for j in 1:M] # create list of MC data of length M

avg = sum(avg_list)/M # calculate average of MC data

stdev = std(avg_list) # calculate standard deviation of MC data

return avg, stdev

end
stat_calc (generic function with 1 method)

Now, we can modify MC_mean_plotter to include error bars to better visualize the convergence of our data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
function MC_errorbar(f::Function, N::Int, m::Int, M::Int)

# f = some general function
# N = number of Monte Carlo steps
# m = number of points to plot
# M = number of times to repeat a single MC step to for statistical analysis

n = Int64(N/m) # define the step size between points to plot

stat_list = [stat_calc(f,i,M) for i in n:n:N] # make a list of averages and standard deviations from stat_calc
avg_list = [stat_list[i][1] for i in 1:m] # create list of averages
stdev_list = [stat_list[i][2] for i in 1:m] # create list of standard deviations

errorbar(collect(n:n:N), avg_list, yerr=stdev_list, linestyle="--", marker=".",
color="blue", label=string("MC Estimate: ", round(avg_list[end], 6)))
# in the above, we plot the averages vs. MC step size with error bars of the standard deviation

axhline(y=10, linestyle="-", color="red", label=string("Exact Value: ", 10))
# in the above, we plot a horizontal line at the 10 value

per_error = 100*abs((avg_list[end]-10)/(avg_list[end])) # calculate percent error

title(string(L"Monte Carlo Estimation of $10$--within ", round(per_error,5), "%")) # make titles and labels
ylabel("Estimation of 10")
xlabel("# of random points")

legend(bbox_to_anchor=(1.05,1), loc=2, borderaxespad=0.) # make legend

end
MC_errorbar (generic function with 1 method)

With MC_mean_plot, we can easily generalize the program in future modification.

Now, we plot a hundred values of MC samples with a maximum of a billion, with each set of samples performed fifteen times for statistical analysis. The resulting data shows a clear convergence of error.

1
MC_errorbar(MC_mean, 10^6, 100, 15)

PyObject <matplotlib.legend.Legend object at 0x1262c3fd0>

We can test our result by looking at the convergence of the standard deviation with increasing sample size. Recall the Monte Carlo estimation of the integral of some general function $f(x)$

$$
\inta^b f(x)dx = \frac{b-a}{N}\sum{i=0}^{N-1}E(f(x_i))
$$

The st.dev. squared can easily be calculated:

$$
\sigma{int}^2 = \frac{b-a}{N^2}\left( \sum{i=0}^{N-1}E(f(x_i)) \right)
$$

If the random variables $x_i$ are uncorrelated, we can easily simplify the above further:

$$
\begin{split}
\sigma{int}^2 &= \frac{b-a}{N^2}\sum{i=0}^{N-1}\sigma^2(E(f(x_i))) \
&= \frac{b-a}{N}\sigma^2(E(f(x)))
\end{split}
$$

Therefore, we find that the st.dev. goes as the inverse of the square of the sample size:

$$
\sigma_{int}\propto\frac{1}{\sqrt{N}}
$$

To test this result, we plot the numerically calculated st.dev. alongside $1/\sqrt{N}$

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function MC_errorbar_visual(f::Function, N::Int, m::Int, M::Int)

# f = some general function
# N = number of Monte Carlo steps
# m = number of points to plot
# M = number of times to repeat a single MC step to for statistical analysis

n = Int64(N/m) # define the step size between points to plot

stat_list = [stat_calc(f,i,M) for i in n:n:N] # make a list of averages and standard deviations from stat_calc
stdev_list = [stat_list[i][2] for i in 1:m] # create list of standard deviations

plot(collect(n:n:N), stdev_list, linestyle="--", marker=".", color="blue") # plot the list of st.dev from MC data

plot(collect(n:n:N), [1/i^(1/2) for i in n:n:N], linestyle="--", color="red") # plot the prediction for the st.dev

title(string("Convergence of Error bars in MC Data")) # make titles and labels
ylabel("Standard Deviation")
xlabel("# of random points")

legend(bbox_to_anchor=(1.05,1), loc=2, borderaxespad=0.) # make legend

end
MC_errorbar_visual (generic function with 1 method)

Running a maximum of a million samples in steps of a hundred, we find roughly the expected behavior, with each sample repeated fifteen times for statistical analysis.

1
MC_errorbar_visual(MC_mean, 10^6, 100, 15)

Problem 2

Here we want to estimate the probability of a point location, i.e. $P[x\in\Omega]$

we have already learned that

$$
P[x\in\Omega] = \int_{\Omega}p(x)\,dx=\int I(x)p(x)\,dx = E_p[I]
$$

Now, set $\Omega: (x_1-5)^2+(x_2-5)^2\leq 1 $ , $p(x)$ to be the PDF (probability density function) of a binormal distribution i.e. $x_1, x_2 \sim N(0,1)$

Use Importance Sampling method (choose best IS distribution $q(x)$) to get the best result.

The importance sampling method comes from a single idea like this

$$
\begin{align}
P[x\in\Omega]&= E_p[I]\
&= \int I(x)p(x)\,dx \
&= \int I(x)\frac{p(x)}{q(x)}\cdot q(x)\,dx = \int I(x)w(x)\cdot q(x)\,dx\
&= E_q[I\cdot w]
\end{align
}
$$

Thus

$$
\hat{I}\text{IS} = \frac1n \sum{i=1}^n I(x_i)w(x_i) \qquad x_i\in q(x)
$$

where

$\hat{I}_\text{IS}: $ IS estimator ; $q(x): $ IS distribution ; $w(x) :$ IS weight ;

Assume that $q(x)$ is the PDF of a binormal distribution i.e. $q\sim N({\mu_1,\mu_2},1)$

the problem should be solve in following steps:

  1. choose samples according to $q(x)$ ;
  2. use these samples to get the value of weight function $w(x)$ ;
  3. apply monte carlo method to the calculation of $E[I\cdot w]$ ;
  4. compare our result with the exact probability and filter the parameters ;

To get the best result, the fourth step is always needed.

Actually, we can directly calculate the exact probability

1
2
3
p=Integrate[
PDF[BinormalDistribution[{0, 0}, {1, 1}, 0], {x, y}], {x, 4, 6}, {y,
5 - Sqrt[1 - (x - 5)^2], 5 + Sqrt[1 - (x - 5)^2]}] // N

which gives

1
2.23465*10^-10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
using Distributions
@everywhere function MC_prob(μ1::Float64, μ2::Float64, N::Int)

# N = number of Monto Carlo points

# step 1. set initial values and draw random samples
r = -1
I = zeros(N,1)
x = rand(Normal(μ1, 1), N)
y = rand(Normal(μ2, 1), N)

# step 2. use these samples to get function values I, p, q, w
g = -(x-5).^2-(y-5).^2
## construct the indicator function
for i=1:1:N
if g[i] > r
I[i] = 1
else
I[i] = 0
end
end

p = 1/(2*π) * exp.(-1/2 * (x.^2 + y.^2))
q = 1/(2*π) * exp.(-1/2 * ((x-5).^2 + (y-5).^2))
w = p./q

# step 3. apply Monte Carlo to calculate the expectation
return mean(I.*w)
end

A natural thought is to set $\mu1=\mu2=5$, $N=10^5$, but the simulated probability seems to be flawed:

1
MC_prob(5.0, 5.0, 10^6)
2.2415303094407912e-10

Indicated by the parameter choosing rules, we can set $\mu1=\mu2$ and make a simple plot of errors on the specific line $y=x$.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
using PyPlot
PyPlot.svg(true)
function error_plotter(N::Int)
plot(collect(1.0:0.1:5.0), [MC_prob(i,i,N) for i in 1.0:0.1:5.0],
linestyle="--", marker=".", color="blue", label="Single Processor")

axhline(y=2.23465e-10, linestyle="-", color="red", label=string("Exact Value: ", 2.23465e-10))

title(string("Monte Carlo Estimation of probability"))
ylabel("Estimation of prob")
xlabel("μ1=μ2")

legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
end
error_plotter (generic function with 1 method)
1
error_plotter(10^6)

1
PyObject <matplotlib.legend.Legend object at 0x123259fd0>

As shown in the plot, $\mu1=\mu2=3.9$ is the best choice, though it is not always perfect.

For fixed parameters, we can do parallel calculation by setting a new function:

1
2
3
function fixed_MC_prob(N::Int)
MC_prob(3.9, 3.9, N)
end
fixed_MC_prob (generic function with 1 method)
1
fixed_MC_prob(10^6)
1
2.3208602460224826e-10

Recall the parallel_calc function before, it will be much more convenient to get the result in large number simulations by running:

1
parallel_calc(fixed_MC_prob, 10^9)

Also, we can obtain the elapsed time in the similar way with mean_time_plotter. (and not explained here)

Homework 2

Problem 3

Redo Problem 2 by Cross Entropy method. Find the best distribution $q(x)$ to get the persuasive result.

CE Method for rare probability estimation

Consider a rare probability

$$
P(x\in\Omega) \ll 1 \quad\Omega={x\mid g(x)>r }
$$

While applying CE method to solve the optimization problem

$$
\max{q’\in Q}\frac1n \sum{i=1}^nf(x_i)\ln q’(x_i)u
$$

we can easily find that $x_i\sim q\approx p$

Thus, $I(x_i)=0$ almost for all $x_i$. Obviously, it does nothing to our work.

Pratically, we usually use the variant of CE method i.e.

Multilevel / Multistage CE Method

Denote

$$
I_r(x)=\begin{cases}
1 & g(x)>r\
0 & g(x)<r
\end{cases}
$$

First we do sampling work, then choose $r$ according to the samples. Thus, we can fetch the specific number of $g(x)$ satisfying $g(x)< r$

$$
g(x_1)\leq g(x_2)\cdots\leq g(x_n)
$$

Set $0< \rho\leq 1$ to be the confidence coefficient, define that the $(1-\rho)$-th quantile of a r.v. $y$ is the $r$ satisfying that

$$
P[y\geq r] \geq \rho, \quad P[y\leq r]\leq 1-\rho
$$

Hence, we can derive an estimation of $r$

$$
r=y[(1-\rho)N]
$$

if ${y_i}$ is ordered assendingly $y_i=g(x_i)$

The basic algorithm goes like below:

Algorithom

  1. let $q=p$
  2. draw samples from $q:{ x_1,\cdot,x_n }$
  3. compute $\hat{r} = (1-\rho)$-th quantile of ${g(xi)}{i=1}^n$ if $\hat{r}>r$, let $\hat{r}=r$
  4. solve $\max{q’\in Q}\frac1n \sum{i=1}^nI_{\hat{r}}(x_i)\ln q’(x_i)w(x_i)$; let $q=q’$
  5. if $\hat{r}<r$, return to step 2, otherwise proceed to step 6
  6. estimate $\hat{P}=\frac1n \sum{i=1}^n I{r}(x_i)w(x_i)$ with $x_i\sim q(x)$

As for Problem 2, the corresponding functions and parameters should be:

$$
p\sim N(0,I_2) \quad g(x)=-(x-5)^2-(y-5)^2 \quad r=-1
$$

For fixed variance $\Sigma=I_2$, we can directly give the optimal mean value in step 4 as taught in lecture 2:
$$
\mu=\frac{E_q[xIw]}{E_q[Iw]}
$$

Hence, it is sufficient enough to give an implement:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
@everywhere function CE_prob(ρ::Float64, N::Int)

# step 1. set initial values
μ1 = 0
μ2 = 0
iter = 5000
r = -1
rhat = -10
p = zeros(N,1)
q = zeros(N,1)
w = zeros(N,1)
I = zeros(N,1)
k = 0

# step 5. set loops for calculation
while rhat < r && k < iter
# step 2. draw random samples on two diff coordinates
x = rand(Normal(μ1, 1), N)
y = rand(Normal(μ2, 1), N)
# step 3.
## sort the corresponding function value
g = -(x-5).^2-(y-5).^2
orderg = sort(g)
## comupte the quantile
ix = floor(Int,(1-ρ)*N)
rhat = orderg[ix]
## set constraint to narrow the search area
if rhat > r
rhat = r
end
# step 4.
## derive the optimization problem
p = 1/(2*π) * exp.(-1/2 * (x.^2 + y.^2))
q = 1/(2*π) * exp.(-1/2 * ((x-5).^2 + (y-5).^2))
w = p./q
## construct the indicator function, which implies the search region
for i=1:1:N
if g[i] > rhat
I[i] = 1
else
I[i] = 0
end
end
## update the sampling distribution
μ1 = mean(x .* I .* w)/mean(I .* w)
μ2 = mean(y .* I .* w)/mean(I .* w)

k += 1
end
# step 6. apply Monte Carlo to calculate the prob
mean(I.*w) # append μ1, μ2 to get the optimal value
end

Let’s make single simulation by setting $\rho = 0.2$ and $N=10^6$, the probability is shown below.

1
CE_prob(0.2,10^6)
1
2.9121974873535745e-10

The result is fair enough, but may raise the question in parameter choice of $\rho$. Now, let’s consider the quantile effect by program below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
using PyPlot
PyPlot.svg(true)
function quantile_effect(N::Int)
plot(collect(0.1:0.05:0.9), [CE_prob(i,N) for i in 0.1:0.05:0.9],
linestyle="--", marker=".", color="blue", label="Single Processor")

axhline(y=2.23465e-10, linestyle="-", color="red",
label=string("Exact Value: ", 2.23465e-10))

title(string("Monte Carlo Estimation of probability"))
ylabel("Estimation of prob")
xlabel("parameter: ρ")

legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
end
quantile_effect (generic function with 1 method)
1
quantile_effect(10^6)

PyObject <matplotlib.legend.Legend object at 0x136c6e050>

As shown in the picture, it is resonable for us to choose $\rho$ in $[0,0.5]$. For clarity, here we set $\rho=0.2$

Rewrite the code in CE_prob by appending $\mu1, \mu2$ to the return command, we will obtain the corresponding distribution and probability:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
using Distributions
@everywhere function CE(N::Int)

# step 1. set initial values
μ1 = 0
μ2 = 0
iter = 5000
ρ = 0.2
r = -1
rhat = -10
p = zeros(N,1)
q = zeros(N,1)
w = zeros(N,1)
I = zeros(N,1)
k = 0

# step 5. set loops for calculation
while rhat < r && k < iter
# step 2. draw random samples on two diff coordinates
x = rand(Normal(μ1, 1), N)
y = rand(Normal(μ2, 1), N)
# step 3.
## sort the corresponding function value
g = -(x-5).^2-(y-5).^2
orderg = sort(g)
## comupte the quantile
ix = floor(Int,(1-ρ)*N)
rhat = orderg[ix]
## set constraint to narrow the search area
if rhat > r
rhat = r
end
# step 4.
## derive the optimization problem
p = 1/(2*π) * exp.(-1/2 * (x.^2 + y.^2))
q = 1/(2*π) * exp.(-1/2 * ((x-5).^2 + (y-5).^2))
w = p./q
## construct the indicator function, which implies the search region
for i=1:1:N
if g[i] > rhat
I[i] = 1
else
I[i] = 0
end
end
## update the sampling distribution
μ1 = mean(x .* I .* w)/mean(I .* w)
μ2 = mean(y .* I .* w)/mean(I .* w)

k += 1
end
# step 6. apply Monte Carlo to calculate the prob
μ1, μ2, mean(I.*w)
end
1
CE(10^6)
(4.426701996662324, 4.423183271573601, 2.8762783318578916e-10)

Reference

  1. Blogs of Monte Carlo work in Github: http://onlookerliu.leanote.com/post/5c355c764900
  2. Gonçalo, Abecasis. “Monte Carlo Integration”, Biostatistics 615/815, Lecture 22: http://csg.sph.umich.edu/abecasis/class/2006/615.22.pdf
  3. Ho, Shirley. “Introduction to Monte Carlo”, Astro 542 at Princeton University: http://www.phys.ubbcluj.ro/~zneda/edu/mc/mcshort.pdf
  4. Julia language guide: http://julia-cn.readthedocs.io/zh_CN/latest/

Kai Liu, wrote on 2017/10/28

Student ID: 116071910015

Mail: liu1995@sjtu.edu.cn

文章作者: Monad Kai
文章链接: onlookerliu.github.io/2017/10/30/Assignment-1-of-Advanced-Computational-Method/
版权声明: 本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 Code@浮生记
支付宝打赏
微信打赏