Assignment 1 of Advanced Computational Method

Homework 1

Monte Carlo method is a method to compute the expectation of a random variable.

The basic steps follow like this:

draw random samples ${x_1, x_2, \cdots, x_n}$ according to $p(x)$.
derive the monte carlo estimator of $E(f)$ from

$$
\hat{f}{\text{mc}}=\frac{1}{n}\sum{i=1}^n f(x_i)
$$

Problem 1

Suppose $x_1, x2, \cdots, x{10}$ satisfy independent identical distributions i.e. $x_i\sim N(0,1)\quad i=1,2,\cdots,10$

Set $f(x)=\sum_{i=1}^{10}x_i^2$, use Monte Carlo method to compute $E[f(x)]$.

Compare your results with the exact value.

Solution:

From basic statistics, we can deduce that
$$
\begin{align}
Ef(x)&=E\sum_{i=1}^{10}x_i^2=10\,Ex_i^2\
&=10\,[Dx_i+(Ex_i)^2]\
&=10\times(1+0)\
&=10
\end{align}
$$

Thus, the exact value of $E[f(x)]$ equals 10.

Continuely, we derive a Monte Carlo method for evaluating $E[f(x)]$

using Distributions
@everywhere function MC_mean(N::Int)
    # N = number of Monto Carlo points

    x = rand(Normal(0, 1), N); # draw random samples

    return 10 * mean(x.^2) # return the Monte Carlo estimator of f
end

MC_mean(10^6)

1	9.98580160514502

A billion Monte Carlo steps yields an even better estimate:

1	MC_mean(10^9)

1	10.000288570895224

Now, let’s take a brief look at the time cost.

1	@elapsed MC_mean(10^6)

1	0.013395715

1	@elapsed MC_mean(10^9)

1	50.939757386

When it comes to Monte Carlo simulation, our goal is to get good data as fast as possible. For now, let’s look at how to speed up our code. First of all, we can easily notice the time cost grows out of all proportion. The calculation for 1 million Monte Carlo cost is over 1000 times faster than the billion’s.

1	(@elapsed MC_mean(10^9))/(@elapsed MC_mean(10^6))

1286.3957285556933

Of course, we can do better than this. In preparation for these calculations, let us introduce a parallel computation for Monte Carlo simulation. We build a function that takes MC_mean and evenly splits the jobs to our processors:

function parallel_calc(f::Function, N::Int, proc::Int=4)

    # f = some predefined function
    # N = number of Monte Carlo steps
    # proc = number of additional processors, preset to four

    n = round(Int, N/proc); # split the number of Monte Carlo steps equally between processors

    proc_sum = @parallel (+) for i=1:proc # loop through the processors and distribute jobs
        f(n)
    end

    return proc_sum / proc # return the average of the parallel jobs

end

parallel_calc (generic function with 2 methods)

Now, we can quickly and efficiently calculate a billion MC samples:

1	@elapsed parallel_calc(MC_mean, 10^9)

21.05725137

There is also an easier way to see the speed up of parallel processing by plotting the elapsed time. Below, we define a simple code to plot the elapsed times of MC_mean(in blue) and parallel (in red) vs. the number of MC samplings:

using PyPlot
PyPlot.svg(true)

function mean_time_plotter(N::Int)
    n = Int64(N/10) # ensure that you only plot 10 points

    plot(collect(1:n:N),[@elapsed MC_mean(i) for i in 10:n:N], linestyle="--", marker=".", color="blue", label="Single Processor")

    # the above plots mean for 10 values of MC steps from 10 to N vs. the number of MC steps

    plot(collect(1:n:N),[@elapsed parallel_calc(MC_mean,i) for i in 10:n:N], linestyle="--", marker=".", color="red", label="Parallel Processor")

    title("Comparison of Parallel vs. Single Processor MC Efficiency") # make title and labels
    ylabel("Elapsed Time (sec)")
    xlabel("# of random points")

    legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.) # anchor the legend

end

mean_time_plotter (generic function with 1 method)

1	mean_time_plotter(10^9)

PyObject <matplotlib.legend.Legend object at 0x132777510>

We can see that the elapsed time of simulation is approximately equal around 0.7 billion MC samplings, after which parallel processing begins to win in favor of the single processor. This trend is continued when we look at a billion MC samples.

Now, we turn to present this MC data and visualize the error. To do this, let’s write a simple program to plot the output of parallel vs. the number of Monte Carlo samplings. Because we already know what the mean is, let’s also plot the accepted value as a straight line in red. Finally, let’s also calculate the percent error and display it in the title. The format used here will be useful in later code where we have to compare with some exact analytical value. The following tedious codes (if I remember to present) are quite similar to each other.

However, one thing unfortunate is that the parallel efficiency does not high within one billion samples. In proper case, for example, we need fast result, single processor is fairly enough.

function MC_mean_plotter(f::Function, N::Int, m::Int)

    n = Int64(N/m) # define the step size between points to plot

    mean_list = [parallel_calc(f,i) for i in n:n:N] # generate list of mean values from parallel for i MC steps

    plot(collect(n:n:N), mean_list, linestyle="--", marker=".",
        color="blue", label=string(L"MC Estimator: ", round(mean_list[end],6)))
    # in the above, we plot the MC estimate of 10 vs. the number of MC steps

    axhline(y=10, linestyle="-", color="red", label=string("Exact Value: ", 10))
    # in the above, we plot a horizontal line at the 10 value

    per_error = 100*abs((mean_list[end]-10)/(mean_list[end])) # calculate percent error from mean_list

    title(string(L"Monte Carlo Estimation of $10$--within ", round(per_error,5), "%")) # make titles and labels
    ylabel("Estimation of 10")
    xlabel("# of random points")

    legend(bbox_to_anchor=(1.05,1), loc=2, borderaxespad=0.) # make legend

end

MC_mean_plotter (generic function with 1 method)

To get a nice plot, we take a maximum of a million Monte Carlo samples and plot a total of a hundred points:

1	MC_mean_plotter(MC_mean, 10^9, 100)

PyObject <matplotlib.legend.Legend object at 0x1326e2cd0>

Although we appear to obtain a reasonable value for 10(below 0.00421%!), it is difficult to see how well the data is converging. Therefore, let’s build a new function that calculates a list of $M$ Monte Carlo estimates of 10 at a given value of $N$. From this list, we can obtain the average and standard deviation:

function stat_calc(f::Function, N::Int, M::Int)

# f = some general function
# N = number of Monte Carlo steps
# M = number of times to repeat a single MC step to for statistical analysis

    avg_list = [parallel_calc(f,N) for j in 1:M] # create list of MC data of length M

    avg = sum(avg_list)/M # calculate average of MC data

    stdev = std(avg_list) # calculate standard deviation of MC data

    return avg, stdev

end

stat_calc (generic function with 1 method)

Now, we can modify MC_mean_plotter to include error bars to better visualize the convergence of our data:

function MC_errorbar(f::Function, N::Int, m::Int, M::Int)

# f = some general function
# N = number of Monte Carlo steps
# m = number of points to plot
# M = number of times to repeat a single MC step to for statistical analysis

    n = Int64(N/m) # define the step size between points to plot

    stat_list = [stat_calc(f,i,M) for i in n:n:N] # make a list of averages and standard deviations from stat_calc
    avg_list = [stat_list[i][1] for i in 1:m] # create list of averages
    stdev_list = [stat_list[i][2] for i in 1:m] # create list of standard deviations

    errorbar(collect(n:n:N), avg_list, yerr=stdev_list, linestyle="--", marker=".",
        color="blue", label=string("MC Estimate: ", round(avg_list[end], 6)))
    # in the above, we plot the averages vs. MC step size with error bars of the standard deviation

    axhline(y=10, linestyle="-", color="red", label=string("Exact Value: ", 10))
    # in the above, we plot a horizontal line at the 10 value

    per_error = 100*abs((avg_list[end]-10)/(avg_list[end])) # calculate percent error

    title(string(L"Monte Carlo Estimation of $10$--within ", round(per_error,5), "%")) # make titles and labels
    ylabel("Estimation of 10")
    xlabel("# of random points")

    legend(bbox_to_anchor=(1.05,1), loc=2, borderaxespad=0.) # make legend

end

MC_errorbar (generic function with 1 method)

With MC_mean_plot, we can easily generalize the program in future modification.

Now, we plot a hundred values of MC samples with a maximum of a billion, with each set of samples performed fifteen times for statistical analysis. The resulting data shows a clear convergence of error.

1	MC_errorbar(MC_mean, 10^6, 100, 15)

PyObject <matplotlib.legend.Legend object at 0x1262c3fd0>

We can test our result by looking at the convergence of the standard deviation with increasing sample size. Recall the Monte Carlo estimation of the integral of some general function $f(x)$

$$
\inta^b f(x)dx = \frac{b-a}{N}\sum{i=0}^{N-1}E(f(x_i))
$$

The st.dev. squared can easily be calculated:

$$
\sigma{int}^2 = \frac{b-a}{N^2}\left( \sum{i=0}^{N-1}E(f(x_i)) \right)
$$

If the random variables $x_i$ are uncorrelated, we can easily simplify the above further:

$$
\begin{split}
\sigma{int}^2 &= \frac{b-a}{N^2}\sum{i=0}^{N-1}\sigma^2(E(f(x_i))) \
&= \frac{b-a}{N}\sigma^2(E(f(x)))
\end{split}
$$

Therefore, we find that the st.dev. goes as the inverse of the square of the sample size:

$$
\sigma_{int}\propto\frac{1}{\sqrt{N}}
$$

To test this result, we plot the numerically calculated st.dev. alongside $1/\sqrt{N}$

function MC_errorbar_visual(f::Function, N::Int, m::Int, M::Int)

# f = some general function
# N = number of Monte Carlo steps
# m = number of points to plot
# M = number of times to repeat a single MC step to for statistical analysis

    n = Int64(N/m) # define the step size between points to plot

    stat_list = [stat_calc(f,i,M) for i in n:n:N] # make a list of averages and standard deviations from stat_calc
    stdev_list = [stat_list[i][2] for i in 1:m] # create list of standard deviations

    plot(collect(n:n:N), stdev_list, linestyle="--", marker=".", color="blue") # plot the list of st.dev from MC data

    plot(collect(n:n:N), [1/i^(1/2) for i in n:n:N], linestyle="--", color="red") # plot the prediction for the st.dev

    title(string("Convergence of Error bars in MC Data")) # make titles and labels
    ylabel("Standard Deviation")
    xlabel("# of random points")

    legend(bbox_to_anchor=(1.05,1), loc=2, borderaxespad=0.) # make legend

end

MC_errorbar_visual (generic function with 1 method)

Running a maximum of a million samples in steps of a hundred, we find roughly the expected behavior, with each sample repeated fifteen times for statistical analysis.

1	MC_errorbar_visual(MC_mean, 10^6, 100, 15)

Problem 2

Here we want to estimate the probability of a point location, i.e. $P[x\in\Omega]$

we have already learned that

$$
P[x\in\Omega] = \int_{\Omega}p(x)\,dx=\int I(x)p(x)\,dx = E_p[I]
$$

Now, set $\Omega: (x_1-5)^2+(x_2-5)^2\leq 1 $ , $p(x)$ to be the PDF (probability density function) of a binormal distribution i.e. $x_1, x_2 \sim N(0,1)$

Use Importance Sampling method (choose best IS distribution $q(x)$) to get the best result.

The importance sampling method comes from a single idea like this

$$
\begin{align}
P[x\in\Omega]&= E_p[I]\
&= \int I(x)p(x)\,dx \
&= \int I(x)\frac{p(x)}{q(x)}\cdot q(x)\,dx = \int I(x)w(x)\cdot q(x)\,dx\
&= E_q[I\cdot w]
\end{align}
$$

Thus

$$
\hat{I}\text{IS} = \frac1n \sum{i=1}^n I(x_i)w(x_i) \qquad x_i\in q(x)
$$

where

$\hat{I}_\text{IS}: $ IS estimator ; $q(x): $ IS distribution ; $w(x) :$ IS weight ;

Assume that $q(x)$ is the PDF of a binormal distribution i.e. $q\sim N({\mu_1,\mu_2},1)$

the problem should be solve in following steps:

choose samples according to $q(x)$ ;
use these samples to get the value of weight function $w(x)$ ;
apply monte carlo method to the calculation of $E[I\cdot w]$ ;
compare our result with the exact probability and filter the parameters ;

To get the best result, the fourth step is always needed.

Actually, we can directly calculate the exact probability

1
2
3

p=Integrate[
  PDF[BinormalDistribution[{0, 0}, {1, 1}, 0], {x, y}], {x, 4, 6}, {y,
    5 - Sqrt[1 - (x - 5)^2], 5 + Sqrt[1 - (x - 5)^2]}] // N

which gives

1	2.23465*10^-10

using Distributions
@everywhere function MC_prob(μ1::Float64, μ2::Float64, N::Int)

# N = number of Monto Carlo points

# step 1. set initial values and draw random samples
    r = -1
    I = zeros(N,1)
    x = rand(Normal(μ1, 1), N)
    y = rand(Normal(μ2, 1), N)

# step 2. use these samples to get function values I, p, q, w
    g = -(x-5).^2-(y-5).^2
    ## construct the indicator function
    for i=1:1:N
        if g[i] > r
            I[i] = 1
        else
            I[i] = 0
        end
    end

    p = 1/(2*π) * exp.(-1/2 * (x.^2 + y.^2))
    q = 1/(2*π) * exp.(-1/2 * ((x-5).^2 + (y-5).^2))
    w = p./q

# step 3. apply Monte Carlo to calculate the expectation
    return mean(I.*w)
end

A natural thought is to set $\mu1=\mu2=5$, $N=10^5$, but the simulated probability seems to be flawed:

1	MC_prob(5.0, 5.0, 10^6)

2.2415303094407912e-10

Indicated by the parameter choosing rules, we can set $\mu1=\mu2$ and make a simple plot of errors on the specific line $y=x$.

using PyPlot
PyPlot.svg(true)
function error_plotter(N::Int)
    plot(collect(1.0:0.1:5.0), [MC_prob(i,i,N) for i in 1.0:0.1:5.0],
    linestyle="--", marker=".", color="blue", label="Single Processor")

    axhline(y=2.23465e-10, linestyle="-", color="red", label=string("Exact Value: ", 2.23465e-10))

    title(string("Monte Carlo Estimation of probability"))
    ylabel("Estimation of prob")
    xlabel("μ1=μ2")

    legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
end

error_plotter (generic function with 1 method)

1	error_plotter(10^6)

1	PyObject <matplotlib.legend.Legend object at 0x123259fd0>

As shown in the plot, $\mu1=\mu2=3.9$ is the best choice, though it is not always perfect.

For fixed parameters, we can do parallel calculation by setting a new function:

1
2
3

function fixed_MC_prob(N::Int)
    MC_prob(3.9, 3.9, N)
end

fixed_MC_prob (generic function with 1 method)

1	fixed_MC_prob(10^6)

1	2.3208602460224826e-10

Recall the parallel_calc function before, it will be much more convenient to get the result in large number simulations by running:

1	parallel_calc(fixed_MC_prob, 10^9)

Also, we can obtain the elapsed time in the similar way with mean_time_plotter. (and not explained here)

Homework 2

Problem 3

Redo Problem 2 by Cross Entropy method. Find the best distribution $q(x)$ to get the persuasive result.

CE Method for rare probability estimation

Consider a rare probability

$$
P(x\in\Omega) \ll 1 \quad\Omega={x\mid g(x)>r }
$$

While applying CE method to solve the optimization problem

$$
\max{q’\in Q}\frac1n \sum{i=1}^nf(x_i)\ln q’(x_i)u
$$

we can easily find that $x_i\sim q\approx p$

Thus, $I(x_i)=0$ almost for all $x_i$. Obviously, it does nothing to our work.

Pratically, we usually use the variant of CE method i.e.

Multilevel / Multistage CE Method

Denote

$$
I_r(x)=\begin{cases}
1 & g(x)>r\
0 & g(x)<r
\end{cases}
$$

First we do sampling work, then choose $r$ according to the samples. Thus, we can fetch the specific number of $g(x)$ satisfying $g(x)< r$

$$
g(x_1)\leq g(x_2)\cdots\leq g(x_n)
$$

Set $0< \rho\leq 1$ to be the confidence coefficient, define that the $(1-\rho)$-th quantile of a r.v. $y$ is the $r$ satisfying that

$$
P[y\geq r] \geq \rho, \quad P[y\leq r]\leq 1-\rho
$$

Hence, we can derive an estimation of $r$

$$
r=y[(1-\rho)N]
$$

if ${y_i}$ is ordered assendingly $y_i=g(x_i)$

The basic algorithm goes like below:

Algorithom

let $q=p$
draw samples from $q:{ x_1,\cdot,x_n }$
compute $\hat{r} = (1-\rho)$-th quantile of ${g(xi)}{i=1}^n$ if $\hat{r}>r$, let $\hat{r}=r$
solve $\max{q’\in Q}\frac1n \sum{i=1}^nI_{\hat{r}}(x_i)\ln q’(x_i)w(x_i)$; let $q=q’$
if $\hat{r}<r$, return to step 2, otherwise proceed to step 6
estimate $\hat{P}=\frac1n \sum{i=1}^n I{r}(x_i)w(x_i)$ with $x_i\sim q(x)$

As for Problem 2, the corresponding functions and parameters should be:

$$
p\sim N(0,I_2) \quad g(x)=-(x-5)^2-(y-5)^2 \quad r=-1
$$

For fixed variance $\Sigma=I_2$, we can directly give the optimal mean value in step 4 as taught in lecture 2:
$$
\mu=\frac{E_q[xIw]}{E_q[Iw]}
$$

Hence, it is sufficient enough to give an implement:

@everywhere function CE_prob(ρ::Float64, N::Int)

# step 1. set initial values
μ1 = 0
μ2 = 0
iter = 5000
r = -1
rhat = -10
p = zeros(N,1)
q = zeros(N,1)
w = zeros(N,1)
I = zeros(N,1)
k = 0

# step 5. set loops for calculation
    while rhat < r && k < iter
# step 2. draw random samples on two diff coordinates
        x = rand(Normal(μ1, 1), N)
        y = rand(Normal(μ2, 1), N)
# step 3.
        ## sort the corresponding function value
        g = -(x-5).^2-(y-5).^2
        orderg = sort(g)
        ## comupte the quantile
        ix = floor(Int,(1-ρ)*N)
        rhat = orderg[ix]
        ## set constraint to narrow the search area
        if rhat > r
            rhat = r
        end
# step 4.
        ## derive the optimization problem
        p = 1/(2*π) * exp.(-1/2 * (x.^2 + y.^2))
        q = 1/(2*π) * exp.(-1/2 * ((x-5).^2 + (y-5).^2))
        w = p./q
        ## construct the indicator function, which implies the search region
        for i=1:1:N
            if g[i] > rhat
                I[i] = 1
            else
                I[i] = 0
            end
        end
        ## update the sampling distribution
        μ1 = mean(x .* I .* w)/mean(I .* w)
        μ2 = mean(y .* I .* w)/mean(I .* w)

        k += 1
    end
# step 6. apply Monte Carlo to calculate the prob
    mean(I.*w) # append μ1, μ2 to get the optimal value
end

Let’s make single simulation by setting $\rho = 0.2$ and $N=10^6$, the probability is shown below.

1	CE_prob(0.2,10^6)

1	2.9121974873535745e-10

The result is fair enough, but may raise the question in parameter choice of $\rho$. Now, let’s consider the quantile effect by program below:

using PyPlot
PyPlot.svg(true)
function quantile_effect(N::Int)
    plot(collect(0.1:0.05:0.9), [CE_prob(i,N) for i in 0.1:0.05:0.9],
    linestyle="--", marker=".", color="blue", label="Single Processor")

    axhline(y=2.23465e-10, linestyle="-", color="red",
            label=string("Exact Value: ", 2.23465e-10))

    title(string("Monte Carlo Estimation of probability"))
    ylabel("Estimation of prob")
    xlabel("parameter: ρ")

    legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
end

quantile_effect (generic function with 1 method)

1	quantile_effect(10^6)

PyObject <matplotlib.legend.Legend object at 0x136c6e050>

As shown in the picture, it is resonable for us to choose $\rho$ in $[0,0.5]$. For clarity, here we set $\rho=0.2$

Rewrite the code in CE_prob by appending $\mu1, \mu2$ to the return command, we will obtain the corresponding distribution and probability:

using Distributions
@everywhere function CE(N::Int)

# step 1. set initial values
μ1 = 0
μ2 = 0
iter = 5000
ρ = 0.2
r = -1
rhat = -10
p = zeros(N,1)
q = zeros(N,1)
w = zeros(N,1)
I = zeros(N,1)
k = 0

# step 5. set loops for calculation
    while rhat < r && k < iter
# step 2. draw random samples on two diff coordinates
        x = rand(Normal(μ1, 1), N)
        y = rand(Normal(μ2, 1), N)
# step 3.
        ## sort the corresponding function value
        g = -(x-5).^2-(y-5).^2
        orderg = sort(g)
        ## comupte the quantile
        ix = floor(Int,(1-ρ)*N)
        rhat = orderg[ix]
        ## set constraint to narrow the search area
        if rhat > r
            rhat = r
        end
# step 4.
        ## derive the optimization problem
        p = 1/(2*π) * exp.(-1/2 * (x.^2 + y.^2))
        q = 1/(2*π) * exp.(-1/2 * ((x-5).^2 + (y-5).^2))
        w = p./q
        ## construct the indicator function, which implies the search region
        for i=1:1:N
            if g[i] > rhat
                I[i] = 1
            else
                I[i] = 0
            end
        end
        ## update the sampling distribution
        μ1 = mean(x .* I .* w)/mean(I .* w)
        μ2 = mean(y .* I .* w)/mean(I .* w)

        k += 1
    end
# step 6. apply Monte Carlo to calculate the prob
    μ1, μ2, mean(I.*w)
end

CE(10^6)

(4.426701996662324, 4.423183271573601, 2.8762783318578916e-10)

Reference

Blogs of Monte Carlo work in Github: http://onlookerliu.leanote.com/post/5c355c764900
Gonçalo, Abecasis. “Monte Carlo Integration”, Biostatistics 615/815, Lecture 22: http://csg.sph.umich.edu/abecasis/class/2006/615.22.pdf
Ho, Shirley. “Introduction to Monte Carlo”, Astro 542 at Princeton University: http://www.phys.ubbcluj.ro/~zneda/edu/mc/mcshort.pdf
Julia language guide: http://julia-cn.readthedocs.io/zh_CN/latest/

Kai Liu, wrote on 2017/10/28

Student ID: 116071910015

Mail: liu1995@sjtu.edu.cn