Compute integral using Monte-Carlo

In this post we show how an integral can be computed using Monte-Carlo method and how combination with bootstrapping can increase precision.

Reference

Lets say we wan to integrate the cardinal sinus over the interval [-4, 4]. Graphically it is about finding the area under the curve by taking care to respect sign of positive (above x axis) and negative area (below x axis).

The area of this curve can be numerically computed by trapezoid or quadrature, or analytically.

def model(x):
    return np.sinc(x)

xlin = np.linspace(-4., 4., 20000)
ylin = model(xlin)

Itrap = integrate.trapezoid(ylin, x=xlin)
#  0.9499393464346436
Iquad = integrate.quad(model, xlin.min(), xlin.max())
# (0.9499393397673102, 9.485684460130983e-11)

Both method agrees with the 6 first significant figures and the quadrature value agrees up to its precision to the approximation given by Wolfram Alpha:

    \[0.9499393397673101546536691161536980057544800186610779643699317841\]

Method

The Monte-Carlo method works as follow:

  • Uniformly randomly sample N points on the curve extent [x_{\min} , x_{\max}] \times [y_{\min} , y_{\max}];
  • Determine whether the point lies inside the area of the function and in which part (positive or negative);
  • Compute the percentage of points lying in those areas and multiply by the area of the curve extent;
  • Compute the integral estimation by subtracting the negative area from the positive area;
  • Repeat the process N_b time to bootstrap the procedure and let the CLT come into action;

The function below implements such a procedure except bootstrapping:

def monte_carlo(x, y, model, N=1000, seed=12345):
    
    # Uniformly sample over [0;1]x[0;1]
    np.random.seed(seed)
    u = np.random.uniform(size=N)
    v = np.random.uniform(size=N)
    
    # Function extent:
    xsize = x.max() - x.min()
    ysize = y.max() - y.min()
    
    # Transform sample to span the extent:
    xmc = u * xsize + x.min()
    ymc = v * ysize + y.min()
    
    # Compute reference function for x sampled values:
    yref = model(xmc)
    
    # Identify negative points:
    qyneg = ymc < 0.
    
    # Identify postive area points:
    qIpos = ymc <= yref
    qIpos[qyneg] = False
    Ipos = np.sum(qIpos)
    
    # Identify negative area points:
    qIneg = ymc >= yref
    qIneg[~qyneg] = False
    Ineg = np.sum(qIneg)
    
    # Identify outside area points:
    qInot = ~(qIpos | qIneg)

    # Estimate integral:
    A = xsize * ysize    
    I = (Ipos - Ineg) / N * A

    return xmc, ymc, qInot, qIpos, qIneg, I

A typical output of such Monte-Carlo trial is shown below:

A single trial gives a poor estimate of the integral, hence the need of bootstrapping:

*_, I = monte_carlo(xlin, ylin, model, N=300_000, seed=1234567890)
# 0.9454496520667456

Now we will repeat the procedure for different sample size N and for each population size we will bootstrap to enhance the estimation:

Nb = 3000
np.random.seed(12345)
estimates = []
for s in np.random.randint(0, 999999999, size=Nb):
    for n in np.logspace(3, 6, 25, base=10):
        n = int(n)
        xmc, ymc, qInot, qIpos, qIneg, I = generate(xlin, ylin, model, N=n, seed=s)
        estimates.append({"s": s, "n": n, "I": I})
data = pd.DataFrame(estimates)

After few minutes the procedure finishes and we can aggregate bootstrap samples to see how it helps to converge towards the expected integral value:

summary = data.groupby(["n"])["I"].quantile([0.1, 0.25, 0.5, 0.75, 0.9]).unstack()
summary = summary.merge(data.groupby(["n"])["I"].mean().rename("mean"), left_index=True, right_index=True)

Which graphically leads to:

This figure confirms bootstrapping is necessary to make the Monte-Carlo process to converge towards the expected values. Percentiles give us an estimate on the range where a single Monte-Carlo trial can lie. Another observation is the bootstrap distribution seems relatively symmetrical.

Figures below illustrates this affirmation:

So in this case we show that it is possible to estimate the value of the integral using the Bootstrapped Monte-Carlo method up to 4 significant figures with an error reasonably symmetrical.

jlandercy
jlandercy
Articles: 22