Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Lattice Rules for Quasi-Monte Carlos - Pierre L’Ecuyer, Aug 28, 2017

Draft
1
Lattice Rules
for Quasi-Monte Carlo
Pierre L’Ecuyer
SAMSI Opening workshop on QMC, Duke University, August 2017

Draft
2
Monte Carlo integration
Want to estimate
µ = µ(f ) =
[0,1)s
f (u) du = E[f (U)]
where f : [0, 1)s → R and U is a uniform r.v. over [0, 1)s.
Standard (or crude) Monte Carlo:
Generate n independent copies of U, say U1, . . . , Un;
estimate µ by ˆµn = 1
n
n
i=1 f (Ui ).

Draft
2
Monte Carlo integration
Want to estimate
µ = µ(f ) =
[0,1)s
f (u) du = E[f (U)]
where f : [0, 1)s → R and U is a uniform r.v. over [0, 1)s.
Standard (or crude) Monte Carlo:
Generate n independent copies of U, say U1, . . . , Un;
estimate µ by ˆµn = 1
n
n
i=1 f (Ui ).
Almost sure convergence as n → ∞ (strong law of large numbers).
For conﬁdence interval of level 1 − α, can use central limit theorem:
P µ ∈ ˆµn −
cαSn
√
n
, ˆµn +
cαSn
√
n
≈ 1 − α,
where S2
n is any consistent estimator of σ2 = Var[f (U)].

Draft
3
Quasi-Monte Carlo (QMC)
Replace the independent random points Ui by a set of deterministic points
Pn = {u0, . . . , un−1} that cover [0, 1)s more evenly.
Estimate
µ =
[0,1)s
f (u)du by Qn =
1
n
n−1
i=0
f (ui ).
Integration error En = Qn − µ.
Pn is called a highly-uniform point set or low-discrepancy point set if some measure of
discrepancy between the empirical distribution of Pn and the uniform distribution converges
to 0 faster than O(n−1/2) (the typical rate for independent random points).
Main construction methods: lattice rules and digital nets.

Draft
4
Simple case: one dimension (s = 1)
Obvious solutions:
Pn = Zn/n = {0, 1/n, . . . , (n − 1)/n} (left Riemann sum):
0 10.5
which gives Qn =
1
n
n−1
i=0
f (i/n), and En = O(n−1
) if f is bounded,

Draft
4
Simple case: one dimension (s = 1)
Obvious solutions:
Pn = Zn/n = {0, 1/n, . . . , (n − 1)/n} (left Riemann sum):
0 10.5
which gives Qn =
1
n
n−1
i=0
f (i/n), and En = O(n−1
) if f is bounded,
or Pn = {1/(2n), 3/(2n), . . . , (2n − 1)/(2n)} (midpoint rule):
0 10.5
for which En = O(n−2) if f is bounded.

Draft
5
Simplistic solution for s > 1: rectangular grid
Pn = {(i1/d, . . . , is/d) such that 0 ≤ ij < d ∀j} where n = ds.
0 1
1
ui,1
ui,2

Draft
5
Simplistic solution for s > 1: rectangular grid
Pn = {(i1/d, . . . , is/d) such that 0 ≤ ij < d ∀j} where n = ds.
0 1
1
ui,1
ui,2
Midpoint rule in s dimensions. Quickly becomes impractical when s increases.
Moreover, each one-dimensional projection has only d distinct points,
each two-dimensional projections has only d2 distinct points, etc.

Draft
6
Lattice rules
[Korobov 1959; Sloan and Joe 1994; etc.] Integration lattice:
Ls =



v =
s
j=1
zj vj such that each zj ∈ Z



,
where v1, . . . , vs ∈ Rs are linearly independent over R and where Ls contains Zs.
Lattice rule: Take Pn = {u0, . . . , un−1} = Ls ∩ [0, 1)s.
Each coordinate of each point must be a multiple of 1/n.

Draft
6
Lattice rules
Ls =



v =
s
j=1



,
Lattice rule of rank 1: ui = iv1 mod 1 for i = 0, . . . , n − 1,
where nv1 = a = (a1, . . . , as) ∈ {0, 1, . . . , n − 1}s.
Korobov rule: a = (1, a, a2 mod n, . . . ).

Draft
6
Lattice rules
Ls =



v =
s
j=1



,
Lattice rule of rank 1: ui = iv1 mod 1 for i = 0, . . . , n − 1,
where nv1 = a = (a1, . . . , as) ∈ {0, 1, . . . , n − 1}s.
Korobov rule: a = (1, a, a2 mod n, . . . ).
For any u ⊂ {1, . . . , s}, the projection Ls(u) of Ls is also a lattice.
Fully projection-regular: Each Pn(u) = Ls(u) ∩ [0, 1)|u| contains n distinct points.
For a rule of rank 1: true iﬀ gcd(n, aj ) = 1 for all j.

Draft
7
Example: lattice with s = 2, n = 101, v1 = (1, 12)/n
0 1
1
ui,1
ui,2
v1
Here, each one-dimensional projection is {0, 1/n, . . . , (n − 1)/n}.

Draft
8
Another example: s = 2, n = 1021, v1 = (1, 90)/n
Pn = {ui = iv1 mod 1 : i = 0, . . . , n − 1}
= {(i/1021, (90i/1021) mod 1) : i = 0, . . . , 1020}.
0 1
1
ui,1
ui,2
v1

Draft
9
A bad lattice: s = 2, n = 101, v1 = (1, 51)/n
0 1
1
ui,1
ui,2
v1
Good uniformity for one-dimensional projections, but not in two dim.

Draft
10
Sequence of imbedded lattices
Sequence of lattices L1
s ⊂ L2
s ⊂ L3
s ⊂ . . . .
Lξ
s ∩ [0, 1)s contains nξ points: nξ−1 divides nξ for all ξ.

Draft
10
Sequence of imbedded lattices
Sequence of lattices L1
s ⊂ L2
s ⊂ L3
s ⊂ . . . .
Lξ
s ∩ [0, 1)s contains nξ points: nξ−1 divides nξ for all ξ.
Simple case: lattices of rank 1, nξ = 2ξ, aξ mod nξ−1 = aξ−1.
Then, aξ,j = aξ−1,j or aξ,j = aξ−1,j + nξ−1.
[Cranley and Patterson 1976, Joe 1990, Hickernell et al. 2001, Kuo et al. 2006]

Draft
11
Error in terms of Fourier expansion of f
f (u) =
h∈Zs
ˆf (h) exp(2πi h · u),
with Fourier coeﬃcients
ˆf (h) =
[0,1)s
f (u) exp(−2πi h · u)du.
If this series converges absolutely, then
En = Qn − µ =
0=h∈L∗
s
ˆf (h)
where L∗
s = {h ∈ Rs : htv ∈ Z for all v ∈ Ls} ⊆ Zs is the dual lattice.

Draft
12
Let α > 0 be an even integer. If f has square-integrable mixed partial derivatives up to order
α/2 > 0, and the periodic continuation of its derivatives up to order α/2 − 1 is continuous
across the unit cube boundaries, then
|ˆf (h)| = O((max(1, h1) · · · max(1, hs))−α/2
).
Moreover, there are rank-1 integration lattices for which
Pα/2 :=
0=h∈L∗
s
(max(1, h1) · · · max(1, hs))−α/2
= O(n−α/2+
),
and they are easy to ﬁnd via CBC construction. This criterion was proposed long ago as a
ﬁgure of merit, usually with α = 2. This is the error for a worst-case f for which
|ˆf (h)| = (max(1, |h1|) · · · max(1, |hs|))−α/2.

Draft
12
|ˆf (h)| = O((max(1, h1) · · · max(1, hs))−α/2
).
Moreover, there are rank-1 integration lattices for which
Pα/2 :=
0=h∈L∗
s
(max(1, h1) · · · max(1, hs))−α/2
= O(n−α/2+
),
and they are easy to find via CBC construction. This criterion was proposed long ago as a
figure of merit, usually with α = 2. This is the error for a worst-case f for which
|ˆf (h)| = (max(1, |h1|) · · · max(1, |hs|))−α/2.
Unfortunately, this bound becomes rapidly useless (in many ways) when s increases. But it
can be generalized in various directions: put different weights w(h) on vectors h, or on
subsets of coordinates; truncate the series if α is not even; etc.
Notions of tractability... Also hard to estimate the error.

Draft
13
Randomized quasi-Monte Carlo (RQMC)
ˆµn,rqmc =
1
n
n−1
i=0
f (Ui ),
with Pn = {U0, . . . , Un−1} ⊂ (0, 1)s an RQMC point set:
(i) each point Ui has the uniform distribution over (0, 1)s;
(ii) Pn as a whole is a low-discrepancy point set.
E[ˆµn,rqmc] = µ (unbiased).
Var[ˆµn,rqmc] =
Var[f (Ui )]
n
+
2
n2
i<j
Cov[f (Ui ), f (Uj )].
We want to make the last sum as negative as possible.
Weaker attempts to do the same: antithetic variates (n = 2), Latin hypercube sampling
(LHS), stratiﬁcation, ...

Draft
14
Variance estimation:
Can compute m independent realizations X1, . . . , Xm of ˆµn,rqmc, then estimate µ and
Var[ˆµn,rqmc] by their sample mean ¯Xm and sample variance S2
m. Could be used to compute a
conﬁdence interval.
Temptation: assume that ¯Xm has the normal distribution.
Beware: often wrong unless m → ∞.
CLT for ﬁxed m and n → ∞ holds for nets with Owen nested scrambling, but not for
randomly-shifted lattice rules.

Draft
15
Randomly-Shifted Lattice
Example: lattice with s = 2, n = 101, v1 = (1, 12)/101
0 1
1
ui,1
ui,2

Draft
15
Randomly-Shifted Lattice
Example: lattice with s = 2, n = 101, v1 = (1, 12)/101
0 1
1
ui,1
ui,2
U

Draft
16
A small lattice shifted by the red vector, modulo 1.
0 1
1
ui,2
ui,1
0
1
2
3
4
5
6
7

Draft
16
A small lattice shifted by the red vector, modulo 1.
0 1
1
ui,2
ui,1
0
1
2
3
4
5
6
7
0 1
1
Ui,2
Ui,1
5
6
7
0
1
2
3
4

Draft
17
Can generate the shift uniformly in the parallelotope determined by basis vectors,
0 1
1
ui,2
ui,1

Draft
17
Can generate the shift uniformly in the parallelotope determined by basis vectors,
0 1
1
ui,2
ui,1
0 1
1
Ui,2
Ui,1

Draft
17
Can generate the shift uniformly in the parallelotope determined by basis vectors, or in any
shifted copy of it.
0 1
1
ui,2
ui,1
0 1
1
Ui,2
Ui,1

Draft
18
Perhaps less obvious: Can generate it in any of the colored shapes below.
0 1
1
Ui,2
Ui,1

Draft
18
Perhaps less obvious: Can generate it in any of the colored shapes below.
0 1
1
Ui,2
Ui,1
0 1
1
Ui,2
Ui,1

Draft
19
Generating the shift uniformly in one tile
Proposition. Let R ⊂ [0, 1)s such that
{Ri = (R + ui ) mod 1, i = 0, . . . , n − 1}
is a partition of [0, 1)s in n regions of volume 1/n.
Then, sampling the random shift U uniformly in any given Ri is equivalent to sampling it
uniformly in [0, 1)s.
The error function
gn(U) = ˆµn,rqmc − µ
over any Ri is the same as over R.

Draft
20
Error function gn(u) for f (u1, u2) = (u1 − 1/2) (u2 − 1/2).
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
u1
u2
−0.2
−0.1
0
0.1
0.2
0.3

Draft
21
Error function gn(u) for f (u1, u2) = (u1 − 1/2) + (u2 − 1/2).
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
u1
u2
−1
−0.5
0
0.5
1

Draft
22
Error function gn(u) for f (u1, u2) = u1u2 (u1 − 1/2) (u2 − 1/2).
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
ux
uy
−1
0
1
·10−2

Draft
23
Variance for randomly-shifted lattice
Suppose f has Fourier expansion
f (u) =
h∈Zs
ˆf (h)e2πihtu
.
For a randomly shifted lattice, the exact variance is (always)
Var[ˆµn,rqmc] =
0=h∈L∗
s
|ˆf (h)|2
,
where L∗
s is the dual lattice.
From the viewpoint of variance reduction, an optimal lattice for given f minimizes the
square “discrepancy” D2(Pn) = Var[ˆµn,rqmc].

Draft
24
Var[ˆµn,rqmc] =
0=h∈L∗
s
|ˆf (h)|2
.
|ˆf (h)|2
= O((max(1, h1) · · · max(1, hs))−α
).
Moreover, there is a vector v1 = v1(n) such that
Pα :=
0=h∈L∗
s
(max(1, h1) · · · max(1, hs))−α
= O(n−α+
).
This Pα is the variance for a worst-case f having
|ˆf (h)|2
= (max(1, |h1|) · · · max(1, |hs|))−α
.
A larger α means a smoother f and a faster convergence rate.

Draft
25
If α is an even integer, this worst-case f is
f ∗
(u) =
u⊆{1,...,s} j∈u
(2π)α/2
(α/2)!
Bα/2(uj ).
where Bα/2 is the Bernoulli polynomial of degree α/2.
In particular, B1(u) = u − 1/2 and B2(u) = u2 − u + 1/6.
Easy to compute Pα and to search for good lattices in this case!
However: This worst-case function is not necessarily representative of what happens in
applications. Also, the hidden factor in O increases quickly with s, so this result is not very
useful for large s.
To get a bound that is uniform in s, the Fourier coeﬃcients must decrease rapidly with the
dimension and “size” of vectors h; that is, f must be “smoother” in high-dimensional
projections. This is typically what happens in applications for which RQMC is eﬀective!

Draft
26
A very general weighted Pα
Pα can be generalized by giving diﬀerent weights w(h) to the vectors h:
˜Pα :=
0=h∈L∗
s
w(h)(max(1, |h1|) · · · max(1, |hs|))−α
.
But how do we choose these weights? There are too many!
The optimal weights to minimize the variance are:
w(h) = (max(1, |h1|) · · · max(1, |hs|))α
|ˆf (h)|2
.

Draft
27
ANOVA decomposition
The Fourier expansion has too many terms to handle.
As a cruder expansion, we can write f (u) = f (u1, . . . , us) as:
f (u) =
u⊆{1,...,s}
fu(u) = µ +
s
i=1
f{i}(ui ) +
s
i,j=1
f{i,j}(ui , uj ) + · · ·
where
fu(u) =
[0,1)|¯u|
f (u) du¯u −
v⊂u
fv(uv),
and the Monte Carlo variance decomposes as
σ2
=
u⊆{1,...,s}
σ2
u, where σ2
u = Var[fu(U)].
The σ2
u’s can be estimated (perhaps very roughly) by MC or RQMC.
Intuition: Make sure the projections Pn(u) are very uniform for subsets u with large σ2
u.

Draft
28
Weighted Pγ,α with projection-dependent weights γu
Denote u(h) = u(h1, . . . , hs) the set of indices j for which hj = 0.
Pγ,α =
0=h∈L∗
s
γu(h)(max(1, |h1|) · · · max(1, |hs|))−α
.
For α/2 integer > 0, with ui = (ui,1, . . . , ui,s) = iv1 mod 1,
Pγ,α =
∅=u⊆{1,...,s}
1
n
n−1
i=0
γu
−(−4π2
)α/2
(α)!
|u|
j∈u
Bα(ui,j ),
and the corresponding variation is
V 2
γ (f ) =
∅=u⊆{1,...,s}
1
γu(4π2)α|u|/2
[0,1]|u|
∂α|u|/2
∂uα/2
fu(u)
2
du,
for f : [0, 1)s
→ R smooth enough. Then,
Var[ˆµn,rqmc] =
u⊆{1,...,s}
Var[ˆµn,rqmc(fu)] ≤ V 2
γ (f )Pγ,α .

Draft
29
Weighted Pγ,α:
Pγ,α =
0=h∈L∗
s
γu(h)(max(1, h1), . . . , max(1, hs))−α
Variance for a worst-case function whose square Fourier coeﬃcients are
|ˆf (h)|2
= γu(h)(max(1, h1), . . . , max(1, hs))−α
.
This is the RQMC variance for the function
f ∗
(u) =
u⊆{1,...,s}
√
γu
j∈u
(2π)α/2
(α/2)!
Bα/2(uj ).
We also have for this f :
σ2
u = γu Var[Bα/2(U)]
(4π2
)α/2
((α/2)!)2
|u|
= γu |Bα(0)|
(4π2
)α/2
(α)!
|u|
.

Draft
29
Weighted Pγ,α:
Pγ,α =
0=h∈L∗
s
γu(h)(max(1, h1), . . . , max(1, hs))−α
Variance for a worst-case function whose square Fourier coeﬃcients are
|ˆf (h)|2
= γu(h)(max(1, h1), . . . , max(1, hs))−α
.
This is the RQMC variance for the function
f ∗
(u) =
u⊆{1,...,s}
√
γu
j∈u
(2π)α/2
(α/2)!
Bα/2(uj ).
We also have for this f :
σ2
u = γu Var[Bα/2(U)]
(4π2
)α/2
((α/2)!)2
|u|
= γu |Bα(0)|
(4π2
)α/2
(α)!
|u|
.
For α = 2, we should take γu = (3/π2)|u|σ2
u ≈ (0.30396)|u|σ2
u.
For α = 4, we should take γu = [45/π4]|u|σ2
u ≈ (0.46197)|u|σ2
u.
For α → ∞, we have γu → (0.5)|u|σ2
u.
The ratios weight / variance should decrease exponentially with |u|.

Draft
30
Heuristics for choosing the weights
For f ∗, we should take γu = ρ|u|σ2
u for some constant ρ.
But there are still 2s − 1 subsets u to consider!
One could deﬁne a simple parametric model for the square variations and then estimate the
parameters by matching the ANOVA variances σ2
u
[Wang and Sloan 2006, L. and Munger 2012].
For example, product weights: γu = j∈u γj for some constants γj ≥ 0.
Order-dependent weights: γu depends only on |u|.
Example: γu = 1 for |u| ≤ d and γu = 0 otherwise. Wang (2007) suggests this with d = 2.
Mixture: POD weights (Kuo et al. 2011).
Note that all one-dimensional projections (before random shift) are the same.
So the weights γu for |u| = 1 are irrelevant.

Draft
31
Weighted Rγ,α
When α is not even, one can take
Rγ,α(Pn) =
∅=u⊆{1,...,s}
γu
1
n
n−1
i=0 j∈u


n/2
h=− (n−1)/2
max(1, |h|)−α
e2πihui,j
− 1

 .
Upper bounds on Pγ,α can be computed in terms of Rγ,α.
Can be computed for any α > 0 (ﬁnite sum). For example, can take α = 1.
We can compute it using FFT.

Draft
32
Figure of merit based on the spectral test
Compute the shortest vector u(Pn) in dual lattice for each projection u and normalize by an
upper bound ∗
|u|(n) (with Euclidean length):
Du(Pn) =
∗
|u|(n)
u(Pn)
≥ 1.

Draft
32
Figure of merit based on the spectral test
Compute the shortest vector u(Pn) in dual lattice for each projection u and normalize by an
upper bound ∗
|u|(n) (with Euclidean length):
Du(Pn) =
∗
|u|(n)
u(Pn)
≥ 1.
L. and Lemieux (2000), etc., maximize
Mt1,...,td
= min

 min
2≤r≤t1
{1,...,r}(Pn)
∗
r (n)
, min
2≤r≤d
min
u={j1,...,jr }⊂{1,...,s}
1=j1<···<jr ≤tr
u(Pn)
∗
r (n)

 .
Computing time of u(Pn) is almost independent of n, but exponential in |u|. Poor lattices
can be eliminated quickly.
Can use a diﬀerent norm, compute shortest vector in primal lattice, etc.

Draft
33
Search methods
Korobov lattices. Search over all admissible a, for a = (1, a, a2, . . . , ...).
Random Korobov. Try r random values of a.

Draft
33
Search methods
Rank 1, exhaustive search.
Pure random search. Try admissible vectors a at random.

Draft
33
Search methods
Component by component (CBC) construction. (Sloan, Kuo, etc.).
Let a1 = 1;
For j = 2, 3, . . . , s, ﬁnd z ∈ {1, . . . , n − 1}, gcd(z, n) = 1, such that
(a1, a2, . . . , aj = z) minimizes D(Pn({1, . . . , j})).
Fast CBC construction for Pγ,α: use FFT. (Nuyens, Cools).

Draft
33
Search methods
Component by component (CBC) construction. (Sloan, Kuo, etc.).
Let a1 = 1;
For j = 2, 3, . . . , s, ﬁnd z ∈ {1, . . . , n − 1}, gcd(z, n) = 1, such that
(a1, a2, . . . , aj = z) minimizes D(Pn({1, . . . , j})).
Fast CBC construction for Pγ,α: use FFT. (Nuyens, Cools).
Randomized CBC construction.
Let a1 = 1;
For j = 2, . . . , s, try r random z ∈ {1, . . . , n − 1}, gcd(z, n) = 1,
and retain (a1, a2, . . . , aj = z) that minimizes D(Pn({1, . . . , j})).
Can add ﬁlters to eliminate poor lattices more quickly.

Draft
34
Embedded latticesPn1 ⊂ Pn2 ⊂ . . . Pnm with n1 < n2 < · · · < nm, for some m > 0.
Usually: nk = bc+k for integers c ≥ 0 and b ≥ 2, typically with b = 2, ak = ak+1 mod nk for
all k < m, and the same random shift.
We need a measure that accounts for the quality of all m lattices.
We standardize the merit at all levels k so they have a comparable scale:
Eq(Pn) = Dq(Pn)/D∗
q(n),
where D∗
q(n) is a normalization factor, e.g., a bound on Dq(Pn) or a bound on its average
over all (a1, . . . , as) under consideration.
For Pγ,α, bounds by Sinescu and L. (2012) and Dick et al. (2008).
For CBC, we do this for each coordinate j = 1, . . . , s (replace s by j).
Then we can take as a global measure (with sum or max):
¯Eq,m(Pn1 , . . . , Pnm )
q
=
m
k=1
wk [Eq(Pnk
)]q
.

Draft
35
Available software tools
Construction: Nuyens (2012) provides Matlab code for fast-CBC construction of lattice
rules based on Pγ,α, with product and order-dependent weights.
Precomputed tables for ﬁxed criteria: Maisonneuve (1972), Sloan and Joe (1994), L. and
Lemieux (2000), Kuo (2012), etc.
Software for using (randomized) lattice rules in simulations is also available in many places
(e.g., in SSJ).

Draft
36
Lattice Builder
Implemented as C++ library, modular, object-oriented, accessible from a program via API.
Various choices of figures of merit, arbitrary weights, construction methods, etc. Easily
extensible.
For better run-time efficiency, uses static polymorphism, via templates, rather than dynamic
polymorphism. Several other techniques to reduce computations and improve speed.
Offers a pre-compiled program with Unix-like command line interface. Also graphical
interface.
Available for download on GitHub, with source code, documentation, and precompiled
executable codes for Linux or Windows, in 32-bit and 64-bit versions.
Coming very soon: Construction of polynomial lattice rules as well.
Show graphical interface

Draft
37
Baker’s (or tent) transformation
To make the periodic continuation of f continuous.
If f (0) = f (1), deﬁne ˜f by ˜f (1 − u) = ˜f (u) = f (2u) for 0 ≤ u ≤ 1/2.
This ˜f has the same integral as f and ˜f (0) = ˜f (1).
0 1
1/2
.

Draft
37
Baker’s (or tent) transformation
To make the periodic continuation of f continuous.
If f (0) = f (1), deﬁne ˜f by ˜f (1 − u) = ˜f (u) = f (2u) for 0 ≤ u ≤ 1/2.
This ˜f has the same integral as f and ˜f (0) = ˜f (1).
0 1
1/2
For smooth f , can reduce the variance to O(n−4+ ) (Hickernell 2002).
The resulting ˜f is symmetric with respect to u = 1/2.
In practice, we transform the points Ui instead of f .

Draft
38
One-dimensional case
Random shift followed by baker’s transformation.
Along each coordinate, stretch everything by a factor of 2 and fold.
Same as replacing Uj by min[2Uj , 2(1 − Uj )].
0 10.5

Draft
38
0 10.5
U/n

Draft
38
0 10.5
Gives locally antithetic points in intervals of size 2/n.
This implies that linear pieces over these intervals are integrated exactly.
Intuition: when f is smooth, it is well-approximated by a piecewise linear function, which is
integrated exactly, so the error is small.

Draft
39
Example: A stochastic activity network
Gives precedence relations between activities. Activity k has random duration Yk (also length
of arc k) with known cumulative distribution function (cdf) Fk(y) := P[Yk ≤ y].
Project duration T = (random) length of longest path from source to sink.
May want to estimate E[T], P[T > x], a quantile, density of T, etc.
0source 1
Y0
2
Y1
Y2
3
Y3
4
Y7
5
Y9
Y4
Y5
6
Y6
7
Y11
Y8
8 sink
Y12
Y10

Draft
40
Simulation
Algorithm: to generate T:
for k = 0, . . . , 12 do
Generate Uk ∼ U(0, 1) and let Yk = F−1
k (Uk)
Compute X = T = h(Y0, . . . , Y12) = f (U0, . . . , U12)
Monte Carlo: Repeat n times independently to obtain n realizations X1, . . . , Xn of T.
Estimate E[T] = (0,1)s f (u)du by ¯Xn = 1
n
n−1
i=0 Xi .
To estimate P(T > x), take X = I[T > x] instead.
RQMC: Replace the n independent points by an RQMC point set of size n.

Draft
40
Simulation
Algorithm: to generate T:
for k = 0, . . . , 12 do
Generate Uk ∼ U(0, 1) and let Yk = F−1
k (Uk)
Compute X = T = h(Y0, . . . , Y12) = f (U0, . . . , U12)
Monte Carlo: Repeat n times independently to obtain n realizations X1, . . . , Xn of T.
Estimate E[T] = (0,1)s f (u)du by ¯Xn = 1
n
n−1
i=0 Xi .
To estimate P(T > x), take X = I[T > x] instead.
RQMC: Replace the n independent points by an RQMC point set of size n.
Numerical illustration from Elmaghraby (1977):
Yk ∼ N(µk , σ2
k ) for k = 0, 1, 3, 10, 11, and Vk ∼ Expon(1/µk ) otherwise.
µ0, . . . , µ12: 13.0, 5.5, 7.0, 5.2, 16.5, 14.7, 10.3, 6.0, 4.0, 20.0, 3.2, 3.2, 16.5.

Draft
41
Naive idea: replace each Yk by its expectation. Gives T = 48.2.
Results of an experiment with n = 100 000.
Histogram of values of T is a density estimator that gives more information than a conﬁdence
interval on E[T] or P[T > x]. Values range from 14.4 to 268.6; 11.57% exceed x = 90.
T
0 25 50 75 100 125 150 175 200
Frequency
0
5000
10000
T = x = 90
T = 48.2
mean = 64.2
ˆξ0.99 = 131.8

Draft
41
Naive idea: replace each Yk by its expectation. Gives T = 48.2.
Results of an experiment with n = 100 000.
Histogram of values of T is a density estimator that gives more information than a conﬁdence
interval on E[T] or P[T > x]. Values range from 14.4 to 268.6; 11.57% exceed x = 90.
RQMC can also reduce the error (e.g., the MISE) of a density estimator!
T
0 25 50 75 100 125 150 175 200
Frequency
0
5000
10000
T = x = 90
T = 48.2
mean = 64.2
ˆξ0.99 = 131.8

Draft
42
Alternative estimator of P[T > x] = E[I(T > x)] for SAN.
Naive estimator: Generate T and compute X = I[T > x].
Repeat n times and average.

Draft
42
Alternative estimator of P[T > x] = E[I(T > x)] for SAN.
Naive estimator: Generate T and compute X = I[T > x].
Repeat n times and average.
0source 1
Y0
2
Y1
Y2
3
Y3
4
Y7
5
Y9
Y4
Y5
6
Y6
7
Y11
Y8
8 sink
Y12
Y10

Draft
43
Conditional Monte Carlo estimator of P[T > x]. Generate the Yj ’s only for the 8 arcs
that do not belong to the cut L = {4, 5, 6, 8, 9}, and replace I[T > x] by its conditional
expectation given those Yj ’s,
Xe = P[T > x | {Yj , j ∈ L}].
This makes the integrand continuous in the Uj ’s.

Draft
43
Xe = P[T > x | {Yj , j ∈ L}].
To compute Xe: for each l ∈ L, say from al to bl , compute the length αl of the longest path
from 1 to al , and the length βl of the longest path from bl to the destination.
The longest path that passes through link l does not exceed x iﬀ αl + Yl + βl ≤ x, which
occurs with probability P[Yl ≤ x − αl − βl ] = Fl [x − αl − βl ].

Draft
43
Xe = P[T > x | {Yj , j ∈ L}].
To compute Xe: for each l ∈ L, say from al to bl , compute the length αl of the longest path
from 1 to al , and the length βl of the longest path from bl to the destination.
The longest path that passes through link l does not exceed x iﬀ αl + Yl + βl ≤ x, which
occurs with probability P[Yl ≤ x − αl − βl ] = Fl [x − αl − βl ].
Since the Yl are independent, we obtain
Xe = 1 −
l∈L
Fl [x − αl − βl ].
Can be faster to compute than X, and always has less variance.

Draft
44
ANOVA Variances for estimator of P[T > x] in Stochastic Activity
Network
0 10 20 30 40 50 60 70 80 90 100
x = 64
x = 100
CMC, x = 64
CMC, x = 100
% of total variance for each cardinality of u
Stochastic Activity Network

Draft
45
Variance for estimator of P[T > x] for SAN
28.66
211.54
214.43
217.31
220.2
10−7
10−6
10−5
10−4
10−3
variance
Stochastic Activity Network (x = 64)
MC
Sobol
Lattice (P2) + baker
n−2

Draft
46
Variance for estimator of P[T > x] with CMC
28.66
211.54
214.43
217.31
220.2
10−8
10−7
10−6
10−5
10−4
variance
Stochastic Activity Network (CMC x = 64)
MC
Sobol
n−2

Draft
47
Histograms, with n = 8191 and m = 10, 000
0 0.5 1
0
0.2
0.4
0.6
0.8
1
probability single MC draw (x = 100)
6 7
·10−2
0
5 · 10−2
0.1
0.15
probability
MC estimator (x = 100)
6.5 7
0
5 · 10−2
0.1
probability
RQMC estimator (x = 100)

Draft
48
Histograms, with n = 8191 and m = 10, 000
0 0.5 1
0
0.1
0.2
0.3
probability single MC draw (CMC x = 100)
6 6.5 7
·10−2
0
5 · 10−2
0.1
0.15
probability
MC estimator (CMC x = 100)
6.4 6.5 6.6 6.7
0
5 · 10−2
0.1
0.15
probability
RQMC estimator (CMC x = 100)

Draft
49
Effective dimension
(Caflisch, Morokoff, and Owen 1997).
A function f has effective dimension d in proportion ρ in the superposition sense if
|u|≤d
σ2
u ≥ ρσ2
.
It has effective dimension d in the truncation sense if
u⊆{1,...,d}
σ2
u ≥ ρσ2
.
High-dimensional functions with low effective dimension are frequent.
One may change f to make this happen.

Draft
50
Example: Function of a Multinormal vector
Let µ = E[f (U)] = E[g(Y)] where Y = (Y1, . . . , Ys) ∼ N(0, Σ).

Draft
50
For example, if the payoﬀ of a ﬁnancial derivative is a function of the values taken by a
c-dimensional geometric Brownian motion (GMB) at d observations times
0 < t1 < · · · < td = T, then we have s = cd.

Draft
50
To generate Y: Decompose Σ = AAt, generate Z = (Z1, . . . , Zs) ∼ N(0, I) where the
(independent) Zj ’s are generated by inversion: Zj = Φ−1(Uj ), and return Y = AZ.

Draft
50
Choice of A?

Draft
50
Choice of A?
Cholesky factorization: A is lower triangular.

Draft
51
Principal component decomposition (PCA) (Ackworth et al. 1998):
A = PD1/2 where D = diag(λs, . . . , λ1) (eigenvalues of Σ in decreasing order) and the
columns of P are the corresponding unit-length eigenvectors.

Draft
51
columns of P are the corresponding unit-length eigenvectors. With this A, Z1 accounts for
the max amount of variance of Y, then Z2 the max amount of variance cond. on Z1, etc.

Draft
51
Function of a Brownian motion (or other L´evy process):
Payoﬀ depends on c-dimensional Brownian motion {X(t), t ≥ 0} observed at times
0 = t0 < t1 < · · · < td = T.

Draft
51
0 = t0 < t1 < · · · < td = T.
Sequential (or random walk) method: generate X(t1), then X(t2) − X(t1), then
X(t3) − X(t2), etc.

Draft
51
0 = t0 < t1 < · · · < td = T.
X(t3) − X(t2), etc.
Bridge sampling (Moskowitz and Caﬂisch 1996). Suppose d = 2m.
generate X(td ), then X(td/2) conditional on (X(0), X(td )),

Draft
51
0 = t0 < t1 < · · · < td = T.
X(t3) − X(t2), etc.
then X(td/4) conditional on (X(0), X(td/2)), and so on.
The ﬁrst few N(0, 1) r.v.’s already sketch the path trajectory.

Draft
51
0 = t0 < t1 < · · · < td = T.
X(t3) − X(t2), etc.
then X(td/4) conditional on (X(0), X(td/2)), and so on.
The ﬁrst few N(0, 1) r.v.’s already sketch the path trajectory.
Each of these methods corresponds to some matrix A.
Choice has a large impact on the ANOVA decomposition of f .

Draft
52
Example: Pricing an Asian basket option
We have c assets, d observation times. Want to estimate E[f (U)], where
f (U) = e−rT
max

0,
1
cd
c
i=1
d
j=1
Si (tj ) − K


is the net discounted payoﬀ and Si (tj ) is the price of asset i at time tj .

Draft
52
f (U) = e−rT
max

0,
1
cd
c
i=1
d
j=1
Si (tj ) − K


Suppose (S1(t), . . . , Sc(t)) obeys a geometric Brownian motion.
Then, f (U) = g(Y) where Y = (Y1, . . . , Ys) ∼ N(0, Σ) and s = cd.

Draft
52
f (U) = e−rT
max

0,
1
cd
c
i=1
d
j=1
Si (tj ) − K


Suppose (S1(t), . . . , Sc(t)) obeys a geometric Brownian motion.
Then, f (U) = g(Y) where Y = (Y1, . . . , Ys) ∼ N(0, Σ) and s = cd.
Even with Cholesky decompositions of Σ, the two-dimensional projections often account for
more than 99% of the variance: low effective dimension in the superposition sense.
With PCA or bridge sampling, we get low effective dimension in the truncation sense. In
realistic examples, the first two coordinates Z1 and Z2 often account for more than 99.99%
of the variance!

Draft
53
Numerical experiment with c = 10 and d = 25
This gives a 250-dimensional integration problem.
Let ρi,j = 0.4 for all i = j, T = 1, σi = 0.1 + 0.4(i − 1)/9 for all i, r = 0.04, S(0) = 100,
and K = 100. (Imai and Tan 2002).

Draft
53
Numerical experiment with c = 10 and d = 25
This gives a 250-dimensional integration problem.
Let ρi,j = 0.4 for all i = j, T = 1, σi = 0.1 + 0.4(i − 1)/9 for all i, r = 0.04, S(0) = 100,
and K = 100. (Imai and Tan 2002).
Variance reduction factors for Cholesky (left) and PCA (right)
(experiment from 2003):
Korobov Lattice Rules
n = 16381 n = 65521 n = 262139
a = 5693 a = 944 a = 21876
Lattice+shift 18 878 18 1504 9 2643
Lattice+shift+baker 50 4553 46 3657 43 7553
Sobol’ Nets
n = 214
n = 216
n = 218
Sobol+Shift 10 1299 17 3184 32 6046
Sobol+LMS+Shift 6 4232 4 9219 35 16557
Note: The payoﬀ function is not smooth and also unbounded!

Draft
54
ANOVA Variances for ordinary Asian Option
0 20 40 60 80 100
s = 3, seq.
s = 3, BB
s = 3, PCA
s = 6, seq.
s = 6, BB
s = 6, PCA
s = 12, seq.
s = 12, BB
s = 12, PCA
% of total variance for each cardinality of u
Asian Option with S(0) = 100, K = 100, r = 0.05, σ = 0.5

Draft
55
Total Variance per Coordinate for the Asian Option
0 20 40 60 80 100
sequential
BB
PCA
% of total variance
Asian Option (s = 6) with S(0) = 100, K = 100, r = 0.05, σ = 0.5
Coordinate 1
Coordinate 2
Coordinate 3
Coordinate 4
Coordinate 5
Coordinate 6

Draft
56
Variance with good lattices rules and Sobol points
26
28
210
212
214
10−6
10−5
10−4
10−3
10−2
10−1
100
variance
Asian Option (PCA) s = 12, S(0) = 100, K = 100, r = 0.05, σ = 0.5
MC
Sobol
n−2

Draft
57
Polynomial lattice rules
Integers and real numbers are replaced by polynomials and formal series, respectively.
Select prime base b ≥ 2. Usually b = 2.
Replace Z by Fb[z], the ring of polynomials over finite field Fb ≡ Zb;
Replace R by Lb = Fb((z−1)), the field of formal Laurent series over Fb, of the form
∞
=ω x z− , where x ∈ Fb.
Polynomial lattice
Ls =



v(z) =
s
j=1
qj (z)vj (z) such that each qj (z) ∈ Fb[z]



,
where v1(z), . . . , vs(z) are independent vectors in Ls
b, of the form vj (z) = aj (z)/P(z), where
P(z) = zk + α1zk−1 + · · · + αk ∈ Zb[z] and each aj (z) is a vector of polynomials of degree
less than k. Note that (Zb[z])s ⊆ Ls (integration lattice) and Ls mod Fb[z] contains exactly
bk points in Ls
b.

Draft
57
Polynomial lattice rules
Integers and real numbers are replaced by polynomials and formal series, respectively.
Select prime base b ≥ 2. Usually b = 2.
Replace Z by Fb[z], the ring of polynomials over finite field Fb ≡ Zb;
Replace R by Lb = Fb((z−1)), the field of formal Laurent series over Fb, of the form
∞
=ω x z− , where x ∈ Fb.
Polynomial lattice
Ls =



v(z) =
s
j=1
qj (z)vj (z) such that each qj (z) ∈ Fb[z]



,
where v1(z), . . . , vs(z) are independent vectors in Ls
b, of the form vj (z) = aj (z)/P(z), where
P(z) = zk + α1zk−1 + · · · + αk ∈ Zb[z] and each aj (z) is a vector of polynomials of degree
less than k. Note that (Zb[z])s ⊆ Ls (integration lattice) and Ls mod Fb[z] contains exactly
bk points in Ls
b.
For a rule of rank 1, v2(z), . . . , vs(z) are the unit vectors.

Draft
58
Deﬁne ϕ : L → R by
ϕ
∞
=ω
x z−
=
∞
=ω
x b−
.
The polynomial lattice rule (PLR) uses the node set Pn = ϕ(Ls) ∩ [0, 1)s = ϕ(Ls mod Fb[z]).

Draft
58
Deﬁne ϕ : L → R by
ϕ
∞
=ω
x z−
=
∞
=ω
x b−
.
The polynomial lattice rule (PLR) uses the node set Pn = ϕ(Ls) ∩ [0, 1)s = ϕ(Ls mod Fb[z]).
PLRs were ﬁrst studied by Niederreiter, Larcher, Tezuka (circa 1990), with rank 1. They
were generalized and further studied by Lemieux and L’Ecuyer (circa 2000), then by Dick,
Pillischammer, Nuyens, Goda, and others. Most of the properties of ordinary lattice rules
have counterparts for the polynomial rules.
The Fourier expansion is replaced by a Walsh expansion, the weighted Pγ,α has a counterpart
Pγ,α,PRL, CBC constructions can provide good parameters, fast CBC also works, etc.

Draft
59
Walsh expansion
For h ≡ h(z) = (h1(z), . . . , hs(z)) ∈ (Fb[z])s
and u = (u1, . . . , us) ∈ [0, 1)s
, where
hi (z) =
i
j=1
hi,j zj−1
and ui =
j≥1
ui,j b−j
∈ [0, 1), deﬁne h, u =
s
i=1
i
j=1
hi,j ui,j in Fb.
The Walsh expansion in Fb of f : [0, 1)s
→ R is
f (u) =
h∈(Fb[z])s
˜f (h)e2πi h,u /b
,
with Walsh coeﬃcients
˜f (h) =
[0,1)s
f (u)e−2πi h,u /b
du.
Theorem: For a PLR with a random digital shift, Var[Qn] =
0=h∈L∗
s
|˜f (h)|2
.
Again, we want to kick out of the dual lattice the h’s for which |˜f (h)|2
is large. For smooth f , the
small h are the most important.

Draft
60
Version of Pγ,α for PLRs
A similar reasoning as for ordinary lattice rules leads to
Pγ,α,PLR =
u⊆{1,...,s}
γu
j∈u, hj =0
2α log2 hj
=
u⊆{1,...,s}
γu
1
n
i = 0n−1
j∈u
µ(α) − 2(1+ log2(xi,j ) )(α−1)
(µ(α) + 1) .
where µ(α) = (1 − 21−α)−1.
For α = 2, this simpliﬁes to µ(2) = 2 and
Pγ,2,PLR =
u⊆{1,...,s}
γu
1
n
n−1
i=0 j∈u
2 − 6 · 2 log2(xi,j )
.

Draft
61
Example in s = 2 dimensions
Base b = 2, k = 8, n = 28 = 256,
P(z) = 1 + z + z3 + z5 + z8 ≡ [110101001],
q1(z) = 1, q2(z) = 1 + z + z2 + z3 + z5 + z7 ≡ [11110101].
0 1
1
ui,1
ui,2

Draft
62
A PLR is also a special case of a digital net in base b, and this can be used to generate the
points eﬃciently: compute the generating matrices and use the digital net implementation.
This is particularly fast in base b = 2.
Random shift in space of formal series: equivalent to a random digital shift in base b, applied
to all the points. It preserves equidistribution.

Draft
63
Random digital shift for digital net
Equidistribution in digital boxes is lost with random shift modulo 1,
but can be kept with a random digital shift in base b.
In base 2: Generate U ∼ U(0, 1)s and XOR it bitwise with each ui .
Example for s = 2:
ui = (0.01100100..., 0.10011000...)2
U = (0.01001010..., 0.11101001...)2
ui ⊕ U = (0.00101110..., 0.01110001...)2.
Each point has U(0, 1) distribution.
Preservation of the equidistribution (k1 = 3, k2 = 5):
ui = (0.***, 0.*****)
U = (0.010, 0.11101)2
ui ⊕ U = (0.***, 0.*****)

Draft
64
Example with
U = (0.1270111220, 0.3185275653)10
= (0. 0010 0000100000111100, 0. 0101 0001100010110000)2.
Changes the bits 3, 9, 15, 16, 17, 18 of ui,1
and the bits 2, 4, 8, 9, 13, 15, 16 of ui,2.
0 1
1
un+1
un
0 1
1
un+1
un
Red and green squares are permuted (k1 = k2 = 4, ﬁrst 4 bits of U).

Draft
65
Array-RQMC for Markov Chains
Setting: A Markov chain with state space X ⊆ R , evolves as
X0 = x0, Xj = ϕj (Xj−1, Uj ), j ≥ 1,
where the Uj are i.i.d. uniform r.v.’s over (0, 1)d . Want to estimate
µ = E[Y ] where Y =
τ
j=1
gj (Xj ).
Ordinary MC: n i.i.d. realizations of Y . Requires s = τd uniforms.
Array-RQMC: L., L´ecot, Tuﬃn, et al. [2004, 2006, 2008, etc.]
Simulate an “array” (or population) of n chains in “parallel.”
Goal: Want small discrepancy between empirical distribution of states
Sn,j = {X0,j , . . . , Xn−1,j } and theoretical distribution of Xj , at each step j.
At each step, use RQMC point set to advance all the chains by one step.

Draft
66Some RQMC insight: To simplify, suppose Xj ∼ U(0, 1) .
We estimate
µj = E[gj (Xj )] = E[gj (ϕj (Xj−1, U))] =
[0,1) +d
gj (ϕj (x, u))dxdu
by
ˆµarqmc,j,n =
1
n
n−1
i=0
gj (Xi,j ) =
1
n
n−1
i=0
gj (ϕj (Xi,j−1, Ui,j )).
This is (roughly) RQMC with the point set Qn = {(Xi,j−1, Ui,j ), 0 ≤ i < n} .
We want Qn to have low discrepancy (LD) over [0, 1) +d
.

Draft
66Some RQMC insight: To simplify, suppose Xj ∼ U(0, 1) .
We estimate
µj = E[gj (Xj )] = E[gj (ϕj (Xj−1, U))] =
[0,1) +d
gj (ϕj (x, u))dxdu
by
ˆµarqmc,j,n =
1
n
n−1
i=0
gj (Xi,j ) =
1
n
n−1
i=0
gj (ϕj (Xi,j−1, Ui,j )).
This is (roughly) RQMC with the point set Qn = {(Xi,j−1, Ui,j ), 0 ≤ i < n} .
We want Qn to have low discrepancy (LD) over [0, 1) +d
.
We do not choose the Xi,j−1’s in Qn: they come from the simulation.
We select a LD point set
˜Qn = {(w0, U0,j ), . . . , (wn−1, Un−1,j )} ,
where the wi ∈ [0, 1) are ﬁxed and each Ui,j ∼ U(0, 1)d
.
Permute the states Xi,j−1 so that Xπj (i),j−1 is “close” to wi for each i (LD between the two sets), and
compute Xi,j = ϕj (Xπj (i),j−1, Ui,j ) for each i.
Example: If = 1, can take wi = (i + 0.5)/n and just sort the states.
For > 1, there are various ways to deﬁne the matching (multivariate sort).

Draft
67
Array-RQMC algorithm
Xi,0 ← x0 (or Xi,0 ← xi,0) for i = 0, . . . , n − 1;
for j = 1, 2, . . . , τ do
Compute the permutation πj of the states (for matching);
Randomize afresh {U0,j , . . . , Un−1,j } in ˜Qn;
Xi,j = ϕj (Xπj (i),j−1, Ui,j ), for i = 0, . . . , n − 1;
ˆµarqmc,j,n = ¯Yn,j = 1
n
n−1
i=0 g(Xi,j );
Estimate µ by the average ¯Yn = ˆµarqmc,n = τ
j=1 ˆµarqmc,j,n.

Draft
67
Array-RQMC algorithm
Xi,0 ← x0 (or Xi,0 ← xi,0) for i = 0, . . . , n − 1;
for j = 1, 2, . . . , τ do
Compute the permutation πj of the states (for matching);
Randomize afresh {U0,j , . . . , Un−1,j } in ˜Qn;
Xi,j = ϕj (Xπj (i),j−1, Ui,j ), for i = 0, . . . , n − 1;
ˆµarqmc,j,n = ¯Yn,j = 1
n
n−1
i=0 g(Xi,j );
Estimate µ by the average ¯Yn = ˆµarqmc,n = τ
j=1 ˆµarqmc,j,n.
Proposition: (i) The average ¯Yn is an unbiased estimator of µ.
(ii) The empirical variance of m independent realizations gives an unbiased estimator of
Var[ ¯Yn].

Draft
68
Some generalizations
L., Lécot, and Tuffin [2008]: τ can be a random stopping time w.r.t. the filtration
F{(j, Xj ), j ≥ 0}.
L., Demers, and Tuffin [2006, 2007]: Combination with splitting techniques (multilevel and
without levels), combination with importance sampling and weight windows. Covers particle
filters.
L. and Sanvido [2010]: Combination with coupling from the past for exact sampling.
Dion and L. [2010]: Combination with approximate dynamic programming and for optimal
stopping problems.
Gerber and Chopin [2015]: Sequential QMC.

Draft
69
Convergence results and applications
L., Lécot, and Tuffin [2006, 2008]: Special cases: convergence at MC rate, one-dimensional,
stratification, etc. O(n−3/2
) variance.
Lécot and Tuffin [2004]: Deterministic, one-dimension, discrete state.
El Haddad, Lécot, L. [2008, 2010]: Deterministic, multidimensional. O(n−1/( +1)
) worst-case error
under some conditions.
Fakhererredine, El Haddad, Lécot [2012, 2013, 2014]: LHS, stratification, Sudoku sampling, ...
L., Lécot, Munger, and Tuffin [2016]: Survey, comparing sorts, and further examples, some with
O(n−3
) empirical variance.
Wächter and Keller [2008]: Applications in computer graphics.
Gerber and Chopin [2015]: Sequential QMC (particle filters), Owen nested scrambling and Hilbert
sort. o(n−1
) variance.

Draft
70
A (4,4) mapping
States of the chains
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
1.0
1.0
s
s
s
s
s
s
s
s
s
s ss
s
s
s
s
Sobol’ net in 2 dimensions after
random digital shift
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
1.0
1.0 s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s

Draft
71
A (4,4) mapping
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
1.0
1.0
s
s
s
s
s
ss
s
s
s
s
s
s
s
s
s
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
1.0
1.0
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s

Draft
72
A (4,4) mapping
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
1.0
1.0
s
s
s
s
s
ss
s
s
s
s
s
s
s
s
s
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
1.0
1.0
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s

Draft
72
A (4,4) mapping
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
1.0
1.0
z
z
s
s
s
s
s
ss
s
s
s
s
s
s
s
s
s
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
1.0
1.0
z
z
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s

Draft
73
Hilbert curve sort
Map the states to [0, 1], then sort.
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
1.0
1.0
s
s
s
s
s
ss
s
s
s
s
s
s
s
s
s

Draft
74
Example: Asian Call Option
S(0) = 100, K = 100, r = 0.05, σ = 0.15, tj = j/52, j = 0, . . . , τ = 13.
RQMC: Sobol’ points with linear scrambling + random digital shift.
Similar results for randomly-shifted lattice + baker’s transform.
log2 n
8 10 12 14 16 18 20
log2 Var[ˆµRQMC,n]
-40
-30
-20
-10
n−2
array-RQMC, split sort
RQMC sequential
crude MC
n−1

Draft
75
Example: Asian Call Option
Sort RQMC points
log2 Var[ ¯Yn,j ]
log2 n VRF CPU (sec)
Batch sort SS -1.38 2.0 × 102
744
(n1 = n2) Sobol -2.03 4.2 × 106
532
Sobol+NUS -2.03 2.8 × 106
1035
Korobov+baker -2.04 4.4 × 106
482
Hilbert sort SS -1.55 2.4 × 103
840
(logistic map) Sobol -2.03 2.6 × 106
534
Sobol+NUS -2.02 2.8 × 106
724
Korobov+baker -2.01 3.3 × 106
567
VRF for n = 220. CPU time for m = 100 replications.

Draft
76
Conclusion, discussion, etc.
RQMC can improve the accuracy of estimators considerably in some applications.
Cleverly modifying the function f can often bring huge statistical eﬃciency
improvements in simulations with RQMC.
There are often many possibilities for how to change f to make it smoother, periodic,
and reduce its eﬀective dimension.
Point set constructions should be based on discrepancies that take that into account.
Nonlinear functions of expectations: RQMC also reduces the bias.
RQMC for density estimation.
RQMC for optimization.
Array-RQMC and other QMC methods for Markov chains. Sequential RQMC.
Still a lot to learn and do ...

Draft
76Some references on QMC, RQMC, and lattice rules:
Monte Carlo and Quasi-Monte Carlo Methods 2014, 2012, 2010, ... Springer-Verlag, Berlin,
2016, 2014, 2012, ...
J. Dick and F. Pillichshammer. Digital Nets and Sequences: Discrepancy Theory and
Quasi-Monte Carlo Integration. Cambridge University Press, Cambridge, U.K., 2010.
F. J. Hickernell. Lattice rules: How well do they measure up? In P. Hellekalek and G. Larcher,
editors, Random and Quasi-Random Point Sets, volume 138 of Lecture Notes in Statistics, pages
109–166. Springer-Verlag, New York, 1998.
F. J. Hickernell, H. S. Hong, P. L’Ecuyer, and C. Lemieux. Extensible lattice sequences for
quasi-Monte Carlo quadrature. SIAM Journal on Scientiﬁc Computing, 22(3):1117–1138, 2001.
J. Imai and K. S. Tan. A general dimension reduction technique for derivative pricing. Journal
of Computational Finance, 10(2):129–155, 2006.
P. L’Ecuyer. Polynomial integration lattices. In H. Niederreiter, editor, Monte Carlo and
Quasi-Monte Carlo Methods 2002, pages 73–98, Berlin, 2004. Springer-Verlag.
P. L’Ecuyer. Quasi-Monte Carlo methods with applications in ﬁnance. Finance and
Stochastics, 13(3):307–349, 2009.

Draft
76P. L’Ecuyer. Randomized quasi-monte carlo: An introduction for practitioners. In P. W. Glynn
and A. B. Owen, editors, Monte Carlo and Quasi-Monte Carlo Methods 2016, 2017.
P. L’Ecuyer and C. Lemieux. Variance reduction via lattice rules. Management Science,
46(9):1214–1235, 2000.
P. L’Ecuyer and D. Munger. On ﬁgures of merit for randomly-shifted lattice rules. In
H. Wo´zniakowski and L. Plaskota, editors, Monte Carlo and Quasi-Monte Carlo Methods 2010,
pages 133–159, Berlin, 2012. Springer-Verlag.
P. L’Ecuyer and D. Munger. Algorithm 958: Lattice builder: A general software tool for
constructing rank-1 lattice rules. ACM Trans. on Mathematical Software, 42(2):Article 15,
2016.
C. Lemieux. Monte Carlo and Quasi-Monte Carlo Sampling. Springer-Verlag, New York, NY,
2009.
C. Lemieux and P. L’Ecuyer. Randomized polynomial lattice rules for multivariate integration
and simulation. SIAM Journal on Scientiﬁc Computing, 24(5):1768–1789, 2003.
H. Niederreiter. Random Number Generation and Quasi-Monte Carlo Methods, volume 63 of
SIAM CBMS-NSF Regional Conference Series in Applied Mathematics. SIAM, Philadelphia, PA,
1992.

Draft
76D. Nuyens. The construction of good lattice rules and polynomial lattice rules. In Peter
Kritzer, Harald Niederreiter, Friedrich Pillichshammer, and Arne Winterhof, editors, Uniform
Distribution and Quasi-Monte Carlo Methods: Discrepancy, Integration and Applications, pages
223–255. De Gruyter, 2014.
D. Nuyens and R. Cools. Fast algorithms for component-by-component construction of rank-1
lattice rules in shift-invariant reproducing kernel Hilbert spaces. Mathematics of Computation,
75:903–920, 2006.
I. H. Sloan and S. Joe. Lattice Methods for Multiple Integration. Clarendon Press, Oxford,
1994.

Draft
76Some references on Array-RQMC:
M. Gerber and N. Chopin. Sequential quasi-Monte Carlo. Journal of the Royal Statistical
Society, Series B, 77(Part 3):509–579, 2015.
P. L’Ecuyer, V. Demers, and B. Tuffin. Rare-events, splitting, and quasi-Monte Carlo. ACM
Transactions on Modeling and Computer Simulation, 17(2):Article 9, 2007.
P. L’Ecuyer, C. Lécot, and A. L’Archevêque-Gaudet. On array-RQMC for Markov chains:
Mapping alternatives and convergence rates. Monte Carlo and Quasi-Monte Carlo Methods
2008, pages 485–500, Berlin, 2009. Springer-Verlag.
P. L’Ecuyer, C. Lécot, and B. Tuffin. A randomized quasi-Monte Carlo simulation method for
Markov chains. Operations Research, 56(4):958–975, 2008.
P. L’Ecuyer, D. Munger, C. Lécot, and B. Tuffin. Sorting methods and convergence rates for
array-rqmc: Some empirical comparisons. Mathematics and Computers in Simulation, 2016.
http://dx.doi.org/10.1016/j.matcom.2016.07.010.
P. L’Ecuyer and C. Sanvido. Coupling from the past with randomized quasi-Monte Carlo.
Mathematics and Computers in Simulation, 81(3):476–489, 2010.
C. Wächter and A. Keller. Efficient simultaneous simulation of Markov chains. Monte Carlo
and Quasi-Monte Carlo Methods 2006, pages 669–684, Berlin, 2008. Springer-Verlag.

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Lattice Rules for Quasi-Monte Carlos - Pierre L’Ecuyer, Aug 28, 2017

More Related Content

Similar to Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Lattice Rules for Quasi-Monte Carlos - Pierre L’Ecuyer, Aug 28, 2017 (20)

More from The Statistical and Applied Mathematical Sciences Institute (20)

Recently uploaded (20)

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Lattice Rules for Quasi-Monte Carlos - Pierre L’Ecuyer, Aug 28, 2017