0% found this document useful (0 votes)
135 views26 pages

FINAL Every Lie Algebra Homomorphism Gives Rise To A Canonical Lie Group Homomorphism If The Domain Group Is Simply Connected

This document discusses Lie groups, which are groups that are also differentiable manifolds. It focuses on matrix Lie groups, and explores their associated Lie algebras. The paper proves that every Lie algebra homomorphism gives rise to a unique Lie group homomorphism between two matrix Lie groups. To prove this result, the paper first proves the Baker-Campbell-Hausdorff formula, which expresses the logarithm of the product of exponentials of two matrices as an infinite series involving nested brackets of the matrices. The formula is then used to show that a Lie group homomorphism exists given a Lie algebra homomorphism, provided the domain group is simply connected.

Uploaded by

Mehrotraashir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
135 views26 pages

FINAL Every Lie Algebra Homomorphism Gives Rise To A Canonical Lie Group Homomorphism If The Domain Group Is Simply Connected

This document discusses Lie groups, which are groups that are also differentiable manifolds. It focuses on matrix Lie groups, and explores their associated Lie algebras. The paper proves that every Lie algebra homomorphism gives rise to a unique Lie group homomorphism between two matrix Lie groups. To prove this result, the paper first proves the Baker-Campbell-Hausdorff formula, which expresses the logarithm of the product of exponentials of two matrices as an infinite series involving nested brackets of the matrices. The formula is then used to show that a Lie group homomorphism exists given a Lie algebra homomorphism, provided the domain group is simply connected.

Uploaded by

Mehrotraashir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY

CONNECTED LIE GROUP HOMOMORPHISMS

AASHIR MEHROTRA

Abstract. In this paper, we explore matrix Lie groups, which are groups in the space of
real or complex matrices. The Lie group structure can be used to prove properties of many
important groups such as the unitary and orthogonal groups, among others. We explore
matrix Lie groups along with Lie algebras, a non-associative skew-symmetric algebra that
satisfies a property known as the Jacobi identity. Lie groups and Lie algebras go hand in
hand, as locally they can be bijectively mapped to one another by using the exponential and
logarithm functions on a matrix, which we define and prove properties of in this paper. We
prove that every Lie algebra homomorphism gives rise to a unique Lie group homomorphism,
which is equal to the composition of the said Lie algebra homomorphism along with the
exponential function. In order to prove this result, we first prove the Baker-Campbell-
Hausdorff formula, which shows that log(eX eY ) can be expressed as an infinite series of
nested brackets in X and Y , provided both the matrices are sufficiently small in magnitude.

1. Introduction
Lie groups are groups that are also differentiable manifolds, such that the product and
inversion operations are smooth. In this paper, we focus on a special case of Lie group,
namely matrix Lie groups.
We define a topology on the space of n × n complex matrices, which allows us to define
topological properties such as connectedness and simple-connectedness on groups of matrices.
By using a matrix norm and by absolute continuity, it is possible to define the exponential
of a matrix to be the convergent infinite sum that is identical to the complex Taylor series
expansion of ez , with z being replaced by X. While the scalar and matrix exponential satisfy
common properties, it is, in general, not true that eX+Y = eX eY for complex matrices X and
Y . The matrix logarithm can also be defined for matrices, though just like in the complex
case, we must restricted the domain to the open ball ∥A − I∥ < 1 in order to avoid the
logarithm to be a multi-valued function. Just like in the complex case, it is possible to prove
that the exponential and logarithm function are inverses one another in local neighbourhood
of I and 0.
A (real) Lie algebra is a (real) vector space g along with a product map [·] : V × V → V
that is billinear, symmetric, and satisfies the Jacobi product:
[X, [Y, Z]] + [Z, [X, Y ]] + [Y, [Z, X]] = 0.
The Lie bracket [·] needn’t be associative.
Every matrix Lie group has an associated Lie algebra, which is the set of matrices X
such that etX ∈ G for all real numbers t. We show that a Lie algebra defined from this
way is indeed a real Lie algebra as defined earlier. We also prove that for any matrix Lie
group homomorphism (which is a continuous group homomorphism) Φ : G → H between
Date: July 15, 2023.
1
2 AASHIR MEHROTRA

two matrix Lie groups G and H, there exists a unique Lie algebra homomorphism (which is
a linear map that preserves that Lie bracket [·]) ϕ : g → h, where g and h are the Lie algebra
of G and H respectively. Moreover, the functions Φ and ϕ satisfy:
eϕ(X) = Φ(eX )
for all X ∈ g.
Our main theorem for this paper is a partial converse to this result, which provides a
canonical matrix Lie group homomorphism given a Lie algebra homomorphism, such that
the domain matrix Lie group G is simply-connected.
Theorem 1.1. Let G and H be matrix Lie groups with associated Lie algebra g and h
respectively. If ϕ : g → h is a Lie algebra homomorphism, and G is simply connected, then
there exists a unique Lie group homomorphism Φ : G → H such that
Φ(eX ) = eϕ(x)
for all X ∈ g.
In order to prove this result, we first prove the Baker-Campbell-Hausdorff Formula (or
BCH formula for short), which has consequence other key results in Lie Theory. Suppose
log A
g(A) = ,
1 − A−1
which is defined for ∥A − 1∥ < 1. Then the BCH formula states if X, Y are n × n complex
matrices with ∥X∥ and ∥Y ∥ sufficiently small, then
Z 1
X Y
log(e e ) = X + g(eadY etadY )(Y )dt,
0
where adY is the function on g that sends X to [X, Y ]. The above integral can be expressed
in the form of a series of nested brackets in X and Y .
This series formulation of the BCH formula is useful in the proof of Theorem 1.1, as it is
used to prove that a ”local homomorphism” with the desired properties exists if given a Lie
algebra homomorphism. To extend this local homomorphism into a global one, the condition
that G is simply-connected is required.

2. Lie Groups
We first start with definitions concerning matrix Lie groups. In what follows we denote
Mn (F) as the ring of n × n matrices over the field F, and GLn (F) as the group of n × n
invertible matrices over F. Also, log will mean a logarithm in base e.
Definition 2.1. If A ∈ Mn (C), then we define the norm of A to be
∥Ax∥
(2.1) ∥A∥ = sup ,
x∈Cn \{0} ∥x∥
pPn
where ∥v∥ for v ∈ Cn is i=1 |vi |2 .
Note that other norms can be given to matrices such as the largest absolute magnitude
of its entries, or the square root of all sums of the squares of the magnitudes of each entry,
similar to the vector norm. Nonetheless, all such other norms can be proven to be equivalent
to the norm defined above, so that there is no ambiguity when using ∥ · ∥.
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
3

The expression 2.1 satisfies the usual properties of a norm (such as the triangle inequality).
However, the norm is not multiplicative, but rather sub-multiplicative.
Claim 2.2. For all A, B ∈ Mn (C), ∥AB∥ ≤ ∥A∥∥B∥.
Proof. We have
∥ABv∥ ∥ABv∥ ∥ABv∥ ∥Bv∥ ∥Aw∥ ∥Bv∥
∥AB∥ = sup = sup = sup ≤ sup sup = ∥A∥∥B∥.
v̸=0 ∥v∥ Bv̸=0 ∥v∥ Bv̸=0 ∥Bv∥ ∥v∥ w̸=0 ∥w∥ v̸=0 ∥v∥

The norm gives rises to a metric space, and hence a topology over Mn (C). We can thus
create a subspace topology over any subset of Mn (C), including GLn (C).
Definition 2.3. If G is a subgroup of GLn (C), then G is said to be a Lie group if it is
a closed set with respect to the subspace topology of GLn (C). In other words, given any
convergent sequence of matrices in G, its limit must either not be invertible or remain in G.
For example, the set of all matrices n × n complex matrices with determinant 1 is a Matrix
Lie Group. This is because, along with being a group, the set can be represented as the
pre-image of {1} with respect to the determinant, and hence is closed. This set is denoted
by SLn (C).
Of course GLn (C) is also a Matrix Lie group, along with GLn (R) and SLn (R), since
matrices in R still follow the group axioms, and the fact that R is a closed subset of C.
Suppose we have a vector space (either Rn or Cn ) with the following inner product:
⟨x, y⟩ = x1 y1 + x2 y2 + . . . + xn yn .
The group of operators that preserve the above inner product is called the n × n orthogonal
group of R (or C), denoted by On (R) and On (C).
If we impose the additional condition that the determinant must be 1, we get the special
orthogonal groups SOn (R) and SOn (C).
Another inner product, this time being only applicable to Cn , is as follows:
⟨x, y⟩ = x1 ȳ1 + x2 ȳ2 + . . . + xn ȳn .
The group of matrices that preserve this inner product is known as the unitary group, or
U (n). Imposing the determinant to be 1, we get the special unitary group SU (n).
Topological properties such as compactness, connectedness, and simple connectedness ap-
ply to matrix Lie groups. Note that since we the norm for matrices we use is identical to
that over C n , and hence all connected sets are path-connected.
To conclude this section, we define matrix Lie group homomorphisms.
Definition 2.4. Let G and H be matrix Lie groups. A matrix Lie group homomorphism
is a map Φ : G → H such that
• Φ is a group homomorphism
• Φ is continuous.
A matrix Lie group isomorphism is a matrix Lie group homomorphism that is a group
isomorphism and a homeomorphism.
4 AASHIR MEHROTRA

3. The Matrix Exponential and Logarithm


Recall that for any complex number z, the exponential is defined as:
z2 z3
ez = 1 + z + + + ··· .
2! 3!
The coefficients of z n , according to the root test
1 p
= lim sup |an |
R n→∞
infer that the radius of convergence of the exponential series is infinite, i.e. the series conver-
gences for all z ∈ C. By absolute convergence, the infinite series above, with z substituted
with a complex matrix A, will also converge (and that too for all A ∈ M )n (C). Hence, we
write
A2 A3
eA = 1 + A + + + ··· .
2! 3!
One of the earliest observations one can make is that eA commutes with A, as eA is a power
series in A, and multiplication with A results in the exponents of the power series increasing
by 1. Similarly, esA commutes with tA for all s, t ∈ C.
Claim 3.1. The function exp is continuous over Mn (C).
Proof. The sum

A
X An
e =
n=0
n!
is uniformly convergent over R = ∥A∥ by the Weierstrass M-test, as each summand can be
bounded by a convergent infinite geometric series. Since each partial sum is continuous, it
must follow that eA is continuous in the disk whose center is at the origin and radius is R.
Since R can get arbitrarily large, eA is continuous over all of C. ■
We now prove some basic properties of the matrix exponential.
Proposition 3.2. Let X, Y ∈ Mn (C). Then
(1) e0 = I
(2) eX is invertible for all X, and (eX )−1 = e−X
(3) if X and Y commute, eX+Y = eX eY = eY eX
−1
(4) For S ∈ GLn (C), eS XS = S −1 eX S.
Proof. Point (1) follows by plugging in A = 0 into the power series.
For point (3), note that since X and Y commute, we can mimic the proof of ez1 +z2 =
ez1 ez2 in the case of C, by using the Cauchy product formula. The reason that X and
Y must commute is that when expanding the power series of eX+Y , we would require the
multiplicands of X and Y in one term to collect in a single exponent, which is not possible
if X and Y don’t commute since, for example, XY X ̸= X 2 Y . Hence, the Cauchy product
formula proof cannot apply in the general case.
Point (2) follows from point (3) as
I = e0 = eX+−X = eX e−X .
In order to prove point (4), note that
(S −1 XS)n = (S −1 XS)(S −1 XS) · · · (S −1 XS) = S −1 X n S,
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
5

for any n ∈ N. Actually, we may conclude it to be true for n ∈ Z by using point (2) of this
proposition. Thus
∞ ∞ ∞
S −1 XS
X (S −1 XS)n X S −1 X n S −1
X Xn
e = = =S [ ]S = S −1 eX S.
n=0
n! n=0
n! n=0
n!

Suppose A ∈ Mn (C) is diagonalizable, meaning there exists an invertible matrix S and a
diagonal matrix Λ such that A = S −1 ΛS. This set of diagonalizable matrices can be proven
to be dense in Mn (C).
As shown above, eA = S −1 eΛ S. This makes computing exp for diagonalizable matrices very
convenient, as eΛ is just the exponential of all its diagonal entries. Also, this representation
implies that the eigenvalues of eA are the exponential of the eigenvalues of A.
We now define the logarithm for matrices. Recall for complex numbers, we define log z by
the Taylor series centered at z = 1:

X
m−1 (z − 1)n
log z = (−1) .
n=1
n
By the root test, its radius of convergence is 1. Let

X
m−1 (A − I)n
log A = (−1)
n=1
n
for a matrix A whenever it’s convergent. Again by absolute convergence, the series above
converges for all A such that ∥A − I∥ < 1.
Note that some A might still converge even when they don’t fall into the aforementioned
disk. For example, when A − I is nilpotent, log A converges.
By a similar argument to the exponential case, log A is continuous over ∥A − I∥ < 1, as
each summand can be bounded by an ∥A − I∥n /n, where ∥A − I∥ < 1. Since such a series
converges, the Taylor series of log A will be continuous over the aforementioned disk.
We now prove that exp and log are local inverses of one another.
Lemma 3.3. (1) For z ∈ C such that |z − 1| < 1,
elog z = z.
(2) For u ∈ C such that |u| < log 2, |eu − 1| < 1, and
log(eu ) = u.
Proof. We have exp(log z) = z for all positive reals z, specifically z ∈ (0, 2). Since 1 ∈
(0, 2) is an accumulation point, the Identity theorem from complex analysis applies, and
exp(log z) = z for all z such that |z − 1| < 1.
Likewise, if |u| < log 2, then
u2 u3 |u2 |
|eu − 1| = |u + + + · · · | ≤ |u| + + · · · = e|u| − 1 < 1.
2! 3! 2
Hence, log(exp(u)) is well-defined for |u| < log 2. We have log(exp(u)) = u for all u ∈
(− log 2, log 2). Since 0 ∈ (− log 2, log 2) is an accumulation point, log(exp(u)) = u for all
|u| < log 2. ■
6 AASHIR MEHROTRA

Proposition 3.4. (1) For A ∈ Mn (C) such that ∥A − I∥ < 1,


elog A = A.
(2) For X ∈ Mn (C) such that ∥X∥ < log 2, ∥eX − I∥ < 1, and
log(eX ) = X.
Lemma 3.5. If λ is an eigenvalue of a complex matrix X, then |λ| ≤ ∥X∥. In particular,
If ∥A − 1∥ < 1 and |X| < log 2, then
|z − 1| < 1
and
|u| < log 2
for all eigenvalues z of A and all eigenvalues u of X.
Proof. Let v be a unit eigenvector for the eigenvalue λ. We have,
∥Xv∥
|λ| = ∥λv∥ = ∥Xv∥ = ≤ ∥X∥.
∥v∥

Proof of Proposition 3.4. Suppose A ∈ Mn (C) such that ∥A − I∥ < 1. This means log A is
permissible. By the Lemma 3.3, |z − 1| < 1 for all eigenvalues z of A. If A = S −1 ΛS, then
A − I = S −1 Λ′ S, where Λ′ is Λ with the main diagonal decreased by 1. This is because the
eigenvalues of A − I are one less than the eigenvalues of A, and the eigenvectors of A and
A − I are the same.
By all of this, we must have, similar to the exponential case,
log(A) = S −1 log(Λ)S,
where log(Λ) is the logarithm of all its non-zero entries. By Lemma 3.3, elog z = z for
|z − 1| < 1, and thus
elog A = S −1 elog(Λ) S = S −1 ΛS = A.
This proves the claim for all matrices as the set of diagonalizable matrices is dense in Mn (C).
For ∥X∥ < log 2, a similar argument from Lemma 3.3 shows that ∥eX − I∥ < 1. By a
similar argument as above, since we have |u| < log 2, by Lemma 3.3, for all eigenvalues u of
X, we must have |eu − 1| < 1. If X = S −1 ΛS, then
eX = S −1 eΛ S,
and since ∥eX − I∥ and |eu − 1| are less than 1, log is well-defined for both the LHS and the
RHS, and thus
log(eX ) = S −1 log(eΛ )S = S −1 ΛS = X.

Note that a key ingredient used in the proof was how the logarithm and exponential
functions acted on diagonal matrices.
Recall the asymptotic notation f (x) = O(g(x)) for f, g : R → R, which is that there exists
M, x0 ∈ R such that M > 0 and
|f (x)| ≤ M |g(x)|
for all x ≥ x0 .
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
7

If f : Mn (C) → Mn (C), we define f (A) = O(g(∥A∥)), where g : R → R, if ∥f (A)∥ =


O(g(∥A∥)). We now give an asymptotic identity of log A.
Claim 3.6. For all A ∈ Mn (C) such that ∥A∥ < 12 ,
∥ log(I + A) − A∥ = O(∥A∥2 ).
Equivalently,
(3.1) log(I + A) = A + O(∥A∥2 )
Proof. We have
∞ n ∞
X
n+1 A 2
X An−2
log(I + A) − A = (−1) =A (−1)n+1
n=2
n n=2
n

2
X ( 1 )m−2
2
=⇒ ∥ log(I + A) − A∥ ≤ ∥A∥ ,
n=2
m
and we’re done as the geometric series converges, and hence ∥ log(I+A)−A∥ is asymptotically
bounded by ∥A∥2 . ■
Note that the choice ∥A∥ < 12 is arbitrary when we apply the above claim to to the next
result.
We now state another important theorem regarding exp.
Theorem 3.7 (Lie Product Formula). For all X, Y ∈ Mn (C), we have
X Y
eX+Y = lim (e m e m )m .
m→∞

Proof. By multiplying the power series of eX/m and eY /m , we see that all except three terms
will be asymptotically bounded by O( m12 ). Specifically,
X Y X Y 1
em em = I + + + O( 2 ).
m m m
As m → ∞, eX/m eY /m gets sufficiently close to I, hence falling into the domain of the
logarithm. Also, ∥ X
m
+mY
+ O( m12 )∥ < 12 if m is sufficiently large. Thus we get (using Claim
3.6),
X Y X Y 1
log(e m e m ) = log(I + + + O( 2 )
m m m
X Y 1 X Y 1 2
= + + O( 2 ) + O(∥ + + O( 2 )∥ )
m m m m m m
X Y 1 1 X Y 1
= + + O( 2 ) + O( 2 ) = + + O( 2 ).
m m m m m m m
Exponentiating the logarithm and tending m to ∞, we get
XX Y Y 1
e m e m = exp(
+ + O( 2 ))
m m m
X Y 1
=⇒ lim (e m e m )m = lim exp(X + Y + O( )) = exp(X + Y ),
m→∞ m→∞ m
which is what was desired. ■
8 AASHIR MEHROTRA

We now consider the differentiation and integration of matrix-valued functions. The de-
rivative of a function A : R : Mn (C) is defined as
 
dA Aij
= .
dt ij dt
The linearity and product rule follow the usual proofs from the scalar case.
Theorem 3.8. For X ∈ Mn (C) etX is a smooth function, and
d tX
e = XetX = etX X.
dt
In particular,
d tX
e = X.
dt t=0

Proof. If Λ is diagonal, then


d(etΛij )
 
d tΛ
e = = Λij etΛij .
dt ij dt
Thus, the identity holds for diagonal matrices.
It suffices to prove the result for diagonalizable matrices X as they are dense in Mn (C).
If X = S −1 ΛS, then
d tX d
e = S −1 etΛ S = S −1 ΛetΛ S
dt dt
= (S ΛS)(S −1 etΛ S) = XetX .
−1


The integral of a matrix-valued function is defined similarly as the derivative, by taking
the integral of each entry. This will be used in the Baker-Campbell-Hausdorff formula.
To conclude this section, we define and prove a result on one-parameter subgroups, which
will be useful later.
Definition 3.9. A continuous function A : R → GLn (C is called a one-parameter sub-
group of GLn (C) if
• A(0) = I
• A(t + s) = A(t)A(s) for all s, t ∈ R
Theorem 3.10 (Characterisation of One-Parameter Subgroups). If A is a one-parameter
subgroup of GLn (C), then there exists a unique X ∈ Mn (C) such that
A(t) = etX
for all t ∈ R.
Lemma 3.11. Fix ϵ < log 2. Let Bϵ/2 be the open ball of radius ϵ/2 centered at the origin,
and let U = exp(Bϵ/2 ). Then for every B ∈ U , there exists a unique C ∈ U such that
C 2 = B, and is given by C = exp( 21 log B).
Proof. Since ∥B − I∥ < 1, it’s clear that exp( 21 log B), and by Proposition 3.4, C 2 = exp(2 ∗
1
2
log B) = exp(log B) = B.
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
9

In order to establish uniqueness, suppose, for the sake of contradiction, C ′ ∈ U such that
(C ′ )2 = B. Let Y = log C ′ , so that by Proposition 3.4 Y ∈ Bϵ/2 , which in turn implies
2Y ∈ Bϵ . We also have exp Y = (C ′ ) and hence
exp(2Y ) = (C ′ )2 = B = exp(log B)).

Note that log B ∈ Bϵ/2 ⊂ Bϵ . By point (2) of Proposition 3.4, exp is injective over Bϵ , and
because exp(2Y ) = exp(log B), hence 2Y = B. Hence
1
C ′ = exp(Y ) = exp( log B) = C.
2
Proof of Theorem 3.10. The fact that X is unique, if it exists, is clear, as
d
X= A(t) .
dt t=0
In order to prove existence, let Bϵ/2 and U be as described in the previous lemma. Note that
U is an open neighbourhood of I, as it is the pre-image of Bϵ on the function log. Hence, by
A’s continuity, there exists t0 > 0, such that A(t) ∈ U for all t ∈ R such that |t| ≤ t0 . Let
1
X = log(A(t0 ))
t0
so that t0 X = log(A(t0 )). This means that t0 X ∈ Bϵ/2 , and thus
et0 X = A(t0 ).
Now by the definition of t0 A(t0 /2) ∈ U , with A(t0 /2)2 = A(t0 by the axioms of a one-
parameter subgroup. By Lemma 3.11, A(t0 ) has a unique square root, given by exp(t0 X/2).
Thus, we have
t0 t0 X
A( ) = e 2 .
2
By inductively applying this argument, we have that for all positive positive integers k, we
get
t0 t0 X
A( k ) = e 2k .
2
Also, for all integers n,
t0 Xn t0 X n t0 Xn
A( k ) = (A( )) = e 2k .
2 n
Hence A(t) = exp(tX) for all t = 2nk t0 , with n ∈ Z and k ∈ N. Since all such t are dense
in R, and A along with exp are continuous, we can conclude that A(t) = exp(tX) for all
t ∈ R. ■

4. Directional Derivatives
The total derivative of a function f : U → Cn (where U is an open subset of some Cm ) at
a point x ∈ U is the unique complex linear transformation Jx : Cm → Cn such that

∥f (y) − f (x) − Jx (y − x)∥


lim = 0.
y→x ∥y − x∥
10 AASHIR MEHROTRA

The above should hold regardless of how y approaches x. The directional derivative of f
with respect to a vector v ∈ Cm is defined by the limit

f (x + hv) − f (x)
∇v f (x) = lim .
h→0 h
for h ∈ R. If we set y(t) = x + tv (t ∈ R), then we have

∥f (y) − f (x) − Jx (y − x)∥


0 = lim
y→x ∥y − x∥
∥f (x + tv) − f (x) − Jx (tv)∥
= lim
t→0 ∥tv∥
∥ f (x+tv)−f
t
(x)
− Jx (v)∥
= lim
t→0 ∥v∥
∇v f (x) − Jx v
= ,
∥v∥
hence ∇v f (x) = Jx v for all x ∈ U and v ∈ Cm .

We use the directional derivative in the proofs for a couple of crucial theorems, including
the Baker-Campbell-Hausdorff Formula.

5. Lie Algebra
Definition 5.1. A Lie algebra g is a real (or complex) vector space together with a billinear
symmetric map [·] : g × g → g that satisfies the Jacob identity, meaning
[X, [Y, Z]] + [Z, [X, Y ]] + [Y, [Z, X]] = 0
for all X, Y, Z ∈ g.
A subalgebra h ⊆ g is a subspace such that [h1 .h2 ] ∈ H for all h1 , h2 ∈ H.’
For this paper, we only consider real Lie algebras.
One example of a Lie algebra is R3 with the cross product. The validity of the axioms can
be verified, but won’t be relevant for this paper.
The more important example of a Lie algebra is that of Mn (C) with an operation known
as the commutator:
[X, Y ] = XY − Y X.
The reason for the name commutator is because [·] equals 0 if and only if XY = Y X, i.e. X
and Y commute.
We shall denote the above Lie algebra as gln (C), whose naming shall become clear in due
course.
Note that gln (C) can be interpreted as either a real or complex Lie algebra, which are
distinct. In this paper, we’ll consider the real Lie algebra.
Suppose sln (C) (whose naming will also be explained) is the set of n × n complex matrices
with trace zero. Then it can be checked that sln (C) along with the commutator as a Lie
bracket is a real Lie algebra.
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
11

Definition 5.2. If g and h are real (or complex) Lie algebras, then a linear map ϕ : g → h
is called a Lie algebra homomorphism if [ϕ(X), ϕ(Y )] = ϕ([X, Y ]) for all X, Y ∈ g. If
the linear map is also invertible, then ϕ is a Lie algebra isomorphism.
Definition 5.3. If g1 and g2 are real (or complex) Lie algebras, then define the direct sum
Lie algebra g of g1 and g2 to be the vector direct sum of g1 and g2 , along with the bracket
given by:
(5.1) [(X1 , X2 ), (Y1 , Y2 )] = ([X1 , Y1 ], [X2 , Y2 ]),
for X1 , Y1 ∈ g1 and X2 , Y2 ∈ g2 . We denote the direct sum as g1 ⊕ g2 .
It can be shown that 5.1 adheres to the Lie algebra axioms.
Definition 5.4. If g is a Lie algebra, and g1 and g2 are Lie subalgebra of g, then g decom-
poses as the Lie algebra direct sum of g1 and g2 if g is the vector space direct sum of
g1 and g2 , and [X1 , X2 ] = 0 for all X1 ∈ g1 and X2 ∈ g2 .
Claim 5.5. If g decomposes as a Lie algebra direct sum of g, then the direct sum g1 and g2
is isomorphic to g.
Proof. Since g = g1 ⊕ g2 , every A ∈ g can be expressed as A = X + Y , where X ∈ g1 and
Y ∈ g2 . Since X ∈ g1 and Y ∈ g2 commute with one another,
[A, B] = [X1 + Y1 , X2 + Y2 ]
(5.2)
= [X1 , X2 ] + [X1 , Y2 ] + [Y1 , X2 ] + [Y1 , Y2 ] = [X1 , X2 ] + [Y1 , Y2 ]
for all A, B ∈ g. The equation 5.2 is essentially identical to 5.1, and hence ϕ([X, Y ]) = X +Y
provides the required Lie algebra isomorphism. ■
We now explain the connection between matrix Lie groups and Lie algebras.
Definition 5.6. Let G be a matrix Lie group. The Lie algebra of G, denoted by g, the set
of all complex matrices (not necessarily invertible) X such that etX ∈ G for all t ∈ R.
Another way of formulating the above definition is that the entire the one-parameter
subgroup generated by X lies in g.
This is the explanation behind the notation gln (C) and sln (C). Note that in physics, the
definition demands eitX to be in G, rather than etX , causing the formulations of some Lie
groups to be off by a factor of i.
It is possible to define the Lie algebra for a general Lie group (as the tangent space of G
at the identity), though that won’t be relevant here.
Proposition 5.7. Let G be a matrix Lie group, with Lie algebra g. For all X, Y ∈ g, A ∈ G,
and s ∈ R, we have
(1) A−1 XA ∈ g
(2) sX ∈ g
(3) X + Y ∈ g
(4) XY − Y X ∈ g.
Thus, g is a real Lie algebra in the way defined earlier.
Proof. (1) Recalling point (4) from Proposition 3.2,
−1 XA)
et(A = A−1 etX A ∈ G
for all t ∈ R, as all three of A−1 , etX , and A are in G. Hence A−1 XA ∈ g.
12 AASHIR MEHROTRA

(2) Observe that et(sX) = etsX ∈ G for all t, thus showing sX ∈ g.


(3) In order to prove this point, we use the Lie product formula:
et(X+Y ) = lim (etX/n etY /n )n .
n→∞

Now (etX/n etY /n )n ∈ G for all n, and since G is closed in GLn (C), the limit et(X+Y ) is
either in G or isn’t invertible. Due to point (2) in Proposition 3.2, we know et(X+Y )
can’t be invertible, and hence et(X+Y ) ∈ G.
(4) By the product rule and Theorem 3.8,
d tX −tX
e Ye = (XY )e0 + (e0 Y )(−X)
dt t=0
= XY − Y X.
Now
ehX Y e−hX − Y
XY − Y X = lim .
h→0 h
for h ∈ R. By point (1) of this current result, ehX Y e−hX ∈ g. Since we have already
proven that g is a real vector space, the LHS of the above limit is in g for all h. Since
g, being a vector space, is a closed set, the limit XY − Y X must be in g.

We now prove a converse to our main result (Theorem 1.1), though this result is more
generally applicable.
Theorem 5.8. Let G and H be matrix Lie groups, with corresponding Lie algebra g and h
respectively. If Φ : G → H if a matrix Lie group homomorphism, then there exists a Lie
algebra homomorphism ϕ : g → h such that
Φ(eX ) = eϕ(X)
for all X ∈ g. Additionally for all X, Y ∈ G and A ∈ G,
(1) ϕ(A−1 XA) = Φ(A−1 )ϕ(X)Φ(A)
(2) ϕ([X, Y ]) = [ϕ(X), ϕ(Y )]
d
(3) ϕ(X) = dt
Φ(etX ) .
t=0

Proof. Since Φ is a continuous function and a homomorphism, Φ(etX ) is a one-parameter


subgroup of GLn (C). Hence by Theorem 3.10, there exists a unique matrix Z such that
(5.3) Φ(etX ) = etZ
for all t ∈ R. We define ϕ(X) = Z. If we substitute t = 1 in equation , we easily see that
Φ(eX ) = eZ = eϕ(X) .

If another map ϕ′ : g → h existed satisfying Φ(eX ) = eϕ (X) for all X ∈ g, then

etϕ(X) = etϕ(X ) = Φ(etX ).
Differentiating at t = 0 shows that ϕ(X) = ϕ′ (X). We now prove linearity. Since Φ(etX ) =
etϕ(X) for all t ∈ R, we get
es[tϕ(X)] = Φ(etsX ) = etϕ(sX)
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
13

for all s, t ∈ R. Differentiating at t = 0 yields sϕ(X) = ϕ(sX).


tX tY
etϕ(X+Y ) = Φ( lim (e n e n )n )
n→∞
tX tY tX tY
= lim (Φ(e n e )) = lim (Φ(e n )Φ(e n ))n
n
n
n→∞ n→∞
tϕ(X) tϕ(Y )
= lim (e n e n )n = et(ϕ(X)+ϕ(Y ) .
n→∞

Differentiating the above result at t = 0 proves ϕ(X + Y ) = ϕ(X) + ϕ(Y ). We now prove
the remaining properties (1), (2), and (3) of ϕ.
(1) If A ∈ G, then
−1 XA) −1 XA)
etϕ(A = eϕ(tA
−1 XA
= Φ(etA ) = Φ(A−1 )Φ(etX )Φ(A) = (Φ(A)))−1 eϕ(X) Φ(A).
(2) As in Proposition 5.7, we have (using the fact that a linear transformation commutes
with the derivative, along with (1))
 
d tX −tX
ϕ(X, Y ) = ϕ e Ye
dt t=0
d d
= ϕ(etX Y e−tX ) = Φ(etX ϕ(Y )Φ(etX )
dt t=0 dt t=0
d
= etϕ(X) ϕ(Y )e−tϕ(X) = [ϕ(X), ϕ(Y )].
dt t=0
(3) Since
Φ(etX ) = eϕ(tX) = etϕ(X)
and
d tϕ(X)
e = ϕ(X),
dt t=0
we get (3).

This theorem shows that a unique Lie group homomorphism gives rise to a unique Lie
algebra homomorphism. The converse, in general, not true, however we later use the Baker-
Campbell-Hausdorff formula to prove that the converse is true if we assume G is simply-
connected.
Next, we define some important maps in Lie theory and for the proof of the BCH formula.
Definition 5.9 (The Adjoint Map). Let G be a matrix Lie group and g its Lie algebra. For
A ∈ G, define the adjoint map AdA : g → g
AdA (X) = AXA−1 .
In what follows, GL(g) be the space of invertible linear operators of g, with gl(g) as its
associated Lie algebra. Equivalently, GL(g) can be identified with GLm (R), and gl(g) can
be identified as Mm (R), where m is the vector space dimension of g.
Proposition 5.10. (1) For all A ∈ G, AdA ∈ GL(g).
(2) The map Ad : G → GL(g) is a Lie group homomorphism.
(3) The map AdA satisfies AdA ([X, Y ]) = [AdA (X), AdA (Y )] for all X, Y ∈ g.
14 AASHIR MEHROTRA

Proof. (1) We have AdA (tX) = A(tX)A−1 = t(AXA−1 ) = tAdA (X).


Also, AdA (X + Y ) = A(X + Y )A−1 = AXA−1 + AY A−1 = AdA (X) + AdA (Y ).
This shows that AdA is linear.
To show that it’s invertible, consider AdA−1 .
Then (AdA−1 ◦ AdA )(X) = A−1 (AXA−1 )A = X and by symmetry composition the
other way around gives X. Hence AdA ∈ GL(g)
(2) Note that AdA (X) sends X to a vector of n2 polynomials in xi , with coefficients
being a rational function in A. Thus, if AdA and AdB are sufficiently close, then the
coefficients of the polynomial entries is sufficiently close, which means the entries of
A and B themselves must be sufficiently close, imply Ad is continuous.
We also have AdAB (X) = ABX(AB)−1 = A(BXB 1 )A−1 = (AdA ◦ AdB ), hence Φ
is a homomorphism.
Hence, Φ, being both continuous and a homomorphism, is a Lie group homomor-
phism.
(3) We have
AdA ([X, Y ]) = A(XY − Y X)A−1 = AXY A−1 − AY XA−1
= AXA−1 AY A−1 − AY A−1 AXA−1 = [AdA (X), AdA (Y )].

By Theorem 5.8, there exists a Lie algebra homomorphism ad : g → gl(g such that
eadX = AdeX
for all X ∈ g. We shall come to what eadX means (see 5.4). For now, we see adX as an
element of Mm (R).
Proposition 5.11. For all X, Y ∈ g,
adX (Y ) = [X, Y ].
Proof. By point (3) of Theorem 5.8,
d
adX = Ad tX .
dt e t=0
Hence,
d d tX −tX
adX (Y ) = Ad tX (Y ) = e Ye = [X, Y ].
dt e t=0 dt t=0

We now revisit trying to understand the meaning of eadX . Observe that
adX ad2X
e = I + adX + + ···
2!
where adnX is ad composed with itself n times. We now have
ad2X
eadX (Y ) = I(Y ) + adX (Y ) + (Y ) + · · ·
(5.4) 2!
[X[X, Y ]]
= Y + [X, Y ] + + ··· .
2!
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
15

We will now interpret eadX as a linear operator on g. Actually eadX ∈ GL(g), since eX is
invertible for any linear operator X.
We now show that exp and log are local inverses in the context of Lie theory. If X ∈ g,
then exp(X) is in G by definition. Hence we can consider and analyse the map exp : g → G.
Lemma 5.12. Let G be a matrix Lie group with Lie algebra g. Suppose Bn (n ∈ N) are
elements of G such that Bn → I. Let Yn = log Bn , which is defined for sufficiently large n.
If Yn is non-zero for all n where it’s defined, and Yn /∥Yn ∥ → Y , then Y ∈ g.
tYn
Proof. For all t ∈ R, ∥Y n∥
→ tY . Since Bn → I, we must also have Yn → 0 and hence
∥Yn ∥ → 0. Hence for a fixed t,  
t
∥Yn ∥ → 0
∥Yn ∥
where {·} is the fractional part of a real number. This is because the expression inside the
{·} above is bounded in the interval [0, 1). We now have
 
t t
∥Yn ∥ − ∥Yn ∥ → t
∥Yn ∥ ∥Yn ∥
 
t t
=⇒ ( − )∥Yn ∥ → t
∥Yn ∥ ∥Yn ∥
 
t
=⇒ ∥Yn ∥ → t.
∥Yn ∥
where ⌊·⌋ is the floor (or greatest integer) function, which is always an integer. Thus we
define the integers an (t) as  
t
an (t) =
∥Yn ∥
for sufficiently large n (since Yn is defined for sufficiently large n). Thus,
 
an (t)Yn Yn
e = exp (an (t)∥Yn ∥) → etY .
∥Yn ∥
On the other hand,
ean (t)Yn = (eYn )an (t) = (Bn )an (t) ∈ G
for all t ∈ R, as an (t) are integers. Since G is closed and etY is invertible, it must be the
case that etY ∈ G for all t ∈ R, which implies Y ∈ g. ■
Theorem 5.13. For 0 < ϵ < log 2, let Uϵ = {X ∈ Mn (C)|∥X∥ < ϵ}, and let Vϵ = exp(Uϵ ).
Let G be a matrix Lie group with Lie algebra g. Then there exists ϵ ∈ (0, log 2) such that for
any A ∈ Vϵ , we have A ∈ G if and only if log A ∈ g.
Note that if log A ∈ g, and A is sufficiently close to I, elog A = A ∈ G (by the definition of
g. So it suffices to prove only one direction of the above theorem, namely A ∈ Vϵ ∩ G =⇒
log A ∈ g.
Proof. We currently view g as a real vector subspace of Mn (C) (which can also be viewed as
2 2
the real vector space R2n ). We can define a symmetric inner product on R2n , like the one
we defined in order to describe On and SOn . Let S be the orthogonal complement of g with
respect to this inner product. Thus every n × n Z matrix can be written as Z = X + Y ,
where X ∈ g and Y ∈ S.
16 AASHIR MEHROTRA

Let Φ : Mn (C) → Mn (C) be defined as


Φ(Z) = eX eY .
2
We interpret the above map over R2n now. We have X = projg (Z), and Y = projS (Z),
where proj is the projection map, which is linear and hence continuously differentiable. Since
Φ(Z) = eprojg (Z) eprojS (Z) ,
we can conclude that Φ is continuously differentiable. We now calculate directional deriva-
tives of Φ at Z = 0 in the direction of X and Y (which can be viewed as input vectors). We
have
ehX − I d
(5.5) ∇X Φ(0) = lim = Φ(tX, 0) = X.
h→0 h dt t=0

We used the fact that the D component of X is 0 as X ∈ g. Similarly, ∇Y Φ(0) = Y .


Since X = ∇X Φ(0) = J0 X and ∇Y Φ(Y ) = J0 Y , we have
∇Z Φ(0) = J0 Z = J0 (X + Y ) = J0 X = J0 Y = X + Y = Z
for all Z ∈ Mn (C). This shows that J0 , or the total derivative of Φ at Z = 0 is the identity.
Since the identity is invertible, the inverse function tells us that Φ is a isomorphism from a
neighbourhood of 0 to a neighbourhood of I. Hence there exists a local inverse of Φ in a
neighbourhood of I.
If there doesn’t exists ϵ ∈ (0, log 2) such that for all A ∈ Vϵ ∩ G =⇒ log A ∈ g, then there
must exists a sequence of matrices An in G such that An → I and log An ∈ / g for all n. By
using the local inverse of Φ, we may write An for sufficiently large n as
An = eXn eYn
where Xn ∈ g and Yn ∈ S for all n. Also, since An → I, Xn , Yn → 0, it must be that Yn ̸= 0
for any n, as then
An = eXn =⇒ log An = Xn ∈ g,
which is a contradiction. Note that Xn , which is the projection of An onto g, will be
sufficiently close to 0, and hence log(eXn ) = Xn . Since Xn ∈ g and An ∈ G, we get that
Bn = e−Xn Am = eYn
is in G. Since Yn and Bn are sufficiently close to 0 and I respectively,
log Bn = Yn .
This implies that Bn → I. Consider the unit sphere in the subspace S. This set is compact,
and thus there exists a subsequence of Yn such that
Yn
→Y
∥Yn ∥
for some Y ∈ S such that Y is in the unit sphere of S, which implies ∥Y ∥ = 1. By Lemma
5, Y ∈ g. But g and S are orthogonal complements, thus Y = 0, which contradicts the
conclusion we derived above that ∥Y ∥ = 1. ■
This remarkably powerful theorem has some major consequences, particularly for con-
nected matrix Lie groups.
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
17

Corollary 5.14. If G is a connected matrix Lie group, then for every A ∈ G, there exists
X1 , X2 , . . . , Xn ∈ g such that
A = eX1 eX2 . . . eXn .
Lemma 5.15. Suppose a ≤ b ∈ R, and f : [a, b]n → GLn (C) is continuous. Then for all
ϵ > 0 there exists δ > 0 such that for all s, t ∈ [a, b]n with ∥s − t∥ < δ,
∥f (s)f (t)−1 − I∥ < ϵ.
Proof. Observe that
(5.6) ∥f (s)f (t)−1 − I∥ = ∥(f (s) − f (t))f (t)−1 ∥ ≤ ∥f (s) − f (t)∥∥f (t)−1 ∥,
where we used the sub-multiplicativity of ∥ · ∥. Since [a, b]n is compact, ∥f (t)−1 ∥ is bounded
above by a constant, say by C > 0. Another consequence of [a, b]n being compact is that
f is uniformly continuous, which implies for any ϵ > 0 there exist δ > 0 such that for
all s, t ∈ [a, b] with ∥s − t∥ < δ, ∥f (s) − f (t)∥ < ϵ/C. Multiplying this inequality with
∥f (t)−1 ∥ < C, we get our desired result. ■
Proof of Corollary 5.14. Let Vϵ be as defined as Theorem 5.13. For any A ∈ G, let f :
[0, 1] → G be a continuous path such that f (0) = I and f (1) = A. By Lemma 5.15, there
exists δ < 0 such that if |s − t| < δ, then f (s)f (t)−1 ∈ Vϵ . We partition [0, 1] into n pieces,
such that n1 < δ. Thus, for i = 1, 2, . . . , n, we have f ( i−1
n
)−1 f ( ni ) ∈ Vϵ , which implies
i − 1 −1 i
f( ) f ( ) = eXi
n n
for all i and some X1 , X2 , . . . Xn , which follows by Theorem 5.13. Hence
A = f (0)−1 f (1)
1 1 2 m − 1 −1
= f (0)−1 f ( )f ( )−1 f ( ) · · · f ( ) f (1)
m m m m
= eX1 eX2 · · · eXn .

6. The Baker-Campbell-Hausdorff Formula


Consider the complex function
log z
g(z) = .
1 − z1
g is analytic centred on z = 1, and has a radius of convergence 1 since the closest pole of g
to 1 is 0. If ∞
X
g(z) = am (z − 1)m
n=0
for some am ∈ C, then by absolute convergence, the above can series can be defined to any
matrix A ∈ Mn (C), such that ∥A − I∥ < 1. This can be generalized to any linear operator of
a finite-dimensional complex vector space V , by identifying V with Cn and hence defining the
norm of linear operator on V . To summarise, for any linear operator A of a finite-dimensional
complex vector space V ,

log A X
(6.1) g(A) = = am (A − I)m
I − A−1 n=0
18 AASHIR MEHROTRA

is well-defined provided ∥A − I∥ = 1.
We shall now state and prove the Baker-Campbell-Hausdorff Formula, which is the biggest
tool that will be used to prove Theorem 1.1.
Theorem 6.1 (BCH Formula). For all X, Y ∈ Mn (C) with ∥X∥ and ∥Y ∥ sufficiently small,
we must have Z 1
log(eX eY ) = X + g(eadX etadY )(Y )dt.
0
Consider the derivative
d X+tY
∆(X, Y ) = e
dt t=0
where X, Y ∈ Mn (C) and t ∈ R.
2 2
Also consider the directional derivative of exp : Cn → Cn in the direction a complex
n × n complex matrix Y acting as a n2 -dimensional vector.
We have:

eX+tY − eX
(6.2) ∇Y exp(X) = lim = ∆(X, Y ).
t→0 t
2
Therefore, we must also have ∆(X, Y ) = JX Y , where JX is a linear operator on Cn . Hence

for fixed X, ∆(X, Y ) is a linear function on Y .


Theorem 6.2 (Derivative of exp). For all X, Y ∈ Mn (Cn ), we have
adX
 
X I −e
∆(X, Y ) = e (Y )
adX
 
X [X, Y ] [X, [X, Y ]]
=e Y − + − ··· .
2! 3!−
ad
Note that adX need not be invertible, by I−e
adX
X
we mean the power series expansion of
the expression, which is
I − eadX adX ad2X
=I+ + + ··· .
adX 2! 3!
Lemma 6.3. If Z is a linear operator on a complex finite-dimensional vector space, then
n
1 X −Z/m n I − e−Z
(6.3) lim (e ) =
n→∞ n Z
m=0
Similar to above, the RHS is to be interpreted as a power series so that it’s well-defined
for all operators Z.
Proof. We have
n−1
1 1 X −Z/n m 1 I − e−Z
lim (e ) = lim
n n→∞ n m=0 n→∞ n I − e−Z/n

I − e−Z
= lim
n→∞ n 1 Z − n 1 12 Z 2 + · · ·
n 2! n
I − e−Z
=
Z
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
19

for all operators Z with det(Z) ̸= 0 and none of Z’s eigenvalues are 0 (the second condition
as I − e−Z/n is invertible if and only if Z’s eigenvalues are all non-zero). Actually, the first
condition is enough as det(Z) = 0 if and only if Z has a 0 eigenvalue.
Since the set of all operators with determinant not equal to 0 is dense in Mn , we see that
6.3 is true for all operators (when viewed as a power series). ■
Proof of Theorem 6.2. For every positive integer n, we have
 n
X+tY X tY
e = exp( + ) .
n n
∆(X, Y ) is the derivative of the above expression at t = 0. By the product rule, we have:
n−1
X d X Y
∆(X, Y ) = (eX/n )n−m−1 [ exp( + t )](e(X/n )m
m=0
dt n n
n−1
X X Y
=e (n−1)X/n
(eX/n )−m ∆( , )(e(X/n) )m
m=0
n n
n−1
(n−1)X/n 1 X
X
=e Ad(e(X/n )−m (∆( , Y ))
n m=0 n
n−1
(n−1)X/n 1
X −adX m X
=e [exp( )] (∆( , Y )).
n m=0 n n
Note that we also used to linearity with respect to Y above.
If we tend n to ∞ in the final expression, we see that e(n−1)X/n tends to eX , ∆( Xn , Y ) tends
to ∆(0, Y ) = Y . Lastly,
n−1
1X −adX m 1 − e−adX
lim [exp( )] = .
n→∞ n n adX
m=0

Hence we have shown that


1 − e−adX
 
d X+tY X
e =e (Y ) .
dt t=0 adX

This means that the total derivative (or Jacobian) of exp at X equals
−adX
 
X 1−e
JX = e .
adX
Let U be an open subset of R. Suppose X(t) : U → Mn (C) is a smooth matrix-valued
function. By the chain rule, we must have
−adX(t)
    
d X(t) dX X(t) 1 − e dX
e = JX(t) =e
dt dt t=0 adX(t) dt
for all real scalar complex matrix-valued functions X.
Recall that adX (Y ) = [X, Y ] for a complex matrix Y . Hence, adnX (Y ) equals the nested
20 AASHIR MEHROTRA

bracket expression [X, [X, . . . [X, Y ]] . . .], where there are n Xs. Thus
1 − e−adX
.
adX
is a series of nested brackets, as shown in 6.2.
If X is sufficiently small, then adX (Y ) = XY −Y Z is sufficiently close to the zero operator
on Mn (C), and thus
1 − e−adX adx ad2X
(6.4) =I− + − ...
adX 2! 3!
approaches I as X approaches 0. Specifically, for sufficiently small X, 6.4 will be so close to
I so that its determinant cannot be zero (since det I = 1), and hence will be invertible.
Proof of Theorem 6.1. For sufficiently small X, Y ∈ Mn (C), let
(6.5) Z(t) = log(eX etY )
so that Z(t) is defined in [0, 1]. By the generalization of Lemma 6.2, we have
I − e−adZ(t)
  
−Z(t) d Z(t) dZ
e e = .
dt adZ(t) dt
Since eZ(t) = eX etY , we must also have have
d
e−Z(t) eZ(t) = (eX etY )−1 eX etY Y = Y.
dt
Hence,
I − e−adZ(t)
  
dZ
= Y.
adZ(t) dt
If X and Y are sufficiently close to 0, then Z(t) would as well be sufficiently close to 0 for
0 ≤ t ≤ 1. Hence by arguments above,
I − e−adZ(t)
adZ(t)
is invertible for sufficiently small X and Y .
Hence,
−1
I − e−adZ(t)

dZ
= (Y ).
dt adZ(t)
Since eZ(t) = eX etY , we may use the properties of Ad and ad to conclude
AdeZ(t) = AdeX AdetY
=⇒ eadZ(t) = eadX etadY
=⇒ adZ(t) = log(eadX etadY )
The last implication follows as X, Y , and Z(t) are small and hence log is defined for eadZ(t)
and eadX etadY .
This implies that
−1
I − (eadX etadY )−1

dZ
= (Y ).
dt log(eadX etadY )
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
21

Recall that the function g defined in equation 6.1 is


1 − A−1
 
g(A) = .
log A
Thus,
dZ
= g(eadX etadY )(Y )
dt
is a differential equation with initial condition Z(0) = log(eX e0 ) = X. Integrating, we
conclude that
Z 1
X Y
log(e e ) = Z(1) = X + g(eadX etadY )(Y )dt,
0
which proves Theorem 6.1. ■

We conclude this section by proving the series form of the BCH formula, which involves a
series of nested commutators. By using the Taylor series expansion of g at z = 1, it can be
shown that
1 1 1
g(A) = 1 + (A − I) − (A − I)2 + (A − I)3 + · · ·
2 6 12

for all A such that ∥A − I∥ < 1. We also must have


eadX etadY − I =
ad2X t2 ad2Y
= (I + adX + + · · · )(I + tadY + + ···) − I
2! 2!
ad2 t2 ad2Y
= adX + tadY + tadX adY + X + + ··· .
2 2
Since eadX etadY − I has no zeroth-order term with respect to adX and adY , (eadX etadY − I)m
will have no mth-order term. Thus, computing g upto degree 2 gives
g(eadX etadY )
1 ad2 t2 ad2Y
= I + (adX + tadY + tadX adY + X + )
2 2 2
1
− (ad2X + t2 ad2Y + tadX adY + tadY adX ) + · · · .
6
Applying the above expressing to Y and integrating (as well as nothing that adY (Y ) = 0,
hence all terms with adY at the beginning are zero), we get
log(eX eY )
Z 1
1 1 1 t
=X+ (Y + [X, Y ] + [X, [X, Y ]] − [X, [X, Y ]] − [Y, [X, Y ]] + · · · )
0 2 4 6 6
1 1 1
= X + Y + [X, Y ] + [X, [X, Y ]] − [Y, [X, Y ]] + · · · ,
2 12 12
continuing as an infinite series of nested brackets.
22 AASHIR MEHROTRA

7. Simply Connected Lie Group Homomorphisms


Recall that if (X, τ ) is a topological space, then it is said to be simply connected if for all
continuous maps f0 , f1 : [0, 1] → X such that f0 (0) = f1 (0) and f0 (1) = f1 (1), there must
exist a continuous map F : [0, 1] × [0, 1] → X such that
• F (0, t) = f0 (t)
• F (1, t) = f1 (t)
• F (s, 0) = f0 (0) = f1 (0)
• F (s, 1) = f0 (1) = f1 (1)
for all s, t ∈ [0, 1].
Equivalently, for all closed loops f (t) in X, there exists a map F (s, t) that contracts f (t)
into a single point as s ranges from 0 to 1.
We prove a partial converse to 5.8, which gives a unique canonical matrix Lie group
homomorphism, when given a Lie algebra homomorphism. This is Theorem 1.1. We first
prove this result locally
Definition 7.1. If G and H are matrix Lie groups, then a local homomorphism of G to H
is a pair (Φ′ , U ), where U is a connected neighbourhood of the identity in G, and Φ′ : U → H
is a continuous map such that Φ′ (AB) = Φ(A′ )Φ(B ′ ) for all A, B ∈ U with AB ∈ U .
Note that U doesn’t necessarily need to be a subgroup of G.
Theorem 7.2. Let G and H be matrix Lie groups, with Lie algebras g and h. Let ϕ : g → h
be a Lie algebra homomorphism. Define Uϵ ∈ G as
Uϵ = {A ∈ G|∥A − I∥ < 1 and ∥ log A∥ < ϵ}.
Then there exists ϵ > 0 such that the map Φ′ : Uϵ → H given by
Φ′ (A) = eϕ(log A)
is a local homomorphism.
Note that for this theorem, G needn’t be simply connected.
Proof. Firstly, observe that Φ′ must be continuous as it is the composition of the continuous
function log, ϕ (which is continuous as it’s a linear map), and exp.
Let ϵ > 0 be small enough such that Theorem 5.13 holds, and small enough so that for
any A, B ∈ Uϵ , the BCH formula holds for X = log A and Y = log B, as well as ϕ(X)) along
with ϕ(Y ) (note that X, Y ∈ g and ϕ(X), ϕ(Y ) ∈ h) . If AB ∈ Uϵ , then
X eY
Φ′ (AB) = Φ′ (eX eY ) = eϕ(log(e ))
.
We now apply the BCH formula, which can be expressed as an infinite series of nested
brackets. We have
ϕ(log(eX eY ))
1 1 1
= ϕ(X + Y + [X, Y ] + [X, [X, Y ]] − [Y, [X, Y ]] + · · · )
2 12 12
1 1 1
= ϕ(X) + ϕ(Y ) + [ϕ(X), ϕ(Y )] + [ϕ(X), [ϕ(X), ϕ(Y )]] − [ϕ(Y ), [ϕ(X), ϕ(Y )]] + · · ·
2 12 12
= log(eϕ(X) eϕ(Y ) ),
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
23

where we’ve used the fact that ϕ([X, Y ]) = [ϕ(X), ϕ(Y )]. We now get
X eY ϕ(X) eϕ(Y ) )
Φ′ (AB) = eϕ(log(e ))
= elog(e = eϕ(X) eϕ(Y ) = Φ′ (A)Φ′ (B),
which proves the result. Note that exp ◦ log is the identity in this case as X and Y are
sufficiently close to I. ■
We now wish to extend Φ′ into a global homomorphism over G, which will be possible
thanks to the group’s simple-connectedness.
Theorem 7.3. Suppose G and H are matrix Lie groups with G simply-connected. If (Φ′ , U )
is a unique local homomorphism of G into H, then there exists a unique Lie grop homomor-
phism Φ : G → H such that Φ|U = Φ′ (i.e. Φ agrees with Φ′ on U ).
Proof. Let A ∈ G, and define a continuous path f : [0, 1] → G such that f (0) = I and
f (1) = A. We say that a partition 0 = t0 < t1 < · · · < tn = 1 is a good partition if for
each s, t ∈ [ti , ti+1 ] for any 0 ≤ 0 ≤ n − 1,
f (s)f (t)−1 ∈ U.
By Lemma 5.15, a good partition exists, (by making the mesh of the partition less than the
δ defined in Lemma 5.15). We define
(7.1) Φ(A) = Φ′ (f (tn )f (tn−1 )−1 ) · · · Φ′ (f (t2 )f (t1 )−1 )Φ(f (t1 )f (t0 )−1 ).
We now prove that Φ(A) is independent of the partition or path chosen.
If t0 , . . . , tn is a good partition, then inserting any s in between some [ti , ti+1 ] the partition
will still be good, as the mesh would either stay the same or decrease. The effect the addition
of s has on 7.1 is the replacing the factor Φ′ (f (ti+1 )f (ti )−1 ) with
Φ′ (f (ti+1 )f (s)−1 )Φ′ (f (s)f (ti )−1 ).
Since Φ′ is a local homomorphism, and f (ti+1 )f (s)−1 , f (s)f (ti )−1 , and f (ti+1 )f (ti )−1 are all
in U , we have
Φ′ (f (ti+1 )f (ti )−1 ) = Φ′ (f (ti+1 )f (s)−1 )Φ′ (f (s)f (ti )−1 ),
hence not changing the value of Φ(A). Hence, Φ(A) will remain the same for any finite
number of additions in its defining partition. Now, any two permutations have a common
refinement, namely their union. Since Φ(A) on each partition stays the same in their union,
we conclude that both partitions yield the same Φ(A) , and hence Φ partition independent.
We now prove Φ(A) is path independent. Let f0 (t) and f1 (t) be continuous paths in G
joining I to A. Since G is simply-connected, there exists a function F such that for all
s, t ∈ [0, 1],
(1) F (0, t) = f0 (t) and F (1, t) = f1 (t)
(2) F (s, 0) = I and F (s, 1) = A.
As in Lemma 5.15, there exists N ∈ N such that for all s, s′ , t,′ ∈ [0, 1] with |s − s′ |, |t − t′ | <
2/N ,
f (s, t)f (s′ , t′ )−1 ∈ U.
Let gj,k (t) (with j, k = 0, 1, . . . , N − 1 and k = 0, 1, . . . , N ) be continuous paths from [0, 1]
to G that connect I to A. They are defined as:

j+1
F ( N , t)
 0 ≤ t ≤ k−1
N
gj,k (t) = F ( j+k N
− t, t) k−1
N
≤ t ≤ k
N
F ( j , t)
 k
≤t≤1
N N
24 AASHIR MEHROTRA

when k ̸= 0. If k = 0, then define gj,k (t) = F ( Nj , t) for all t ∈ [0, 1]. In particular, we have
g0,0 = f0 (t).
We now deform f0 to g0,1 , then g0,1 to g0,2 , and so unit we reach g0,N , which we then reform
to g1,0 , and continue in a lexicographical manner, until we reach gN −1,N , which we deform
into f1 . The claim is that Φ(A) is the same over all such deformations. Note that the pair
gj,k (t) and gj,k+1 (t) for some 0 ≤ j, k ≤ N − 1 are identical over [0, 1] except for the interval
 
k−1 k+1
, .
N N
We have proven above that Φ(A) is partition independent, and we are free to choose any
good partition. Let that partition be
1 k−1 k+1 k+2
0, , · · · , , , , · · · , 1.
N N N N
which is good by definition of N . By 7.1, Φ(A) only depends on the partition points. Since
gj,k (t) and gj,k+1 (t) agree on the above partition points, we must have both paths yield the
same Φ(A) .
Similarly, the pair gj,N and gj+1,0 for some 0 ≤ j ≤ N − 2, along with the pair gN −1,N and
f1 are identical over all of [0, 1] except
N −1
( , 1).
N
Choosing the partition
1 N −1
0, , · · · , ,N
N N
will suffice as gj,N and gj+1,0 agree on all of the partition points. Hence both paths yield the
same Φ(A). All of this together implies that by deforming f1 to f2 via this process preserves
Φ(A), and hence Φ is path independent.
We observe that Φ is continuous as Φ′ is continuous and for two sufficiently close points
A and B, they have paths fA and fB from I to A and B respectively whose max distance
from one another is sufficiently small.
If fA (t) and fB (t) are paths connecting I to A and B respectively, define fAB (t) as
(
fB (2t) 0 ≤ t ≤ 21
fAB (t) =
fA (2t − 1)B 12 ≤ t ≤ 1.
If t0 , . . . , tn is a good partition of fA and s0 , . . . , sm is a good partition of fB , then
s0 sm 1 + t0 1 + tn
,··· , , ,··· ,
2 2 2 2
sm 1 1+t0
is a good partition of fAB , noting that 2 = 2 = 2 . Then
1 + tn 1 + tn−1 −1 1 + t1 1 + t0 −1 ′ t0 + 1 sm
Φ(AB) = Φ′ (fAB ( )fAB ( ) ) · · · Φ′ (fAB ( )fAB ( ) )Φ (fAB ( )fAB ( )−1 )
2 2 2 2 2 2
′ sm sm−1 −1 ′ s2 s1 −1 ′ s1 s0 −1
Φ (fAB ( )fAB ( ) ) · · · Φ (fAB ( )fAB ( ) )Φ (fAB ( )fAB ( ) )
2 2 2 2 2 2
= Φ(A)Φ(B)
sm 1+t0
since 2
= 2
, and thus
1 + t0 sm
f( )f ( )−1 = I.
2 2
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
25

Hence Φ(AB) = Φ(A)Φ(B), and Φ is a Lie group homomorphism.


Lastly, we prove that Φ agrees with Φ′ on U . Let A ∈ U , and t0 , . . . , tn a good partition
over a path f in U (f exists since we assume U is connected). Then we use induction to
prove Φ(A) = Φ′ (A) = Φ′ (f (tn )).
For the base case, we note that
Φ(f (t1 )) = Φ′ (f (t1 )f (t0 )−1 ) = Φ′ (f (t1 ))
as t0 = 0 and f (t0 ) = I.
For induction, assume Φ(f (tj )) = Φ′ (f (tj )), we have
Φ(f (tj+1 )) = Φ′ (f (tj+1 )f (tj )−1 )Φ′ (f (tj )f (tj−1 )−1 ) · · · Φ′ (f (t1 ))
= Φ′ (f (tj+1 )f (tj )−1 )Φ(f (tj )) = Φ′ (f (tj+1 )f (tj )−1 )Φ′ (f (tj )) = Φ′ (f (tj+1 )),

where we used the fact that Φ′ is a local homomorphism. Hence Φ(A) = Φ′ (A) for all
A ∈ U. ■

Proof of Theorem 1.1. For existence, let Φ′ be the local homomorphism in Theorem 7.2, and
Φ the global homomorphism in Theorem 7.1. For X ∈ g, the matrix eX/m will be in U for
sufficiently large m, and
Φ(eX/m ) = Φ′ (eX/m ) = eϕ(X)/m .
Since Φ is a homomorphism,
Φ(eX ) = Φ(eX/m )m = eϕ(X)
as desired.
Assume for contradiction that Φ1 and Φ2 are two homomorphisms such that Φ1 (A) ̸=
Φ2 (A). Since G is connected, by Corollary 5.14, we have A = eX1 · · · eXn for Xi ∈ g. Hence
Φ1 (A) = Φ2 (A)
which contradicts our assumption. Hence Φ is unique, and our main theorem has been
proven ■

8. Conclusion and Acknowledgements


The above exploration into Lie Theory can be applied to the representation theory of Lie
groups and algebras, as representations are a special case of homomorphisms.
Using the Baker-Campbell-Hausdorff formula, one can also prove that if G is a matrix Lie
group with Lie algebra g, then for any subalgebra h ⊆ g, there exists a unique ”Lie connected
subgroup” H such that the Lie algebra of H is h.
By the above result and Ado’s theorem (which state that every Lie algebra is a Lie algebra
of matrices with the Lie product being the commutator), it follows that for every real Lie
algebra g, there exists a matrix Lie group G such that g is isomorphic to the Lie algebra of
G.
I would like to thank my instructor Simon Rubinstein-Salzedo, my peer Atticus Kuhn and
my mentor Annika Mauro for their guidance and feedback on this paper, which was done
for an independent research writing class of Euler Circle.
26 AASHIR MEHROTRA

References
[1] Lars V Ahlfors. Complex analysis. 1979.
[2] Alen Alexanderian. Matrix lie groups and their lie algebras.
[3] Robert Gilmore. Lie groups, Lie algebras, and some of their applications. Courier Corporation, 2006.
[4] Brian C Hall and Brian C Hall. Lie groups, Lie algebras, and representations. Springer, 2013.
[5] JA1088363 Oteo. The baker–campbell–hausdorff formula and nested commutator identities. Journal of
mathematical physics, 32(2):419–424, 1991.
[1] [2] [3] [4] [5]

You might also like