FINAL Every Lie Algebra Homomorphism Gives Rise To A Canonical Lie Group Homomorphism If The Domain Group Is Simply Connected
FINAL Every Lie Algebra Homomorphism Gives Rise To A Canonical Lie Group Homomorphism If The Domain Group Is Simply Connected
AASHIR MEHROTRA
Abstract. In this paper, we explore matrix Lie groups, which are groups in the space of
real or complex matrices. The Lie group structure can be used to prove properties of many
important groups such as the unitary and orthogonal groups, among others. We explore
matrix Lie groups along with Lie algebras, a non-associative skew-symmetric algebra that
satisfies a property known as the Jacobi identity. Lie groups and Lie algebras go hand in
hand, as locally they can be bijectively mapped to one another by using the exponential and
logarithm functions on a matrix, which we define and prove properties of in this paper. We
prove that every Lie algebra homomorphism gives rise to a unique Lie group homomorphism,
which is equal to the composition of the said Lie algebra homomorphism along with the
exponential function. In order to prove this result, we first prove the Baker-Campbell-
Hausdorff formula, which shows that log(eX eY ) can be expressed as an infinite series of
nested brackets in X and Y , provided both the matrices are sufficiently small in magnitude.
1. Introduction
Lie groups are groups that are also differentiable manifolds, such that the product and
inversion operations are smooth. In this paper, we focus on a special case of Lie group,
namely matrix Lie groups.
We define a topology on the space of n × n complex matrices, which allows us to define
topological properties such as connectedness and simple-connectedness on groups of matrices.
By using a matrix norm and by absolute continuity, it is possible to define the exponential
of a matrix to be the convergent infinite sum that is identical to the complex Taylor series
expansion of ez , with z being replaced by X. While the scalar and matrix exponential satisfy
common properties, it is, in general, not true that eX+Y = eX eY for complex matrices X and
Y . The matrix logarithm can also be defined for matrices, though just like in the complex
case, we must restricted the domain to the open ball ∥A − I∥ < 1 in order to avoid the
logarithm to be a multi-valued function. Just like in the complex case, it is possible to prove
that the exponential and logarithm function are inverses one another in local neighbourhood
of I and 0.
A (real) Lie algebra is a (real) vector space g along with a product map [·] : V × V → V
that is billinear, symmetric, and satisfies the Jacobi product:
[X, [Y, Z]] + [Z, [X, Y ]] + [Y, [Z, X]] = 0.
The Lie bracket [·] needn’t be associative.
Every matrix Lie group has an associated Lie algebra, which is the set of matrices X
such that etX ∈ G for all real numbers t. We show that a Lie algebra defined from this
way is indeed a real Lie algebra as defined earlier. We also prove that for any matrix Lie
group homomorphism (which is a continuous group homomorphism) Φ : G → H between
Date: July 15, 2023.
1
2 AASHIR MEHROTRA
two matrix Lie groups G and H, there exists a unique Lie algebra homomorphism (which is
a linear map that preserves that Lie bracket [·]) ϕ : g → h, where g and h are the Lie algebra
of G and H respectively. Moreover, the functions Φ and ϕ satisfy:
eϕ(X) = Φ(eX )
for all X ∈ g.
Our main theorem for this paper is a partial converse to this result, which provides a
canonical matrix Lie group homomorphism given a Lie algebra homomorphism, such that
the domain matrix Lie group G is simply-connected.
Theorem 1.1. Let G and H be matrix Lie groups with associated Lie algebra g and h
respectively. If ϕ : g → h is a Lie algebra homomorphism, and G is simply connected, then
there exists a unique Lie group homomorphism Φ : G → H such that
Φ(eX ) = eϕ(x)
for all X ∈ g.
In order to prove this result, we first prove the Baker-Campbell-Hausdorff Formula (or
BCH formula for short), which has consequence other key results in Lie Theory. Suppose
log A
g(A) = ,
1 − A−1
which is defined for ∥A − 1∥ < 1. Then the BCH formula states if X, Y are n × n complex
matrices with ∥X∥ and ∥Y ∥ sufficiently small, then
Z 1
X Y
log(e e ) = X + g(eadY etadY )(Y )dt,
0
where adY is the function on g that sends X to [X, Y ]. The above integral can be expressed
in the form of a series of nested brackets in X and Y .
This series formulation of the BCH formula is useful in the proof of Theorem 1.1, as it is
used to prove that a ”local homomorphism” with the desired properties exists if given a Lie
algebra homomorphism. To extend this local homomorphism into a global one, the condition
that G is simply-connected is required.
2. Lie Groups
We first start with definitions concerning matrix Lie groups. In what follows we denote
Mn (F) as the ring of n × n matrices over the field F, and GLn (F) as the group of n × n
invertible matrices over F. Also, log will mean a logarithm in base e.
Definition 2.1. If A ∈ Mn (C), then we define the norm of A to be
∥Ax∥
(2.1) ∥A∥ = sup ,
x∈Cn \{0} ∥x∥
pPn
where ∥v∥ for v ∈ Cn is i=1 |vi |2 .
Note that other norms can be given to matrices such as the largest absolute magnitude
of its entries, or the square root of all sums of the squares of the magnitudes of each entry,
similar to the vector norm. Nonetheless, all such other norms can be proven to be equivalent
to the norm defined above, so that there is no ambiguity when using ∥ · ∥.
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
3
The expression 2.1 satisfies the usual properties of a norm (such as the triangle inequality).
However, the norm is not multiplicative, but rather sub-multiplicative.
Claim 2.2. For all A, B ∈ Mn (C), ∥AB∥ ≤ ∥A∥∥B∥.
Proof. We have
∥ABv∥ ∥ABv∥ ∥ABv∥ ∥Bv∥ ∥Aw∥ ∥Bv∥
∥AB∥ = sup = sup = sup ≤ sup sup = ∥A∥∥B∥.
v̸=0 ∥v∥ Bv̸=0 ∥v∥ Bv̸=0 ∥Bv∥ ∥v∥ w̸=0 ∥w∥ v̸=0 ∥v∥
The norm gives rises to a metric space, and hence a topology over Mn (C). We can thus
create a subspace topology over any subset of Mn (C), including GLn (C).
Definition 2.3. If G is a subgroup of GLn (C), then G is said to be a Lie group if it is
a closed set with respect to the subspace topology of GLn (C). In other words, given any
convergent sequence of matrices in G, its limit must either not be invertible or remain in G.
For example, the set of all matrices n × n complex matrices with determinant 1 is a Matrix
Lie Group. This is because, along with being a group, the set can be represented as the
pre-image of {1} with respect to the determinant, and hence is closed. This set is denoted
by SLn (C).
Of course GLn (C) is also a Matrix Lie group, along with GLn (R) and SLn (R), since
matrices in R still follow the group axioms, and the fact that R is a closed subset of C.
Suppose we have a vector space (either Rn or Cn ) with the following inner product:
⟨x, y⟩ = x1 y1 + x2 y2 + . . . + xn yn .
The group of operators that preserve the above inner product is called the n × n orthogonal
group of R (or C), denoted by On (R) and On (C).
If we impose the additional condition that the determinant must be 1, we get the special
orthogonal groups SOn (R) and SOn (C).
Another inner product, this time being only applicable to Cn , is as follows:
⟨x, y⟩ = x1 ȳ1 + x2 ȳ2 + . . . + xn ȳn .
The group of matrices that preserve this inner product is known as the unitary group, or
U (n). Imposing the determinant to be 1, we get the special unitary group SU (n).
Topological properties such as compactness, connectedness, and simple connectedness ap-
ply to matrix Lie groups. Note that since we the norm for matrices we use is identical to
that over C n , and hence all connected sets are path-connected.
To conclude this section, we define matrix Lie group homomorphisms.
Definition 2.4. Let G and H be matrix Lie groups. A matrix Lie group homomorphism
is a map Φ : G → H such that
• Φ is a group homomorphism
• Φ is continuous.
A matrix Lie group isomorphism is a matrix Lie group homomorphism that is a group
isomorphism and a homeomorphism.
4 AASHIR MEHROTRA
for any n ∈ N. Actually, we may conclude it to be true for n ∈ Z by using point (2) of this
proposition. Thus
∞ ∞ ∞
S −1 XS
X (S −1 XS)n X S −1 X n S −1
X Xn
e = = =S [ ]S = S −1 eX S.
n=0
n! n=0
n! n=0
n!
■
Suppose A ∈ Mn (C) is diagonalizable, meaning there exists an invertible matrix S and a
diagonal matrix Λ such that A = S −1 ΛS. This set of diagonalizable matrices can be proven
to be dense in Mn (C).
As shown above, eA = S −1 eΛ S. This makes computing exp for diagonalizable matrices very
convenient, as eΛ is just the exponential of all its diagonal entries. Also, this representation
implies that the eigenvalues of eA are the exponential of the eigenvalues of A.
We now define the logarithm for matrices. Recall for complex numbers, we define log z by
the Taylor series centered at z = 1:
∞
X
m−1 (z − 1)n
log z = (−1) .
n=1
n
By the root test, its radius of convergence is 1. Let
∞
X
m−1 (A − I)n
log A = (−1)
n=1
n
for a matrix A whenever it’s convergent. Again by absolute convergence, the series above
converges for all A such that ∥A − I∥ < 1.
Note that some A might still converge even when they don’t fall into the aforementioned
disk. For example, when A − I is nilpotent, log A converges.
By a similar argument to the exponential case, log A is continuous over ∥A − I∥ < 1, as
each summand can be bounded by an ∥A − I∥n /n, where ∥A − I∥ < 1. Since such a series
converges, the Taylor series of log A will be continuous over the aforementioned disk.
We now prove that exp and log are local inverses of one another.
Lemma 3.3. (1) For z ∈ C such that |z − 1| < 1,
elog z = z.
(2) For u ∈ C such that |u| < log 2, |eu − 1| < 1, and
log(eu ) = u.
Proof. We have exp(log z) = z for all positive reals z, specifically z ∈ (0, 2). Since 1 ∈
(0, 2) is an accumulation point, the Identity theorem from complex analysis applies, and
exp(log z) = z for all z such that |z − 1| < 1.
Likewise, if |u| < log 2, then
u2 u3 |u2 |
|eu − 1| = |u + + + · · · | ≤ |u| + + · · · = e|u| − 1 < 1.
2! 3! 2
Hence, log(exp(u)) is well-defined for |u| < log 2. We have log(exp(u)) = u for all u ∈
(− log 2, log 2). Since 0 ∈ (− log 2, log 2) is an accumulation point, log(exp(u)) = u for all
|u| < log 2. ■
6 AASHIR MEHROTRA
Proof. By multiplying the power series of eX/m and eY /m , we see that all except three terms
will be asymptotically bounded by O( m12 ). Specifically,
X Y X Y 1
em em = I + + + O( 2 ).
m m m
As m → ∞, eX/m eY /m gets sufficiently close to I, hence falling into the domain of the
logarithm. Also, ∥ X
m
+mY
+ O( m12 )∥ < 12 if m is sufficiently large. Thus we get (using Claim
3.6),
X Y X Y 1
log(e m e m ) = log(I + + + O( 2 )
m m m
X Y 1 X Y 1 2
= + + O( 2 ) + O(∥ + + O( 2 )∥ )
m m m m m m
X Y 1 1 X Y 1
= + + O( 2 ) + O( 2 ) = + + O( 2 ).
m m m m m m m
Exponentiating the logarithm and tending m to ∞, we get
XX Y Y 1
e m e m = exp(
+ + O( 2 ))
m m m
X Y 1
=⇒ lim (e m e m )m = lim exp(X + Y + O( )) = exp(X + Y ),
m→∞ m→∞ m
which is what was desired. ■
8 AASHIR MEHROTRA
We now consider the differentiation and integration of matrix-valued functions. The de-
rivative of a function A : R : Mn (C) is defined as
dA Aij
= .
dt ij dt
The linearity and product rule follow the usual proofs from the scalar case.
Theorem 3.8. For X ∈ Mn (C) etX is a smooth function, and
d tX
e = XetX = etX X.
dt
In particular,
d tX
e = X.
dt t=0
■
The integral of a matrix-valued function is defined similarly as the derivative, by taking
the integral of each entry. This will be used in the Baker-Campbell-Hausdorff formula.
To conclude this section, we define and prove a result on one-parameter subgroups, which
will be useful later.
Definition 3.9. A continuous function A : R → GLn (C is called a one-parameter sub-
group of GLn (C) if
• A(0) = I
• A(t + s) = A(t)A(s) for all s, t ∈ R
Theorem 3.10 (Characterisation of One-Parameter Subgroups). If A is a one-parameter
subgroup of GLn (C), then there exists a unique X ∈ Mn (C) such that
A(t) = etX
for all t ∈ R.
Lemma 3.11. Fix ϵ < log 2. Let Bϵ/2 be the open ball of radius ϵ/2 centered at the origin,
and let U = exp(Bϵ/2 ). Then for every B ∈ U , there exists a unique C ∈ U such that
C 2 = B, and is given by C = exp( 21 log B).
Proof. Since ∥B − I∥ < 1, it’s clear that exp( 21 log B), and by Proposition 3.4, C 2 = exp(2 ∗
1
2
log B) = exp(log B) = B.
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
9
In order to establish uniqueness, suppose, for the sake of contradiction, C ′ ∈ U such that
(C ′ )2 = B. Let Y = log C ′ , so that by Proposition 3.4 Y ∈ Bϵ/2 , which in turn implies
2Y ∈ Bϵ . We also have exp Y = (C ′ ) and hence
exp(2Y ) = (C ′ )2 = B = exp(log B)).
■
Note that log B ∈ Bϵ/2 ⊂ Bϵ . By point (2) of Proposition 3.4, exp is injective over Bϵ , and
because exp(2Y ) = exp(log B), hence 2Y = B. Hence
1
C ′ = exp(Y ) = exp( log B) = C.
2
Proof of Theorem 3.10. The fact that X is unique, if it exists, is clear, as
d
X= A(t) .
dt t=0
In order to prove existence, let Bϵ/2 and U be as described in the previous lemma. Note that
U is an open neighbourhood of I, as it is the pre-image of Bϵ on the function log. Hence, by
A’s continuity, there exists t0 > 0, such that A(t) ∈ U for all t ∈ R such that |t| ≤ t0 . Let
1
X = log(A(t0 ))
t0
so that t0 X = log(A(t0 )). This means that t0 X ∈ Bϵ/2 , and thus
et0 X = A(t0 ).
Now by the definition of t0 A(t0 /2) ∈ U , with A(t0 /2)2 = A(t0 by the axioms of a one-
parameter subgroup. By Lemma 3.11, A(t0 ) has a unique square root, given by exp(t0 X/2).
Thus, we have
t0 t0 X
A( ) = e 2 .
2
By inductively applying this argument, we have that for all positive positive integers k, we
get
t0 t0 X
A( k ) = e 2k .
2
Also, for all integers n,
t0 Xn t0 X n t0 Xn
A( k ) = (A( )) = e 2k .
2 n
Hence A(t) = exp(tX) for all t = 2nk t0 , with n ∈ Z and k ∈ N. Since all such t are dense
in R, and A along with exp are continuous, we can conclude that A(t) = exp(tX) for all
t ∈ R. ■
4. Directional Derivatives
The total derivative of a function f : U → Cn (where U is an open subset of some Cm ) at
a point x ∈ U is the unique complex linear transformation Jx : Cm → Cn such that
The above should hold regardless of how y approaches x. The directional derivative of f
with respect to a vector v ∈ Cm is defined by the limit
f (x + hv) − f (x)
∇v f (x) = lim .
h→0 h
for h ∈ R. If we set y(t) = x + tv (t ∈ R), then we have
We use the directional derivative in the proofs for a couple of crucial theorems, including
the Baker-Campbell-Hausdorff Formula.
5. Lie Algebra
Definition 5.1. A Lie algebra g is a real (or complex) vector space together with a billinear
symmetric map [·] : g × g → g that satisfies the Jacob identity, meaning
[X, [Y, Z]] + [Z, [X, Y ]] + [Y, [Z, X]] = 0
for all X, Y, Z ∈ g.
A subalgebra h ⊆ g is a subspace such that [h1 .h2 ] ∈ H for all h1 , h2 ∈ H.’
For this paper, we only consider real Lie algebras.
One example of a Lie algebra is R3 with the cross product. The validity of the axioms can
be verified, but won’t be relevant for this paper.
The more important example of a Lie algebra is that of Mn (C) with an operation known
as the commutator:
[X, Y ] = XY − Y X.
The reason for the name commutator is because [·] equals 0 if and only if XY = Y X, i.e. X
and Y commute.
We shall denote the above Lie algebra as gln (C), whose naming shall become clear in due
course.
Note that gln (C) can be interpreted as either a real or complex Lie algebra, which are
distinct. In this paper, we’ll consider the real Lie algebra.
Suppose sln (C) (whose naming will also be explained) is the set of n × n complex matrices
with trace zero. Then it can be checked that sln (C) along with the commutator as a Lie
bracket is a real Lie algebra.
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
11
Definition 5.2. If g and h are real (or complex) Lie algebras, then a linear map ϕ : g → h
is called a Lie algebra homomorphism if [ϕ(X), ϕ(Y )] = ϕ([X, Y ]) for all X, Y ∈ g. If
the linear map is also invertible, then ϕ is a Lie algebra isomorphism.
Definition 5.3. If g1 and g2 are real (or complex) Lie algebras, then define the direct sum
Lie algebra g of g1 and g2 to be the vector direct sum of g1 and g2 , along with the bracket
given by:
(5.1) [(X1 , X2 ), (Y1 , Y2 )] = ([X1 , Y1 ], [X2 , Y2 ]),
for X1 , Y1 ∈ g1 and X2 , Y2 ∈ g2 . We denote the direct sum as g1 ⊕ g2 .
It can be shown that 5.1 adheres to the Lie algebra axioms.
Definition 5.4. If g is a Lie algebra, and g1 and g2 are Lie subalgebra of g, then g decom-
poses as the Lie algebra direct sum of g1 and g2 if g is the vector space direct sum of
g1 and g2 , and [X1 , X2 ] = 0 for all X1 ∈ g1 and X2 ∈ g2 .
Claim 5.5. If g decomposes as a Lie algebra direct sum of g, then the direct sum g1 and g2
is isomorphic to g.
Proof. Since g = g1 ⊕ g2 , every A ∈ g can be expressed as A = X + Y , where X ∈ g1 and
Y ∈ g2 . Since X ∈ g1 and Y ∈ g2 commute with one another,
[A, B] = [X1 + Y1 , X2 + Y2 ]
(5.2)
= [X1 , X2 ] + [X1 , Y2 ] + [Y1 , X2 ] + [Y1 , Y2 ] = [X1 , X2 ] + [Y1 , Y2 ]
for all A, B ∈ g. The equation 5.2 is essentially identical to 5.1, and hence ϕ([X, Y ]) = X +Y
provides the required Lie algebra isomorphism. ■
We now explain the connection between matrix Lie groups and Lie algebras.
Definition 5.6. Let G be a matrix Lie group. The Lie algebra of G, denoted by g, the set
of all complex matrices (not necessarily invertible) X such that etX ∈ G for all t ∈ R.
Another way of formulating the above definition is that the entire the one-parameter
subgroup generated by X lies in g.
This is the explanation behind the notation gln (C) and sln (C). Note that in physics, the
definition demands eitX to be in G, rather than etX , causing the formulations of some Lie
groups to be off by a factor of i.
It is possible to define the Lie algebra for a general Lie group (as the tangent space of G
at the identity), though that won’t be relevant here.
Proposition 5.7. Let G be a matrix Lie group, with Lie algebra g. For all X, Y ∈ g, A ∈ G,
and s ∈ R, we have
(1) A−1 XA ∈ g
(2) sX ∈ g
(3) X + Y ∈ g
(4) XY − Y X ∈ g.
Thus, g is a real Lie algebra in the way defined earlier.
Proof. (1) Recalling point (4) from Proposition 3.2,
−1 XA)
et(A = A−1 etX A ∈ G
for all t ∈ R, as all three of A−1 , etX , and A are in G. Hence A−1 XA ∈ g.
12 AASHIR MEHROTRA
Now (etX/n etY /n )n ∈ G for all n, and since G is closed in GLn (C), the limit et(X+Y ) is
either in G or isn’t invertible. Due to point (2) in Proposition 3.2, we know et(X+Y )
can’t be invertible, and hence et(X+Y ) ∈ G.
(4) By the product rule and Theorem 3.8,
d tX −tX
e Ye = (XY )e0 + (e0 Y )(−X)
dt t=0
= XY − Y X.
Now
ehX Y e−hX − Y
XY − Y X = lim .
h→0 h
for h ∈ R. By point (1) of this current result, ehX Y e−hX ∈ g. Since we have already
proven that g is a real vector space, the LHS of the above limit is in g for all h. Since
g, being a vector space, is a closed set, the limit XY − Y X must be in g.
■
We now prove a converse to our main result (Theorem 1.1), though this result is more
generally applicable.
Theorem 5.8. Let G and H be matrix Lie groups, with corresponding Lie algebra g and h
respectively. If Φ : G → H if a matrix Lie group homomorphism, then there exists a Lie
algebra homomorphism ϕ : g → h such that
Φ(eX ) = eϕ(X)
for all X ∈ g. Additionally for all X, Y ∈ G and A ∈ G,
(1) ϕ(A−1 XA) = Φ(A−1 )ϕ(X)Φ(A)
(2) ϕ([X, Y ]) = [ϕ(X), ϕ(Y )]
d
(3) ϕ(X) = dt
Φ(etX ) .
t=0
Differentiating the above result at t = 0 proves ϕ(X + Y ) = ϕ(X) + ϕ(Y ). We now prove
the remaining properties (1), (2), and (3) of ϕ.
(1) If A ∈ G, then
−1 XA) −1 XA)
etϕ(A = eϕ(tA
−1 XA
= Φ(etA ) = Φ(A−1 )Φ(etX )Φ(A) = (Φ(A)))−1 eϕ(X) Φ(A).
(2) As in Proposition 5.7, we have (using the fact that a linear transformation commutes
with the derivative, along with (1))
d tX −tX
ϕ(X, Y ) = ϕ e Ye
dt t=0
d d
= ϕ(etX Y e−tX ) = Φ(etX ϕ(Y )Φ(etX )
dt t=0 dt t=0
d
= etϕ(X) ϕ(Y )e−tϕ(X) = [ϕ(X), ϕ(Y )].
dt t=0
(3) Since
Φ(etX ) = eϕ(tX) = etϕ(X)
and
d tϕ(X)
e = ϕ(X),
dt t=0
we get (3).
■
This theorem shows that a unique Lie group homomorphism gives rise to a unique Lie
algebra homomorphism. The converse, in general, not true, however we later use the Baker-
Campbell-Hausdorff formula to prove that the converse is true if we assume G is simply-
connected.
Next, we define some important maps in Lie theory and for the proof of the BCH formula.
Definition 5.9 (The Adjoint Map). Let G be a matrix Lie group and g its Lie algebra. For
A ∈ G, define the adjoint map AdA : g → g
AdA (X) = AXA−1 .
In what follows, GL(g) be the space of invertible linear operators of g, with gl(g) as its
associated Lie algebra. Equivalently, GL(g) can be identified with GLm (R), and gl(g) can
be identified as Mm (R), where m is the vector space dimension of g.
Proposition 5.10. (1) For all A ∈ G, AdA ∈ GL(g).
(2) The map Ad : G → GL(g) is a Lie group homomorphism.
(3) The map AdA satisfies AdA ([X, Y ]) = [AdA (X), AdA (Y )] for all X, Y ∈ g.
14 AASHIR MEHROTRA
We will now interpret eadX as a linear operator on g. Actually eadX ∈ GL(g), since eX is
invertible for any linear operator X.
We now show that exp and log are local inverses in the context of Lie theory. If X ∈ g,
then exp(X) is in G by definition. Hence we can consider and analyse the map exp : g → G.
Lemma 5.12. Let G be a matrix Lie group with Lie algebra g. Suppose Bn (n ∈ N) are
elements of G such that Bn → I. Let Yn = log Bn , which is defined for sufficiently large n.
If Yn is non-zero for all n where it’s defined, and Yn /∥Yn ∥ → Y , then Y ∈ g.
tYn
Proof. For all t ∈ R, ∥Y n∥
→ tY . Since Bn → I, we must also have Yn → 0 and hence
∥Yn ∥ → 0. Hence for a fixed t,
t
∥Yn ∥ → 0
∥Yn ∥
where {·} is the fractional part of a real number. This is because the expression inside the
{·} above is bounded in the interval [0, 1). We now have
t t
∥Yn ∥ − ∥Yn ∥ → t
∥Yn ∥ ∥Yn ∥
t t
=⇒ ( − )∥Yn ∥ → t
∥Yn ∥ ∥Yn ∥
t
=⇒ ∥Yn ∥ → t.
∥Yn ∥
where ⌊·⌋ is the floor (or greatest integer) function, which is always an integer. Thus we
define the integers an (t) as
t
an (t) =
∥Yn ∥
for sufficiently large n (since Yn is defined for sufficiently large n). Thus,
an (t)Yn Yn
e = exp (an (t)∥Yn ∥) → etY .
∥Yn ∥
On the other hand,
ean (t)Yn = (eYn )an (t) = (Bn )an (t) ∈ G
for all t ∈ R, as an (t) are integers. Since G is closed and etY is invertible, it must be the
case that etY ∈ G for all t ∈ R, which implies Y ∈ g. ■
Theorem 5.13. For 0 < ϵ < log 2, let Uϵ = {X ∈ Mn (C)|∥X∥ < ϵ}, and let Vϵ = exp(Uϵ ).
Let G be a matrix Lie group with Lie algebra g. Then there exists ϵ ∈ (0, log 2) such that for
any A ∈ Vϵ , we have A ∈ G if and only if log A ∈ g.
Note that if log A ∈ g, and A is sufficiently close to I, elog A = A ∈ G (by the definition of
g. So it suffices to prove only one direction of the above theorem, namely A ∈ Vϵ ∩ G =⇒
log A ∈ g.
Proof. We currently view g as a real vector subspace of Mn (C) (which can also be viewed as
2 2
the real vector space R2n ). We can define a symmetric inner product on R2n , like the one
we defined in order to describe On and SOn . Let S be the orthogonal complement of g with
respect to this inner product. Thus every n × n Z matrix can be written as Z = X + Y ,
where X ∈ g and Y ∈ S.
16 AASHIR MEHROTRA
Corollary 5.14. If G is a connected matrix Lie group, then for every A ∈ G, there exists
X1 , X2 , . . . , Xn ∈ g such that
A = eX1 eX2 . . . eXn .
Lemma 5.15. Suppose a ≤ b ∈ R, and f : [a, b]n → GLn (C) is continuous. Then for all
ϵ > 0 there exists δ > 0 such that for all s, t ∈ [a, b]n with ∥s − t∥ < δ,
∥f (s)f (t)−1 − I∥ < ϵ.
Proof. Observe that
(5.6) ∥f (s)f (t)−1 − I∥ = ∥(f (s) − f (t))f (t)−1 ∥ ≤ ∥f (s) − f (t)∥∥f (t)−1 ∥,
where we used the sub-multiplicativity of ∥ · ∥. Since [a, b]n is compact, ∥f (t)−1 ∥ is bounded
above by a constant, say by C > 0. Another consequence of [a, b]n being compact is that
f is uniformly continuous, which implies for any ϵ > 0 there exist δ > 0 such that for
all s, t ∈ [a, b] with ∥s − t∥ < δ, ∥f (s) − f (t)∥ < ϵ/C. Multiplying this inequality with
∥f (t)−1 ∥ < C, we get our desired result. ■
Proof of Corollary 5.14. Let Vϵ be as defined as Theorem 5.13. For any A ∈ G, let f :
[0, 1] → G be a continuous path such that f (0) = I and f (1) = A. By Lemma 5.15, there
exists δ < 0 such that if |s − t| < δ, then f (s)f (t)−1 ∈ Vϵ . We partition [0, 1] into n pieces,
such that n1 < δ. Thus, for i = 1, 2, . . . , n, we have f ( i−1
n
)−1 f ( ni ) ∈ Vϵ , which implies
i − 1 −1 i
f( ) f ( ) = eXi
n n
for all i and some X1 , X2 , . . . Xn , which follows by Theorem 5.13. Hence
A = f (0)−1 f (1)
1 1 2 m − 1 −1
= f (0)−1 f ( )f ( )−1 f ( ) · · · f ( ) f (1)
m m m m
= eX1 eX2 · · · eXn .
■
is well-defined provided ∥A − I∥ = 1.
We shall now state and prove the Baker-Campbell-Hausdorff Formula, which is the biggest
tool that will be used to prove Theorem 1.1.
Theorem 6.1 (BCH Formula). For all X, Y ∈ Mn (C) with ∥X∥ and ∥Y ∥ sufficiently small,
we must have Z 1
log(eX eY ) = X + g(eadX etadY )(Y )dt.
0
Consider the derivative
d X+tY
∆(X, Y ) = e
dt t=0
where X, Y ∈ Mn (C) and t ∈ R.
2 2
Also consider the directional derivative of exp : Cn → Cn in the direction a complex
n × n complex matrix Y acting as a n2 -dimensional vector.
We have:
eX+tY − eX
(6.2) ∇Y exp(X) = lim = ∆(X, Y ).
t→0 t
2
Therefore, we must also have ∆(X, Y ) = JX Y , where JX is a linear operator on Cn . Hence
I − e−Z
= lim
n→∞ n 1 Z − n 1 12 Z 2 + · · ·
n 2! n
I − e−Z
=
Z
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
19
for all operators Z with det(Z) ̸= 0 and none of Z’s eigenvalues are 0 (the second condition
as I − e−Z/n is invertible if and only if Z’s eigenvalues are all non-zero). Actually, the first
condition is enough as det(Z) = 0 if and only if Z has a 0 eigenvalue.
Since the set of all operators with determinant not equal to 0 is dense in Mn , we see that
6.3 is true for all operators (when viewed as a power series). ■
Proof of Theorem 6.2. For every positive integer n, we have
n
X+tY X tY
e = exp( + ) .
n n
∆(X, Y ) is the derivative of the above expression at t = 0. By the product rule, we have:
n−1
X d X Y
∆(X, Y ) = (eX/n )n−m−1 [ exp( + t )](e(X/n )m
m=0
dt n n
n−1
X X Y
=e (n−1)X/n
(eX/n )−m ∆( , )(e(X/n) )m
m=0
n n
n−1
(n−1)X/n 1 X
X
=e Ad(e(X/n )−m (∆( , Y ))
n m=0 n
n−1
(n−1)X/n 1
X −adX m X
=e [exp( )] (∆( , Y )).
n m=0 n n
Note that we also used to linearity with respect to Y above.
If we tend n to ∞ in the final expression, we see that e(n−1)X/n tends to eX , ∆( Xn , Y ) tends
to ∆(0, Y ) = Y . Lastly,
n−1
1X −adX m 1 − e−adX
lim [exp( )] = .
n→∞ n n adX
m=0
bracket expression [X, [X, . . . [X, Y ]] . . .], where there are n Xs. Thus
1 − e−adX
.
adX
is a series of nested brackets, as shown in 6.2.
If X is sufficiently small, then adX (Y ) = XY −Y Z is sufficiently close to the zero operator
on Mn (C), and thus
1 − e−adX adx ad2X
(6.4) =I− + − ...
adX 2! 3!
approaches I as X approaches 0. Specifically, for sufficiently small X, 6.4 will be so close to
I so that its determinant cannot be zero (since det I = 1), and hence will be invertible.
Proof of Theorem 6.1. For sufficiently small X, Y ∈ Mn (C), let
(6.5) Z(t) = log(eX etY )
so that Z(t) is defined in [0, 1]. By the generalization of Lemma 6.2, we have
I − e−adZ(t)
−Z(t) d Z(t) dZ
e e = .
dt adZ(t) dt
Since eZ(t) = eX etY , we must also have have
d
e−Z(t) eZ(t) = (eX etY )−1 eX etY Y = Y.
dt
Hence,
I − e−adZ(t)
dZ
= Y.
adZ(t) dt
If X and Y are sufficiently close to 0, then Z(t) would as well be sufficiently close to 0 for
0 ≤ t ≤ 1. Hence by arguments above,
I − e−adZ(t)
adZ(t)
is invertible for sufficiently small X and Y .
Hence,
−1
I − e−adZ(t)
dZ
= (Y ).
dt adZ(t)
Since eZ(t) = eX etY , we may use the properties of Ad and ad to conclude
AdeZ(t) = AdeX AdetY
=⇒ eadZ(t) = eadX etadY
=⇒ adZ(t) = log(eadX etadY )
The last implication follows as X, Y , and Z(t) are small and hence log is defined for eadZ(t)
and eadX etadY .
This implies that
−1
I − (eadX etadY )−1
dZ
= (Y ).
dt log(eadX etadY )
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
21
We conclude this section by proving the series form of the BCH formula, which involves a
series of nested commutators. By using the Taylor series expansion of g at z = 1, it can be
shown that
1 1 1
g(A) = 1 + (A − I) − (A − I)2 + (A − I)3 + · · ·
2 6 12
where we’ve used the fact that ϕ([X, Y ]) = [ϕ(X), ϕ(Y )]. We now get
X eY ϕ(X) eϕ(Y ) )
Φ′ (AB) = eϕ(log(e ))
= elog(e = eϕ(X) eϕ(Y ) = Φ′ (A)Φ′ (B),
which proves the result. Note that exp ◦ log is the identity in this case as X and Y are
sufficiently close to I. ■
We now wish to extend Φ′ into a global homomorphism over G, which will be possible
thanks to the group’s simple-connectedness.
Theorem 7.3. Suppose G and H are matrix Lie groups with G simply-connected. If (Φ′ , U )
is a unique local homomorphism of G into H, then there exists a unique Lie grop homomor-
phism Φ : G → H such that Φ|U = Φ′ (i.e. Φ agrees with Φ′ on U ).
Proof. Let A ∈ G, and define a continuous path f : [0, 1] → G such that f (0) = I and
f (1) = A. We say that a partition 0 = t0 < t1 < · · · < tn = 1 is a good partition if for
each s, t ∈ [ti , ti+1 ] for any 0 ≤ 0 ≤ n − 1,
f (s)f (t)−1 ∈ U.
By Lemma 5.15, a good partition exists, (by making the mesh of the partition less than the
δ defined in Lemma 5.15). We define
(7.1) Φ(A) = Φ′ (f (tn )f (tn−1 )−1 ) · · · Φ′ (f (t2 )f (t1 )−1 )Φ(f (t1 )f (t0 )−1 ).
We now prove that Φ(A) is independent of the partition or path chosen.
If t0 , . . . , tn is a good partition, then inserting any s in between some [ti , ti+1 ] the partition
will still be good, as the mesh would either stay the same or decrease. The effect the addition
of s has on 7.1 is the replacing the factor Φ′ (f (ti+1 )f (ti )−1 ) with
Φ′ (f (ti+1 )f (s)−1 )Φ′ (f (s)f (ti )−1 ).
Since Φ′ is a local homomorphism, and f (ti+1 )f (s)−1 , f (s)f (ti )−1 , and f (ti+1 )f (ti )−1 are all
in U , we have
Φ′ (f (ti+1 )f (ti )−1 ) = Φ′ (f (ti+1 )f (s)−1 )Φ′ (f (s)f (ti )−1 ),
hence not changing the value of Φ(A). Hence, Φ(A) will remain the same for any finite
number of additions in its defining partition. Now, any two permutations have a common
refinement, namely their union. Since Φ(A) on each partition stays the same in their union,
we conclude that both partitions yield the same Φ(A) , and hence Φ partition independent.
We now prove Φ(A) is path independent. Let f0 (t) and f1 (t) be continuous paths in G
joining I to A. Since G is simply-connected, there exists a function F such that for all
s, t ∈ [0, 1],
(1) F (0, t) = f0 (t) and F (1, t) = f1 (t)
(2) F (s, 0) = I and F (s, 1) = A.
As in Lemma 5.15, there exists N ∈ N such that for all s, s′ , t,′ ∈ [0, 1] with |s − s′ |, |t − t′ | <
2/N ,
f (s, t)f (s′ , t′ )−1 ∈ U.
Let gj,k (t) (with j, k = 0, 1, . . . , N − 1 and k = 0, 1, . . . , N ) be continuous paths from [0, 1]
to G that connect I to A. They are defined as:
j+1
F ( N , t)
0 ≤ t ≤ k−1
N
gj,k (t) = F ( j+k N
− t, t) k−1
N
≤ t ≤ k
N
F ( j , t)
k
≤t≤1
N N
24 AASHIR MEHROTRA
when k ̸= 0. If k = 0, then define gj,k (t) = F ( Nj , t) for all t ∈ [0, 1]. In particular, we have
g0,0 = f0 (t).
We now deform f0 to g0,1 , then g0,1 to g0,2 , and so unit we reach g0,N , which we then reform
to g1,0 , and continue in a lexicographical manner, until we reach gN −1,N , which we deform
into f1 . The claim is that Φ(A) is the same over all such deformations. Note that the pair
gj,k (t) and gj,k+1 (t) for some 0 ≤ j, k ≤ N − 1 are identical over [0, 1] except for the interval
k−1 k+1
, .
N N
We have proven above that Φ(A) is partition independent, and we are free to choose any
good partition. Let that partition be
1 k−1 k+1 k+2
0, , · · · , , , , · · · , 1.
N N N N
which is good by definition of N . By 7.1, Φ(A) only depends on the partition points. Since
gj,k (t) and gj,k+1 (t) agree on the above partition points, we must have both paths yield the
same Φ(A) .
Similarly, the pair gj,N and gj+1,0 for some 0 ≤ j ≤ N − 2, along with the pair gN −1,N and
f1 are identical over all of [0, 1] except
N −1
( , 1).
N
Choosing the partition
1 N −1
0, , · · · , ,N
N N
will suffice as gj,N and gj+1,0 agree on all of the partition points. Hence both paths yield the
same Φ(A). All of this together implies that by deforming f1 to f2 via this process preserves
Φ(A), and hence Φ is path independent.
We observe that Φ is continuous as Φ′ is continuous and for two sufficiently close points
A and B, they have paths fA and fB from I to A and B respectively whose max distance
from one another is sufficiently small.
If fA (t) and fB (t) are paths connecting I to A and B respectively, define fAB (t) as
(
fB (2t) 0 ≤ t ≤ 21
fAB (t) =
fA (2t − 1)B 12 ≤ t ≤ 1.
If t0 , . . . , tn is a good partition of fA and s0 , . . . , sm is a good partition of fB , then
s0 sm 1 + t0 1 + tn
,··· , , ,··· ,
2 2 2 2
sm 1 1+t0
is a good partition of fAB , noting that 2 = 2 = 2 . Then
1 + tn 1 + tn−1 −1 1 + t1 1 + t0 −1 ′ t0 + 1 sm
Φ(AB) = Φ′ (fAB ( )fAB ( ) ) · · · Φ′ (fAB ( )fAB ( ) )Φ (fAB ( )fAB ( )−1 )
2 2 2 2 2 2
′ sm sm−1 −1 ′ s2 s1 −1 ′ s1 s0 −1
Φ (fAB ( )fAB ( ) ) · · · Φ (fAB ( )fAB ( ) )Φ (fAB ( )fAB ( ) )
2 2 2 2 2 2
= Φ(A)Φ(B)
sm 1+t0
since 2
= 2
, and thus
1 + t0 sm
f( )f ( )−1 = I.
2 2
THE BAKER-CAMPBELL-HAUSDORFF FORMULA AND SIMPLY CONNECTED LIE GROUP HOMOMORPHISMS
25
where we used the fact that Φ′ is a local homomorphism. Hence Φ(A) = Φ′ (A) for all
A ∈ U. ■
Proof of Theorem 1.1. For existence, let Φ′ be the local homomorphism in Theorem 7.2, and
Φ the global homomorphism in Theorem 7.1. For X ∈ g, the matrix eX/m will be in U for
sufficiently large m, and
Φ(eX/m ) = Φ′ (eX/m ) = eϕ(X)/m .
Since Φ is a homomorphism,
Φ(eX ) = Φ(eX/m )m = eϕ(X)
as desired.
Assume for contradiction that Φ1 and Φ2 are two homomorphisms such that Φ1 (A) ̸=
Φ2 (A). Since G is connected, by Corollary 5.14, we have A = eX1 · · · eXn for Xi ∈ g. Hence
Φ1 (A) = Φ2 (A)
which contradicts our assumption. Hence Φ is unique, and our main theorem has been
proven ■
References
[1] Lars V Ahlfors. Complex analysis. 1979.
[2] Alen Alexanderian. Matrix lie groups and their lie algebras.
[3] Robert Gilmore. Lie groups, Lie algebras, and some of their applications. Courier Corporation, 2006.
[4] Brian C Hall and Brian C Hall. Lie groups, Lie algebras, and representations. Springer, 2013.
[5] JA1088363 Oteo. The baker–campbell–hausdorff formula and nested commutator identities. Journal of
mathematical physics, 32(2):419–424, 1991.
[1] [2] [3] [4] [5]