The injectivity radii of the unitary groups

February 22, 2023

This post was prompted by a user question raised in this issue on Manifolds.jl. Ronny Bergmann implemented these results in Manifolds.jl in this PR.

Introduction

Suppose you held up an object and began rotating it at a fixed speed. After some amount of time, you stop rotating it. If I know the starting pose of the object and the amount of time that has passed, under what conditions can I also tell you the rotational velocity (both direction and speed) of the spinning object?

There's really just one condition: that the rotational distance (i.e. angle) between the initial and final positions does not exceed some maximum value. For example, if you rotate 180° in any direction, it's impossible for me to know whether you rotated the object clockwise or counterclockwise. Worse, if you rotate a full 360°, I couldn't know whether you didn't move the object at all or performed a whole rotation or a million rotations. This maximum allowed distance that allows one to still infer the initial and final orientation is called the injectivity radius[1].

The question motivating this post is, what is the injectivity radius for the rotations in not just 2 dimensions and 3 dimensions but any dimension? And more generally, what is it for the unitary group and its most common subgroups? Since these groups are featured in many (all?) of the introductory texts on Lie groups and manifolds, and since the injectivity radius is a basic property introduced in differential geometry textbooks, I was surprised that I could not find a single reference giving these radii for these groups.

In this post I'll work out these radii. Marvelously, we don't need any differential geometry or group theory to do this, just linear algebra! Nevertheless, this post assumes familiarity with these topics and for the sake of space will try not define all common terms or notation.[2]

The injectivity radius

Consider a point pp on some manifold M\mathcal{M} with tangent vectors X,YTpMX,Y \in T_p \mathcal{M}. Assume a Riemannian metric gpg_p defining an inner product gp ⁣:(X,Y)X,Ygg_p\colon (X, Y) \mapsto \left\langle X, Y \right\rangle_g, which induces a norm Xg\left\lVert X \right\rVert_g.

Let's denote the exponential map expp ⁣:Xq\exp_p\colon X \mapsto q for qMq \in \mathcal{M} and the logarithmic map logp ⁣:qY\log_p\colon q \mapsto Y. The injectivity radius at pp is defined as the norm of the smallest XX for which XY=logp(exppX)X \ne Y = \log_p (\exp_p X), or in other words, the smallest XX for which the logarithmic map no longer is the inverse of the exponential map. We further define the global injectivity radius of M\mathcal{M} as the infimum of the injectivity radii at all points on the manifold.

Notationally, we define this global injectivity radius as

injM=inf(p,X)TM{Xglogp(expp(X))X}.\operatorname{inj}^-_{\mathcal{M}} = \inf_{(p, X) \in T \mathcal{M}}\{\left\lVert X \right\rVert_g | \log_p(\exp_p(X)) \ne X\}.

We'll also consider the related supremum

injM+=sup(p,X)TM{Xglogp(expp(X))=X}.\operatorname{inj}^+_{\mathcal{M}} = \sup_{(p, X) \in T \mathcal{M}}\{\left\lVert X \right\rVert_g | \log_p(\exp_p(X)) = X\}.

These two quantities form the lower and upper bound radii, respectively, of two geodesic balls within which the exponential map is invertible.

The unitary group(s)

The Unitary group U(n,F)\mathrm{U}(n, \mathbb{F}) over some number system F\mathbb{F} is the group of all n×nn \times n matrices pFn×np \in \mathbb{F}^{n \times n} for which pHp=Inp^\mathrm{H} p = I_n, where H{\cdot}^\mathrm{H} denotes the matrix adjoint. F\mathbb{F} could be the real numbers R\mathbb{R}, complex numbers C\mathbb{C}, or quaternions H\mathbb{H}.

We will also deal with the following subgroups:

We will focus on the real and complex fields, but the unitary quaternionic case immediately follows from the complex one.

Relevant geometric properties

The unitary group is a compact group and when equipped with the Frobenius inner product g ⁣:(X,Y)X,YFg\colon (X, Y) \mapsto \left\langle X, Y \right\rangle_\mathrm{F} becomes a Riemannian manifold.[3] The Riemannian exponential expp\exp_p and logarithm logp\log_p are related to the Lie group exponential Exp\operatorname{Exp} and logarithm Log\operatorname{Log}, which for these matrix groups are just the matrix exponential and logarithm.

expp(X)=pExpp(pHX)logp(q)=pLogp(pHq) \begin{aligned} \exp_p(X) &= p\operatorname{Exp}_p(p^{\mathrm{H}}X)\\ \log_p(q) &= p\operatorname{Log}_p(p^{\mathrm{H}}q) \end{aligned}
Thus, to find the injectivity radius at any point pp, we only need to work out when the matrix exponential is inverted by the principal matrix logarithm.

In the following then, pp is always the identity matrix and will not be mentioned, while XX is always an element of the Lie algebra, that is, the tangent space at the identity matrix.

The orthogonal group O(n)\mathrm{O}(n) is comprised of two submanifolds, SO(n)\mathrm{SO}(n), whose elements have determinant +1, and another subgroup whose elements have determinant -1. These submanifolds are disconnected, so that the geodesic cannot join two points from the different submanifolds.

The injectivity radius of O(n)\mathrm{O}(n) is the same as that of SO(n)\mathrm{SO}(n).

Relevant linear algebraic properties

All unitary matrices have a unit determinant det(q)=1|\det(q)| = 1. The inverse of any unitary matrix qq is just its adjoint q1=qHq^{-1} = q^\mathrm{H}

The logarithm of any unitary matrix is a skew-hermitian matrix X=XHX = -X^\mathrm{H}.[4] Unitary and skew-hermitian matrices are normal matrices, which means they are always diagonalizable with unitary eigenvectors. Let q=VSVHq=VSV^\mathrm{H} be the eigendecomposition of qq and X=UΛUHX = U \Lambda U^\mathrm{H} be the eigendecomposition of XX.

The unitary/skew-Hermitian condition then implies

X=XHUΛUH=UΛHUHΛ=ΛHdiag(Λ)=λ=λλ+λ=2Re(λ)=0λi=iθi,θiR,i1n. \begin{aligned} X &= -X^\mathrm{H}\\ U \Lambda U^\mathrm{H} &= -U \Lambda^\mathrm{H} U^\mathrm{H}\\ \Lambda &= -\Lambda^\mathrm{H}\\ \operatorname{diag}(\Lambda) = \lambda &= -\lambda^*\\ \lambda + \lambda^* = 2\operatorname{Re}(\lambda) &= 0\\ \lambda_i &= \mathrm{i} \theta_i, \quad \theta_i \in \mathbb{R}, i \in 1\ldots n \end{aligned}.
The eigenvalues of the unitary matrices are points on U(1,F)\mathrm{U}(1, \mathbb{F}), and the eigenvalues of the skew-Hermitian matrices are pure imaginary numbers λ=iθ\lambda = \mathrm{i}\theta.[5]

The norm of XX under the Frobenius metric is

Xg=XF=UΛUH,UΛUHF=λF=i=1nλi2=i=1nθi2=θ. \begin{aligned} \left\lVert X \right\rVert_g &= \left\lVert X \right\rVert_\mathrm{F} = \sqrt{\left\langle U \Lambda U^\mathrm{H}, U \Lambda U^\mathrm{H} \right\rangle_\mathrm{F}} \\ &= \left\lVert \lambda \right\rVert_\mathrm{F} = \sqrt{\sum_{i=1}^n |\lambda_i|^2} = \sqrt{\sum_{i=1}^n \theta_i^2}\\ &= \left\lVert \theta \right\rVert \end{aligned}.

The special unitary group has the additional constraint that the determinant is +1. This constraint implies that i=1nθi=0\sum_{i=1}^n \theta_i = 0. Given n1n-1 values of θi\theta_i, the nnth value is thus fixed.

Special orthogonal matrices are real, and real matrices have the extra property that if they have a complex eigenvalue λi\lambda_i then they also have a complex eigenvalue λi\lambda_i^*, i.e. the complex eigenvalues come in conjugate pairs. The sum of the elements in each pair is thus 0, and they don't contribute to the the sum i=1nθi\sum_{i=1}^n \theta_i. When nn is odd, at least one eigenvalue is 0.

Using the eigendecomposition, the matrix exponential and logarithm can be computed by applying the corresponding function to the eigenvalues. Then

Exp(X)=UExp(Λ)UHLog(q)=VLog(S)VH \begin{aligned} \operatorname{Exp}(X) &= U \operatorname{Exp}(\Lambda) U^\mathrm{H}\\ \operatorname{Log}(q) &= V \operatorname{Log}(S) V^\mathrm{H} \end{aligned}
To find the injectivity radii, we only need to find for what values θi\theta_i is exp(iθi)\exp(\mathrm{i}\theta_i) invertible.

The injectivity radii

We are finally ready to work out the injectivity radii. For U(n,F)\mathrm{U}(n, \mathbb{F}), the eigenvalues of XX are λi=iθi\lambda_i = \mathrm{i} \theta_i for θiR\theta_i \in \mathbb{R}.

The matrix exponential is invertible when exp(iθi)\exp(\mathrm{i} \theta_i) is invertible. But this is just rotation in the complex plane by the angle θi\theta_i, which is invertible for θi(π,π]\theta_i \in (-\pi, \pi]. The injectivity radii are then computed by the constraint θi=π|\theta_i| = \pi for at least one i1ni \in 1\ldots n.

For U(n,F)\mathrm{U}(n, \mathbb{F}) and F≢R\mathbb{F} \not\equiv \mathbb{R}, the largest value of θ\left\lVert \theta \right\rVert achievable subject to the constraints occurs when θi=π|\theta_i| = \pi for all ii. Then θ=πn\left\lVert \theta \right\rVert = \pi \sqrt{n}. The smallest value of θ\left\lVert \theta \right\rVert occurs when θi={π,i=k0,ik|\theta_i| = \begin{cases}\pi, & i = k\\ 0, & i \ne k\end{cases} for any k1nk \in 1\ldots n. So

injU(n,F)=π,F≢RinjU(n,F)+=πn. \begin{aligned} \operatorname{inj}^-_{\mathrm{U}(n, \mathbb{F})} &= \pi, \qquad \mathbb{F} \not\equiv \mathbb{R}\\ \operatorname{inj}^+_{\mathrm{U}(n, \mathbb{F})} &= \pi \sqrt{n}. \end{aligned}

For SU(n)\mathrm{SU}(n), we have the additional constraint that i=1nθi=0\sum_{i=1}^n \theta_i = 0. For even nn, θ\left\lVert \theta \right\rVert is maximized subject to these constraints when n/2n/2 entries in θ\theta are +π+\pi and when n/2n/2 entries are π-\pi. On the other hand, for odd nn, it is maximized when (n1)/2(n-1)/2 values each are +π+\pi and π-\pi and the nnth value is 00. θ\left\lVert \theta \right\rVert is minimized subject to these constraints when there is a single nonzero pair θi=θj=π\theta_i = -\theta_j = \pi for jij \ne i. As a result,

injSU(n)=π2injSU(n)+=π2n/2. \begin{aligned} \operatorname{inj}^-_{\mathrm{SU}(n)} &= \pi\sqrt{2}\\ \operatorname{inj}^+_{\mathrm{SU}(n)} &= \pi \sqrt{2 \lfloor n/2 \rfloor}. \end{aligned}

The constraints on θ\theta required for O(n)\mathrm{O}(n) and SO(n)\mathrm{SO}(n) are also satisfied by the lower and upper bounds of the norms considered for SU(n)\mathrm{SU}(n), so

injO(n)=injSO(n)=π2injO(n)+=injSO(n)+=π2n/2. \begin{aligned} \operatorname{inj}^-_{\mathrm{O}(n)} = \operatorname{inj}^-_{\mathrm{SO}(n)} &= \pi\sqrt{2}\\ \operatorname{inj}^+_{\mathrm{O}(n)} = \operatorname{inj}^+_{\mathrm{SO}(n)} &= \pi \sqrt{2 \lfloor n/2 \rfloor}. \end{aligned}

It's always a good idea to numerically check that the results make sense. For the 2D and 3D rotations SO(2)\mathrm{SO}(2) and SO(3)\mathrm{SO}(3), respectively, we then have

injSO(2)=injSO(2)+=π2injSO(3)=injSO(3)+=π2 \begin{aligned} \operatorname{inj}^-_{\mathrm{SO}(2)} = \operatorname{inj}^+_{\mathrm{SO}(2)} = \pi\sqrt{2}\\ \operatorname{inj}^-_{\mathrm{SO}(3)} = \operatorname{inj}^+_{\mathrm{SO}(3)} = \pi\sqrt{2} \end{aligned}

These injectivity radii correspond to a rotation of 180°. [3] Coming back to our motivating example, if we start from any pose and rotate an object more than 180° in any direction, we can no longer uniquely determine the initial rotational velocity.

U(1,C)\mathrm{U}(1, \mathbb{C}), the complex unit circle, is isomorphic to SO(2){\mathrm{SO}(2)}, so it may seem like a contradiction that its injectivity radius is π\pi. But this difference is again caused by the choice of metric, which causes the inner product on SO(2){\mathrm{SO}(2)} to be scaled by 12\frac{1}{2} compared to U(1,C)\mathrm{U}(1, \mathbb{C}).[3]

Analogously, U(1,H)\mathrm{U}(1, \mathbb{H}) represents the unit quaternions (also the compact symplectic group), which are an alternative way to represent 3D rotations, and is equivalent to SU(2)\mathrm{SU}(2). The injectivity radii again only differ by the factor of 2\sqrt{2} due to the scaling convention of the metric.

Conclusion

This has been one of the rare moments where we got to dabble with group theory and manifolds without needing too much geometry. I hope it was enjoyable!

[1] The injectivity radius is so called because it is the radius of a geodesic ball within which the exponential map is injective (one-to-one).
[2] It's really hard to write anything about manifolds or groups without writing a whole introductory text to differential geometry or group theory.
[3] Because of the skew-hermitian nature of the elements of the Lie algebra, some texts use the scaled Frobenius metric g(X,Y)=12X,YFg(X, Y) = \frac{1}{2}\left\langle X, Y \right\rangle_\mathrm{F}. In some cases, this allows the norm of a tangent vector to be interpreted as the angle of the rotation. To get the injectivity radii for this metric, one would just divide ours by 2\sqrt{2}.
[4] This comes from differentiating the constraint qHq=Inq^\mathrm{H}q=I_n. Then we have d(qHq)=(dq)Hq+qHdq=d(In)=0\mathrm{d}{(q^\mathrm{H}q)} = (\mathrm{d}{q})^\mathrm{H}q + q^\mathrm{H} \mathrm{d}{q} = \mathrm{d}(I_n) = 0. Letting X=dqX = \mathrm{d}{q}, then XHq=qHX=(XHq)HX^\mathrm{H}q = -q^\mathrm{H} X = -(X^\mathrm{H}q)^\mathrm{H}. When qq is the identity matrix, X=XHX = -X^\mathrm{H}.
[5] For the complex unitary group, i\mathrm{i} is the usual imaginary number, while for quaternions, it would be a pure unit quaternion.

If you have questions or suggestions, feel free to open an issue.
If you found this useful, please share and follow @sethaxen or @[email protected].