## Square roots of matrices and operators

Everyone knows the definition of the square root of a non-negative number: if x > 0, then the square root of x is a nonnegative number y for which we have $y^2 = x$, and it is denoted $y = \sqrt{x}$. We could also call the number $-\sqrt{x}$ a square root of $x$, but by convention the positive solution of the equation $y^2 = x$ is called “The” square root.

So, what if we have an algebraic system that is not the set of positive real numbers, but where there exists a defined multiplication operation “$\diamond$“? If $a$ is an element of such a set, then a square root of $a$ would obviously be an element $b$ for which $b\diamond b = a$ ($\diamond$ is here the multiplication operation). The set of complex numbers is obviously such a set, but is there any other example?

Consider the set of $2 \times 2$ matrices, the elements of which are denoted

where a,b,c and d are real or complex numbers. So, could we define the square root of this kind of matrices in the same way as it is done with the real numbers, using the normal matrix multiplication as the operation “$\diamond$“? Non-commutativity shouldn’t be a problem here, because any algebraic object commutes with itself.

Let’s use the identity matrix as an example:

What’s its square root? Really quickly one finds out that there are as many as four matrices that can be seen as the square root of $I$:

and it is not at all obvious which one of these should be called “The Square Root of $I$“. Because of this, the concept of square root is not often used when working with matrices.

Sometimes, a square root can also be found for operators that act on a set of functions, like the second derivative operator $\frac{d^2}{dx^2}$, for which the operators $\frac{d}{dx}$ and $-\frac{d}{dx}$ are obvious “square root candidates”. The multiplication of operators doesn’t have to be commutative, so they are a bit like matrices, but they are usually not representable by finite-dimensional matrices unless the domain (the set of functions that the operator acts on) is artificially made very limited.

An example of the operator square root problem that actually occurs in the analysis of a real physical system is the Klein-Gordon equation, which was the first attempt to form a relativistic version of the time-dependent Schroedinger equation of quantum mechanics. The time dependent Schroedinger equation of a free particle (particle that is not acted on by any forces) is:

where $\Psi$ is a function of both time t and a space coordinate x (for a simple one-dimensional situation, that is). This can be written shortly as

where the free-particle Hamiltonian operator

is obtained from the classical expression of kinetic energy: $E_k = p^2 /2m$ by replacing the classical one-dimensional momentum $p$ with the corresponding quantum momentum operator

The problem with the normal TDSE is that the time and space coordinates are treated differently in it, because it is first order with respect to time and second order with respect to position. This is not acceptable if we want to satisfy the principle of special relativity.

A naive attempt to form a relativistic Schroedinger equation is to take the formula of the energy of a moving body in special relativity

and replace the momentum p with the quantum momentum operator $\hat{p}$. Then we would have

but the problem with this is how to define the square root of something that contains a derivative operator. One way to do this would be to use the Taylor series expansion of the square root:

equating the $x$ here with the derivative operator, but this is not acceptable, because an equation that has derivatives of all positive integral orders up to infinity can be seen as equivalent to a difference equation like

which is non-local, and in an equation of motion like the Klein-Gordon equation this would mean that there can be immediate (faster than speed of light) interactions between objects. (if you want, try to write the difference equation above as an infinite-order differential equation by using Taylor expansion).

Because of the non-locality problem, the actual conventional Klein-Gordon equation is formed by making the square energy operator $\hat{H}^2$ using the formula of relativistic momentum, and equating $\hat{H}^2 \Psi$ with $-\hbar^2 \frac{\partial^2 \Psi}{\partial t^2}$ . The result is

This, however, causes other problems because the equation here allows particles to have a negative value of kinetic energy, which is hard to interpret physically. These problems are solved in quantum field theory (QFT), where the number of particles that are present in the system can be uncertain, similarly to how the position of a single quantum particle described by the Schroedinger equation can have some expectation value and nonzero standard deviation.