Numerical solution of PDE:s, Part 8: Complex Ginzburg-Landau Equation

In the previous numerical solution posts, I described linear equations like diffusion equation and the Schrödinger equation, and how they can be solved by (implicit or explicit) finite differencing. The idea of the implicit methods was to convert the equation into a linear system of equations, from which the function values on a discrete mesh could be calculated.

Saying that these equations were linear means that they can be written as


where the linear differential operator, containing space and time derivatives, is acting on the function and producing “something” (usually zero but in the case of source terms/inhomogeneity something nonzero).

As a first example of a nonlinear PDE, let’s consider the complex Ginzburg-Landau equation (CGLE), which reads:


Here the \alpha and \beta are real parameters and i is the imaginary unit. Applying an implicit differencing on this may seem to result in a system of equations


but this is not a linear system because of the |A_{i}^{j+1}|^2, so we cannot solve the problem in this way by using linear algebra.

The trick to solve this is to linearize the system, by evaluating the |A|^2 at timestep j and the rest of the quantities at timestep j+1, producing the system


which is now a linear system w.r.t. to the variables evaluated at timestep j+1 (the matrix for solving “A^{j+1}“:s has diagonal elements that depend of “A^j“:s). A more sophisticated method would do several iterations to approximate the values of A(x,t) between the timesteps j and j+1.

An R code that solves the equation for a domain x\in [0,100], t\in [0,150], using discrete steps \Delta x = 0.66, \Delta t = 0.33 , initial state A(x,0) = 0.1e^{2ix} and values \alpha=3 and \beta = -2, is shown here.

library(graphics) #load the graphics library needed for plotting

lx <- 100 #length of the computational domain
lt <- 150 #length of the simulation time interval
nx <- 150 #number of discrete lattice points
nt <- 450 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell
dt <- lt/nt #length of timestep

a <- 3
b <- -2

kappa1 = dt*(1+1i*a)/dx/dx #an element needed for the matrices
kappa2 = dt*(1+1i*b)

psi = as.complex(c(1:nx)) #array for the function A values
sol = as.complex(c(1:nx))

for(j in c(1:nx)) {
psi[j] = 0.1*exp(2i*j*dx)
sol[j] = psi[j]

xaxis <- c(1:nx)*dx #the x values corresponding to the discrete lattice points

IPxaxis <- c(1:(4*nx))*dx/4
IPtaxis <- c(1:(4*nt))*dt/4

sol_plot = matrix(nrow=nt,ncol=nx)

A = matrix(nrow=nx,ncol=nx) #matrix for forward time evolution
IP = matrix(nrow = 4*nt, ncol=4*nx)

for (m in c(1:nt)) { #main time stepping loop

for(j in c(1:nx)) {
for(k in c(1:nx)) {
if(j==k) {
A[j,k] = 1 + 2*kappa1 + kappa2*abs(sol[j])*abs(sol[j]) – dt #diagonal elements
if((j==k+1) || (j==k-1)) {
A[j,k] = -kappa1 #off-diagonal elements

for(l in c(1:nx)) {
psi[l] = sol[l]
sol <- solve(A,psi) #solve the system of equations

for (l in c(1:nx)) {
sol_plot[m,l] <- Re(sol[l])

jpeg(file = paste(“plot_”,m,”.jpg”,sep=””))
plot(xaxis,Im(sol),xlab = “position (x)”,ylab=”Im[A(x,t)]”,ylim=c(-4,4),pch=’.’)
title(paste(“Im[A(x,t)] at t = “,round(m*dt,digits=2)))


for(l in c(1:(nt-1))) {
for(m in c(1:(nx-1))) { #make a bitmap with 4 times more pixels, using linear interpolation
IP[4*l-3,4*m-3] = sol_plot[l,m]
IP[4*l-2,4*m-3] = sol_plot[l,m]+0.25*(sol_plot[l+1,m]-sol_plot[l,m])
IP[4*l-1,4*m-3] = sol_plot[l,m]+0.5*(sol_plot[l+1,m]-sol_plot[l,m])
IP[4*l,4*m-3] = sol_plot[l,m]+0.75*(sol_plot[l+1,m]-sol_plot[l,m])

for(l in c(1:(4*nt))) {
for(m in c(1:(nx-1))) {
IP[l,4*m-2] = IP[l,4*m-3]+0.25*(IP[l,4*m+1]-IP[l,4*m-3])
IP[l,4*m-1] = IP[l,4*m-3]+0.5*(IP[l,4*m+1]-IP[l,4*m-3])
IP[l,4*m] = IP[l,4*m-3]+0.75*(IP[l,4*m+1]-IP[l,4*m-3])

jpeg(file = “2dplot.jpg”)
image(IPtaxis,IPxaxis,IP,xlab = “t-axis”,ylab=”x-axis”,zlim=c(-2,2))

Plotting the real part of the resulting function A(x,t) at several values of t, we see that the solution initially doesn’t do much anything, but at some point a “phase turbulence” sets in starting from the ends of the x-domain and after that the function evolves in a very random way, without following any clear pattern (unlike the spreading mass/temperature distributions, traveling waves or scattering wavepackets in the case of the common linear PDE:s).

An animation of the solution is shown below.

This kind of chaos is typical of nonlinear systems, be them point mass systems with nonlinear forces between mass points or field systems with nonlinear field equations such as the CGLE here. Note that the solution of this equation is a bit too heavy of a calculation to do just for the purpose of creating random numbers, so for that end other methods such as Perlin’s noise are used.

The 2D color plot of the real part of the solution, plotted in the xt-plane, looks like this:


More plots of the solutions for different values of parameters can be found in this article.

It should be noted that, Wolfram Mathematica’s “NDSolve” function can’t usually solve nonlinear PDE:s correctly, despite usually working properly in the case of linear PDE:s. Some other commercial math programs such as Comsol Multiphysics may work better when solving nonlinear problems, at least to my experience.

So, here was the basic idea of how nonlinear PDE:s are solved by linearization, and what kind of things are possible in the behavior of their solutions. In the next PDE post I will show how to solve the thin-film equation, about which I actually wrote my master’s thesis in 2013, and which doesn’t usually behave chaotically unlike the CGLE (but can be made to do so by adding suitable terms).

The problematic radial momentum operator

In quantum mechanics, position coordinates x,y,z of a particle are replaced with position operators \large \hat{x},\hat{y},\hat{z} and the components of the momentum vector \mathbf{p} are replaced with the operators \large \hat{p_x},\hat{p_y},\hat{p_x}. The most important property of these operators is the commutation relation between a coordinate and the corresponding momentum component:

\large [\hat{x},\hat{p_x}] = i\hbar

In the position representation, the momentum operator has the form of a differential operator

\large \hat{p_x} = -i\hbar \frac{\partial}{\partial x}

and the \hat{x} operator is just a multiplication of the wavefunction with the variable x.

Can these results be extended to coordinate systems that are not Cartesian? Consider a plane polar coordinate system where the position is given as the pair \large (r,\theta), with

\large x = r\cos \theta
\large y = r\sin \theta

The “momentum” corresponding to the variable \large \theta is obviously the angular momentum operator

\large \hat{L}_z = -i\hbar \frac{\partial}{\partial \theta}.

But what about the radial momentum operator? It would have to be

\large \hat{p}_r = -i\hbar \frac{\partial}{\partial r}

in the position representation, but is it an acceptable quantum operator?

Actually, it is not. An operator corresponding to a measurable quantity has to be Hermitian, and it’s easy to see that if we have a wavefunction similar to a hydrogen atom ground state

\large \psi (r) = Ae^{-ar}

where a is a real constant, the state is an eigenstate of \hat{p}_r with an imaginary eigenvalue \large ia\hbar. Therefore it’s not possible to use the radial momentum operator as an observable.

Another fun way to see this is to consider a “radial translation operator”, which is generated by the radial momentum:

\large \hat{T}_r (\Delta r) = \exp \left(\frac{i\Delta r \hat{p}_r}{\hbar}\right)

and operate with it on a rotation symmetric function \psi (r). The result is

\large \hat{T}_r (\Delta r)\psi (r) = \psi (r + \Delta r)

However, because a radius coordinate always has to be a nonnegative number, we have irreversibly lost all information about the values of the function \large \psi (r) for \large r<\Delta r here, which means that this radial translation operator does not have an inverse, and therefore can’t be unitary as would be expected if \large \hat{p}_r were Hermitian!

Numerical solution of PDE:s, Part 7: 2D Schrödinger equation

Haven’t been posting for a while, but here’s something new… Earlier I showed how to solve the 1D Schrödinger equation numerically in different situations. Now I’m going to show how to calculate the evolution of a 2D wavepacket in a potential energy field that has been constructed to mimic the classical “two-slit experiment” which shows how the mechanics of low-mass particles like electrons can exhibit interference similar to the mechanics of classical waves (sound, light, water surface, and so on).

A 2D Schrödinger equation for a single particle in a time-independent background potential V(x,y) is


Where the particle mass has been set to 1 and the Planck’s constant to 2\pi.

To solve this numerically, we need the Crank-Nicolson method, as was the case when solving the 1D problem. More specifically, the linear system to be solved is




where the wavefunction now has two position indices and one time index, and the potential energy has only two position indices.

To form a model of the two-slit experiment, we choose a domain 0 < x < 6; 0 < y < 6 and make a potential energy function defined by

IF (x < 2.2 OR x > 3.8 OR (x > 2.7 AND x < 3.3)) THEN IF (3.7 < y < 4) THEN V(x,y) = 30

IF (x < 0.5 OR x > 5.5 OR y < 0.5 OR y > 5.5) THEN V(x,y) = 30

Otherwise V(x,y) = 0.

which corresponds to having hard walls surrounding the domain and a barrier with two holes around the line y = 3.85

For an initial condition, we choose a Gaussian wavepacket that has a nonzero expectation value of the momentum in y-direction:


An R-Code that solves this problem for a time interval 0 < t < 1 is

library(graphics) #load the graphics library needed for plotting

lx <- 6.0 #length of the computational domain (both x and y)
lt <- 1.0 #length of the simulation time interval
nx <- 47 #number of discrete lattice points in x and y directions
nt <- 60 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell, same in x and y directions
dt <- lt/nt #length of timestep

V = matrix(nrow=nx,ncol=nx) #potential energies at discrete xy points

for(j in c(1:nx)) { #construct the potential field with a double-slit barrier
for(k in c(1:nx)) {
V[j,k] = 0+0i
if((j*dx<2.2)||(j*dx>3.8)||((j*dx>2.7) && (j*dx<3.3))) {
if((k*dx>3.7) && (k*dx<4.0)) {
V[j,k] = 30+0i #No significant density is going to go through these barriers
if((j*dx>5.5) || (j*dx<0.5) || (k*dx>5.5) || (k*dx<0.5)) {
V[j,k] = 30+0i

kappa1 = (1i)*dt/(2*dx*dx) #an element needed for the matrices
kappa2 <- c(1:(nx*nx)) #another element

for(j in c(1:nx)) {
for(k in c(1:nx)) {
kappa2[(j-1)*nx+k] <- kappa1*2*dx*dx*V[j,k]

psi = c(1:(nx*nx)) #array for the wave function values

for(j in c(1:nx)) {
for(k in c(1:nx)) {
psi[(j-1)*nx+k] = exp(-2*(k*dx-1.7)*(k*dx-1.7)-2*(j*dx-3)*(j*dx-3)+(2i)*k*dx) #Gaussian initial wavefunction
if((j*dx > 5.5)||(j*dx < 0.5)||(k*dx > 5.5)||(k*dx < 0.5)) {
psi[(j-1)*nx+k] = as.complex(0)

xaxis <- c(1:nx)*dx #the x values corresponding to the discrete lattice points
yaxis <- c(1:nx)*dx #the y values

A = matrix(nrow=nx*nx,ncol=nx*nx) #matrix for forward time evolution
B = matrix(nrow=nx*nx,ncol=nx*nx) #matrix for backward time evolution
P = matrix(nrow=nx,ncol=nx) #matrix for the solution after time stepping
IP = matrix(nrow=4*nx,ncol=4*nx) #matrix for the higher resolution image obtained by interpolation

for(j in c(1:(nx*nx))) { #Set the values for time evolution matrix elements
for(k in c(1:(nx*nx))) {
if(j==k) {
A[j,k] = 1 + 4*kappa1 + kappa2[j]
B[j,k] = 1 – 4*kappa1 – kappa2[j]
if((k==j+1) || (k==j-1)) {
A[j,k] = -kappa1
B[j,k] = kappa1
if((k==j+nx)||(k==j-nx)) {
A[j,k] = -kappa1
B[j,k] = kappa1

for(k in c(1:nt)) { #main time stepping loop

sol <- solve(A,B%*%psi) #solve the system of equations

for(l in c(1:(nx*nx))) {
psi[l] <- sol[l]

for(l in c(1:nx)) {
for(m in c(1:nx)) {
P[l,m] = abs(psi[(l-1)*nx + m])*abs(psi[(l-1)*nx + m]) #square of the absolute value of wave function
if(abs(V[l,m]) > 5) P[l,m] = 2
for(l in c(1:(nx-1))) {
for(m in c(1:(nx-1))) { #make a bitmap with 4 times more pixels, using linear interpolation
IP[4*l-3,4*m-3] = P[l,m]
IP[4*l-2,4*m-3] = P[l,m]+0.25*(P[l+1,m]-P[l,m])
IP[4*l-1,4*m-3] = P[l,m]+0.5*(P[l+1,m]-P[l,m])
IP[4*l,4*m-3] = P[l,m]+0.75*(P[l+1,m]-P[l,m])

for(l in c(1:(4*nx))) {
for(m in c(1:(nx-1))) {
IP[l,4*m-2] = IP[l,4*m-3]+0.25*(IP[l,4*m+1]-IP[l,4*m-3])
IP[l,4*m-1] = IP[l,4*m-3]+0.5*(IP[l,4*m+1]-IP[l,4*m-3])
IP[l,4*m] = IP[l,4*m-3]+0.75*(IP[l,4*m+1]-IP[l,4*m-3])

jpeg(file = paste(“plot_abs_”,k,”.jpg”,sep=””)) #save the image
image(IP, zlim = c(0,0.15))


The code produces a sequence of image files, where the probability density is plotted with colors, as an output. Some representative images from this sequence (converted to grayscale) is shown below:

A video of the time evolution is shown below:

The treshold for maximum white color has been chosen to be quite low, to make the small amount of probability density that crosses the barrier visible.

The discrete grid of points has been made quite coarse here to keep the computation time reasonable, and the resolution has been increased artificially by using linear interpolation between the discrete points.

So, now we’ve seen how to solve the motion of 2D wavepackets moving around obstacles. In the next numerical methods post, I’ll go through the numerical solution of a nonlinear PDE.



Numerical solution of PDE:s, Part 3: 2D diffusion problem

In the earlier posts related to PDE numerical integration, I showed how to discretize 1-dimensional diffusion or heat conduction equations either by explicit or implicit methods. The 1d model can work well in some situations where the symmetry of a physical system makes the concentration or temperature field practically depends on only one cartesian coordinate and is independent of the position along two other orthogonal coordinate axes.

The 2-dimensional version of the diffusion/heat equation is


assuming that the diffusion is isotropic, i.e. its rate does not depend on direction.

In a discretized description, the function C(x,y,t) would be replaced by a three-index object


assuming that one of the corners of the domain is at the origin. Practically the same can also be done with two indices, as in


where N_x is the number of discrete points in x-direction.

Using the three-index version of the 2D diffusion equation and doing an explicit discretization, we get this kind of a discrete difference equation:


which can be simplified a bit if we have \Delta x = \Delta y .

Below, I have written a sample R-Code program that calculates the evolution of a Gaussian concentration distribution by diffusion, in a domain where the x-interval is 0 < x < 6 and y-interval is 0 < y < 6. The boundary condition is that nothing diffuses through the boundaries of the domain, i.e. the two-dimensional integral of C(x,y,t) over the area [0,6]\times [0,6] does not depend on t.

library(graphics) #load the graphics library needed for plotting

lx <- 6.0 #length of the computational domain in x-direction
ly <- 6.0 #length of the computational domain in y-direction
lt <- 6 #length of the simulation time interval
nx <- 30 #number of discrete lattice points in x-direction
ny <- 30 #number of discrete lattice points in y-direction
nt <- 600 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell in x-direction
dy <- ly/ny #length of one discrete lattice cell in y-direction
dt <- lt/nt #length of timestep

D <- 1.0 #diffusion constant (assumed isotropic)

Conc2d = matrix(nrow=ny,ncol=nx)
DConc2d <- matrix(nrow=ny, ncol=nx) #a vector for the changes in concentration during a timestep
xaxis <- c(0:(nx-1))*dx #the x values corresponding to the discrete lattice points
yaxis <- c(0:(ny-1))*dy #the y values corresponding to the discrete lattice points

kappax <- D*dt/(dx*dx) #a parameter needed in the discretization
kappay <- D*dt/(dy*dy) #a parameter needed in the discretization

for (i in c(1:ny)) {
for (j in c(1:nx)) {
Conc2d[i,j] = exp(-(i*dy-3)*(i*dy-3)-(j*dx-3)*(j*dx-3)) #2D Gaussian initial concentration distribution
DConc2d[i,j] <- 0 #all initial values in DConc vector zeroed

for (j in c(1:nt)) { #main time stepping loop

for(k in c(1:nx))
Conc2d[1,k] <- Conc2d[k,2] #fluxes through the boundaries of the domain are forced to stay zero
Conc2d[nx,k] <- Conc2d[nx-1,k]

for(k in c(1:ny)) {
Conc2d[k,1] <- Conc2d[k,2]
Conc2d[k,nx] <- Conc2d[k,nx-1]
for (k in c(2:(ny-1))) {
for (l in c(2:(nx-1))) {
DConc2d[k,l] <- kappax*(Conc2d[k,l-1]-2*Conc2d[k,l]+Conc2d[k,l+1]) + kappay*(Conc2d[k-1,l]-2*Conc2d[k,l]+Conc2d[k+1,l]) #time stepping

for (k in c(2:(ny-1))) {
for (l in c(2:(nx-1))) {
Conc2d[k,l] <- Conc2d[k,l]+DConc2d[k,l] #add the changes to the vector Conc
k <- 0
l <- 0

if(j %% 3 == 1) { #make plots of C(x,y) on every third timestep
jpeg(file = paste("plot_",j,".jpg",sep=""))
title(paste("C(x,y) at t =",j*dt))

To plot the distribution C(x,y) with a color map instead of a 3D surface, you can change the line




The two kinds of graphs are shown below for three different values of t.




Figure 1. Surface plots of the time development of a Gaussian mass or temperature distribution spreading by diffusion.




Figure 2. Time development of a Gaussian mass or temperature distribution spreading by diffusion, plotted with a red-orange-yellow color map.

To obtain an implicit differencing scheme, we need to write the discretized equation as a matrix-vector equation where C_{i;j;k} is obtained from C_{i;j;k+1} by backward time stepping. In the square matrix, there is one row (column) for every lattice point of the discrete coordinate system. Therefore, if there are N points in x-direction and N points in y-direction, then the matrix is an N^2 \times N^2 – matrix. From this it’s quite obvious that the computation time increases very quickly when the spatial resolution is increased.

Initially, one may think that the equation corresponding to diffusion in a really coarse 3 \times 3 grid is the following one:


Where I’ve used the denotation k_x = \frac{D\Delta t}{(\Delta x)^2} and k_y = \frac{D\Delta t}{(\Delta x)^2} The problem with this is that there are unnecessary terms -k_x in here, creating a cyclic boundary condition (mass that is diffusing through the boundary described by line x = L reappears from the boundary on the other side, x = 0. The correct algorithm for assigning the nonzero elements A_{ij} of the matrix A is

1. A_{ij} = 1 + 2k_x + 2k_y , when i = j
2. A_{ij} = -k_x , when j=i-1 AND i\neq 1 (modulo N_x)
3. A_{ij} = -k_x , when j=i+1 AND i\neq 0 (modulo N_x)
4. A_{ij} = -k_y , when j=i+N_x OR j=i-N_x

when you want to get a boundary condition which ensures that anything that diffuses through the boundaries is lost forever. Then the matrix-vector equation in the case of 3 \times 3 lattice is


Unlike the linear system in the 1D diffusion time stepping, this is not a tridiagonal problem and consequently is slower to solve. An R-Code that produces a series of images of the diffusion process for a Gaussian concentration distribution in a 15 \times 15 discrete lattice is given below.

library(graphics) #load the graphics library needed for plotting

lx <- 6.0 #length of the computational domain in x-direction
ly <- 6.0 #length of the computational domain in y-direction
lt <- 6 #length of the simulation time interval
nx <- 15 #number of discrete lattice points in x-direction
ny <- 15 #number of discrete lattice points in y-direction
nt <- 180 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell in x-direction
dy <- ly/ny #length of one discrete lattice cell in y-direction
dt <- lt/nt #length of timestep

D <- 1.0 #diffusion constant

C = c(1:(nx*ny))
Cu = c(1:(nx*ny))
Conc2d = matrix(nrow=ny,ncol=nx)
xaxis <- c(0:(nx-1))*dx #the x values corresponding to the discrete lattice points
yaxis <- c(0:(ny-1))*dy #the y values corresponding to the discrete lattice points

kappax <- D*dt/(dx*dx) #a parameter needed in the discretization
kappay <- D*dt/(dy*dy) #a parameter needed in the discretization

A = matrix(nrow=(nx*ny),ncol=(nx*ny))

for(i in c(1:(nx*ny))) {
for(j in c(1:(nx*ny))) {
A[i,j] <- 0
if(i==j) A[i,j] <- 1+2*kappax+2*kappay
if(j==i+1 && (i%%nx != 0)) A[i,j] <- -kappax
if(j==i-1 && (i%%nx != 1)) A[i,j] <- -kappax
if(j==i+nx) A[i,j] <- -kappay
if(j==i-nx) A[i,j] <- -kappay

for (i in c(1:ny)) {
for (j in c(1:nx)) {
Conc2d[i,j] <- exp(-2*(i*dy-3)*(i*dy-3)-2*(j*dx-3)*(j*dx-3)) #2D Gaussian initial concentration distribution

for (j in c(1:nt)) { #main time stepping loop

for(k in c(1:ny)) {
for(l in c(1:nx)) {
C[(k-1)*nx+l] <- Conc2d[k,l]

Cu <- solve(A,C)

for(k in c(1:ny)) {
for(l in c(1:nx)) {
Conc2d[k,l] <- Cu[(k-1)*nx+l]

k <- 0
l <- 0

if(j %% 3 == 1) { #make plots of C(x,y) on every third timestep
jpeg(file = paste("plot_",j,".jpg",sep=""))
title(paste("C(x) at t =",j*dt))

Note that the function C(x,y) approaches a constant zero function as t increases, try modifying the code yourself to make a boundary condition that doesn’t let anything diffuse through the boundaries!


Why does quantum mechanics need an infinite dimensional vector space?

When undergraduate physics students are trying to learn quantum mechanics, one of the most difficult obstacles is that abstract infinite dimensional vector spaces are needed, which makes it impossible for a person to mentally “see” the geometry in their minds in the way a three-dimensional space can be imagined. The fact that the coordinates of the points in that vector space are complex numbers certainly doesn’t make it any easier. When I was studying that myself in the early undergraduate studies, it took me over a year to get a hold of the concept that any self-adjoint operator acting on vectors in the quantum state space has a set of eigenvectors that form a basis of that space.

Sometimes, however, a quantum mechanics problem can be handled by approximating the state space with a finite-dimensional vector space. In fact, several physics Nobel prizes have been awarded for the study of two-state quantum systems. One example of a situation, where a quantum system is effectively a two-state system, is something where the ground state (n=0) and the first excited state (n=1) have energies that are close to each other, and the energy gap between the first and second (n = 2) excited states is unusually wide. Then, if the system is at low temperature, the Boltzmann factors of the lowest-lying two states,




are much larger numbers that the Boltzmann factors of any other states, and it is unlikely that a measurement of total energy will give a result other than one of the two lowest energy eigenvalues.

How, then, does one know that a quantum state space actually has to be infinite-dimensional? I once saw, in a mathematical physics textbook, a practice problem where this had to be proved, and I was proud for almost immediately finding the correct answer.

A cornerstone of nonrelativistic quantum mechanics is the position-momentum commutation relation


Now, if the quantum state space were finite-dimensional with dimension N, then these operators \hat{x} and \hat{p} would be N\times N square matrices, and the constant i\hbar on the RHS would be a diagonal matrix where all the diagonal elements have value i\hbar

Actually, this equation can’t hold in the finite-dimensional case. To see this, consider the calculation rules for taking the trace of a matrix sum or a matrix product:



Now, if you take the trace of both sides of the commutation relation equation, and use these rules, you will get the result


Which clearly isn’t possible. So, we have done a false assumption somewhere in the derivation of the contradictory equation above, and the false assumption must be the finite-dimensionality of the space where x and p are defined. An infinite dimensional operator doesn’t always have a trace at all, and this is the case with the operator xp-px.

An interested reader may try to form finite-dimensional matrix representations for the raising and lowering operators of a quantum harmonic oscillator, use those those to form matrices for operators \hat{x} and \hat{p}, and see what happens when you calculate their commutator.

Numerical solution of PDE:s, Part 2: Implicit method

In the previous blog post, I showed how to solve the diffusion equation


using the explicit method, where the equation is converted to a discrete one


which can be simplified by using the notation


The problem with this approach is that if we want to have a high resolution, i.e. \Delta x is very small, the timestep \Delta t has to be made even much smaller to keep the procedure numerically stable. The stability of this kind of calculations can be investigated with Von Neumann stability analysis, and there you will find out that the instability acts by amplifying the short-wavelength Fourier components of the numerical discretization error.

The specific stability condition for the explicit solution of diffusion equation is


In the better method to solve the diffusion equation, the implicit method, we will not solve the numbers f_{i;j+1} in a straightforward way from the numbers f_{i;j}. Instead, we use backward time stepping to write the f_{i;j} as a function of the numbers f_{i-1;j+1}, f_{i;j+1} and f_{i+1;j+1}, as in the equation below:


which represents a linear system of equations. More specifically, it is a tridiagonal system, and in matrix-vector form it reads


for the case of a relatively small mesh n_x = 7. So, now we have to solve this tridiagonal system on each timestep, and this take more computation time than the explicit method but has the advantage of making the calculation stable even when \Delta t is not necessarily made much smaller than \Delta x.

A code for solving this kind of a linear system can be found from the book “Numerical Recipes for C” or from the corresponding book written for FORTRAN.

Now, let’s use the implicit method to solve a diffusion problem where the x-domain is

x \in [0,6],

and the step sizes are

\Delta x = 0.01 and \Delta t = 0.05 .

The initial concentration profile is chosen to be


(don’t be confused by the fact that the function is now called C instead of f) and we use a boundary condition that forces the value of C(x,t) at the left endpoint to be C(0,t)=1 and that at the right endpoint to be C(6,t) = 0. This means that the left boundary is an infinite source of concentration (or heat in the case of conduction problem) and the right boundary is an infinite “sink”. With physical intuition, it is easy to deduce that this kind of system evolves toward a steady state where the value of C(x) decreases linearly from 1 to 0 on the interval [0,6]

A C++ code for doing this calculation is shown below.

// This program calculates the time development of a diffusion or heat conduction
// system with implicit finite differencing. The system described is the development of a concentration or temperature field
// between boundary points where it is constrained to stay at different constant values.
// Teemu Isojärvi, Feb 2017


using namespace std;

#define LX 6. // Length of spatial domain
#define NX 600 // Number of lattice points
#define LT 3. // Length of time interval
#define NT 60 // Number of timesteps

#define D 1. // Diffusion coefficient

int main(void)

double dx = (double)LX/(double)NX; // Lattice spacing
double dt = (double)LT/(double)NT; // Length of timestep

double c[NX]; // Concentration values at lattice points

double x; // Auxiliary position variable

double kappa = D*dt/(dx*dx); // Auxiliary variable for representing the linear system

double g[NX];
double b;
double q;

double u[NX]; // Vector for storing the solution on each timestep

for(int m = 0; m<NX; m++)
x = (double)m*dx;
c[m]=exp(-3*x*x); // Gaussian initial concentration

for(int n = 0; n<NT; n++)

q = 2*kappa + 1;

for(int m = 1; m < NX; m++) { // First loop for solving the tridiagonal system g[m]=-kappa/q; q = (2*kappa + 1) + kappa*g[m]; u[m]=(c[m]+kappa*u[m-1])/q; } for(int m=(NX-2); m>=0; m--) u[m] -= g[m+1]*u[m+1]; // Second loop

for(int m=0; m<NX; m++) c[m] = u[m]; // Updating the concentration or temperature field


for(int m = 0; m<NX; m++)
x = (double)m*dx;
cout << x << " " << c[m] << "\n"; // Output with the results at time t = LT

return 0;

Running this program three times, with domain lenght parameters L=0.5, L=1.0 and L=3.0 and keeping the time step constant, we get data points that can be plotted in the same coordinate system with a graphing program, like below:


Figure 1. Time evolution of a concentration field C(x,t) in a system where the concentration is forced to stay at constant values at the endpoints of the domain.

The simulation seems to proceed as expected, approaching a linearly decreasing function C(x)

An equivalent code for FORTRAN is in the next box:

! Calculates the time development of a concentration distribution C(x,t) with implicit
! finite-differencing of a diffusion equation. The boundary condition is that the left boundary is an infinite source
! of solute/heat and the value of C(x) at x=0 stays at constant value 1. The value of C at the right boundary stays zero.
! Therefore, the function C(x) evolves towards a linearly decreasing function.
! Teemu Isojärvi, Feb 2017


real :: DX, DT, LX, LT, D, KAPPA,B,Q ! Real variables for discretization and the diffusion constant
integer :: NT,NX ! Number of time and position steps

REAL :: C(600) ! Concentration field as an array of data points, dimension same as value of NX
REAL :: G(600)
REAL :: U(600)

REAL :: X ! Auxiliary variable

INTEGER :: M ! Integer looping variables

LX = 6. ! Length of x-interval
LT = 3.0 ! Length of t-interval
NX = 600 ! Number of lattice points
NT = 300 ! Number of time steps
DX = LX/NX ! Distance between neighboring lattice points
DT = LT/NT ! Length of time step
D = 1. ! Diffusion/heat conduction coefficient

do M = 1, NX ! Initial values of concentration
X = M*DX
C(M) = EXP(-3*X*X) ! Gaussian initial concentration distribution
end do

do N = 1, NT ! Time stepping loop


U(1)=(C(1) + 2 * KAPPA)/(2 * KAPPA + 1)
Q = 2 * KAPPA + 1

do M = 2, NX ! First loop for solving the tridiagonal system
Q = (2 * KAPPA + 1) + KAPPA * G(M)
U(M) = (C(M) + KAPPA * U(M-1))/Q
end do

do M = (NX-1), 1
U(M) = U(M) - G(M+1) * U(M+1) ! Second loop
end do

do M = 1, NX-1
C(M) = U(M) ! Updating the concentration or temperature field
end do

end do

do M = 1, NX
X = M*DX
print *,X,C(M) ! Print the x and concentration values at data points
end do


To test the implicit method with R-Code, let’s solve a problem where the length of the x-domain is 20, the time interval has length 3 and the initial distribution is


to ensure that we can set the boundary conditions C(0,t)=C(L,t)=0 (the Gaussian distribution doesn’t have time to spread all the way to the boundaries in a time interval of 3 units). The code for this calculation is shown below:

library(graphics) #load the graphics library needed for plotting

lx <- 20.0 #length of the computational domain
lt <- 3. #length of the simulation time interval
nx <- 4000 #number of discrete lattice points
nt <- 300 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell
dt <- lt/nt #length of timestep

D <- 1.0 #diffusion constant

Conc <- c(1:nx) #define a vector to put the x- and t-dependent concentration values in

xaxis <- c(1:nx) #the x values corresponding to the discrete lattice points

kappa <- D*dt/(dx*dx) #a parameter needed in the discretization

offdiagonal <- rep(-kappa, times = nx-1)
ondiagonal <- rep(1+2*kappa, times = nx)

for (i in c(1:nx)) {
Conc[i] <- exp(-2*(i*dx-10)*(i*dx-10)) #a Gaussian initial concentration field

xaxis[i] <- i*dx #calculate the x coordinates of lattice points

for (j in c(1:nt)) { #main time stepping loop

sol <- Solve.tridiag(offdiagonal, ondiagonal, offdiagonal, Conc)

for (k in c(1:nx)) {
Conc[k] <- sol[k]

if(j %% 3 == 1) { #make plots of C(x) on every third timestep
jpeg(file = paste("plot_",j,".jpg",sep=""))
plot(xaxis,Conc,xlab="position (x)", ylab="concentration",ylim=c(0,2))
title(paste("C(x) at t =",j*dt))

An animation of the results can be viewed in this link. If you test the code yourself, remember to install the limSolve package first, by writing the command


in the R console. If you don’t want to load the video, here are the plots for 3 different values of t:


When solving PDE:s other than the ordinary diffusion equation, the implicit method is often even more necessary that it was in the examples here. For example, when solving the time-development of a quantum wave packet from the time-dependent Schroedinger equation, the solution function doesn’t stay normalized if one tries to do a simple explicit time stepping. The solution of Schroedinger equations with implicit method will be shown in the next post. The TDSE is a complex diffusion equation of the form


where the equation has been simplified by setting the Planck constant to value 2\pi and the particle mass to m=1.

The 1D diffusion and Schroedinger equations are simple in the sense that the linear system to be solved on each timestep is tridiagonal, with makes the solution considerably faster than is the case with a general linear system. Systems that are not tridiagonal will appear when one wants to solve equations with more than one space coordinate (i.e. x and y instead of just x), or when the equation contains higher than 2nd order derivatives with respect to the position coordinate (this can be demonstrated with the thin-film equation, which is fourth-order with respect to x).


Linear algebra for balancing chemical reaction equations

The balancing of chemical reaction equations is familiar for many, either from high school chemistry or introductory university chemistry. A simple example of such a problem is the reaction equation for the combustion of methane in oxygen:

aCH_4 + bO_2 \rightarrow cCO_2 + dH_2 O,

for which we are supposed to find coefficients a,b,c and d that make the reaction possible in terms of conservation of the number of atoms of any element that is present. In this case, it is rather easy to see that the number of CO_2 molecules must be equal to the number of CH_4 molecules, and that the number of water molecules must be twice that. Therefore the possible equations are

CH_4 + 2O_2 \rightarrow CO_2 + 2H_2 O , or

2CH_4 + 4O_2 \rightarrow 2CO_2 + 4H_2 O, or

3CH_4 + 6O_2 \rightarrow 3CO_2 + 6H_2 O ,

or any other “multiple” of the first equation (obtained by multiplying all the coefficients in the equation with the same number).

Usually this kind of balancing is done mainly by intuition and trying to guess the right coefficients, but this can be difficult if the reactants and product contain many different elements, each of which are present in more than one kind of molecule. One example of a more complicated (but still quite simple) reaction is the reaction of chlorine dioxide (ClO_2) with alkaline water to form chlorate (ClO_3^-), chlorite (ClO_2^-) and water:

aClO_2 + bOH^- \rightarrow cClO_2^- + dClO_3^- + eH_2 O .

Now, an easily applicable way to solve a set of acceptable coefficients a,b,c,d and e is to explicitly write the balancing condition equations for each element that appears in the equation. These balancing conditions are:

a = c + d                                 (balancing of chlorine)
2a + b = 2c + 3d + e             (balancing of oxygen)
b = 2e                                   (balancing of hydrogen)
b = c + d                                (balancing of electric charge)

This can be solved with many kinds of mathematical software, either free (there’s lots of free math programs especially for Linux) or commercial (like Wolfram Mathematica).

But there’s one another thing. Because the number of unknowns in the system of equations above is larger than the number of equations by one, we need to decide the value of one of the coefficients a, \dots , b arbitrarily. So let’s say that a = 1. Now the set of equations is:

a = 1                                   (arbitrary choice of one coeffiecient)
a = c + d                            (balancing of chlorine)
2a + b = 2c + 3d + e          (balancing of oxygen)
b = 2e                              (balancing of hydrogen)
b = c + d                           (balancing of electric charge)

Using the column vector


the system of equations can be written in matrix form


This can be solved in Mathematica by writing the command

Solve[{a == 1, a == c + d, 2a + b == 2*c + 3*d + e, b == 2e, b == c + d}]

and the result is

a = 2,

b = 2,

c = 1,

d = 1,

e = 1,
which leads to the reaction equation

2ClO_2 + 2OH^- \rightarrow ClO_2^- + ClO_3^- + H_2 O

which doesn’t even need to be multiplied with any constant to make all the coefficients integers (usually that is necessary).
IMAGE SOURCE: The feature image of an Erlenmeyer flask on this blog post is a public domain image from Wikipedia, address .

Square roots of matrices and operators

Everyone knows the definition of the square root of a non-negative number: if x > 0, then the square root of x is a nonnegative number y for which we have y^2 = x, and it is denoted y = \sqrt{x}. We could also call the number -\sqrt{x} a square root of x, but by convention the positive solution of the equation y^2 = x is called “The” square root.

So, what if we have an algebraic system that is not the set of positive real numbers, but where there exists a defined multiplication operation “\diamond “? If a is an element of such a set, then a square root of a would obviously be an element b for which b\diamond b = a (\diamond is here the multiplication operation). The set of complex numbers is obviously such a set, but is there any other example?

Consider the set of 2 \times 2 matrices, the elements of which are denoted


where a,b,c and d are real or complex numbers. So, could we define the square root of this kind of matrices in the same way as it is done with the real numbers, using the normal matrix multiplication as the operation “\diamond “? Non-commutativity shouldn’t be a problem here, because any algebraic object commutes with itself.

Let’s use the identity matrix as an example:


What’s its square root? Really quickly one finds out that there are as many as four matrices that can be seen as the square root of I:


and it is not at all obvious which one of these should be called “The Square Root of I“. Because of this, the concept of square root is not often used when working with matrices.

Sometimes, a square root can also be found for operators that act on a set of functions, like the second derivative operator \frac{d^2}{dx^2}, for which the operators \frac{d}{dx} and -\frac{d}{dx} are obvious “square root candidates”. The multiplication of operators doesn’t have to be commutative, so they are a bit like matrices, but they are usually not representable by finite-dimensional matrices unless the domain (the set of functions that the operator acts on) is artificially made very limited.

An example of the operator square root problem that actually occurs in the analysis of a real physical system is the Klein-Gordon equation, which was the first attempt to form a relativistic version of the time-dependent Schroedinger equation of quantum mechanics. The time dependent Schroedinger equation of a free particle (particle that is not acted on by any forces) is:


where \Psi is a function of both time t and a space coordinate x (for a simple one-dimensional situation, that is). This can be written shortly as


where the free-particle Hamiltonian operator


is obtained from the classical expression of kinetic energy: E_k = p^2 /2m by replacing the classical one-dimensional momentum p with the corresponding quantum momentum operator


The problem with the normal TDSE is that the time and space coordinates are treated differently in it, because it is first order with respect to time and second order with respect to position. This is not acceptable if we want to satisfy the principle of special relativity.

A naive attempt to form a relativistic Schroedinger equation is to take the formula of the energy of a moving body in special relativity


and replace the momentum p with the quantum momentum operator \hat{p}. Then we would have


but the problem with this is how to define the square root of something that contains a derivative operator. One way to do this would be to use the Taylor series expansion of the square root:


equating the x here with the derivative operator, but this is not acceptable, because an equation that has derivatives of all positive integral orders up to infinity can be seen as equivalent to a difference equation like


which is non-local, and in an equation of motion like the Klein-Gordon equation this would mean that there can be immediate (faster than speed of light) interactions between objects. (if you want, try to write the difference equation above as an infinite-order differential equation by using Taylor expansion).

Because of the non-locality problem, the actual conventional Klein-Gordon equation is formed by making the square energy operator \hat{H}^2 using the formula of relativistic momentum, and equating \hat{H}^2 \Psi  with -\hbar^2 \frac{\partial^2 \Psi}{\partial t^2} . The result is


This, however, causes other problems because the equation here allows particles to have a negative value of kinetic energy, which is hard to interpret physically. These problems are solved in quantum field theory (QFT), where the number of particles that are present in the system can be uncertain, similarly to how the position of a single quantum particle described by the Schroedinger equation can have some expectation value and nonzero standard deviation.