Numerical solution of PDE:s, Part 2: Implicit method

In the previous blog post, I showed how to solve the diffusion equation


using the explicit method, where the equation is converted to a discrete one


which can be simplified by using the notation


The problem with this approach is that if we want to have a high resolution, i.e. \Delta x is very small, the timestep \Delta t has to be made even much smaller to keep the procedure numerically stable. The stability of this kind of calculations can be investigated with Von Neumann stability analysis, and there you will find out that the instability acts by amplifying the short-wavelength Fourier components of the numerical discretization error.

The specific stability condition for the explicit solution of diffusion equation is


In the better method to solve the diffusion equation, the implicit method, we will not solve the numbers f_{i;j+1} in a straightforward way from the numbers f_{i;j}. Instead, we use backward time stepping to write the f_{i;j} as a function of the numbers f_{i-1;j+1}, f_{i;j+1} and f_{i+1;j+1}, as in the equation below:


which represents a linear system of equations. More specifically, it is a tridiagonal system, and in matrix-vector form it reads


for the case of a relatively small mesh n_x = 7. So, now we have to solve this tridiagonal system on each timestep, and this take more computation time than the explicit method but has the advantage of making the calculation stable even when \Delta t is not necessarily made much smaller than \Delta x.

A code for solving this kind of a linear system can be found from the book “Numerical Recipes for C” or from the corresponding book written for FORTRAN.

Now, let’s use the implicit method to solve a diffusion problem where the x-domain is

x \in [0,6],

and the step sizes are

\Delta x = 0.01 and \Delta t = 0.05 .

The initial concentration profile is chosen to be


(don’t be confused by the fact that the function is now called C instead of f) and we use a boundary condition that forces the value of C(x,t) at the left endpoint to be C(0,t)=1 and that at the right endpoint to be C(6,t) = 0. This means that the left boundary is an infinite source of concentration (or heat in the case of conduction problem) and the right boundary is an infinite “sink”. With physical intuition, it is easy to deduce that this kind of system evolves toward a steady state where the value of C(x) decreases linearly from 1 to 0 on the interval [0,6]

A C++ code for doing this calculation is shown below.

// This program calculates the time development of a diffusion or heat conduction
// system with implicit finite differencing. The system described is the development of a concentration or temperature field
// between boundary points where it is constrained to stay at different constant values.
// Teemu Isojärvi, Feb 2017


using namespace std;

#define LX 6. // Length of spatial domain
#define NX 600 // Number of lattice points
#define LT 3. // Length of time interval
#define NT 60 // Number of timesteps

#define D 1. // Diffusion coefficient

int main(void)

double dx = (double)LX/(double)NX; // Lattice spacing
double dt = (double)LT/(double)NT; // Length of timestep

double c[NX]; // Concentration values at lattice points

double x; // Auxiliary position variable

double kappa = D*dt/(dx*dx); // Auxiliary variable for representing the linear system

double g[NX];
double b;
double q;

double u[NX]; // Vector for storing the solution on each timestep

for(int m = 0; m<NX; m++)
x = (double)m*dx;
c[m]=exp(-3*x*x); // Gaussian initial concentration

for(int n = 0; n<NT; n++)

q = 2*kappa + 1;

for(int m = 1; m < NX; m++) { // First loop for solving the tridiagonal system g[m]=-kappa/q; q = (2*kappa + 1) + kappa*g[m]; u[m]=(c[m]+kappa*u[m-1])/q; } for(int m=(NX-2); m>=0; m--) u[m] -= g[m+1]*u[m+1]; // Second loop

for(int m=0; m<NX; m++) c[m] = u[m]; // Updating the concentration or temperature field


for(int m = 0; m<NX; m++)
x = (double)m*dx;
cout << x << " " << c[m] << "\n"; // Output with the results at time t = LT

return 0;

Running this program three times, with domain lenght parameters L=0.5, L=1.0 and L=3.0 and keeping the time step constant, we get data points that can be plotted in the same coordinate system with a graphing program, like below:


Figure 1. Time evolution of a concentration field C(x,t) in a system where the concentration is forced to stay at constant values at the endpoints of the domain.

The simulation seems to proceed as expected, approaching a linearly decreasing function C(x)

An equivalent code for FORTRAN is in the next box:

! Calculates the time development of a concentration distribution C(x,t) with implicit
! finite-differencing of a diffusion equation. The boundary condition is that the left boundary is an infinite source
! of solute/heat and the value of C(x) at x=0 stays at constant value 1. The value of C at the right boundary stays zero.
! Therefore, the function C(x) evolves towards a linearly decreasing function.
! Teemu Isojärvi, Feb 2017


real :: DX, DT, LX, LT, D, KAPPA,B,Q ! Real variables for discretization and the diffusion constant
integer :: NT,NX ! Number of time and position steps

REAL :: C(600) ! Concentration field as an array of data points, dimension same as value of NX
REAL :: G(600)
REAL :: U(600)

REAL :: X ! Auxiliary variable

INTEGER :: M ! Integer looping variables

LX = 6. ! Length of x-interval
LT = 3.0 ! Length of t-interval
NX = 600 ! Number of lattice points
NT = 300 ! Number of time steps
DX = LX/NX ! Distance between neighboring lattice points
DT = LT/NT ! Length of time step
D = 1. ! Diffusion/heat conduction coefficient

do M = 1, NX ! Initial values of concentration
X = M*DX
C(M) = EXP(-3*X*X) ! Gaussian initial concentration distribution
end do

do N = 1, NT ! Time stepping loop


U(1)=(C(1) + 2 * KAPPA)/(2 * KAPPA + 1)
Q = 2 * KAPPA + 1

do M = 2, NX ! First loop for solving the tridiagonal system
Q = (2 * KAPPA + 1) + KAPPA * G(M)
U(M) = (C(M) + KAPPA * U(M-1))/Q
end do

do M = (NX-1), 1
U(M) = U(M) - G(M+1) * U(M+1) ! Second loop
end do

do M = 1, NX-1
C(M) = U(M) ! Updating the concentration or temperature field
end do

end do

do M = 1, NX
X = M*DX
print *,X,C(M) ! Print the x and concentration values at data points
end do


To test the implicit method with R-Code, let’s solve a problem where the length of the x-domain is 20, the time interval has length 3 and the initial distribution is


to ensure that we can set the boundary conditions C(0,t)=C(L,t)=0 (the Gaussian distribution doesn’t have time to spread all the way to the boundaries in a time interval of 3 units). The code for this calculation is shown below:

library(graphics) #load the graphics library needed for plotting

lx <- 20.0 #length of the computational domain
lt <- 3. #length of the simulation time interval
nx <- 4000 #number of discrete lattice points
nt <- 300 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell
dt <- lt/nt #length of timestep

D <- 1.0 #diffusion constant

Conc <- c(1:nx) #define a vector to put the x- and t-dependent concentration values in

xaxis <- c(1:nx) #the x values corresponding to the discrete lattice points

kappa <- D*dt/(dx*dx) #a parameter needed in the discretization

offdiagonal <- rep(-kappa, times = nx-1)
ondiagonal <- rep(1+2*kappa, times = nx)

for (i in c(1:nx)) {
Conc[i] <- exp(-2*(i*dx-10)*(i*dx-10)) #a Gaussian initial concentration field

xaxis[i] <- i*dx #calculate the x coordinates of lattice points

for (j in c(1:nt)) { #main time stepping loop

sol <- Solve.tridiag(offdiagonal, ondiagonal, offdiagonal, Conc)

for (k in c(1:nx)) {
Conc[k] <- sol[k]

if(j %% 3 == 1) { #make plots of C(x) on every third timestep
jpeg(file = paste("plot_",j,".jpg",sep=""))
plot(xaxis,Conc,xlab="position (x)", ylab="concentration",ylim=c(0,2))
title(paste("C(x) at t =",j*dt))

An animation of the results can be viewed in this link. If you test the code yourself, remember to install the limSolve package first, by writing the command


in the R console. If you don’t want to load the video, here are the plots for 3 different values of t:


When solving PDE:s other than the ordinary diffusion equation, the implicit method is often even more necessary that it was in the examples here. For example, when solving the time-development of a quantum wave packet from the time-dependent Schroedinger equation, the solution function doesn’t stay normalized if one tries to do a simple explicit time stepping. The solution of Schroedinger equations with implicit method will be shown in the next post. The TDSE is a complex diffusion equation of the form


where the equation has been simplified by setting the Planck constant to value 2\pi and the particle mass to m=1.

The 1D diffusion and Schroedinger equations are simple in the sense that the linear system to be solved on each timestep is tridiagonal, with makes the solution considerably faster than is the case with a general linear system. Systems that are not tridiagonal will appear when one wants to solve equations with more than one space coordinate (i.e. x and y instead of just x), or when the equation contains higher than 2nd order derivatives with respect to the position coordinate (this can be demonstrated with the thin-film equation, which is fourth-order with respect to x).


Square roots of matrices and operators

Everyone knows the definition of the square root of a non-negative number: if x > 0, then the square root of x is a nonnegative number y for which we have y^2 = x, and it is denoted y = \sqrt{x}. We could also call the number -\sqrt{x} a square root of x, but by convention the positive solution of the equation y^2 = x is called “The” square root.

So, what if we have an algebraic system that is not the set of positive real numbers, but where there exists a defined multiplication operation “\diamond “? If a is an element of such a set, then a square root of a would obviously be an element b for which b\diamond b = a (\diamond is here the multiplication operation). The set of complex numbers is obviously such a set, but is there any other example?

Consider the set of 2 \times 2 matrices, the elements of which are denoted


where a,b,c and d are real or complex numbers. So, could we define the square root of this kind of matrices in the same way as it is done with the real numbers, using the normal matrix multiplication as the operation “\diamond “? Non-commutativity shouldn’t be a problem here, because any algebraic object commutes with itself.

Let’s use the identity matrix as an example:


What’s its square root? Really quickly one finds out that there are as many as four matrices that can be seen as the square root of I:


and it is not at all obvious which one of these should be called “The Square Root of I“. Because of this, the concept of square root is not often used when working with matrices.

Sometimes, a square root can also be found for operators that act on a set of functions, like the second derivative operator \frac{d^2}{dx^2}, for which the operators \frac{d}{dx} and -\frac{d}{dx} are obvious “square root candidates”. The multiplication of operators doesn’t have to be commutative, so they are a bit like matrices, but they are usually not representable by finite-dimensional matrices unless the domain (the set of functions that the operator acts on) is artificially made very limited.

An example of the operator square root problem that actually occurs in the analysis of a real physical system is the Klein-Gordon equation, which was the first attempt to form a relativistic version of the time-dependent Schroedinger equation of quantum mechanics. The time dependent Schroedinger equation of a free particle (particle that is not acted on by any forces) is:


where \Psi is a function of both time t and a space coordinate x (for a simple one-dimensional situation, that is). This can be written shortly as


where the free-particle Hamiltonian operator


is obtained from the classical expression of kinetic energy: E_k = p^2 /2m by replacing the classical one-dimensional momentum p with the corresponding quantum momentum operator


The problem with the normal TDSE is that the time and space coordinates are treated differently in it, because it is first order with respect to time and second order with respect to position. This is not acceptable if we want to satisfy the principle of special relativity.

A naive attempt to form a relativistic Schroedinger equation is to take the formula of the energy of a moving body in special relativity


and replace the momentum p with the quantum momentum operator \hat{p}. Then we would have


but the problem with this is how to define the square root of something that contains a derivative operator. One way to do this would be to use the Taylor series expansion of the square root:


equating the x here with the derivative operator, but this is not acceptable, because an equation that has derivatives of all positive integral orders up to infinity can be seen as equivalent to a difference equation like


which is non-local, and in an equation of motion like the Klein-Gordon equation this would mean that there can be immediate (faster than speed of light) interactions between objects. (if you want, try to write the difference equation above as an infinite-order differential equation by using Taylor expansion).

Because of the non-locality problem, the actual conventional Klein-Gordon equation is formed by making the square energy operator \hat{H}^2 using the formula of relativistic momentum, and equating \hat{H}^2 \Psi  with -\hbar^2 \frac{\partial^2 \Psi}{\partial t^2} . The result is


This, however, causes other problems because the equation here allows particles to have a negative value of kinetic energy, which is hard to interpret physically. These problems are solved in quantum field theory (QFT), where the number of particles that are present in the system can be uncertain, similarly to how the position of a single quantum particle described by the Schroedinger equation can have some expectation value and nonzero standard deviation.