## Numerical solution of PDE:s, Part 10: The thin-film equation

Earlier, I showed how to solve the 1D and 2D versions of the complex Ginzburg-Landau equation, which is an example of a nonlinear partial differential equation, and which had to be linearized for solution with implicit differencing, meaning that the matrix in the linear system was different on each timestep.

Another nonlinear PDE is the so-called thin film equation, which in 2D form reads

Here the function $h(x,y,t)$ describes the local thickness of a film of viscous liquid located on top of a solid surface described by the xy-plane. The parameter $\gamma$ is the surface tension of the liquid-gas interface and $\mu$ is the viscosity of the liquid.

An unusual thing about this equation is that it’s fourth order in the spatial coordinates, while most equations in physics are second order DE:s. Some equations of continuum mechanics describing elastic bending of cylinders and plates are also fourth order, so this is not the only example.

In many cases, a corresponding equation with only one spatial coordinate is enough for describing thin film physics, and then the graph of the solution can be though of as depicting an intersection of the film at a single value of y-coordinate.

When discretizing this equation, we must note that the factor $h^3$ has to be treated explicitly to get a linear system of equations, just like what had to be done with the $|A|^2$ in the CGLE. Also, we will set $\gamma/(3\mu ) = 1$ to make the equation dimensionless. One correct way to discretize this equation leads to the system

where the two index object $\alpha_{j}^{i}$ is

Note that now the linear system is not tridiagonal, but heptadiagonal due to the higher order derivatives. Solution of the equation with this differencing scheme and a Gaussian initial condition $h(x,0)$ is done with the following R language code.

library(graphics) #load the graphics library needed for plotting

lx <- 10 #length of the computational domain
lt <- 10 #length of the simulation time interval
nx <- 150 #number of discrete lattice points
nt <- 150 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell
dt <- lt/nt #length of timestep

psi = c(1:nx) #array for the function h values
sol = c(1:nx)

kappa = dt/(4*dx*dx*dx*dx)

for(j in c(1:nx)) {
psi[j] = exp(-(j*dx-5)*(j*dx-5))
sol[j] = psi[j]
}

xaxis <- c(1:nx)*dx #the x values corresponding to the discrete lattice points
alpha <- c(1:nx)

A = matrix(nrow=nx,ncol=nx) #matrix for forward time evolution

for (m in c(1:nt)) { #main time stepping loop

for(j in c(1:nx)) {
alpha[j] = kappa*psi[j]*psi[j]*psi[j]
}

for(j in c(1:nx)) {
for(k in c(1:nx)) {
A[j,k]=0
if(j==k) {
if(j!=nx && j!=1) {
A[j,k] = 1 + 2*alpha[j+1] + 2*alpha[j-1] #diagonal elements
}
if(j==1) {
A[j,k] = 1 + 2*alpha[j+1]
}
if(j==nx) {
A[j,k] = 1 + 2*alpha[j-1]
}
}

if(j==k+1 && j!=1) {
A[j,k] = -alpha[j-1] #off-diagonal elements
}

if(j==k-1 && j!=nx) {
A[j,k] = -alpha[j+1]
}

if(j==k+2 && j<nx) {
A[j,k] = -2*alpha[j+1]
}

if(j==k-2 && j>1) {
A[j,k] = -2*alpha[j-1]
}

if(j==k+3 && j<nx) {
A[j,k] = alpha[j+1]
}

if(j==k-3 && j>1) {
A[j,k] = alpha[j-1]
}

}
}

for(l in c(1:nx)) {
psi[l] = sol[l]
}
sol <- solve(A,psi) #solve the system of equations

jpeg(file = paste("plot_",m,".jpg",sep=""))
plot(xaxis,sol,xlab = "position (x)",ylab="h(x,t)",ylim=c(0,1),pch='.')
title(paste("h(x,t) at t = ",round(m*dt,digits=2)))
lines(xaxis,sol)
dev.off()

}

dev.off()

An animation of the solution looks like this.

The solutions of this equation have the property that the graph settles into the shape of a downward opening parabola when time proceeds. A problem with this is that the contact angle, in which the liquid surface approaches the solid surface (x-axis) at the final equilibrium, can be anything between 0 and 90 degrees depending on the relative width and height of the initial Gaussian. In a real liquid-solid system, the final contact angle depends on the surface tensions of both the liquid-solid and the liquid-gas interfaces, as described by the Young equation.

To create an equation that can model equilibrium contact angles appropriately, we add a disjoining pressure term $\Pi (h)$ in the equation, as here:

One form of the $\Pi$-term that works is

where the $h_*$ is a precursor film thickness.

The idea behind the precursor film is that even when we have a liquid drop or puddle surrounded by apparently dry solid surface, there is actually a very thin adsorbed layer of liquid molecules of that dry area (the molecules get there by evaporating from the liquid surface and reattaching on the solid). So, in a simulation where we include the disjoining pressure, we need to use an initial condition that is a Gaussian with an added constant equal to the precursor thickness:

$h(x,0) = \exp \left[-b(x-x_0 )^2 \right] + h_*$

In here we will set the values n = 5 and m = 2 in the disjoining pressure term, and set the precursor film thickness into the value 0.01. The disjoining pressure term can be treated explicitly at the same time as we use implicit differencing for the rest of the equation – we subtract, on the RHS of the discretized equation a term $D_{j}^i$,

defined by

where

and $\Pi_{j}^i$ is the disjoining pressure evaluated at the discrete points. Note the use of a central finite difference instead of a one-sided difference when calculating the derivatives – doing otherwise is likely to make the simulation crash. The term $D_{j}^{i}$ only affects the right-hand side vector of the linear system $\mathbf{Ax} = \mathbf{b}$ that we solve on each timestep. A code that solves the new equation for the pre-factor value $B=0.1$ is shown next.

library(graphics) #load the graphics library needed for plotting

lx <- 10 #length of the computational domain
lt <- 5 #length of the simulation time interval
nx <- 150 #number of discrete lattice points
nt <- 5000 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell
dt <- lt/nt #length of timestep

prec = 0.01 #precursor film thickness

psi = c(1:nx) #array for the function h values
sol = c(1:nx)

disjoin1 = c(1:nx)
disjoin2 = c(1:nx)
disjoin3 = c(1:nx)

kappa = dt/(4*dx*dx*dx*dx)

for(j in c(1:nx)) {
psi[j] = exp(-(j*dx-5)*(j*dx-5))+prec
sol[j] = psi[j]
}

xaxis <- c(1:nx)*dx #the x values corresponding to the discrete lattice points
alpha <- c(1:nx)

A = matrix(nrow=nx,ncol=nx) #matrix for forward time evolution

for (m in c(1:nt)) { #main time stepping loop

disjoin1[1] = 0
disjoin1[nx] = 0
for(j in c(1:nx)) {
disjoin1[j] = 0.1*((prec/psi[j])^5 - (prec/psi[j])^2)
}

for(j in c(2:(nx-2))) {
disjoin2[j] = psi[j]*psi[j]*psi[j]*(disjoin1[j+1] - disjoin1[j-1])/(2*dx)
}

for(j in c(2:(nx-2))) {
disjoin3[j] = (disjoin2[j+1]-disjoin2[j-1])/(2*dx)
}

disjoin3[1] = 0
disjoin3[2] = 0
disjoin3[nx-1] = 0
disjoin3[nx] = 0

for(j in c(1:nx)) {
alpha[j] = kappa*psi[j]*psi[j]*psi[j]
}

for(j in c(1:nx)) {
for(k in c(1:nx)) {
A[j,k]=0
if(j==k) {
if(j!=nx && j!=1) {
A[j,k] = 1 + 2*alpha[j+1] + 2*alpha[j-1] #diagonal elements
}
if(j==1) {
A[j,k] = 1 + 2*alpha[j+1]
}
if(j==nx) {
A[j,k] = 1 + 2*alpha[j-1]
}
}

if(j==k+1 && j!=1) {
A[j,k] = -alpha[j-1] #off-diagonal elements
}

if(j==k-1 && j!=nx) {
A[j,k] = -alpha[j+1]
}

if(j==k+2 && j<nx) {
A[j,k] = -2*alpha[j+1]
}

if(j==k-2 && j>1) {
A[j,k] = -2*alpha[j-1]
}

if(j==k+3 && j<nx) {
A[j,k] = alpha[j+1]
}

if(j==k-3 && j>1) {
A[j,k] = alpha[j-1]
}

}
}

for(l in c(1:nx)) {
psi[l] = sol[l]
}
sol <- solve(A,psi-disjoin3) #solve the system of equations
for(l in c((nx-20):nx)) { # remove the risk of boundary effects on the right end of the domain
psi[l]=prec
sol[l]=prec
}

if(m%%10 == 0) {
jpeg(file = paste("plot_",m,".jpg",sep=""))
plot(xaxis,sol,xlab = "position (x)",ylab="h(x,t)",ylim=c(0,1),pch='.')
title(paste("h(x,t) at t = ",round(m*dt,digits=2)))
lines(xaxis,sol)
dev.off()
}

}

In the next animation, the solution curves for pre-factor values $B=0.1$ (red curve) and $B=0.01$ (black curve) are shown in the same graph.

In the animation, it is apparent that a larger value of $B$ leads to a larger contact angle at equilibrium. Actually, it can be shown that the contact angle $\theta$ depends on the values of the parameters as

More information about the thin film equation and its solutions can be found on the Wiki page here, and on the NJIT department of applied mathematics homepage. When the effects of gravitation or surface tension gradients are added in the TFE, many kinds of interesting pattern formation effects can happen just like in our previous example of a nonlinear PDE, the Ginzburg-Landau equation. If the liquid film consists of a mixture of many, possibly volatile liquids, complicated multiphysics problems involving fluid dynamics, evaporation, heat transfer and chemical kinetics, all at the same time, are obtained.

## Numerical solution of PDE:s, Part 9: 2D Ginzburg-Landau equation

In an earlier post, I described the 1-dimensional Ginzburg-Landau equation and showed how it can be linearized and solved with a implicit differencing scheme. The most interesting feature of the solutions was the appearance of seemingly random oscillations. A similar solution method is possible for the 2d version of the equation:

where again $\alpha$ and $\beta$ are real valued constants.

An R language code for solving this with parameter values $\alpha = 0$, $\beta = 1.5$ and an initial state $A(x,0)$ which is a mixture of 2d plane waves is shown below.

library(graphics) #load the graphics library needed for plotting

lx <- 80.0 #length of the computational domain in x-direction
ly <- 80.0 #length of the computational domain in y-direction
lt <- 60.0 #length of the simulation time interval
nx <- 50 #number of discrete lattice points in x-direction
ny <- 50 #number of discrete lattice points in y-direction
nt <- 240 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell in x-direction
dy <- ly/ny #length of one discrete lattice cell in y-direction
dt <- lt/nt #length of timestep

a <- 0
b <- 1.5

kappa1 = dt*(1+1i*a)/dx/dx
kappa2 = dt*(1+1i*b)

C = c(1:(nx*ny))
Cu = c(1:(nx*ny))
A2d = matrix(nrow=ny,ncol=nx)
xaxis <- c(0:(nx-1))*dx #the x values corresponding to the discrete lattice points
yaxis <- c(0:(ny-1))*dy #the y values corresponding to the discrete lattice points

A = matrix(nrow=(nx*ny),ncol=(nx*ny))
IP = matrix(nrow=4*nx,ncol=4*nx)

for (i in c(1:ny)) {
for (j in c(1:nx)) {
A2d[i,j] <- 0.01*exp(1i*5.21*i+1i*10.331*j)+0.01*exp(1i*15.71*i+1i*17.831*j)
}
}

for (k in c(1:nt)) { #main time stepping loop

for(i in c(1:ny)) {
for(j in c(1:nx)) {
C[(i-1)*nx+j] <- A2d[i,j]

}
}

for(i in c(1:(nx*ny))) {
for(j in c(1:(nx*ny))) {
A[i,j] <- 0
if(i==j && j!=1 && j!=nx && i!=1 && i!=ny) A[i,j] <- 1+2*kappa1+kappa2*abs(C[j])*abs(C[j]) - dt
if(i==j && (j==1 || j==nx) && i!=1 && i!=ny) A[i,j] <- 1+2*kappa1+kappa2*abs(C[j])*abs(C[j]) - dt
if(i==j && j!=1 && j!=nx && (i==1 || i==ny)) A[i,j] <- 1+2*kappa1+kappa2*abs(C[j])*abs(C[j]) - dt
if(i==j && (j==1 || j==nx) && (i==1 || i==ny)) A[i,j] <- 1+2*kappa1+kappa2*abs(C[j])*abs(C[j]) - dt
if(j==i+1 && (i%%nx != 0)) A[i,j] <- -kappa1
if(j==i-1 && (i%%nx != 1)) A[i,j] <- -kappa1
if(j==i+nx) A[i,j] <- -kappa1
if(j==i-nx) A[i,j] <- -kappa1
}
}

Cu <- solve(A,C)

for(i in c(1:ny)) {
for(j in c(1:nx)) {
if(i==1) Cu[(i-1)*nx+j]=Cu[i*nx+j]
if(i==ny) Cu[(i-1)*nx+j]=Cu[(i-2)*nx+j]
if(j==1) Cu[(i-1)*nx+j]=Cu[(i-1)*nx+j+1]
if(j==nx) Cu[(i-1)*nx+j]=Cu[(i-1)*nx+j-1]
}
}

for(i in c(1:ny)) {
for(j in c(1:nx)) {
A2d[i,j] <- Cu[(i-1)*nx+j]
}
}

for(l in c(1:(nx-1))) {
for(m in c(1:(nx-1))) { #make a bitmap with 4 times more pixels, using linear interpolation
IP[4*l-3,4*m-3] = A2d[l,m]
IP[4*l-2,4*m-3] = A2d[l,m]+0.25*(A2d[l+1,m]-A2d[l,m])
IP[4*l-1,4*m-3] = A2d[l,m]+0.5*(A2d[l+1,m]-A2d[l,m])
IP[4*l,4*m-3] = A2d[l,m]+0.75*(A2d[l+1,m]-A2d[l,m])
}
}

for(l in c(1:(4*nx))) {
for(m in c(1:(nx-1))) {
IP[l,4*m-2] = IP[l,4*m-3]+0.25*(IP[l,4*m+1]-IP[l,4*m-3])
IP[l,4*m-1] = IP[l,4*m-3]+0.5*(IP[l,4*m+1]-IP[l,4*m-3])
IP[l,4*m] = IP[l,4*m-3]+0.75*(IP[l,4*m+1]-IP[l,4*m-3])
}
}

#make plots of C(x,y) on every third timestep
jpeg(file = paste("plot_",k,".jpg",sep=""))
image(Re(IP),zlim=c(-3,3))
title(paste("Real part of solution A(x,y,t)",k*dt))
dev.off()

}

The code produces 2d plots of the real part of the solution on each timestep, and in the video shown below they have been combined into an animation.

In the animation we see the appearance of spiral patterns typical for these values of parameters $\alpha,\beta$. Other values of the parameters
produce different kinds of patterns, as is described in this link.

## Numerical solution of PDE:s, Part 8: Complex Ginzburg-Landau Equation

In the previous numerical solution posts, I described linear equations like diffusion equation and the Schrödinger equation, and how they can be solved by (implicit or explicit) finite differencing. The idea of the implicit methods was to convert the equation into a linear system of equations, from which the function values on a discrete mesh could be calculated.

Saying that these equations were linear means that they can be written as

where the linear differential operator, containing space and time derivatives, is acting on the function and producing “something” (usually zero but in the case of source terms/inhomogeneity something nonzero).

As a first example of a nonlinear PDE, let’s consider the complex Ginzburg-Landau equation (CGLE), which reads:

Here the $\alpha$ and $\beta$ are real parameters and $i$ is the imaginary unit. Applying an implicit differencing on this may seem to result in a system of equations

but this is not a linear system because of the $|A_{i}^{j+1}|^2$, so we cannot solve the problem in this way by using linear algebra.

The trick to solve this is to linearize the system, by evaluating the $|A|^2$ at timestep $j$ and the rest of the quantities at timestep $j+1$, producing the system

which is now a linear system w.r.t. to the variables evaluated at timestep $j+1$ (the matrix for solving “$A^{j+1}$“:s has diagonal elements that depend of “$A^j$“:s). A more sophisticated method would do several iterations to approximate the values of $A(x,t)$ between the timesteps $j$ and $j+1$.

An R code that solves the equation for a domain $x\in [0,100]$, $t\in [0,150]$, using discrete steps $\Delta x = 0.66$, $\Delta t = 0.33$ , initial state $A(x,0) = 0.1e^{2ix}$ and values $\alpha=3$ and $\beta = -2$, is shown here.

library(graphics) #load the graphics library needed for plotting

lx <- 100 #length of the computational domain
lt <- 150 #length of the simulation time interval
nx <- 150 #number of discrete lattice points
nt <- 450 #number of timesteps
dx <- lx/nx #length of one discrete lattice cell
dt <- lt/nt #length of timestep

a <- 3
b <- -2

kappa1 = dt*(1+1i*a)/dx/dx #an element needed for the matrices
kappa2 = dt*(1+1i*b)

psi = as.complex(c(1:nx)) #array for the function A values
sol = as.complex(c(1:nx))

for(j in c(1:nx)) {
psi[j] = 0.1*exp(2i*j*dx)
sol[j] = psi[j]
}

xaxis <- c(1:nx)*dx #the x values corresponding to the discrete lattice points

IPxaxis <- c(1:(4*nx))*dx/4
IPtaxis <- c(1:(4*nt))*dt/4

sol_plot = matrix(nrow=nt,ncol=nx)

A = matrix(nrow=nx,ncol=nx) #matrix for forward time evolution
IP = matrix(nrow = 4*nt, ncol=4*nx)

for (m in c(1:nt)) { #main time stepping loop

for(j in c(1:nx)) {
for(k in c(1:nx)) {
A[j,k]=0
if(j==k) {
A[j,k] = 1 + 2*kappa1 + kappa2*abs(sol[j])*abs(sol[j]) – dt #diagonal elements
}
if((j==k+1) || (j==k-1)) {
A[j,k] = -kappa1 #off-diagonal elements
}
}
}

for(l in c(1:nx)) {
psi[l] = sol[l]
}
sol <- solve(A,psi) #solve the system of equations

for (l in c(1:nx)) {
sol_plot[m,l] <- Re(sol[l])
}

jpeg(file = paste(“plot_”,m,”.jpg”,sep=””))
plot(xaxis,Im(sol),xlab = “position (x)”,ylab=”Im[A(x,t)]”,ylim=c(-4,4),pch=’.’)
title(paste(“Im[A(x,t)] at t = “,round(m*dt,digits=2)))
lines(xaxis,Im(sol))
dev.off()

}

for(l in c(1:(nt-1))) {
for(m in c(1:(nx-1))) { #make a bitmap with 4 times more pixels, using linear interpolation
IP[4*l-3,4*m-3] = sol_plot[l,m]
IP[4*l-2,4*m-3] = sol_plot[l,m]+0.25*(sol_plot[l+1,m]-sol_plot[l,m])
IP[4*l-1,4*m-3] = sol_plot[l,m]+0.5*(sol_plot[l+1,m]-sol_plot[l,m])
IP[4*l,4*m-3] = sol_plot[l,m]+0.75*(sol_plot[l+1,m]-sol_plot[l,m])
}
}

for(l in c(1:(4*nt))) {
for(m in c(1:(nx-1))) {
IP[l,4*m-2] = IP[l,4*m-3]+0.25*(IP[l,4*m+1]-IP[l,4*m-3])
IP[l,4*m-1] = IP[l,4*m-3]+0.5*(IP[l,4*m+1]-IP[l,4*m-3])
IP[l,4*m] = IP[l,4*m-3]+0.75*(IP[l,4*m+1]-IP[l,4*m-3])
}
}

jpeg(file = “2dplot.jpg”)
image(IPtaxis,IPxaxis,IP,xlab = “t-axis”,ylab=”x-axis”,zlim=c(-2,2))
dev.off()

Plotting the real part of the resulting function $A(x,t)$ at several values of $t$, we see that the solution initially doesn’t do much anything, but at some point a “phase turbulence” sets in starting from the ends of the x-domain and after that the function evolves in a very random way, without following any clear pattern (unlike the spreading mass/temperature distributions, traveling waves or scattering wavepackets in the case of the common linear PDE:s).

An animation of the solution is shown below.

This kind of chaos is typical of nonlinear systems, be them point mass systems with nonlinear forces between mass points or field systems with nonlinear field equations such as the CGLE here. Note that the solution of this equation is a bit too heavy of a calculation to do just for the purpose of creating random numbers, so for that end other methods such as Perlin’s noise are used.

The 2D color plot of the real part of the solution, plotted in the xt-plane, looks like this:

More plots of the solutions for different values of parameters can be found in this article.

It should be noted that, Wolfram Mathematica’s “NDSolve” function can’t usually solve nonlinear PDE:s correctly, despite usually working properly in the case of linear PDE:s. Some other commercial math programs such as Comsol Multiphysics may work better when solving nonlinear problems, at least to my experience.

So, here was the basic idea of how nonlinear PDE:s are solved by linearization, and what kind of things are possible in the behavior of their solutions. In the next PDE post I will show how to solve the thin-film equation, about which I actually wrote my master’s thesis in 2013, and which doesn’t usually behave chaotically unlike the CGLE (but can be made to do so by adding suitable terms).

## A Random Bubbles Picture with ImageJ

In the last post, I described how to create an image with random B&W patterns by using random noise and bandpass filtering. This time I will show an even simpler trick with ImageJ, the creation of an image with random bubbles.

Start as before by opening ImageJ and making a new image.

Fill the originally black image with specified random noise of standard deviation 3.0 (Process > Noise > Add Specified Noise ).

Next, choose Process > Filters > Minimum and set the “radius” to 20 pixels. This will replace the brightness value of every pixel in the original image with the smallest brightness value that can be found in its circular neighborhood of that radius.

Next do an edge detection, from the menu Process > Find Edges. The resulting image looks like this.

Finally, modify the brightness of the image and add contrast (both done in Image > Adjust > Brightness/Contrast).

Now we have a quite good bubbles image, but it has a rather poor resolution, as can be seen by zooming in:

The solution to this is to use the Inkscape vector graphics program to “trace” the bitmap and then export it in a PNG file of a larger resolution. Choose Path > Trace Bitmap from the Inkscape menu and use the “Brightness Cutoff” to do this.

If this is done by producing a 2-color vector image (not by multiple brightness steps), the result and the zoom look like this:

So, no more pixelization or blurring. The Inkscape program is easy to use and shouldn’t cause trouble. Another way to do the same thing is to use some online image vectorization service.

More image processing stuff coming soon, have fun!

## Creating Bitmaps with Random Patterns

This time I’m going to write about image processing and computer graphics. Many of you may have seen procedurally generated textures – bitmaps that look like something from the real world such as rusted steel, marble or stone, but are not photos, being completely computer generated. It may initially seem strange that it’s possible to produce patterns with such natural-looking randomness with a mathematical algorithm. The trick is to use random noise (similar to static on an old television screen) and apply cleverly chosen filters on that.

Let me show how to produce an image with random black spots/stripes on a white background, similar to the pattern of a zebra or some breeds of dog. First, we need a free image processing program called ImageJ, downloadable for Windows or Linux from this address.

First, we will choose New > Image from the File menu, and set the resolution of a new blank image to 1000 x 1000 pixels.

The result is an all-black bitmap as you see below.

Next, we will add gray random noise with a specified standard deviation of pixel brightness. Choose Noise > Add Specified Noise from the Process menu, and set the standard deviation to value 3.00.

The resulting image looks like that below, where you see the familiar “TV static” appearance.

Next we will do a so-called bandpass filtering on the image. The idea here is that the program first converts the image to a Fourier transform, which is a function that tells how much of different “wavelengths” are present in the image. An image that consists of large features (with a characteristic diameter of a large number of pixels) has a high amount of large-wavelength Fourier components, while an image that contains small features has a high amount of short-wavelength components. After converting the image to the Fourier transform the bandpass filtering algorithm removes all components with wavelengths that don’t lie on a desired interval and then does an inverse Fourier transform to produce a filtered image.

Choose FFT > Bandpass Filter from the Process menu and set the lower limit to 30 pixels and the upper limit to 100 pixels. This will remove all features with characteristic lengths that are not in that interval. The result looks like this:

This resulting image is pretty much the same as Perlin’s fractal noise, which is very important in computer generated graphics.

Next, we have to do a “color quantization” that converts the image to a maximum contrast one, having only all-black or all-white pixels. To do this,  choose Adjust > Brightness/Contrast from the Image menu.

Dragging the Contrast bar to maximum value, we obtain a two-color image that has random spots/stripes as wanted. If the contrast adjustment doesn’t seem to work, try changing the image type to “RGB Color” from the Image menu first.

Spots of different sizes can be created by changing the upper and lower limits in the bandpass filtering stage. If an image with a larger resolution is desired, the image can be converted to a vector image with a program such as Inkscape, and printed back into a .PNG bitmap file of a higher resolution. This will produce a larger image without blurring or pixelization.

More complicated filtering and coloring procedures can be done to produce images like this one here, but I will write more about it later.

## Interplay of Theory Development with Physical Reality

What’s the relation of theories of physics to the actual physical reality they are describing? A naive answer would be something like “Theories are formed by making enough experiments and deducing a common law from the results. As more experiments are made, the theory with its laws becomes a more correct description of reality”. This, however is a simplistic way to describe the situation, as no amount of supporting special cases counts as an actual logical proof for a general statement, and usually there’s a need to already have at least some degree of a “pre-theory” before one can even make any experiments.

Suppose someone wants to study the behavior of matter at different temperatures. The things measured could involve thermal expansion of a metal rod, pressure changes upon heating a gas in a rigid container, or something similar. Actually, to even be able to measure a temperature, one has to make several assumptions about the physical laws related to it. When we immerse one end of a thermometer in a kettleful of hot water, we assume that only the temperature of the thermometer changes significantly, and the hot water stays at a very close to constant temperature as the thermometer and water reach thermal equilibrium. So there we already need to have some kind of a common-sense notion of heat capacity, whatever we decide to call that thing at that point of constructing a theory of thermodynamics. In particular, we assume that if an object has either a very small mass or small volume, it takes only a small amount of “thermal energy” to change its temperature.

Also, the concept of thermal equilibrium itself requires some assumptions that are not obvious in principle. The Zeroth Law of Thermodynamics says that if Object 1 and Object 2 are in thermal equilibrium with each other, and Object 2 is in equilibrium with Object 3, then the Objects 1 and 3 are also in mutual thermal equilibrium. A kind of an equivalence relation, but this time in physics instead of pure mathematics (another example of an equivalence relation in physics is found when forming a set of all scalar-vector potential combinations resulting in the same electric and magnetic fields). We need to do this assumption before we can measure temperatures based on the thermal expansion of some material, as the equilibration needs to be able to be transmitted forward through the material that we are trying to expand by heating it. So, to measure temperature we need to have some kind of an intuitive notion of thermal equilibrium, and also assume that the length of a metal rod or column of mercury or ethanol is a single-valued function of temperature in the range of conditions we are doing the experiments in. This may seem like a chicken-and-egg problem where we need to define temperature to study thermal expansion, and also know a law of thermal expansion to be able to define temperature. The same situation is apparent in the SI definition of the unit of length as an exact fraction of the distance travelled by light in a second, despite the fact that distance measurements were done by people for quite a long time before someone even got the idea of the constancy of speed of light. This circularity problem doesn’t cause as much trouble as one may think, as you can try different kind of theories combining assumptions about temperature and thermal expansion and make a lot of tests on every theory, trying to find a contradiction between the theory and experimental fact, and trust that in the case of wrong assumptions it will become apparent that something fishy is going on. Or of course if there’s a logical contradiction in the theory with itself then it must be wrong no matter what the experiments tell.

Some things that are usually assumed about thermal equilibration include that it proceeds monotonously towards its conclusion, and doesn’t do strange things like the thermometer reading oscillating around some value before setting to equilibrium (in other words, the thermal impact given to the thermometer by the hot liquid doesn’t cause an effect similar to how a mechanical impact affects a physical pendulum in absence of supercritical damping). This assumption is not always true in thermodynamics, as oscillating chemical reactions like the BZ reaction show (first academic papers about oscillating reactions were actually initially rejected from publication because they were so much against the common sense notions about chemical kinetics and thermodynamics at the time).

The temperature/thermal equilibrium example was a rather simple one, but a lot more difficult problem are the differences of the theory of relativity and quantum mechanics to the classical Newtonian mechanics, where one finds out a necessity to accept physical laws that are very difficult to form a mental image about and that are against conventional common sense (however, it must be admitted that even the classical Newton’s first law is not intuitively obvious on the surface of the Earth where there are frictional forces damping almost every form of motion).

Philosophical theories about the nature of successive scientific theories include the Karl Popper’s theory where the most important feature of a scientific theory is thought to be its falsifiability, which means that the theory must be “vulnerable” to experimental evidence in the sense that the theory can’t be somehow cleverly bent to fit any possible experimental results (in the same way how illogical features in a religious belief system can be justified by saying “Inscrutable are the Lord’s ways”). Other philosophers who have investigated the subject include Thomas Kuhn and Paul Feyerabend. In Kuhn’s theory in particular, the very concept of “objective physical reality” is questioned, and it’s said that to be able to talk about physical reality one always has to do that in reference to some particular theoretical framework.

Actually I recently got (for free) a book related to this subject, C. Dilworth’s “Scientific Progress – A Study Concerning the Nature of the Relation Between Successive Scientific Theories”, and it seems to contain a lot of interesting material but I haven’t managed to read very far through it. I hope the links in this blog post help someone who’s trying to find info about this subject.

## The problematic radial momentum operator

In quantum mechanics, position coordinates x,y,z of a particle are replaced with position operators $\large \hat{x},\hat{y},\hat{z}$ and the components of the momentum vector $\mathbf{p}$ are replaced with the operators $\large \hat{p_x},\hat{p_y},\hat{p_x}$. The most important property of these operators is the commutation relation between a coordinate and the corresponding momentum component:

$\large [\hat{x},\hat{p_x}] = i\hbar$

In the position representation, the momentum operator has the form of a differential operator

$\large \hat{p_x} = -i\hbar \frac{\partial}{\partial x}$

and the $\hat{x}$ operator is just a multiplication of the wavefunction with the variable x.

Can these results be extended to coordinate systems that are not Cartesian? Consider a plane polar coordinate system where the position is given as the pair $\large (r,\theta)$, with

$\large x = r\cos \theta$
$\large y = r\sin \theta$

The “momentum” corresponding to the variable $\large \theta$ is obviously the angular momentum operator

$\large \hat{L}_z = -i\hbar \frac{\partial}{\partial \theta}$.

But what about the radial momentum operator? It would have to be

$\large \hat{p}_r = -i\hbar \frac{\partial}{\partial r}$

in the position representation, but is it an acceptable quantum operator?

Actually, it is not. An operator corresponding to a measurable quantity has to be Hermitian, and it’s easy to see that if we have a wavefunction similar to a hydrogen atom ground state

$\large \psi (r) = Ae^{-ar}$

where a is a real constant, the state is an eigenstate of $\hat{p}_r$ with an imaginary eigenvalue $\large ia\hbar$. Therefore it’s not possible to use the radial momentum operator as an observable.

Another fun way to see this is to consider a “radial translation operator”, which is generated by the radial momentum:

$\large \hat{T}_r (\Delta r) = \exp \left(\frac{i\Delta r \hat{p}_r}{\hbar}\right)$

and operate with it on a rotation symmetric function $\psi (r)$. The result is

$\large \hat{T}_r (\Delta r)\psi (r) = \psi (r + \Delta r)$

However, because a radius coordinate always has to be a nonnegative number, we have irreversibly lost all information about the values of the function $\large \psi (r)$ for $\large r<\Delta r$ here, which means that this radial translation operator does not have an inverse, and therefore can’t be unitary as would be expected if $\large \hat{p}_r$ were Hermitian!