Detailed Balance

Interact

The Markov Chains that we have been studying have stationary distributions that contain much information about the behavior of the chain. The stationary distribution of a chain is a probability distribution that solves the balance equations. For some chains it is easy to identify a distribution that solves the balance equations. But for other chains, the solution can be complicated or tedious. Let’s see if we can find a simple way to solve the balance equations.

Recall our earlier image of what is being balanced in the equations. Imagine a large number of independent replications of the chain. For example, suppose a large number of particles are moving among the states according to the transition probabilities of the chain, all moving at instants 1, 2, 3, independently of each other.

Suppose the chain is in steady state. As we said earlier, if you think of π(k) as the proportion of particles leaving state k at any instant, then the balance equations

π(j)=kSπ(k)P(k,j)

say that the proportion of particles leaving state j is the same as the number of particles entering it. Hence the chain is balanced.

Notice that the left hand side is just the proportion of particles leaving j; there is no information about where the particles are going.

Now suppose there is detailed balance, given by

π(i)P(i,j)=π(j)P(j,i)   for all states ij

These are called the detailed balance equations. They say that for every pair of states i and j, the proportion of particles that leave i and move directly to j is the same as the proportion that leave j and move directly to i. In the case i=j the equations carry no information and hence are left out.

That turns out to be a stronger condition than balance.

Detailed Balance Implies Balance

Suppose there is a probability distribution π that solves the detailed balance equations. Then π also solves the balance equations. kSπ(k)P(k,j)=kSπ(j)P(j,k)   (detailed balance)=π(j)kSP(j,k)=π(j)1              (sum of jth row of transition matrix)=π(j)

What we learn from this is that if we can find a solution to the detailed balance equations, we will also have solved the balance equations.

This is helpful for two reasons:

  • The detailed balance equations are simple.
  • There are lots of them; indeed if there are s states then there are (s2) detailed balance equations in s unknowns. This gives us lots of ways to try to solve them.

Of course all the (s2) equations need not be consistent, in which case there will not be a solution to the detailed balance equations. In such situations we’ll have to slog our way through solving the balance equations directly. But here is an example that shows that if the detailed balance equations do have a solution, we have an easy way at arriving at the stationary distribution of the chain.

Ehrenfest Chain

We have returned to this example because it is one where solving the balance equations involves some labor. We will show that for this chain and others like it, the detailed balance equations can easily be solved, giving us a quick route to the stationary distribution.

The state space is the integers 0 through N. Recall how the transitions work: at each step, the chain either goes up by 1, stays the same, or goes down by 1. Such chains are called birth and death chains and are used to model many different random quantities such as gamblers’ fortunes or population sizes. In our example, we are modeling the size of the population of gas particles in a container.

For such chains, most of the transition probabilities are 0 because in one step the chain can only move to the two neighboring states. So most of the detailed balance equations are trivially true. For the ones the involve positive tranistion probabilities, the states i and j have to be separated by 1 (remember that the detailed balance equations specify ij). And in that case both P(i,j) and P(j,i) are positive, as the chain is irreducible.

This allows us to solve the detailed balance equations, for example by starting at the lowest state and moving up. Remember the transition rules:

  • At each step, select one of the N particles at random and place it into one of the two containers at random; the chain counts the number of particles in Container 1.

The detailed balance equations are easy to solve sequentially: π(0)12=π(1)12N    π(1)=Nπ(0)=(N1)π(0)π(1)N12N=π(2)22N    π(2)=N12π(1)=N(N1)2π(0)=(N2)π(0)π(2)N22N=π(3)32N    π(3)=N23π(2)=N(N1)(N2)32π(0)=(N3)π(0)

and so on, so that for 1kN, π(k)=(Nk)π(0) by a far easier induction than the one needed to solve the balance equations. The sum of the terms in the solution is

π(0)(1+Nk=1(Nk))=π(0)Nk=0(Nk)=π(0)2N

by the binomial theorem. So π(0)=2N and the stationary distribution is binomial (N,1/2).

At this point it is worth remembering that for numerical value of N you can just use steady_state to find the stationary distribution, relying on Python to do all the work for you. This has some clear advantages but also some disadvantages:

  • Python will not be able to handle the computation when N is very large.
  • You will either not see that the distribution is just binomial or will see it and not know why.

These are reasons why, even in the age of powerful personal computers, it is still important to find good ways of solving problems using math.

Sticky Random Walk on a Circle

Suppose a chain has states 0, 1, 2, 3, 4 arranged in sequence clockwise on a circle. Suppose that at each step the chain stays in place with probability s, moves to its counterclockwise neighbor with probability p, and to its clockwise neighbor with probability r. Here s, p, and r are strictly positive and sum to 1.

It is clear that the behavior of the chain is symmetric in the five states, and therefore in the long run it is expected to spend the same proportion of time in each state. The stationary distribution is uniform on the states. You can also check this by solving the balance equations.

Let’s see whether the detailed balance equations are satisfied. Unlike the Ehrenfest chain above, this chain can “loop back around.” So it’s not clear that the detailed balance equations are consistent.

The detailed balance equations are: π(0)r=π(1)p    π(1)=rpπ(0)π(1)r=π(2)p    π(2)=r2p2π(0)π(2)r=π(3)p    π(3)=r3p3π(0)π(3)r=π(4)p    π(4)=r4p4π(0) So far so good, but now for the moment of truth:

π(4)r=π(0)p    π(4)=prπ(0)

For this system of equations to be consistent and have a positive solution, the two expressions for π(4) must be equal, which is equivalent to

r4p4=pr,   that is,  r5=p5

This can only happen if r=p, and in that case the detailed balance equations say that all the entries of π are equal, which we already knew.

To summarize:

  • The stationary distribution of the chain is uniform on all the states. The uniform distribution satisfies the balance equations.
  • When r=p, the detailed balance equations have a positive solution which is the stationary distribution.
  • When rp the detailed balance equations have no solution that is a probability distribution.

Clearly, r=p has a special status. What exactly does that mean for the behavior of this chain? That’s the topic of the next section. For now, here are simulated paths of the chain, for two sets of parameters:

  • circle_walk_1: s=0.1, r=0.6, p=0.3
  • circle_walk_2: s=0.1, r=0.3, p=0.6

The chance of staying in place is the same for both, but the chances of clockwise and counterclockwise moves have been switched. Here are simulated paths of the two chains. In the plot, “clockwise” is shown as a move up and “counterclockwise” as a move down.

Look at the paths (simulate some more if you like) and answer the following questions:

  • Which one has more “up” transitions than “down”?
  • If someone showed you the path of one of these two processes but didn’t say which of the two it was, could you identify the process?
states = np.arange(5)

s = 0.1
r = 0.6
p = 0.3

def transition_prob(i, j):
    if i == j:
        return s
    elif j == (i+1) % 5:
        return r
    elif j == (i-1) % 5:
        return p
    else:
        return 0
    
circle_walk_1 = MarkovChain.from_transition_function(states, transition_prob)
circle_walk_1
0 1 2 3 4
0 0.1 0.6 0.0 0.0 0.3
1 0.3 0.1 0.6 0.0 0.0
2 0.0 0.3 0.1 0.6 0.0
3 0.0 0.0 0.3 0.1 0.6
4 0.6 0.0 0.0 0.3 0.1
circle_walk_1.simulate_path(0, 50, plot_path=True)

png

s = 0.1
r = 0.3
p = 0.6

circle_walk_2 = MarkovChain.from_transition_function(states, transition_prob)
circle_walk_2
0 1 2 3 4
0 0.1 0.3 0.0 0.0 0.6
1 0.6 0.1 0.3 0.0 0.0
2 0.0 0.6 0.1 0.3 0.0
3 0.0 0.0 0.6 0.1 0.3
4 0.3 0.0 0.0 0.6 0.1
circle_walk_2.simulate_path(0, 50, plot_path=True)

png