Making use of Markov chains

A stochastic process describes a system that changes in values over time. Typically in order to predict what state the process will be in in the future, the past states may have to be used as input. A Markov chain is a stochastic process in which the predictions for the future of the process are based solely on its present state. Thus a Markov chain only remembers the current step and not the steps in the past (this is called the memoryless property). Markov chains, and stochastic processes in general, have wide applications in many fields. In this post, we highlight two fun probability problems that can be solved using Markov chains.

Figure 1 – toss a coin until seeing 3 consecutive heads

The first problem is demonstrated by the above diagram – tossing a fair coin until the appearance of k consecutive heads. How many tosses are required on average to accomplish this goal? In Figure 1, it takes 17 tosses to get 3 consecutive heads. The number of tosses is random. Sometimes it takes more tosses and sometimes it takes fewer tosses. In general, what is the expected number of tosses to get k consecutive heads? For 3 consecutive heads, the answer is 14. To get 4 consecutive heads, it takes on average 30 tosses.

How does coin tossing becomes a Markov chain? Let’s take the goal of 4 consecutive heads. As the coin is tossed, we are interested in 5 different states – 0, H, HH, HHH and HHHH. The state 0 refers to the beginning of the experiment or that the current toss is a Tail. The other four states refer to the most recent consecutive heads that have appeared. In each toss of the coin, the process is in one of these 5 states. The next state depends on the current state and does not depend on the past states, making this a Markov chain. The following is the transition probability matrix.

    \mathbf{P} =       \bordermatrix{ & 0 & \text{1=H} & \text{2=HH} & \text{3=HHH} & \text{4=HHHH} \cr        0 & 0.5 & 0.5 & 0  & 0 & 0 \cr        1 & 0.5 & 0 & 0.5 & 0 & 0 \cr        2 & 0.5 & 0 & 0 & 0.5 & 0 \cr        3 & 0.5 & 0 & 0 & 0  & 0.5 \cr        4 & 0 & 0 & 0 & 0  & 1 \cr   } \qquad

The matrix captures the probabilities of how the process transitions from one state to the next. For example, if the current state is H, then the process will go to HH (with probability 0.5) or go to 0 (with probability 0.5). Example 2 in this blog post in a companion blog shows how to use this matrix to compute the expected number of steps it takes to go from state 0 to the state HHHH. The answer is 30.

If the goal is to toss the coin until obtaining 3 consecutive heads, then the states of the process would be 0, H, HH and HHH. Set the the transition probability matrix accordingly and apply the same technique discussed in the companion blog. If the goal is to toss the coin until obtaining k consecutive heads, adjust the states and the matrix accordingly and apply the method.

Though the highlighted problem is about the average number of tosses, a broader problem is about a probability distribution of the number of tosses to get k consecutive heads. For k=4, what is the probability that the goal (4 consecutive heads) can be achieved in 4 tosses of the coin, in 5 tosses, and so on? The method in answering these questions is discussed in this blog post in the same companion blog.

Figure 2 – rolling a die until seeing all 6 faces

The second problem is demonstrated by Figure 2 – rolling a fair die until the appearance of all 6 faces. How many rolls on average are required to achieve this goal? In Figure 2, it takes 11 rolls to get all 6 faces. Example 3 in this blog post shows how to answer this question. The answer is 14.7.

It turns out that Markov chain is not needed to solve this problem since this problem is related to two classic problems in probability – the occupancy problem as well as the coupon collector problem. In fact, the coupon collector problem provides a handy formula to obtain the average number rolls in order to have all faces in a k-sided die appeared once.

As in the first problem, the broader problem is about a probability distribution. What is the probability that all 6 faces appear after 6 rolls, after 7 rolls, and after 8 rolls and so on? This blog post in the same companion blog gives the idea on how to answer such questions.

Introductory discussion on Markov chains is found in the early posts in this companion blog on stochastic processes.

probability

Markov chains

Daniel Ma math

Dan Ma mathematics

\copyright 2017 – Dan Ma