October 2, 2024 14:20

Alright, time to make a blog post and ignore the massive hiatus between this and the last post. I was watching a streamer making a coin toss bet, where they wanted heads to win. As soon as it was tails, he said "okay, best of 3". It went on heads, but then tails again, to which he continued "okay, best of 5". He then got heads twice, which means heads won the best of 5, and he was content.

My question was: If we kept going, potentially forever, is the chance of heads winning approaching 100%? How fast/slow is it approaching if so? I plan to answer these questions in this post.

The idea is: We stop this "experiment" once heads wins at any time. This means that we have a 50% chance of getting heads right away, and we're done. Otherwise, we got tails, and so if we want to win on the best of 3, we need heads (so they're tied), and heads again. If we instead got tails, we need to get heads twice to win the best of 5. However, another way to win the best of 5 is to get tails twice, and then heads three times in a row.

In general, if heads has not won at any point in time, and we want it to win on the best of $2N+1$, we need to have done $2N$ coin tosses where $N$ were heads and $N$ were tails (if we have more tails then one coin toss wouldn't make heads win, and if we have more heads, then we already won before), and where heads did not win at any point during this sequence. For example, if we look at this sequence of coin tosses $$\text{TTHTTHHHTTHHHTHTHTHT}$$ This has 20 coin tosses, where 10 were heads and 10 were tails. However, if we inspect the first 13 tosses, we'll see that there are 8 heads and 7 tails, meaning heads won the best of 13, and so this is not a sequence we're interested in (because we're looking for a sequence where heads wins specifically on the best of 21, in this case).

To quantify "what are the odds of heads having won up to a best of $2N+1$?", we need to first inspect the chances of winning specifically on a best of $2n+1$ for each $0 \le n \le N$ and sum them all up. As established above, to win specifically on the best of $2n+1$ we need a sequence of $2n$ tosses with $n$ heads and $n$ tails and that heads was never in the lead. Let $C_n$ denote the amount of possible sequences of $2n$ tosses that has these properties. We know that the amount of total possible sequences of $2n$ coin tosses is $2^{2n}$. After each of these sequences, we want heads to win, which is a $1/2$ chance of happening. All in all, we get $$\frac{C_n}{2^{2n}} \cdot \frac{1}{2} = \frac{C_n}{2^{2n+1}}$$ where we define $C_0 := 1$ for convenience. $C_1 = 1$ because the only sequence of 2 tosses with this property is $TH$. $C_2 = 2$ because the only two sequences of 4 tosses with this property are $TTHH$ and $THTH$. $C_3 = 5$ and the 5 sequences of 6 tosses with this property are seen below. $$\text{TTTHHH} \quad \text{TTHTHH} \quad \text{TTHHTH} \quad \text{THTTHH} \quad \text{THTHTH}$$ As it turns out, our sequence $C_n$ is exactly the same as the sequence of Catalan numbers, which are also denoted as $C_n$ (so think of this notation as foreshadowing). I won't prove it here because it deserves a blog post of its own, so... stay tuned for that. The point is that the Catalan numbers have a nice closed formula. $$C_n = \frac{1}{n+1}{2n \choose n}$$ The ${2n \choose n}$ part makes sense, since this is the amount of total sequences of $2n$ tosses with $n$ heads and $n$ tails, but the fraction hints that out of all those, only $1/(n+1)$ of them satisfy the property of heads never having been in the lead.

All in all, we get that the total chance of heads winning on the best of $2N+1$ or earlier is $$\sum_{n=0}^N \frac{1}{(n+1)2^{2n+1}}{2n \choose n}$$

Theorem. For all $N$, the partial sums evaluate to $$\sum_{n=0}^N \frac{1}{(n+1)2^{2n+1}}{2n \choose n} = 4^{-(N+1)}\left( 4^{N+1} - {2(N+1) \choose N+1} \right)$$

Proof. By induction, for $N=0$, we get $$\frac{1}{2}{0 \choose 0} = \frac{1}{2} = \frac{1}{4}\left(4 - \frac{2}{1}\right) = 4^{-1}\left(4 - {2 \choose 1}\right)$$ Inductively, we thus wish to prove $$4^{-(N+1)}\left( 4^{N+1} - {2(N+1) \choose N+1} \right) + \frac{1}{(N+2)2^{2N+3}}{2(N+1) \choose N+1} = 4^{-(N+2)}\left( 4^{N+2} - {2(N+2) \choose N+2} \right)$$ We can rewrite the fraction as $$\frac{1}{(N+2)2^{2N+3}} = \frac{1}{2(N+2)4^{N+1}} = 4^{-(N+1)}\frac{1}{2(N+2)}$$ Therefore, the left-hand side becomes $$4^{-(N+1)}\left( 4^{N+1} - {2(N+1) \choose N+1} + \frac{1}{2(N+2)}{2(N+1) \choose N+1} \right)$$ We can rewrite that as $$4^{-(N+1)}\left( 4^{N+1} - {2(N+1) \choose N+1} \left( 1 - \frac{1}{2(N+2)} \right) \right)$$ We want the exponents to be $N+2$, so we write $$4^{-(N+2)}\left( 4^{N+2} - {2(N+1) \choose N+1} \left( 4 - \frac{2}{N+2} \right) \right)$$ Therefore, to get the desired result, we just need to prove that $${2(N+1) \choose N+1} \left( 4 - \frac{2}{N+2} \right) = {2(N+2) \choose N+2}$$ We first see that $$4 - \frac{2}{N+2} = \frac{2(2N + 3)}{N+2} = \frac{2(2N + 3)(N+2)}{(N+2)^2} = \frac{(2N + 3)(2N+4)}{(N+2)^2}$$ and using the fact that ${2k \choose k} = (2k)!/(k!^2)$, we get $$\frac{(2N+2)!}{(N+1)!^2}\frac{(2N + 3)(2N+4)}{(N+2)^2} = \frac{(2N+4)!}{(N+2)!^2} = {2(N+2) \choose N+2}\square$$ Using this theorem, we get that $$\sum_{n=0}^\infty \frac{1}{(n+1)2^{2n+1}}{2n \choose n} = \lim_{N \to \infty} 1 - \frac{1}{4^{N+1}}{2(N+1) \choose N+1}$$ Using Stirling's approximation, $$\lim_{N \to \infty} \frac{1}{4^{N+1}}{2(N+1) \choose N+1} = \lim_{N \to \infty} \frac{1}{4^{N+1}}\frac{4^{N+1}}{\sqrt{\pi (N+1)}} = 0$$ And so, as we expected, the probability of heads eventually winning, as long as we keep the game going farther and farther, is 100%. $$\sum_{n=0}^\infty \frac{1}{(n+1)2^{2n+1}}{2n \choose n} = 1$$ Perhaps this can enlighten as to why this convergence is slow, because we were left with the term $1/\sqrt{\pi (N+1)}$ which goes to 0 quite slowly (though, not logarithmically slowly). Even after 60 coin tosses, there's about a 10% chance of heads never having won at any time. And even after 1000 coin tosses, there's still about a 2.5% chance you have to keep going.