In this post I wish to discuss the mathematics of doubling in some detail.

Let us assume Hero and Villain are playing a “match-to-three”. That is, first player to win three VP is the overall match winner. Each individual game is worth 1 VP. The actual game being played is irrelevant: it could be Chess, Agricola, Snakes and Ladders, the royal game (with Hero moving the cards and Villain trash-talking at every sub-optimal play), or a well-known poker variant where both players start with three items of clothing. To make things interesting, assume the winning probability of an individual game is slightly less than 50%. To be specific, I will set the win rate to 40%. What are the chances of Hero winning the overall match?

This kind of question is most easily solved with dynamic programming. The boundary condition says that if someone already has three wins then the overall winning chances is either 0% or 100%. Next, we can compute the winning chances when the score is 2-2. We can then work our way backwards, eventually arriving at 31.7% winning chances for the Hero at 0-0.

You will notice that if the match score is X-X then the Villain prefers smaller numbers of X. Intuitively, smaller values of X means that it is less likely that Hero will find enough “random noise” to overcome Villain’s advantage in the long run. The following diagram summarises the probability of Hero winning the match at every possible match-score:

I should mention that Bart has already done his homework, and he knows how to compute the probabilities corresponding to each match score. If the winning chances of an individual game are 0.25 then he correctly computes the following:

- 0.049 chance of winning 5-point match, no doubling
- 0.104 chance of winning 3-point match (equivalent to 5-point match with 2VP per game)

I will leave this as an exercise for the reader.

Now let us give Hero the following handicap: Before each game, Hero can demand the game be played for 1 VP or 2 VP. Moreover, there is no bonus for winning the match with 4 VP instead of 3 VP.

This means, for instance, if Villain has 2 points then Hero will always play for 2 VP. Again, we can use dynamic programming to compute the winning chances. If you use Excel to perform the dynamic programming then you will need the function max(FOO,BAR) somewhere in your calculations. You will notice several things:

- The boundary conditions now include either player having 3 or 4 VP.
- The table only shows winning percentages, but not whether Hero should play for 1 VP or 2 VP.
- The numbers look weird: The winning chances on the main digonal is no longer strictly increasing and the winning chances for 1-2 is the same as 1-1.

The latter is easily explained. Since Hero can choose to play for 2 VP, Villain gets no advantage from being 1-2 instead of 2-2. Slightly more interesting is a match-score of 1-1. If Hero plays for 2 VP, then the next game decides the match. If Hero plays for 1 VP then the worst case scenario is the match score becomes 1-2, in which case Hero can play for 2 VP and the next game decides the match. Therefore, it must be correct to play for 1 VP. It turns out Hero is close to breaking even thanks to his handicap.

It is not hard to get an Excel spreadsheet to crunch the numbers for different parameter values. We can, for instance, figure out what happens in a match to 13 points with Hero’s winning chances down to 30% for an individual game.

When we talk about longer matches, it is generally more convenient to think in terms of number of VP remaining instead of number of VP already scored. For instance, 19-18 in a 21-point match is equivalent to 5-4 in a 7-point match and it makes sense to describe both of them as “2-away 3-away”.

# Redoubles

Now what happens if either side has the right to double the stakes, but there are no redoubles? This is a trivial case: it makes no sense to refuse a double. If one side declines to double, he can’t prevent the opponent from doing the same. Therefore, it must be correct for either side to double.

But what if redoubles were allowed? This means e.g. if Hero proposes to play for 2 VP then Villain can propose to play for 4 VP before the game starts. Then things may get weird. I will leave the analysis as an exercise for the reader.

# Doubling During the Middle of a Game

Hitherto, we have assumed that doubling could only occur before the start of every game. In this case, each individual game can be thought of as a black box – the only relevant parameter is the winning chances of a single game. Moreover, it never makes sense to pass a double since the loser gets the same “pre-starting position” and the winner gets a free point.

But if doubling were allowed during a game then the “structure” of the game tree becomes important. For instance, two different games could have the same winning chances of 25% but a different structure. If Hero doubles judiciously then he can leverage the structure to improve his overall chances of winning the match (of course he can’t leverage the structure to improve his chances of winning a particular game – if Villain were hell-bent on winning the next game at all costs, then he would never pass a double).

To illustrate the concept of structure, assume we are interested in maximising the expected number of VP for a single game instead of a “match-to-N-points”. Consider the following “random-walk-game”: A happy star randomly moves to one of the twelve coloured leaf nodes, with each node occurring with probability 1/12. If the colour is Green (Red) then Hero wins (loses) one VP. If no doubling cube is used then basic math says the expected gain per game is 1/3 = 0.333 VP for both structures depicted in the left and right halves of the diagram below.

Now suppose that Hero has the right to double before the game and Villain can double when the happy star reaches one of the three “intermediate” nodes. On the left diagram, assume that Hero doubles (since there are more greens than reds). Villain should accept and then Hero can expect to win 0.667 VP. But on the right diagram Hero is only winning 0.5 VP if Villain uses correct doubling strategy. This is left as an exercise for the reader.

Therefore, structure is important: without the cube Hero wins the same expected VP in both games, but with the cube Hero prefers the first game. Another lesson is that ownership of the cube (i.e. exclusive right to make the next double) is worth some equity. As a general rule, if all other things are equal then whoever owns the cube prefers game states that require many moves before one side has a decisive advantage.

Hopefully this example should make it clear why a simple mathematical analysis breaks down when we consider real games with doubling decisions occurring during the game. Similar considerations obviously apply when computing optimal match strategy rather than expected VP in a single game.

# Summary

In this post I show that optimal use of the doubling cube is a lot more complex than “always double (take) if our winning chances are at least 80% (20%)”. There are three caveats:

- The elephant in the room is nobody knows how to reliable estimate the winning chances of a specific game state. Not even Spider GM can do this, unless there are very few cards unseen
- 80% and 20% turn out to be optimal parameters – only if we assume the winning chances change continuously rather than suddenly change (think Brownian motion instead of quantum leaps!).
- The parameters 80% and 20% also assume we are playing for money (e.g. $1 = 1 VP, aim to maximise expected winnings). It doesn’t work in match-to-N-points, especially when we reach the pointy end with both sides close to victory.

If we can solve the elephant in the room, then this doubling strategy should be a good starting point – coupled with a few “common-sense tweaks” near the end of the match. For instance, you never double when you are 1 point away from winning the match etc. But one could argue it is precisely the elephant in the room that makes Spider Solitaire such a great game 😊

# Fun Fact

If Hero has the exclusive right to make the next double then it is possible to construct a pathological game tree where one can change some green nodes to red while also changing Hero’s correct doubling decision from No-Double to Double. A well-known Backgammon example is the Jacoby Paradox.

Very interesting! I think I follow. In the case of the colored green and red balls, the villain steals one of the hero’s chances of winning by making a double which hero is best off declining. I assume the 80-20 rule means you should double if you think you have 80% chance of winning. But it doesn’t say at what chance the opposing party should accept as opposed to decline the double.

I provide results based on what I think are called Monte Carlo simulations — I don’t do formulas, I just run many thousands of examples with the parameters (and a random number generator) and assume the answer is the mean.

One tiny thing I’ll mention is I once worked with a guy who played backgammon against the computer (this is roughly 1986). What he would do was wait until he was behind, then double, and the program would keep doubling and he’d double back until the cube was at 64. Then if he was going to lose he’d just exit from the program, but if he won he chalked up 64 points. The program kept cumulative scores, and he played often enough (rough 512 times?) that he determined that the program’s counter did not wrap around at 32767. Whatever floats your boat, I guess.

I will await with interest your proposal for what sort of game we might play next.

LikeLike