In this post I wish to discuss the mathematics of doubling in some detail.
Let us assume Hero and Villain are playing a “match-to-three”. That is, first player to win three VP is the overall match winner. Each individual game is worth 1 VP. The actual game being played is irrelevant: it could be Chess, Agricola, Snakes and Ladders, the royal game (with Hero moving the cards and Villain trash-talking at every sub-optimal play), or a well-known poker variant where both players start with three items of clothing. To make things interesting, assume the winning probability of an individual game is slightly less than 50%. To be specific, I will set the win rate to 40%. What are the chances of Hero winning the overall match?
This kind of question is most easily solved with dynamic programming. The boundary condition says that if someone already has three wins then the overall winning chances is either 0% or 100%. Next, we can compute the winning chances when the score is 2-2. We can then work our way backwards, eventually arriving at 31.7% winning chances for the Hero at 0-0.
You will notice that if the match score is X-X then the Villain prefers smaller numbers of X. Intuitively, smaller values of X means that it is less likely that Hero will find enough “random noise” to overcome Villain’s advantage in the long run. The following diagram summarises the probability of Hero winning the match at every possible match-score:
I should mention that Bart has already done his homework, and he knows how to compute the probabilities corresponding to each match score. If the winning chances of an individual game are 0.25 then he correctly computes the following:
0.049 chance of winning 5-point match, no doubling
0.104 chance of winning 3-point match (equivalent to 5-point match with 2VP per game)
I will leave this as an exercise for the reader.
Now let us give Hero the following handicap: Before each game, Hero can demand the game be played for 1 VP or 2 VP. Moreover, there is no bonus for winning the match with 4 VP instead of 3 VP.
This means, for instance, if Villain has 2 points then Hero will always play for 2 VP. Again, we can use dynamic programming to compute the winning chances. If you use Excel to perform the dynamic programming then you will need the function max(FOO,BAR) somewhere in your calculations. You will notice several things:
The boundary conditions now include either player having 3 or 4 VP.
The table only shows winning percentages, but not whether Hero should play for 1 VP or 2 VP.
The numbers look weird: The winning chances on the main digonal is no longer strictly increasing and the winning chances for 1-2 is the same as 1-1.
The latter is easily explained. Since Hero can choose to play for 2 VP, Villain gets no advantage from being 1-2 instead of 2-2. Slightly more interesting is a match-score of 1-1. If Hero plays for 2 VP, then the next game decides the match. If Hero plays for 1 VP then the worst case scenario is the match score becomes 1-2, in which case Hero can play for 2 VP and the next game decides the match. Therefore, it must be correct to play for 1 VP. It turns out Hero is close to breaking even thanks to his handicap.
It is not hard to get an Excel spreadsheet to crunch the numbers for different parameter values. We can, for instance, figure out what happens in a match to 13 points with Hero’s winning chances down to 30% for an individual game.
When we talk about longer matches, it is generally more convenient to think in terms of number of VP remaining instead of number of VP already scored. For instance, 19-18 in a 21-point match is equivalent to 5-4 in a 7-point match and it makes sense to describe both of them as “2-away 3-away”.
Now what happens if either side has the right to double the stakes, but there are no redoubles? This is a trivial case: it makes no sense to refuse a double. If one side declines to double, he can’t prevent the opponent from doing the same. Therefore, it must be correct for either side to double.
But what if redoubles were allowed? This means e.g. if Hero proposes to play for 2 VP then Villain can propose to play for 4 VP before the game starts. Then things may get weird. I will leave the analysis as an exercise for the reader.
Doubling During the Middle of a Game
Hitherto, we have assumed that doubling could only occur before the start of every game. In this case, each individual game can be thought of as a black box – the only relevant parameter is the winning chances of a single game. Moreover, it never makes sense to pass a double since the loser gets the same “pre-starting position” and the winner gets a free point.
But if doubling were allowed during a game then the “structure” of the game tree becomes important. For instance, two different games could have the same winning chances of 25% but a different structure. If Hero doubles judiciously then he can leverage the structure to improve his overall chances of winning the match (of course he can’t leverage the structure to improve his chances of winning a particular game – if Villain were hell-bent on winning the next game at all costs, then he would never pass a double).
To illustrate the concept of structure, assume we are interested in maximising the expected number of VP for a single game instead of a “match-to-N-points”. Consider the following “random-walk-game”: A happy star randomly moves to one of the twelve coloured leaf nodes, with each node occurring with probability 1/12. If the colour is Green (Red) then Hero wins (loses) one VP. If no doubling cube is used then basic math says the expected gain per game is 1/3 = 0.333 VP for both structures depicted in the left and right halves of the diagram below.
Now suppose that Hero has the right to double before the game and Villain can double when the happy star reaches one of the three “intermediate” nodes. On the left diagram, assume that Hero doubles (since there are more greens than reds). Villain should accept and then Hero can expect to win 0.667 VP. But on the right diagram Hero is only winning 0.5 VP if Villain uses correct doubling strategy. This is left as an exercise for the reader.
Therefore, structure is important: without the cube Hero wins the same expected VP in both games, but with the cube Hero prefers the first game. Another lesson is that ownership of the cube (i.e. exclusive right to make the next double) is worth some equity. As a general rule, if all other things are equal then whoever owns the cube prefers game states that require many moves before one side has a decisive advantage.
Hopefully this example should make it clear why a simple mathematical analysis breaks down when we consider real games with doubling decisions occurring during the game. Similar considerations obviously apply when computing optimal match strategy rather than expected VP in a single game.
In this post I show that optimal use of the doubling cube is a lot more complex than “always double (take) if our winning chances are at least 80% (20%)”. There are three caveats:
The elephant in the room is nobody knows how to reliable estimate the winning chances of a specific game state. Not even Spider GM can do this, unless there are very few cards unseen
80% and 20% turn out to be optimal parameters – only if we assume the winning chances change continuously rather than suddenly change (think Brownian motion instead of quantum leaps!).
The parameters 80% and 20% also assume we are playing for money (e.g. $1 = 1 VP, aim to maximise expected winnings). It doesn’t work in match-to-N-points, especially when we reach the pointy end with both sides close to victory.
If we can solve the elephant in the room, then this doubling strategy should be a good starting point – coupled with a few “common-sense tweaks” near the end of the match. For instance, you never double when you are 1 point away from winning the match etc. But one could argue it is precisely the elephant in the room that makes Spider Solitaire such a great game 😊
If Hero has the exclusive right to make the next double then it is possible to construct a pathological game tree where one can change some green nodes to red while also changing Hero’s correct doubling decision from No-Double to Double. A well-known Backgammon example is the Jacoby Paradox.
IM Bart has reminded me via email I haven’t yet explained the joke on 1st of April.
It turns out if you click the last link with words “please click this link” you do not get the video of Never Gonna Give You Up. Instead you reach my Spider Solitaire paper in the UNSW high school Parabola (which doesn’t involve any Rick-Roll at all).
In this post I will discuss why Villain doubled after dealing the cards in round 1.
The position in round 1 is this:
There is only one way to get two turnovers. The best play is “fg,bf,bc,jc,jb” giving the following position (I used MS Paintbrush instead of explicitly undoing moves in the Spider Solitaire program).
Now let us compute the chances of improving our minimum guaranteed turnovers after turning the card in Column 10. For simplicity assume each of the 13 ranks are equally likely. Let us assume each rank is worth one happy star if it yields a turnover, 0.5 happy stars if you need the correct suit to get the extra turnover and 0 happy stars if no turnovers regardless of suit. Multiple turnovers are possible. For instance, 1.5 happy stars means you always get one extra turnover, with a chance of a second turnover if you were allowed to call the suit, etc.
Note that if the next card in Column 10 were an Ace or Four then we don’t get an extra turnover since we counterfeited the turnover in column 9. It turns out the only good ranks are Three, Nine, Jack and Queen. That’s only 4 happy stars out of a possible 13. Actually, we can increase that to 4 and a half, since a Jack of Spades gives us a double turnover in columns 8 and 10 to go with our turnover in column 9. Given that we only start with two turnovers, it’s about an even chance we get bad cards in both column 9 and column 10.
The other piece of bad news is there are no “atomic columns”. Note that we were forced to pollute column 6 to guarantee our two turnovers in the first place. If we assign the capital letters A and B to columns 9 and 10 respectively, then it’s hard to find decent plans corresponding to the rest of the alphabet. This also means that we need to turnover every card in column 9 or 10, not just the first to get a fighting chance.
In a nutshell, we have difficult short-term problems and long-term problems to deal with. This is why Villain doubled. If I were in Hero’s shoes, I would have passed the double.
Once upon a time, there lived a Beaver in the Animal Kingdom.
The Beaver had just beat the highest difficulty level of Spider Solitaire – four suits sans undo. He felt he had played well after a difficult start, but it was hard to judge his overall ability at the game. After all, one wins and zero losses does not a large sample size make. And the fact none of his friends displayed any aptitude for the Royal Game certainly didn’t help. So, the Beaver decided to have a chat with his best friend, the Raccoon, who was known for his extensive knowledge of all things mathematics.
“It’s hard to judge your playing strength after one game,” said the Raccoon. “You need to play a large number of games to prove your victory wasn’t just beginner’s luck.”
“Suppose I played 129 games in a row,” replied the Beaver, plucking a three-digit number at random. “Then we can tally up my wins and losses and then we have a much better understanding of where I’m at.”
“Agreed,” replied the Raccoon. “Right now, the only thing we can agree on is you can play a hell of a lot better than I can.”
The Beaver chuckles, and he soon notices Captain Obvious is eager to join in the conversation.
“The only problem is it will take a long time to churn through 129 games,” says Captain Obvious. “Spider GM probably doesn’t wanna hear this but we all have better things to do in our lives than playing the Royal Game all day.”
“True,” says Raccoon. “Very True.”
Hang on, thinks the Raccoon. 129 happens to be a power of two plus one. This has me thinking – what if we can involve powers of two somehow? Let us say some games can be worth more than others. Suppose that each individual game was worth N victory points, where N was a power of two. A series of 129 games is equivalent to “First to 65 wins”. This should speed things up considerably. But Captain Obvious will gleefully point out Spider Solitaire is a game for one player, not two. Hang on (***thinks for a while***) I think I might have something.
“Okay I have an idea,” says Raccoon.
“What is it?” asks the Beaver and Captain Obvious simultaneously.
“Let us pretend Beaver is the protagonist,” says Raccoon. “Only Beaver can move any cards. I am the Antagonist and I am willing Beaver to lose.”
Using a stick, the Raccoon sketches a hypothetical cube with all powers of 2 between 1 and 32.
“Initially, each game is worth 1 Victory Point. If Beaver thinks he has a good position, then he can double the stakes. I must concede 1 VP or agree to play on for 2 VP. Similarly, if I think Beaver has a poor position then I can double the stakes and Beaver has the same choice of refusing or accepting.”
“Sounds interesting,” says Beaver. “But if my game state were really bad, can’t you just double the stakes after every move? That wouldn’t be very interesting”
“That is correct,” replies the Raccoon. “Therefore, I propose another rule: if either side doubles the stakes and the opponent accepts then the opponent has the exclusive right to make the next double.”
“So that means, if I get a poor position, you double, I accept, then I turn the game around, then I can redouble and play for four VP?”
“Quite correct,” replies the Raccoon.
“Wait a minute,” says Captain Obvious. “If first to 65 wins then is it possible to get more than 65 if the doubling cube is more than 1?”
“Yes,” replies the Raccoon. “It doesn’t matter if you’re above 65 or exactly equal to 65. And before you ask, it’s perfectly legit for someone to double near the end of the match regardless of the game state because the math says he has nothing to lose.”
“Just to touch base,” says the rot13(fzneg nff) as he gleefully pokes the rot13(nff) of Captain Obvious, “does that mean only Beaver can moves cards, but both Beaver and Raccoon participate in cube-decisions.”
“That’s correct,” says Raccoon. “Even though I don’t move any cards, I can still participate in evaluating the winning chances of a given game-state. Win-win for everybody since I get a chance to improve my game as well.
This idea proved quite successful, and soon Raccoon was discussing the implications of the doubling cube with his friends, many of whom were also avid mathematicians. They had independently discovered some interesting theory and concepts such as market losers, the Crawford Rule, Jacoby Paradox, Woolsey’s Law for Doubling and so on. Not surprisingly, much of this theory is well-known to expert Backgammon players today.
For the record, the Beaver managed to win 66-42, although that may have been a function of Raccoon’s limited understanding of the Royal Game (and hence sub-optimal decisions with the cube). At least it was a lot better than the 8-65 drubbing that Raccoon received when they reversed the roles of Protagonist/Antagonist. Initially the Raccoon thought the best equaliser for a mediocre player is to play each game at high stakes and hope to get lucky, even if the game state rot13(fhpxrq) since a long match would allow the antagonist to “grind” his way to victory. But the Beaver thought it was better to be aggressive with even marginal advantages – for instance if an intermediate player starts with six guaranteed turnovers or a “good five” then he should immediately double. Then at least he is fighting from a position of strength. If the protagonist thought his chances without a doubling cube were 50-50 then he is probably better off grinding and should hope to win on skill, not luck.
And the less said about Ninja Monkey’s first Match-to-65 and his infamous random move algorithm the better 😊
Well, that didn’t last long. Hero got off to a bad start after Round 1 of Game 1 and decided to go all-in on a lousy hand, reasoning that if Hero’s chances of winning an individual game were less than 50%, then the chances of winning a long match would be rot13(fuvg) regardless.
I can see where Bart/Bug are coming from, but if the “protagonist” believes his chances of winning an individual game are less than 50% I think a better strategy is to be aggressive with even marginal advantages. For instance, Hero could insta-double if the initial game state allowed six guaranteed turnovers or a “good five”. That way, Hero would at least be fighting from a position of strength. If I had to play a 25-point match against Kit Woolsey or Paul Magriel, I would certainly consider a similar strategy – looking for any excuse to double from a position of strength. I would be less sure about accepting/refusing when my opponent doubles, but at least I would avoid any kamikaze redoubles unless the value of the cube is already enough to give opponent a win.
Bart has had a go at analysing the mathematics of an unbalanced match where Hero has, say, a 25% chance of winning a match to 5 and tries to equalise the match to some extent through judicious use of the cube. I will have more to say about this in a future post.
I don’t have much to say about the actual card-play. Hero had a reasonable position after round 0, managing to turn over every face-down card in column 7, albeit without getting the empty column. It’s hard to make bad decisions when we have very few face-up cards, no spaces, and all face-up cards are arranged in descending sequence. But after dealing a less-than-stellar 10 cards in round 1 the situation was already desperate. Rightly or wrongly, Villain immediately doubled the stakes and before we knew it, the number on the D-cube exceeded the number of points required for victory.
When you have a lousy position, the good news is it’s hard to make a mistake because your options are so limited. But we all know what the bad news is. Indeed, I couldn’t find any serious card-play mistakes by Bart or Bug for the entire game. Maybe Hero could have taken an extra turnover in round 3 at the expense of exposing two more Aces, but that’s always easier to say with hindsight. I think it was one of those games we were destined to lose. I’ve had similar games when playing with Microsoft Spider Solitaire and – dare I say it – I have no reason to believe this version is biased.
In the post-mortem, I will discuss in some detail why Villain doubled after we dealt the initial 10 cards in round 1. The TL;DR version basically says “I know from vast experience this is probably not gonna end well”.
Again, another short round. The game state is rather poor – if we don’t get a space in column 5 then there are very few plans corresponding to any letter of the alphabet after ‘A’. And sure enough, we got a dreaded King of Diamonds. At least we managed to clear every face-down card in Column 2 and our Club suit is almost complete. The only bright spot is it’s hard to make a mistake when options are severely limited. Maybe it would have been wiser to decline the initial double and start afresh. At least we know next time if Villain is willing to stake the entire match on a single game then he probably has a very good reason.
With three Eights being dealt simultaneously, I think it would be good to see a few Nines in the next round …
Actual play (3 April, score = 441): eh, ei (Kd) ba, bh, bf (Qh)
Actual play (trivial): bd (3h) gf, jg
Actual play: ???
Spider GM Comments: I’m assuming this completes the round, unless Bart/Bug can justify moves like “gd” or “hg” etc. Also, a rare lapse from IM Bug who forgot to mention the Jack of Clubs in his song lyrics.