Tower of Hanoi

The Tower of Hanoi is a simple mathematical problem or puzzle. You are given three rods and a number of discs of different sizes. The puzzle starts with all discs on a single rod. Your aim is to move all of them to a different rod according to various rules:

  • Only one disc can be moved at a time
  • No disc can sit atop a smaller disc.

It is not hard to show that with N discs, we can achieve the goal in 2N – 1 moves. The simplest proof is to observe that with N discs we need to perform the following three steps: (i) shift the top N-1 discs to an empty rod (ii) shift the bottom disc to the other empty rod, (iii) shift the top N-1 discs onto the bottom disc. By mathematical induction one easily establishes the formula 2N – 1. Note that we are essentially reducing the problem with N discs to a problem with N-1 discs.

With similar reasoning one can show that any random position of discs can be obtained (as long as no disc covers a smaller disc). The proof is left as an exercise for the reader.

The Tower of Hanoi is an example of shifting a large pile of items with limited resources. If you are not familiar with this puzzle, you will probably be surprised by the fact that only three rods are required no matter how many discs you start with. Avid readers of this blog may have come across terms like “Tower-Of-Hanoi manoeuvres” from previous posts, so if you were unsure what the fuss was all about, then now you know 😊.

In Spider Solitaire we are often confronted with the problem of shifting large sequences of cards with limited resources. A simple example is shown below: A complete suit of Spades is visible but can we actually clear the suit with only one empty column?

The answer is yes. We can shift the Eight of Diamonds onto the Nine of Diamonds in column six, build the J-0-9 of Spades onto the K-Q in column 2, move the 8-7-6-5 of Spades from column five onto the 9 of Spades, swap the 4H and 4S on top of both the Spade Fives and finally add the Ace of Spades from Column three to complete the suit.

Going back to the Hanoi puzzle, with a small number of rods a monkey could probably luck his way into a solution by making random moves, but once you get a decent size pile of discs the random move strategy doesn’t work so well! Also, with random moves it is difficult to prove that e.g. 30 moves or less is impossible given five discs. Similar considerations apply to Spider Solitaire. Since the above example is relatively simple, a monkey could probably complete a suit of Spades by repeated trial and error, assuming he only makes moves that are “reversible”. But with a more complex problem, the monkey won’t do so well.

If you want more practice with “Tower-of-Hanoi manoeuvres” I recommend the following exercise: set up the diagram above, ignoring any face-down cards or cards not in sequence (for instance in column two you keep only the K-Q of spades).  Then try to minimise the number of in-suit builds using only reversible moves (you should be able to get pretty close to zero). From this new position pretend you’ve just realised your mistake and try to clear the Spades using only reversible moves. This exercise should give you an idea of why empty columns are so valuable.

Note that all this carries the assumption of no 1-point penalty per move (commonly used in many implementations of Spider Solitaire). If there was such a penalty then we would have to think twice about performing an extra 50 moves just for the sake of one more in-suit build. But for now we’ll keep things simple.

A closer look at Choose Your Difficulty

In Microsoft Spider Solitaire a player can choose 1/2/4 suits and a difficulty level. A player can gain Experience Points by winning games of Spider Solitaire, and after gaining enough XP he can level up. After sufficient levelling up, the player might even win some percentage of Microsoft shares or dot com stock options … uhhh just kidding 😉

The XP gained is shown in the table below.

Experience Points Table

The first thing to notice is not all combinations of “suit count” and difficulty are legit. For instance there is no Grandmaster hand at the 1-suit level and the minimum difficulty for 4-suit level is Expert. A random deal presumably means the deck is properly shuffled (in math terms all 104! hands occur with equal probability if we ignore the equivalence of cards of the same rank and suit), and the player is explicitly warned that such deals may be unsolvable. Any deal other than random is guaranteed to be solvable with sufficient luck or the use of boop.

Obviously it is difficult to measure the true worth of how hard a game really is. For instance if we played 1-suit then should beating an Easy and Medium hand be worth the same as beating a Hard? At least  increasing the number of suits or difficulty level results in increasing the XP gained, which is what we expect. So far so good.

However, I noted the XP gained for a random deal is equal to the XP gained for the lowest permissible difficulty level for the same number of suits, and this makes little sense.

For sake of argument let us assume we have 400 hands at the four-suit level. 300 of these are solvable and are arranged in order of increasing difficulty from left to right. The remaining 100 are unsolvable and occupy the right-most 100 deals in random order. An Expert deal would choose randomly out of the left-most 100, but a Random deal would choose randomly out of the entire 400 hands. Clearly it should be easier to beat an Expert deal than a Random deal, and therefore the latter should be worth more XP than the former.

In practice, the overwhelming majority of games are winnable, even at the Four-suit level (although I know that many folk will dispute that claim!) so the above example should really have e.g. 301 = 100+100+100+1 hands instead of 400. Essentially Random is equivalent to “Random but guaranteed winnable”. Therefore the same reasoning says a Random hand should be worth less than a Grandmaster hand.  Perhaps a Random deal should be worth the same as a Master deal, Maybe a little more or little less. But it certainly should be worth more than Expert. Of course, similar considerations apply to the 1-suit or 2-suit levels.

Perhaps some dude who is much, much smarter than I am can write a Ph. D. on the true worth of XP for a given difficulty level and number of suits. Another Doctor of Spider Solitaire anyone??? 😊

My friend is a Doctor of Spider Solitaire :)

It’s official – I have awarded my Scrabble friend a Doctor of Spider Solitaire. His first actual attempt was

  • Philosophy -> Peter Thiel -> Forbes 400 -> Bill Gates -> Microsoft Windows -> Windows 3.0 -> Microsoft Solitaire -> Spider (solitaire).

Unfortunately Peter Thiel does not link to Bill Gates in 1 step, and there were a few false leads with some Windows versions (e.g. XP) not having a Microsoft Solitaire link.

For those who prefer visuals – here is a screen dump showing multiple routes from philosophy to Spider (solitaire):

My friend says he likes paths that go through Creed Bratton/The Office (visible on the left if you look closely). I have nothing much to add here 🙂

From Spider Solitaire to Philosophy – and Back Again

And now for something completely different:

Let us try the following experiment. We start with the Wikipedia page on Spider Solitaire and then do the following:

  • click the first link of the “main text” (but ignoring anything in parentheses).
  • Rinse
  • Repeat

From the screenshot below, step 1 says should click on the word “patience”


After a few iterations we reach a closed loop of the form Philosophy > Existence > Ontological > Philosophy.

The interesting phenomenon is that the starting point is almost always irrelevant: if you pick a random page then it is heavy odds on you reach the same closed loop involving “philosophy”. Not surprisingly, Wikipedia itself has a page on this phenomenon and it is estimated (as of February 2016) that 97% of all articles in Wikipedia lead to Philosophy. The remaining articles either lead to “sinks” (no outgoing wikilinks), non-existent pages or closed loops other than Philosophy. This phenomenon was pointed out to me by someone from Adelaide University on the 4th of March.

Just for the record, here is the chain that starts with Spider Solitaire. I will not discuss this chain in detail – the reader is invited to draw his or her own conclusions:

  • Spider (solitaire)
  • Patience
  • Card games
  • Game
  • Play
  • Intrinsically motivated
  • Desire
  • Emotion
  • Biological
  • Natural science
  • Branch of science
  • Sciences
  • Knowledge
  • Facts
  • Reality
  • Imaginary
  • Object
  • Modern Philosophy
  • Philosophy  > existence > ontological > philosophy

Being a self-proclaimed Grand Master of Spider Solitaire, I am more interested in the reverse process. Starting from the Wikipedia page on Philosophy, is it possible for me to choose any outgoing links of my choice (not necessarily the first) and eventually land on the Spider Solitaire page? I don’t have a definitive answer. All I know is the random link algorithm proposed by my good friend Ninja Monkey doesn’t work so well!

If anybody can find a path from Philosophy to Spider Solitaire I will be happy to grant said person the title of Great Grand Master of Spider Solitaire. Challenge accepted anyone?

Who moved my Phone?

“Where is my damn phone?” I yell.

One of these days I’m gonna have to get rid of this bad habit. I’m pretty sure I left it under the tree like three minutes ago … right next to where Ninja Monkey is sitting … OH FOR 70,85,67,75,83 SAKE!!!!!!!!

“This is weird”, says Ninja Monkey.

“Ninja Monkey,” I say sternly. “We need to talk.”

Ninja Monkey shows me my phone. Somehow he has reached level 742 in Jewels Magic. Given his fascination with random move algorithms I’m pleasantly surprised to find he hasn’t made any in-app purchases yet.

“This game is rigged,” says Ninja Monkey.

I suddenly remember that Monkey and I published a paper about a certain Spider Solitaire game being rigged some time ago. Maybe the Ninja Monkey is onto something after all.

“Why is level 742 of Jewels Magic rigged?” I ask.

“I realised random move algorithms ain’t always what they’re cracked up to be,” says Ninja Monkey. “I’m not very good with these abstract strategy games – so I asked my friend Wise Snail for insights.”

“As you know,” says Wise Snail, “being the World’s slowest Spider Solitaire player I like to analyse the current game state to the Nth degree before making a move.”

***Sigh***

Why couldn’t Ninja Monkey at least ask one of my better students for advice?

“<sarcasm> What fascinating insight did you come up with this time? </sarcasm>” I ask.

“I soon realised if I wait for three seconds then the game will highlight 3 or more jewels of the same color,” replies the Wise Snail.

“So your new strategy is just wait for three seconds and then play whatever move the app suggests?”

“I know I’m not the best player, but my strategy has one important advantage: If you’re trying to prove a game is rigged then nobody can accuse you of deliberately playing sub-optimal moves to promote your desired hypothesis, null or otherwise.”

“True,” I respond. “Very true.”

 “We start with 26 moves,” says Ninja Monkey. “The goal says we need to collect 50 red, 50 blue and 50 orange jewels. If I use the suggested-move algorithm instead of random-move-algorithm then I always collect plenty of red and orange jewels but very few blue jewels.”

“That is weird,” I reply. “There is no logical reason why one colour should be favoured over another. That’s like you-know … racism or something like that.”

“I ran the following test,” says the Wise Snail. “I played 10 games on level 742, stopping whenever one of the jewel counts reaches zero or I run out of moves. I got the following results:”

RedBlueOrange
0298
0290
0227
2300
8370
0319
03112
0387
3390
0398

“So that means the blue number is always largest, and by a country mile,” I say.

“Of course that doesn’t tell us why it behaves that way.”

“But that’s all I need to know,” I reply. “Q.E.D. The game is rigged. Maybe I should write an angry-gram  to the developer of this game.”

“I agree,” says the Snail. Unfortunately he takes a minute just to type the word “Dear” on my phone.

“Let me have a go,” says Monkey. He can literally type at one million words per minute but unfortunately he can only produce gibberish of the highest quality.

Fine. I have to type the angry-gram myself. It takes three minutes, and I finally press Send. Whoosh!

Hmmm … perhaps it’s time for another collaboration with Ninja Monkey and the Wise Snail. For now, they’re back in the good books again. But if I catch them playing with my phone once more without my permission then I might reconsider …

THE END

Rank Imbalances

In this lesson we will examine the issues of rank imbalances. The current diagram shows the state of play after we dealt a second row of cards from the stock.

One thing you may have noticed is we seem to have a lot of Jacks but not many Tens. To be more precise we have only one Ten but six Jacks. That’s a delta of 5. If we try to construct the entire “histogram” for all card ranks we will also find several other discrepancies between other pairs of adjacent ranks such as seven Fours but not as many Threes or Fives.

Of course all players would know that such imbalances are less than convenient, but how best to deal with such imbalances?

One thing to note is that we can’t affect the probability of turning over specific cards. For instance, there are two Jacks remaining and 104-49=55 cards unseen. The probability that the next exposed card is a Jack is always 2/55 no matter how well or badly we play (if we deal from the stock we can always pretend 10 cards appear sequentially instead of simultaneously). But we can mitigate the effects to some extent. For example if two Jacks are buried under a King then the effects of too many Jacks will be attenuated, but if the only Ten was buried under two Kings then that is obviously much worse.

It is beyond the scope of this post to discuss in detail how to deal with rank imbalances, but a general principle is that the more flexibility you have, the better your chances will be – and the best way to retain flexibility is to procrastinate whenever possible.

Consider the following questions:

  • Can we get back our empty column? (a good question to ask whenever at least one column has no face-down cards!)
  • Can we increase the number of in-suit builds?
  • How many guaranteed turnovers do we have? (Note that an empty column usually equates to one more turnover, but not always).
  • What would be your next play?

As an aside, here’s a question for the math geeks among you. Do you think we would be justified in complaining about our bad luck seeing that out of 49 exposed cards there are six Jacks but only one Ten?

One might try to compute the probability that out of 49 cards we will get at least six Jacks and at most one Ten. Computer simulation says the chances are 0.61 per cent.

Not so fast. Note that I specified “Jacks versus Tens” after seeing the current game state, which is clearly unfair. Either we have to guess a pair of adjacent ranks (e.g. Fours vs Threes) or alternatively include all ranks. In the latter case we might ask “what is the probability that out of 49 cards we will get at least six repeats of X and at most one Y for some pair of adjacent ranks X and Y?” Remember that X can either be Y+1 or Y-1. In this case the probability is about 12.5%.

If you’re really nit-picky you might also ask “why 6X versus 1Y? Why not 7X vs 2Y etc”, but you get the gist.

There is also the issue of selective memory. We might have played eight games and we only remember the one game with way too many Jacks and only a solitary Ten. And by some strange coincidence, 12.5% happens to equal the fraction 1/8.

This is probably too much detailed mathematic specificity for the average Joe Bloggs, but the point I wish to make is don’t complain that the game is rigged unless you really know your statistics better than your alphabet.

 Now, going back to the lesson

Look-Ahead Algorithms

By now you’ve all heard the great news: Monkey has proved it is possible to beat Spider Solitaire at the 4-suit level! All he needs to do is build in-suit at every opportunity and never break a suited connector. The bad news is the win rate is extremely low. Of course if the Monkey wins $3000 from the Eagle thanks to a single victory then we’re not complaining – but can we do even better?

First I wanna differentiate between a random hand and a custom hand. Recall that I discussed a number of you-tube videos where random players show off their awesome skills by beating a hand of 4-suit spider sans 85,78,68,79, but the unmentioned caveat was the hand was ridiculously easy.

If I pick a ridiculously easy hand then Monkey wins 6 games in 500, or 1.2 percent of the time. If I pick a random hand then Monkey wins a <sarcasm> whopping </sarcasm> zero games in 500.

So far so bad.

It turned out my simulated annealing/stochastic hill climbing/stochastic gradient descent/something else thingy was a red herring. It worked the first time, but later experimentation with the random number generator showed that “always Level 3” was the best way to go in the long run. But at least it made for a good story lol 😊

Guaranteed Minimum Evaluation Score

Good players (or avid readers of this blog or both!) will be aware of the concept of minimum guaranteed turn-overs. Basically we can calculate the minimum number of cards we are guaranteed to turn over even if the worst possible cards showed up. Of course we can do something similar for e.g. the number of suited connectors or some other “function” of the position. For instance, we might assign 10 points for flipping a card, 1 point for a suited connector and 100 points for removing a complete suit. And of course experienced players know that different programs will have different scoring systems. The important point is that we can (i) evaluate a given position (ii) compute a guaranteed minimum score even if the worst possible cards turned up.

This is an example of “looking ahead” since we are considering the consequences of a single move given the available information (i.e. face-up cards), which is obviously better than making a single move because it’s an in-suit build.

Here is a simple example: suppose the evaluation function was 10 points for exposing a card, 1 point for any suited connector, and no 1-point penalty for each move. Assuming best play in the start position below, what is your Guaranteed Minimum Evaluation Score (GMES) even if you turned over the worst possible cards?

I hoped you answered 54. Note that we are lucky in the sense we can build in-suit 4 times and off-suit once. Normally it’s the other way around.

A good first move is to build in-suit with the Jack-Ten of diamonds. The Ten is on one of six columns with equal-fewest cards and we also have spare off-suit Jack. But we can make many other moves and still come to the same GMES of 54. You could start with the Three-Two of Clubs or even the strange looking 9h-8d. But obviously the sensible option (for humans at least!) is Jd-0d.

The position you should be visualising is the one below, where a happy star represents any face-up card. We can easily check with a single glance that there are 5 turnovers and 4 in-suit builds, with a total evaluation score of 54.

If we want to train our Ninja Monkey to compute the best GMES we can simply iterate random sequences of moves from a given position. But we must be careful not to expose any new cards (e.g. if we shift the Ten of diamonds we will not flip the card underneath). Assuming we have enough iterations, Ninja Monkey will deduce e.g. shifting the Jack of diamonds onto the Club Queen is not a good move since it reduces the GMES from 54 to 53. Even worse would be 0d-9h, which reduces the GMES to 42. We lose one guaranteed turn-over and two in-suit builds with one bad move!

An example of a “good  enough sequence” is:  Jc-0d, 9d-8d, 3c-2d, 0d-98d, Jd-098d, Qc-Jc. Note that the first move was unnecessary but at least we got the correct GMES of 54.

Now Get Off Yer 65,82,83,79 And Start Procrastinating!

A lazy programmer would be satisfied with the above algorithm but of course it is possible to do even better than that. Note that there is no reason to commit ourselves to every move in the above sequence without examining the newly-turned over cards. After starting with Jc-0d, we can turn over a card and we always have the choice of 9d-8d, 3c-2d, 0d-98d, Jd-098d, Qc-Jc or the new card may give us an even better option. In effect we are procrastinating. In summary Ninja Monkey will find a sequence with the best possible MGES and execute it, but terminating the sequence prematurely if he turns over at least one new card.

How do we tell if it’s time to deal a new row of cards? Simples, as the Meerkat would say. If Ninja Monkey cannot increase the GMES then it is time to deal a new row of cards. If the stock is empty then either the game is won or the Monkey concedes defeat.

With the new algorithm, Monkey will beat an easy hand around half the time or a random hand 5 percent of the time. Random Hand statistics are shown in the graphic below

Let us say that a Spider hand is ImportNumbPieAsNP-Hard if Ninja Monkey cannot obtain one victory in 50 tries. In this case roughly half the hands are ImportNumbPieAsNP-Hard. At the other end of the spectrum there was also one very easy hand with 21 wins in 50 attempts. One implication of all this is if a Spider Solitaire server gives 40 ImportNumbPieAsNP-Hard games in a row then there’s a pretty good chance it’s biased.

Unfortunately all this comes at a price. Despite Ninja Monkey’s extremely fast metabolism, this takes a very long time to run.

Spider Solitaire September 2019 experiment complete

My Spider Solitaire experiment for September is complete

Earlier I made a promise with a friend that I will play the 4-suit daily challenges on my i-phone. For each game I estimated the probability a monkey playing random moves will win at the 1-suit level. I said an inversion occurred if for any two games the latter was harder than the former (i.e. estimated win rate was less for the monkey).

I got a p-value of 0.0571 so the null hypothesis barely stood up (Nevertheless, I do not regret the experiment: my data for July pretty much forced me to hypothesize the program was biased because I did get p < 0.05, but only just).

Due to time constraints, I do not wish to further test my i-phone Spider Solitaire. I won’t be surprised if the random number generator is rigged, but it’s not worth my time to prove this. (If you are interested, I recommend you test more than one-month worth of games. Dates are sorted by day as primary key then month by secondary key so for instance March 17 < February 23 even though February occurs before March).

In the diagram below the downward trend is not obvious, but I suspect there were too many “near-perfect scores” at the beginning and not enough near the end. It is also interesting that the result was very close in the sense that “changing one bad result” after the fact would have been enough to push the p-value below 0.05. The decision to accept/reject the null hypothesis was too close to call until the very last day of this month.

Note: for my Spider Solitaire paper in Parabola, I tested a different Spider server and the downward trend was much more obvious.

lastdiagram

That’s it for now, till next time 🙂

The World’s Worst Math Teacher (another short story)

“Another one of life’s disappointments.”

“What’s wrong?” I ask.

“Marking assignments, the bane of every teacher,” growls Ms. Spider, as she angrily scrawls the word “DREADFUL” on a sheet of paper. “This goose just divided by zero.”

I’ve always enjoyed math, but I am all too aware that it represents a bugaboo for many ordinary folk. Not everybody can have higher than average IQ and not everybody can play piano and solve Rubik’s Cube at the same time. I agree we have to Make Math Great Again.

“I s’pose I could improve my presentations skills or learn Statistics 101,” admits Ms. Spider.

“I confess I never studied stats at uni,” I respond. “I had to pick it up all by myself.”

“Learning stats 101 sounds too much like work. Surely there must be a better way.”

“You could make the exams and homework easier,” I suggest.

“We can’t make it too easy,” responds Ms. Spider. “I’m sure the good students wouldn’t mind an extra challenge or two,”

I steal a glance at the goose’s assignment. Yes the goose is below average, but one of the assignment questions are badly worded. Another question has kilometres as a typo for metres, and I have to suppress a chuckle. I can see why some of Ms. Spider’s students call her the WWMT.

“Actually,” says Ms. Spider, “I was toying with a more radical solution”

“Which is?”

“We could give different exams to different students”

“What a revolutionary idea!” I exclaim. “Nobody has ever thought of this before!”

“From each according to his abilities … “

“From each according to his needs,” we chant in unison.

I am impressed: this Spider is clearly well-educated, not just in mathematics. She knows her clichés and sayings.

“Does that mean,” I ask, “if an awesome student correctly answers 40 assignment questions in a row then he will get a very difficult exam?”

“Exactly.”

“Hang on, what if an awesome student deliberately flunks the assignments …”

“Well … we could give the exam less weight than assignments,” the Spider responds somewhat nervously. “Then there is no advantage to tanking the assignments.”

“That’s Dandy!”

“For this to work,” continues Ms. Spider, “we have to come up with some way of measuring the difficulty of certain questions.”

“I understand,”

I mull over this for a while. We all know that students can be graded according to some chosen system. For instance, a math student can be Outstanding, Exceeds Expectations, Acceptable, Poor, Dreadful or Troll. But how can we grade certain questions?

The Spider writes two math questions on a sheet of paper:

mathquestionz

“Which of these problems is harder?” asks Ms. Spider.

“I think both are equally easy. After all, I participated in the International Mathematical Olympiad many years ago.”

Somehow, I think that was not the answer Ms. Spider expected.

Behind us, a monkey, eagle, mouse, elephant, lion and jackal are enjoying some Texas Holdem. As usual, the monkey has squandered away all his chips early, and the Eagle is schooling the rest of the field, having accumulated more than half the chips in play. The Spider eyes them warily: clearly they should not be privy to our discussion.

“You see,” says Ms. Spider. “Sometimes I find it hard to judge the difficulty of a single question. For instance, I expect problem X to be easier than Y, but for some reason the reverse holds when I mark the assignments.”

I mull over Ms Spider’s words. I am not really in a position to judge, given I have never marked any student assignments.

“I have an idea,” says Ms. Spider. “Let’s draw a table”

pic1

“For simplicity,” says Ms. Spider. “Let’s assume each question is either marked correct or not correct, hence there are no partial marks. I use blank instead of 0 for ease of reading. Sam is an awesome student since she answered most questions correctly. Owen is a Stupid student because he only scored 2 out of 9. Each individual is represented by a single row.”

“Okay.”

“But there is no reason we can’t do the same with columns if you pardon the double negative. For instance, only six people solved problem 8 but nine solved problem 9. Therefore problem 9 is harder than  problem 8 …”

“So even if you don’t understand the questions themselves you can still say things like Debbie is better than Anna”

“Exactly,” replies Ms. Spider.

“With 18 students and 9 problems, you don’t have a lot of data”

It’s a stupid observation, I know – but I am only trying to buy time as I try to digest her ideas.

“Well, the same logic applies if we had 1800 students and 900 problems.”

“I think I understand,” I say. “It’s like some kind of Mechanical Turk. Previous students have tried these questions (and of course you don’t have to pay them to do these exams!), so you can work out which questions are easy or hard.”

“Wasn’t the Mechanical Turk some kind of fake chess-playing machine by Wolfgang von Kempelen? What a disgraceful idea! I would never try to cheat chess players like that”.

Okay, didn’t see that one coming. We need to agree on a definition of Mechanical Turk.

“Do you think your students will eventually find out their exam papers are different?”

“That shouldn’t be an issue,” says Ms. Spider, as she squirms in her seat. “If a poor student finds out, he has no reason to complain. If a good student finds out then deep down in his heart he already knows he is better than the poor student, so the exam result doesn’t matter.”

Somehow I think her logic is very, very, unsatisfactory. But I do know that many of the greatest insights precisely come from those who are willing to suggest ideas that sound utterly outrageous. For instance Rivest, Shamir and Adleman are your average computer scientists, but with a bit of luck they might one day become famous, known to every student of cryptography. So I should cut her some slack.

In fact, I am more than looking forward to the results of her revolutionary teaching methods. After all, I’m not the teacher and I don’t set the exams. I was especially careful not to suggest any drastic ideas of my own. If the radioactive 83,72,73,84 hits the fan and grows to fill the size of the entire house then I am more than happy to watch, knowing my 65,82,83,69 is fully covered.

Bring. It. On.

 

 

The WHAT HOW and WHY of biased Spider Solitaire servers

In my paper on Spider Solitaire, I presented evidence that Opaque Solitaire(*) was biased: if a player won too many games the random number generator will favour difficult hands for future games. However I didn’t discuss why or how a software designer will do this. These are definitely valid questions, and will be the topic of this post.

(*) Opaque Solitaire is not the real name of the server.

Steve Brown, the author of Spider Solitaire Winning Strategies gives good reasons why most spider solitaire implementations will not rig deals if the player wins too much.

Steve Brown’s argument boils down to three main points.

  • The WHAT: claims of bias on internet forums cannot be substantiated with any evidence.
  • The HOW: It is not clear how to rig the random number generator without significant effort
  • The WHY: Why would a software developer 80,74,83,83 off its users?

 

However, for the Opaque Solitaire program in question I can (partly) refute Steve’s argument.

The WHAT

Obviously I have solid evidence Opaque Solitaire is biased: my paper is based on statistical hypothesis testing. I obtained a “magic number” that happened to be statistically significant at the alpha = 0.05 level. If you want more details, you know what to do 😊 The more interesting point of this post concerns the HOW. Suppose you wanted to design a Spider program that can adjust the difficulty level according to the player’s skill.

The HOW

Let’s say you compiled a dataset of 100,000 hands (initial game states), and you wish to estimate the difficulty level of each hand. For simplicity assume difficulty is represented as an equity between 0 and 1. For instance, an equity of 0.58 means you are willing to bet 58 cents if a win pays $1. Hence higher equity means the game is easier to win. Getting an expert human player to test 100,000 hands is clearly impractical. One can write a program to automatically assess the equity of each hand, but that runs into trouble too. For instance, the ranks of the first 10 cards showing is a poor indicator of equity since it is more than possible to overcome a poor start (or conversely a good start can sour).

But why not get the players themselves to estimate the equity for you? Consider the following hypothetical table

pic1

There are 18 players and 9 games. For each game and player, the result is either WIN (1), LOSS(0) or NOT PLAYED (blank) and I have color-coded results for ease of reading (any subliminal messages or easter eggs you find are purely coincidence!). Most of the games have been played so only a few data points are missing. For instance we can deduce that Sam is a pretty good player since he won 7 games out of 9 whereas Owen or Fiona is much worse. Similarly we can look at individual columns and say e.g. Games 3 and 8 are equally easy or hard since they have 6 wins, 11 losses and 1 not played. Game 6 or 9 is easier since more players beat it. We therefore can decide Game 9 is suitable for Owen because Owen is a poor player and wants to play easier hands. But we would not assign Game 5 to Isabella since that is relatively hard. One can think of this table as a Mechanical Turk, but the crowdworkers don’t find the tasks very onerous because for some reason they actually enjoy playing Spider Solitaire 😊

Note that implementing this does not require us to know much about Spider Solitaire strategy. The results of the table speak for themselves. For instance Debbie “dominates” Anna because whenever Anna won a particular hand, so did Debbie. But the reverse is false. Hence we know Debbie is a better player than Anna.

Obviously a small number of data points is not reliable, but it’s not hard to imagine a similar table for a much large number of players and games. Note that it is not necessary for every player to play every hand for this to work. Anyways, you get the gist. Assuming your Spider Solitaire  program is online and you are able to store cookies, you can keep tabs on which players are better than others and which hands are easy or hard. Hence you can assign the “correct difficulty” hands to different players. There might even be a Ph. D. or two in this 😊

I’m not saying this is the best method to rate the difficulty of individual spider hands, but this is one way to do it.

The WHY

As for the WHY, my best guess is the developer(s) of Opaque Solitaire wish to challenge a good player and not bore him with too many easy hands. Unfortunately statistical testing can only say the data is fishy, and cannot answer why someone would “make the data fishy”, if you will.

I’ve seen forums where players accuse Microsoft Hearts of cheating. Some players claim that MS must compensate for the AI’s poor strategy by collusion. Others say MS does this because the software designers have good intentions but don’t understand the expectations of players. I agree Joe Bloggs probably knows nothing about Statistics and he is probably on tilt after losing three hands in a row. But when Jane Citizen accuses the same program of reneging or playing a card known to be held by someone else then you know you’ve got issues. I haven’t played much Microsoft Hearts but I’m siding with the chumps. For the same reasons, I would not lightly diss anyone who complains about rigged games, Spider Solitaire or otherwise (NOTE: the MS Hearts forums may be out of date and the software may have changed for the better, but you get the gist). Since my paper was successfully published, I believe the onus of proof should rest on software developers: they should make it easy to test if a program is biased or not.

Summary

In summary, I believe most Spider programs are kosher, but not Opaque Solitaire. One word of warning: I do not have conclusive evidence that Opaque Solitaire deliberately rigs the cards because that’s the nature of Statistics – hypothesis testing alone does not equal conclusive proof: if your p-value is less than 0.05 then you might have obtained “unlucky data”, forcing you to jump to the wrong conclusion. But p < 0.05 can be used to justify further testing. But the point of my paper was to show that the bias exists and can be quantified using statistics.