Homework 7

Due: Tuesday, March 1, before 11:30pm

For this assignment, you will be writing a program to shuffle cards. You can imagine numbering all of the cards in the deck from 0-51 in some sort of order.

A good shuffle of a deck of cards would make it so that it is equally likely that any given card is at any location in the deck (from first to last). Some studies have shown it takes 7 riffle-shuffles to randomize a normal deck of 52 playing cards (see Bayer and Diaconis, 1992). However, it is difficult for a computer to perform such electronically.

For our programs, we are going to be shuffling a deck of 52 cards and calculating the probability that a given card has a particular value. If the shuffle does truly randomize the cards, then we would expect that the chance of a given card being a particular value would be 1/52 if we analyze this over a large number of shuffles.

You will shuffle a sorted deck of cards N times and keeping a count of the number of times each card appears in the first location. After N times, you can find out the probability that the first card is a value V by taking the count of how many times card V appeared in the first location during your N shuffles and dividing by N.

prob(V) = count(V)/N

We will be scoring the randomness of a shuffle by finding the difference between the calculated probability and the expected probability of 1/52, squaring it, and summing all of them together.

Consider a 3 card deck. If after 30 shuffles, we get the following count of the number of times each card appeared in the first location:

- Card 0: 15
- Card 1: 10
- Card 2: 5

- Card 0: 15/30 = 0.50
- Card 1: 10/30 = 0.33
- Card 2: 5/30 = 0.17

1/3 = 0.33The differences would be

- Card 0: 0.33 - 0.50 = -0.17
- Card 1: 0.33 - 0.33 = 0.00
- Card 2: 0.33 - 0.17 = 0.16

Score = (-0.17)^{2}+ (0)^{2}+ (0.16)^{2}= 0.0545

A perfect shuffle would yield a score of 0, and the closer a score is to 0 the better the shuffle is.

You will need to use at least two arrays. One array will be used to store the shuffled 52 cards, and another array will be used to store the count of the number of times that each individual card appears in the first position of a shuffle in N shuffles of the deck. These count values will be used to calculate statistics about how good a particular shuffling algorithm is.

You will be implementing 3 different shuffles.

In the pairwise swap shuffle, you randomly pick two cards and swap them. Assuming that your choices are truly random, and with enough swaps, this should randomize the cards.

The problem with a pairwise swap is that even with 52 swaps, you still have a 13% chance that a given card was not swapped. One way to try to correct this situation is, instead of picking random pairs, you take each card in turn and randomly select another card to switch it with. In this manner, for the first card, you have a 1/52 chance of it staying the same.

There actually is a minor change we can make to this shuffling algorithm that will improve the random distribution. In our first version, there are N^N possible outcomes, however, there are only N! possible permutations. This means that some permutation will occur more frequently than others.

The solution is to not consider a location after you have shuffled it. For example, for the first card, you consider all 52 locations. For the second card, you only consider the remaining 51 locations, you do not consider the first location anymore. You continue down the locations until you are left swapping the 52nd card with itself.

This is also known as the Fisher-Yates shuffle.

Your program should generate the following data for each set of iterations 52, 520, 5200, 52000, 520000:

- The scoring of non-shuffling
- The scoring of doing 13 pairwise swaps
- The scoring of doing 26 pairwise swaps
- The scoring of doing 52 pairwise swaps
- The scoring of doing a bad linear shuffle
- The scoring of doing a Fisher-Yates shuffle

For reference, the output from my program is below

Doing 52 iterations ------------------------- No Shuffle: 0.980769 Doing pairwise swap 13 times: 0.347633 Doing pairwise swap 26 times: 0.088757 Doing pairwise swap 52 times: 0.036243 Doing bad linear shuffle: 0.016272 Doing linear shuffle: 0.012574 Doing 520 iterations ------------------------- No Shuffle: 0.980769 Doing pairwise swap 13 times: 0.391686 Doing pairwise swap 26 times: 0.130096 Doing pairwise swap 52 times: 0.020229 Doing bad linear shuffle: 0.002641 Doing linear shuffle: 0.002559 Doing 5200 iterations ------------------------- No Shuffle: 0.980769 Doing pairwise swap 13 times: 0.348555 Doing pairwise swap 26 times: 0.125830 Doing pairwise swap 52 times: 0.017830 Doing bad linear shuffle: 0.000563 Doing linear shuffle: 0.000157 Doing 52000 iterations ------------------------- No Shuffle: 0.980769 Doing pairwise swap 13 times: 0.351887 Doing pairwise swap 26 times: 0.128635 Doing pairwise swap 52 times: 0.016465 Doing bad linear shuffle: 0.000393 Doing linear shuffle: 0.000015 Doing 520000 iterations ------------------------- No Shuffle: 0.980769 Doing pairwise swap 13 times: 0.354374 Doing pairwise swap 26 times: 0.125996 Doing pairwise swap 52 times: 0.016171 Doing bad linear shuffle: 0.000379 Doing linear shuffle: 0.000002

Last Modified: February 21, 2005 - Benjamin A. Kuperman