CS35 Lab 10: Scrabble Solver

Coding portion due by 6pm on Sunday, November 24.
Written portion due at start of class, Tuesday, November 26.


Introduction

Your work for this lab consists of three parts:

  1. Complete an implementation of a HashTable, which implements the Dictionary interface.
  2. Use your HashTable to implement a program that will allow a user to check if a word is a valid word in the dictionary. Your program will also allow the user to enter a set of letters and (i) receive all legal anagrams of that word and (ii) receive all legal plays using the set of letters.
  3. You will test the efficiency of your HashTable by varying properties such as the load factor, initial capacity, and hash function and report your results.

As usual, you can get the initial files for this lab by running update35. The new files are listed here, and the files that you will need to edit are highlighted in blue:

Make sure to add your BinaryHeap implementation to the library/directory:
$ cd cs35/labs/10
$ cp ../09/binaryHeap* library/
You may need to patch up your include statements to get it to compile.


Completing the HashTable implementation

You should start by completing and testing the HashTable class. Our HashTable uses an array of vectors to store the data within the hash table, using chaining to resolve collisions. Note that this is not a vector of vectors. table is a standard C++ array, where each index contains an actual vector object (not a pointer). You should be able to draw this out as a memory diagram.

Each vector stores key-value pairs; for keys of type K and values of type V each key-value pair therefore has type Pair<K,V>.

You must implement public and private methods for the HashTable interface. We have provided the initial start in hashTable-inl.h, but feel free to divide the implementation into several files. We have provided you the implementation of insert as well as two hash functions so that your hash table can be utilized on either ints or strings.

You must test each HashTable function you implement using the testHash program. We have provided the data members you should keep track of and the implementation of insert. You should first read and understand these portions. We have also provided two sample hash functions, one for ints and one for strings. You may modify these as you see fit.

Note that we have defined a variable MAX_ALPHA. Load factor is generally noted as the letter alpha, and thus the highest load factor allowed is specified using MAX_ALPHA. Also note the odd declaration for the constructor:

 HashTable(int capacity = 53, double maxLoadFactor = 0.8);
This says that the user can specify two parameters, capacity and maxLoadFactor when constructing a HashTable. If the user omits these arguments, the default values are 53 and 0.8 respectively. This is a general technique for specifying optional parameters in C++.

Lastly, do not forget to use exceptions if a function was not called properly. For example, if there are no items in the table, which functions should be called? If a search process fails, which functions can handle that properly and which should throw an exception?


Scrabble Checker

Scrabble is a popular board game based on a players ability to generate words from a given set of letters (called tiles). While there are more details needed to learn the game, all that you need to know for this assignment is that a player may only play actual words. That is, a legal play is is a word that exists in an English dictionary. Second, an important skill is being able to generate a list of possible plays given a finite set of letters in no particular order.

Your goal for the main program is to help a user learn about possible word plays. (Of course, this tool will NEVER be used to cheat in a real game. Scout's honor :) ). Your program will greet the user and provide a simple interface with four options:

  1. Exit the program loop
  2. Verify a word is in the dictionary. This option will ask the user for a "play" and check if the play is legal, i.e., if it is an actual word
  3. Generate a list of legal anagrams. This option will ask the user for a series of letters; duplicate letters are allowed. You will then return all of the words that use all of the letters in any order.
  4. Generate a list of legal plays. This option will ask the user for a series of letters; you will then return all of the words that use a subset of these letters (in any order).

The two main technical parts to implement are a spell checker and an anagram solver. You will solve both problems using dictionaries. There are not too many other details, we want to encourage creativity for this lab.

To solve both problems, you will need to load a dictionary of known words. You will need to load the words in the file /usr/local/doc/sowpods.txt. You will need to construct dictionaries to solve each problem. The first dictionary will index all known words, and is simply a problem of verifying if a word is in the data structure or not.

For the anagram solver, you will need to store a dictionary of anagrams and the associated list of words that can be created from it. This is slightly more complicated. One tip to consider is that anagrams are not order specific, so it may be best to sort all of the letters first. For example the letters c,a,t form the words cat, act, and tac. If the user had given me the letters in a different order, e.g. t,c,a the answered would be the same. So it is probably best to store your sequence of letters in sorted order and then sort the letters given by the user to search the dictionary.

There are a few ways to generate all legal plays. Perhaps the easiest way is to generate a list of all possible subsets of letters, and then use the anagram solver on each subset of letters. Implement the anagram solver first. Warning: Make sure to avoid printing out the same legal plays multiple times. One way to do this is to build up a dictionary of legal plays as you iterate over subsets. You can print out your dictionary after you finish iterating.

Additional Requirements

Sample Run

$./scrabbleCheat 

Welcome to Scrabble Solver

Menu:

	(0) Exit
	(1) See if a play is legal
	(2) Find all legal anagrams for a set of tiles
	(3) Find all legal plays for a set of tiles

Select your choice (0-3):1

Enter the word you wish to play: quilt
Congratulations, you have found a legal play!

Menu:

	(0) Exit
	(1) See if a play is legal
	(2) Find all legal anagrams for a set of tiles
	(3) Find all legal plays for a set of tiles

Select your choice (0-3): 1

Enter the word you wish to play: QuILt
Congratulations, you have found a legal play!

Menu:

	(0) Exit
	(1) See if a play is legal
	(2) Find all legal anagrams for a set of tiles
	(3) Find all legal plays for a set of tiles

Select your choice (0-3): 1

Enter the word you wish to play: blerg
Sorry, that is not a legal play

Menu:

	(0) Exit
	(1) See if a play is legal
	(2) Find all legal anagrams for a set of tiles
	(3) Find all legal plays for a set of tiles

Select your choice (0-3): 2
Enter the letters you wish to play: tca
Your legal plays are:
	act
	cat

Menu:

	(0) Exit
	(1) See if a play is legal
	(2) Find all legal anagrams for a set of tiles
	(3) Find all legal plays for a set of tiles

Select your choice (0-3): 2
Enter the letters you wish to play: ACT
Your legal plays are:
	act
	cat

Menu:

	(0) Exit
	(1) See if a play is legal
	(2) Find all legal anagrams for a set of tiles
	(3) Find all legal plays for a set of tiles

Select your choice (0-3): 2
Enter the letters you wish to play: bl
Sorry, no anagram exists

Menu:

	(0) Exit
	(1) See if a play is legal
	(2) Find all legal anagrams for a set of tiles
	(3) Find all legal plays for a set of tiles

Select your choice (0-3): 3
Enter the letters you wish to play: table

All possible plays:
	blate
	bleat
	ablet
	table
	belt
	blet
	blat
	abet
	bate
	beat
	beta
	tael
	tale
	teal
	tela
	late
	leat
	able
	albe
	bael
	bale
	blae
	bet
	bel
	alt
	lat
	ate
	eat
	eta
	tae
	tea
	ale
	lea
	elt
	let
	tel
	bat
	tab
	alb
	bal
	lab
	at
	ta
	et
	te
	ab
	ba
	ae
	ea
	be
	al
	la
	el

Menu:

	(0) Exit
	(1) See if a play is legal
	(2) Find all legal anagrams for a set of tiles
	(3) Find all legal plays for a set of tiles

Select your choice (0-3): 0

Thanks for Playing Scrabble Solver!


Vectors

The Standard Template Library (STL) contains many templated implementations of data structures covered in this course, plus many beyond. In this lab, you will use the vector class which is most closely related to the List interface and uses something akin to the ArrayList implementation. You can read about its usage here.



Written Portion

In testHash.cpp or a separate file, explore some of the parameters of your HashTable implementation. As discussed in class, the performance of HashTables depend on several factors (e.g., initial choice of capacity, hash function, load factor, etc.). Pick one or more of these factors as the independent variable in your experimental design. Use reportStats to gather information about several possible dependent variables (i.e., the y-axis) including the number of key-comparisons, the percentage of non-zero buckets in the table, the maximum size of any list, the average size of non-zero buckets, etc. This is by no means an exhaustive set of possible dependent and independent variables.

Once you explore some possible pairings of variables, produce graphs that show a couple interesting relationships. This can be done with one or two graphs. For example, you may have one graph showing the effect of initial capacity (x-axis) on comparisons (y-axis) with a curve for two different hash functions. Write a short paragraph describing how you generated the data (e.g., "inserted 1,000,000 values into a HashTable, varying the maximum load factor each time"). Write another sentence or two summarizing your findings.

This portion can be done with your lab partner. You will hand in your solution in during class on Tuesday, November 27. We suggest modifying testHash.cpp. When you hand in the normal part of the lab, be sure that testHash.cpp shows the original hash table tests. You can encapsulate your experiments in additional functions, not be deleting the original tests.

Extension - Define Subscript Operator

In class, we discussed how dictionaries are like arrays but with generic indexing. The subscript operator in arrays (e.g., C++ standard arrays, Python lists) is very similar to the get operation. In fact, let us overload the subscript operator for CircularArrayLists. First, let us recall how get(int i) is defined:

template <typename T>
T CircularArrayList<T>::get(int i) {
  if (i < 0 || i >= size) {
    throw std::runtime_error("Attempted to get out of bounds.");
  }
  return values[(headPos+i)%capacity];
}
Hopefully that is nothing new! Now let us see how the indexing operator is defined for this class:
template <typename T>
T& CircularArrayList<T>::operator[](int i) {
  if (i < 0 || i >= size) {
    throw std::runtime_error("Attempted to index out of bounds");
  }
  return values[(headPos+i)%capacity];  
}

Notice that this looks almost exactly the same. The main difference is in the return value. get returns a value of type T. Just as we pass-by-value in C++, by default we return-by-value. That means that the value returned is a copy of the original item stored in the CircularArrayList.

The subscript operator is used to get values, and thus the get return-by-value would be suitable. But we also wand to use indexing to modify or set values e.g., list[i] = 7. To do this, we need to have access to the actual piece of memory where the data is stored. So, we return a data type of T&. This tells C++ not to return a copy, but to return the actual chunk of memory where the data is stored so that it can be modified.

Try defining the indexing operator for Dictionaries. Unleash the power of polymorphism!



Submitting your work

Once you are satisfied with your program, hand it in by typing handin35 at the unix prompt.

You may run handin35 as many times as you like, and only the most recent submission will be recorded. This is useful if you realize after handing in some programs that you'd like to make a few more changes to them.