Lab 8: Scrabble Assistant
Due on Sunday, November 20th at 11:59 PM. This is a team lab. You and your assigned lab partner(s) will complete this lab together. Make sure that you are familiar with the Partner Etiquette guidelines. You may discuss the concepts of this lab with other classmates, but you may not share your code with anyone other than course staff and your lab partner(s). Do not look at solutions written by students other than your team. If your team needs help, please post on the Piazza forum or contact the instructor or ninjas. If you have any doubts about what is okay and what is not, it’s much safer to ask than to risk violating the Academic Integrity Policy.
Overview
In this lab, you will write a hash table implementation for a dictionary data structure. (Unit tests for this assignment have been provided.) You will then use that implementation to create a tool which assists in finding solutions to tiled word games (such as Scrabble). You will also provide a written analysis of the performance of your hash table.
Starting Code

Your starting code includes the following files. Bolded files are those which you may need to change.
adts/
: As usual, the ADT interface declarations are stored here. Reference implementations of previously-covered data structures (like lists) are stored in the subdirectoryimpls
. Review thedictionary.h
interface which should be familiar from the previous lab.hashes.h
/hashes.cpp
: Some hash implementations for the types you will use in your table.hashTable.h
/hashTable__definitions.h
: TheHashTable
class. Only the public methods have been provided as part of the declaration; you will need to write your own private fields and members.scrabbleAssistant.h
/scrabbleAssistant.cpp
: TheScrabbleAssistant
class, which bundles up the work of identifying anagrams and similar tasks.main.cpp
: The file where themain
function for the program. This program is already written but relies upon theScrabbleAssistant
class you will write.tests.cpp
: Unit tests for your hash table data structure.manualTests.cpp
: A program which you can use to perform sandbox tests of your code.experiment.cpp
: A program which you will use to experiment with the performance of your hash table.experimentClasses.h
/experimentClasses.cpp
: A module containing classes used while experimenting with the performance ofHashTable
.experimentHashes.h
/experimentHashes.cpp
: A module containing the hash functions which are used in the experimental program.
Part I: HashTable
For the first part of this assignment, you will implement a separate chaining hash table as we discussed in class. (You will not implement a linear probing hash table.) One of the constructors allows the user to specify a maximum load factor; the other should use a sensible default value (e.g. 0.75
). Unit tests have been provided for this lab; once you have completed your HashTable
implementation, the appropriate unit tests should pass. Further, all operations on your HashTable
(other than getKeys
and getItems
) must run in \(O(1)\) average time.
Note that hashing functions have been provided in hashes.h
and hashes.cpp
for keys of type int
and string
. If you want to use any other type of data as the key in your hash table, you’ll need to write an appropriate hashing function. Do not worry about experiment.cpp
for this part of your lab; we will use that file later.
You should consider using an array of statically-allocated vector
objects to represent the buckets of your HashTable
. Statically-allocated vectors are easier to use and less error-prone, especially when they are stored in an array. You could also use the STL class list
(a linked list implementation) or even the provided STLList
class.
Memory Errors and Negative Modulus
In C++, the modulus operator can return a negative number:
cout << -5 % 3 << endl; // prints "-2"!
int h = hash(key); // could be a negative int
int index = h % capacity; // could be a negative number -- bad index!
This could be a problem if we then use that negative number as an array index!
To solve this, we can simply increase the result of modulus by the modulus base if turns out to be negative:
int x = -5 % 3;
if (x < 0) {
x += 3;
}
cout << x << endl; // prints "1"
A helper function to do this for you, positive_modulus
, appears in hashes.h
/hashes.cpp
. Make sure to use it!
Removing From A vector
The vector
class provides a means to remove an element from the end of a list (pop_back
) but doesn’t give a clear interface for removing elements e.g. from an arbitrary index. Vectors provide a general function called erase
which can delete entire contiguous regions:
vector<int> vec = ...;
vec.erase(vec.begin() + start_index, vec.begin() + end_index);
The above function works much like a Python slice: the start index is included but the end index is not. Thus, remove_range_from_vector(1,3)
would remove the second and third elements (indices 1 and 2) from the vector. If you want to delete a single element from a vector, you can do something like this:
vector<int> vec = ...; // whatever defines this vector
int index = ...; // e.g. 4
vec.erase(vec.begin() + index, vec.begin() + index + 1);
Part II: Scrabble Assistant

Once you have a working HashTable
, you can write the Scrabble Assistant. We will design the application in two parts: a ScrabbleAssistant
class (which does all of the interesting work) and a main
function (which provides the interface that allows the end user to interact with that object).
The work of the ScrabbleAssistant
can be broken into the following parts:
- Keeping track of which words are legal.
- Finding the anagrams of a particular word.
- Finding the “power set” of a string.
- Eliminating duplicate strings in a list.
- Finding all of the legal plays given a string containing the player’s tiles.
Legal Words
The spellcheck
method of ScrabbleAssistant
takes a string and returns a boolean indicating whether it is a word; that is, spellcheck("puppy")
should return true
but spellcheck("cloombish")
returns false
. The catch is that this function must run in \(O(1)\) average time!
To accomplish this, we will construct a hash table in ScrabbleAssistant
which tracks legal words. The constructor of ScrabbleAssistant
receives a vector
of all of the legal words; you should add each of those words as a key to the hash table. (The value can be anything – false
, 0
, the word itself – because we will ignore it.) This way, the spellcheck
method only needs to check this dictionary.
Once you have finished this task, the spellcheck
-related tests should pass.
Anagrams
The anagramsOf
method determines the dictionary words which can be created by rearranging a given string of characters. For instance, the word “bat” contains exactly the same letters as the word “tab” and no other English words contain exactly those letters. We also need this function to run in \(O(1)\) average time!
Similar to the above, we can accomplish this by pre-processing our word list in the ScrabbleAssistant
constructor. First, write a helper function to sort the letters in a string (e.g. sortString("bat")
should return "abt"
). (You don’t have to do this yourself; the std::sort
function of the algorithm
library can help. Try std::sort(mystr.begin(), mystr.end())
.) Then, in the constructor of ScrabbleAssistant
, create a dictionary that maps each sorted string (e.g. "abt"
) to a vector
of all words that sort to that string (here, a vector
containing "bat"
and "tab"
). You’ll need to write this constructor code to add the words to the dictionary one at a time. This way, your findAnagrams
function can return the vector
to which the given string is mapped.
Power Set

The anagrams function lets us find all of the words made of exactly a given string of letters, but it doesn’t allow us to find words made of fewer letters. For instance, the string "at"
can be made from letters in "atb"
, but it is not an anagram of them. To find all words, we can write a function to compute the power set of a given string: the strings which can be made by removing any number of letters from it. The power set of "atb"
, for instance, is made of the strings "atb"
, "at"
, "ab"
, "a"
, "tb"
, "t"
, "b"
, and ""
. Computing this power set does take \(O(2^n)\) time; fortunately, most words are small and so this doesn’t cost all that much.
You should write a stringPowerSet
function to compute the power set of a given string. Here’s a recursive algorithm that you can use to write the function:
Eliminating Duplicate Strings
By the time we’re finished looking for all of the words we can make, we might find the same word multiple times. You should therefore write a deduplicate
function that, given a vector<string>
, returns a vector<string>
which contains each word only once. You could approach this in different ways:
- Nested loop over the
vector
, looking for matches. This is \(O(n^2)\) time. - Linear search for adjacent equal strings. This only works if the
vector
is sorted, but takes \(O(n)\) time. (The sort takes \(O(n \log n)\) time.) - Add every
string
in thevector
as a key to a hash table (with a dummy value like0
, as described above for checking valid words). Then, get all keys from the hash table. This takes \(O(n)\) time on average.
You can use any of the above approaches, although the last is the most efficient and least error-prone.
Finding Words
Once you’ve written the anagramsOf
and stringPowerSet
routines, you can combine them to solve the problem of finding words. The user will provide a string containing all of the letters available to make a word. You can find all legal words as follows:
- Find the power set of that string.
- Get all of the anagrams for each string in the power set.
- Put all of these anagrams into a single vector.
- Deduplicate that vector.
- Be happy!
Once you’ve implemented the findWords
method as described above, your ScrabbleAssistant
will be complete! You can run the provided program by providing the name of a dictionary file; if you don’t specify one, /usr/share/dict/words
will be used. Congratulations!
Part III: Experimentation
For the final part of this lab, you will write some functions to perform experiments on your HashTable
to compare its performance with that of the provided STLBST
class (in the adts/impls/
directory). Your objective will be both to show that HashTable
works well when given a good hashing algorithm and to show that it can work poorly when given a bad hashing algorithm.
The Experiment

Most of the code for this experiment has been written in experiment.cpp
. This code will create three different kinds of dictionaries, each of which contains a data type that stores a string
and an int
. One dictionary will be a HashTable
using a “good” hash, another will be a HashTable
using a “bad” hash, and the third will be the binary search tree STLBST
. The experiment will add some number of elements to each dictionary and retrieve all of them; it reports the average amount of time per element spent inserting and retrieving mappings.
There are two different pair-of-int
classes used in this experiment, both of which are defined in experimentClasses.cpp
: the StringIntGoodHash
class and the StringIntBadHash
class. These classes are identical in every way except their name. In experimentHashes.cpp
, however, each is given a different hash
function. You should study the difference between these hash functions to understand what they do.
Your Task
In experiment.cpp
, there is a single function that you need to write. The getExperimentKeys
function is used by the experiment to determine which keys to add to the dictionary. You should write this function to produce keys that exploit the weakness of the bad hashing function; that is, write this function to cause hash collisions for StringIntBadHash
. Note that every key returned by getExperimentKeys
must be distinct; the experiment will not run if your mappings contain duplicate keys.
Once you have written your getExperimentKeys
function, you use the experiment
program to produce a graph much as we did with the mystery function program. This program does not require any arguments, so you can run it like so:
./experiment | gnuplot
This will both save a file called experiment.png
as well as open a window showing you the results of your experiment. If you have successfully exploited the bad hash function, the runtime of the bad hash experiment should be comparatively high.
After you have completed your experiment, create a written summary that contains:
- Your experimental graph
- A description of how your
getExperimentKeys
function exploits the bad hashing algorithm - A brief discussion of why the graph looks like it does (in terms of the big-O complexity of the different dictionaries)
A couple paragraphs should be sufficient. As with previous assignments, this write-up is required to be in PDF form. If you would like to use LaTeX, the WrittenLab.tex
file has been provided for you and you can use \includegraphics{experiment.png}
to add your graph to the file.
Memory
Your code must run without any memory errors; make sure to use valgrind
to find them. Memory leaks, on the other hand, are acceptable in small amounts and will not significantly impact your grade. In particular, “still reachable” memory is acceptable; “lost” memory should not occur in large amounts (e.g. millions of bytes).
Coding Style Requirements
You are required to observe some good coding practices:

-
You should pick meaningful variable names.
// Good int* pixels = new int[size]; // Bad int* p = new int[size];
-
You should use correct and consistent indentation. Lines of code within a block (that is, surrounded by
{
and}
) should be indented four spaces further than the lines surrounding them.// Good if (condition) { cout << "Test" << endl; } // Bad if (condition) { cout << "Test" << endl; }
-
You should use a block whenever possible, even if it’s not necessary. This helps you avoid subtle or messy bugs in the future.
// Good if (condition) { cout << "Something" << endl; } // Bad if (condition) cout << "Something" << endl;
-
Any new methods or fields in your header files should have comments explaining their purpose and behavior. You are permitted to omit documentation for methods that are inherited from other classes; that is, if your class has a
foo
method because its superclass has afoo
method, you don’t need to document that method.// Good public: /** * Saves the image represented by this object to a file. * @param filename The name of the file to save. * @return The number of pixels in the saved image. * @throws runtime_error If the image could not be saved due to an I/O error. */ int save(std::string filename); // Bad public: int save(std::string filename);
Your method/field documentation does not have to be in the format above, but you must describe the method’s behavior, its parameters, its return value, and any exceptions that it may throw. (If you’re indifferent, the above syntax is a good one to know; it’s a de facto standard used by Javadoc, Doxygen, and other tools that automatically process source code comments into other formats like searchable webpages.)
Peer Review
As this is a team assignment, you are required to complete a peer review of your lab partner or partners. You must do this even if everything is fine. You should complete this peer review after you have turned in your assignment. The peer review comes in the form of a simple Google Form and can be accessed here; in most cases, it will likely take less than a minute to complete. If you have trouble accessing this peer review, please make sure that you are logged into Google using your Swarthmore credentials.
Your peer review will be kept private; only course staff will have access to this information. You can also update your peer review after you have submitted it. However, if you do not submit a peer review, you will lose participation grade points. Please don’t forget!
Summary of Requirements
When you are finished, you should have
- A working
HashTable
data structure - A
ScrabbleAssistant
class that powers the provided program - An experimental write-up demonstrating the importance of good hash functions
- No memory errors
- Code which conforms to the style requirements above
- A completed peer review