Your work for this lab consists of three parts:
As usual, you can get the initial files for this lab via Github. The files that you will need to edit are highlighted in blue:
You should start by completing and testing the HashTable class. Our HashTable uses an array of vectors to store the data within the hash table, using chaining to resolve collisions. Note that this is not a vector of vectors. buckets is a standard C++ array, where each index contains an actual vector object (not a pointer). You should be able to draw this out as a memory diagram.
Each vector stores key-value pairs; for keys of type K and values of type V each key-value pair therefore has type Pair<K,V>.
You must implement methods for the HashTable interface. Feel free to divide the implementation into multiple files (or not).
You must test each HashTable method you implement using the testHash program. We have provided the data members you should keep track of, and implementations of hashing methods for strings and integers.
Do not forget to use exceptions if a function was not called properly. For example, if there are no items in the table, which functions should throw exceptions? If a search process fails, which functions can handle that properly and which should throw an exception?
Scrabble is a popular board game based on a players ability to generate words from a given set of letters (called tiles). While there are more details needed to learn the game, all that you need to know for this assignment is that a player may only play actual words. That is, a legal play is is a word that exists in an English dictionary. Second, an important skill is being able to generate a list of possible plays given a finite set of letters in no particular order.
Your goal for the main program is to help a user learn about possible word plays. As in lab 9, we will use command-line options to provide input from the user. The program should be run with:
./scrabbleMain <dictionary-to-use> <action> <letters>
The two main technical parts to implement are a spell checker and an anagram solver. You will solve both problems using dictionaries. There are not too many other details, we want to encourage creativity for this lab.
To solve both problems, you will need to load a dictionary of known words. In the simplest case, this is quite similar to other problems you've seen in this course before, like in the spellChecker. For example, for check, all you need is a dictionary that will index all known words, which is simply a problem of verifying if a word is in the data structure or not.
For the anagram solver, you will need to store a dictionary of letter combinations mapping to the associated list of words that can be created from it. This is slightly more complicated. One tip to consider is that anagrams are not order specific, so it may be best to sort all of the letters first. For example the letters A,E,R form the words ARE, ERA, and EAR. If the user had given the letters in a different order, e.g. R,A,E the answer would be the same. So it is probably best to store your sequence of letters in sorted order and then sort the letters given by the user to search the dictionary. This requires implementing a small sorting function on strings – feel free to use any appropriate algorithm for this.
There are a few ways to generate all legal plays. Perhaps the easiest way is to generate a list of all possible subsets of letters, and then use the anagram solver on each subset of letters. Implement the anagram solver first. Make sure to avoid printing out the same legal plays multiple times. One way to do this is to build up a dictionary of legal plays as you iterate over subsets. You can print out your dictionary after you finish iterating.
$ time ./scrabbleMain /usr/local/doc/sowpods.txt check leader leader is in the dictionary. real 0m1.160s user 0m1.135s sys 0m0.024s $ time ./scrabbleMain /usr/local/doc/sowpods.txt anagrams leader dealer laered leader leared redeal real 0m1.354s user 0m1.322s sys 0m0.032s $ time ./scrabbleMain /usr/local/doc/sowpods.txt plays leader ale lea a arle earl laer lare lear rale real are ear era dele leed lard leer lere reel deer dere dree ered rede reed laree leare leear arede deare eared dealer laered leader leared redeal ad da ae ea de ed al la ar alee eale lar dale deal lade lead dae ee alder arled lader ared dare dear eard rade read dal lad eel lee dee el ard rad ere ree del eld led er re red elder lered real 0m1.334s user 0m1.305s sys 0m0.028s
The Standard Template Library (STL) contains many templated implementations of data structures covered in this course, plus many beyond. That is, you do not, in fact, need to copy-paste a List implementation every time you need one! In this lab, you will use the vector class which is most closely related to the List interface, and you can think of as essentially a built-in ArrayList implementation. You can read about its usage here.
One detail of the implementation that isn't immediately obvious is how to remove an element from a vector. For our data structures, we had a remove method that took an index as a number. Vectors require that you use a different notion of indexing when removing, based on an offset from the beginning of the vector. So to remove an element at a particular index i, from a vector v use:
v.erase(v.begin() + i);
Other than this detail, the provided documentation should be enough to let you use vectors as if they were array lists without too much trouble. There is an example of creating and populating a vector in readFileToList, so you can see the interface in action.
The final task is to measure some aspects of your HashTable implementation. As discussed in class, the performance of HashTables depend on several factors (e.g., initial choice of capacity, hash function, load factor, etc.). Pick one or more of these factors as the independent variable in your experimental design. Use reportStats to gather information about several possible dependent variables (i.e., the y-axis) including the number of key-comparisons, the percentage of non-zero buckets in the table, the maximum size of any list, the average size of non-zero buckets, etc. This is by no means an exhaustive set of possible dependent and independent variables.
Once you explore some possible pairings of variables, produce graphs that show a couple interesting relationships. This can be done with one or two graphs. For example, you may have one graph showing the effect of initial capacity (x-axis) on comparisons (y-axis) with a curve for two different hash functions. If you completely separate tests, show them in two graphs (e.g., one for capacity vs comparisons, one for load factor maximum versus average list length). Write a short paragraph describing how you generated the data (e.g., inserted 1,000,000 values into a HashTable, varying the maximum load factor each time). Write another sentence or two summarizing your findings.
We should be able to get a feel for how you generated your statistics by running:
$ ./scrabbleMain report-statsIn the starter code, this option does nothing. Put the code you use for your experiments here.
As usual, git push by the deadline to submit. Note the extended deadline this week!