CS35 Lab #8:

Hash Tables

Due 11:59pm Wednesday, 9 April 2008

You may work with a partner on this assignment if you wish. Be sure to include both names in your README.

Hash maps with separate chaining
Implement the Map interface using separate chaining as described in section 9.2.5 of the text. Use the code in labs/w10-HashMaps as a starting point. At least one of your constructors should support a floating point load factor and a initial capacity as parameters. Your implementation should double the array size and rehash the keys when the load capacity is exceeded. You implementation should include a method collisions() that returns the total number of collisions observed in the hash table. To test your implementation, build Maps on the words in /usr/share/dict/words using a load factor of 0.5, 0.8, 0.9, and 0.95. Code for running such an experiment is in the file labs/w10-HashMaps/TestCollisions.java. Display the total number of collisions and the average number of collisions per word for each load factor. How do your results compare to the HashTableMap that uses linear probing? Compute a histogram of the chain lengths in each bucket in the final hash table for each load factor. An example histogram is shown below. This histogram is completely made up and may not look at all like the actual distribution.
length     freq (%)
0          20.00
1          40.00
2          30.00
3           5.00
4           2.00
5           1.00
6           1.00
7           0.50
9           0.25
10          0.24
15          0.01
Summarize the results of your experiments in a README file. Is hashing a viable alternative to tree-based searching? Is chaining better or worse than linear probing?
Submitting
Along with your Java source code, you should hand in a README (not README.txt or Readme.txt) file. These files will be imported automatically via handin35.