CS 35

Takehome Midterm Test

A tarball of your solution due in email to me by midnight Tuesday, 27 March. NO LATE work accepted. Give me the best you have by midnight Tuesday, 27 March. If you can't get code to run, at least give me code that you hope(d) would run. If your code doesn't work you should tell me how it fails. If it does anything extra, you should tell me what.

On the course web-pages, I promised a take-home midterm test. This is it. Please work alone on this. You may use your text book, code I have provided in my pub directory, and Sun Java documentation reachable from the course web-site without reference. All other sources should be referenced. Please remember NO HELP FROM PEOPLE. This is a test.

Your code should be well-documented. In a prose README file, give me an overview of how your code works and estimated run-time of the various phases of the spell-check of a file under reasonable assumptions, like not too many misspelled words, etc. Tell me what your assumptions are. Give plausible arguments to support your estimates.


Your task is to implement a spelling checker. For the dictionaries, use /usr/share/dict/american-english plus your own personal file of words. Given a text file to check, your program should output all misspelled words and the line numbers on which they occur. For each misspelled word, list any words in the dictionaries that can be obtained by adding one character. For this midterm test, I want you to use the Java Collection TreeSet<E> for the big dictionary. You may use any other collections you find useful for other structures. That is, use a Java Collection and its method's instead of writing your own where possible. I do not want you to use regular expressions. You should be able to do the last part of this by clever use of methods in the String class withour regular expressions.

Hints:
This problem is very close to programming project 12.17 on page 449 in Weiss. There he says some parts are close to generating a cross-reference. In case it will help, I have put slightly modified code from Weiss for the cross-reference in /home/cfk/pub/cs35/week8/takehome. Weiss's Tokenizer is much more complicated than what you need for this problem. We do not want case to count in our spell-checker. Something close to the kind of code in /home/cfk/pub/cs35/week6/wordfreq/Readwords.java should suffice for reading the files. Parts of the program Readwords.java in /home/cfk/pub/cs35/week8/takehome/ may be helpful.

Happy computing.