The program handin33 will only submit files in the cs33/lab/10 directory. (You should run update33 first to set up the directory and create any necessary files.)
Your program must follow these following guidelines:
The tree has the following properties:
typedef struct node_t { char data; struct node_t *fc; /* first child */ struct node_t *ns; /* next sibling */ } Node;In cs33/labs/10/suffixtree.c (which is the code we wrote in class but some minor tweaks), you will find the above structure defined, as well as the complete definition of these three functions which we wrote in class on Tuesday:
Node *makeSuffixTree(); Node *makeNode(char letter); Node *getChild(Node *node, char letter); void insertSuffix(Node *node, char *suffix);I am also including two functions to read data which you will use at the end:
char *readFile(char *filname, int size); void safe(char *str, int length);Comments describing all five of the above functions are in the suffixtree.c file.
Your job is to write the following 6 functions:
void insertAllSuffixes(Node *tree, char *word); int countLeaves(Node *node); Node *findNode(Node *tree, char *search); int numAppears(Node *tree, char *search); void freeTree(Node *tree); /* OK, I LIED, THIS ISN'T OPTIONAL - IT'S NOT THAT HARD... */ void printSuffixes(Node *tree); /* OPTIONAL, BUT USEFUL */The comments in the suffixtree.c file will tell you exactly what each function should do.
Once you've implemented those 6 functions, you can substitute your main() function with this one (which is also provided in a comment at the end of the suffixtree.c file):
int main(int argc, char **argv) { Node *tree; char *buffer, *search; if (argc != 2) { fprintf(stderr, "Requires a text file as an argument.\n"); exit(1); } buffer = readFile(argv[1], MAXBUFFER); tree = makeSuffixTree(); insertAllSuffixes(tree, buffer); free(buffer); search = malloc(sizeof(char) * MAXBUFFER); do { printf("Type a string to search for (an empty line to quit):\n"); safe(search, MAXBUFFER); if (search[0] != '\0') { printf("Number of occurrences: %d\n", numAppears(tree, search)); } } while (search[0] != '\0'); free(search); freeTree(tree); return 0; }This main() function will read a text file (provided as a command-line argument) into a very large string, then it will form a suffix tree of this string. (Note that this string could have spaces, commas, etc., so it's not just a single word like we've seen so far.) After the suffix tree is formed, it asks you to repeatedly enter a search string. After each search string, the number of occurrences of that string are reported.
Run your program through valgrind and be sure you have no memory leaks. (We will talk about valgrind on Thursday.)
The idea is simple: in each leaf node you will store the starting position of that suffix. For example, let's use our "banana" example from above. In the leaf node (the '$' node) corresponding to the suffix "nana$", you would store 2 since "nana" is the suffix you'd get by starting at position 2 in "banana". You will have to add an extra field to the structure to accomodate this. (For non-leaves, this field is meaningless.)
Now that you've added that extension, write this function:
int *posAppears(Node *tree, char *search);This function returns an array of all of the starting positions of the search string.
In numAppears, you returned the number of leaf nodes below the search string. Here, you will return a list of the positions stored in the leaf nodes below the search string.
./suffixtree jabberwocky.txt -i