CS35 Lab 06: Binary Search Trees

Due 11:59pm Sunday, November 5 2017

The two main goals of this lab are to practice implementing and using non-linear data structures and to work on extensive unit testing. To that end, you will produce

  1. A templated binary search tree implementation: linkedBST.
  2. A series of unit tests to verify that your BST implementation works.

As with most assignments for the rest of the semester, you will be working with a partner. Both partners should be present and working on the code together. You will both be responsible for understanding all concepts, so dividing and conquering is not an option. The academic integrity policy applies to the entire pair; you cannot work or share code with anyone outside your partner.

You and your lab partner will share the same git repository, which is named lab06-<user1>-<user2>.git. Please access your lab by cloning the appropriate git repository, e.g.:

$ git clone git@github.swarthmore.edu:CS35-F17/lab06-mgagne1-adanner1.git
You can also get the ssh git url from CS35 github org. For this lab, we also are hosting several test files. Instead of giving everyone identical copies of the somewhat large files, you will create a symbolic link to the folder containing the files.
$ cd ~/cs35/labs/lab06
$ ln -s /usr/local/doc/lab06-data/  ./test_data

Starting Code

Below is an overview of the files required for submitting this lab. Those highlighted in blue will require implementation on your part. Those highlighted in black are complete and should not be modified except for comments.


Binary Search Tree Implementation

Your main work for this lab is to complete and test the LinkedBST implementation. In linkedBST.h and linkedBSTNode.h we have defined templated LinkedBST and linkedBSTNode classes for you. The public methods should be implemented in linkedBST-inl.h and the helper methods are partially implemented in linkedBST-private-inl.h. We have given you a complete implementation of the findInSubtree method to get you started.

Our LinkedBST itself contains just two pieces of data:

Each LinkedBSTNode contains:

Most of the public functions defined by the LinkedBST interface are simple calls to private recursive helper functions, which perform the required operation on some subtree of the tree. For example, the height of a node is equal to one more than the maximum height of its children.

One peculiarity of our implementation is that our helper update functions (insertInSubtree and removeFromSubtree) return a pointer to the root of the subtree they modified. This pointer is usually the same node as the node passed as an argument to the function (current), but can be a new node (for insertInSubtree) or a different previously-existing node (for removeFromSubtree) if their modification changes the root node of the subtree. This peculiarity greatly simplifies the binary search tree implementation because it allows a node's parent to easily maintain a correct tree structure if the modification changes the parent's child node.


Tree Traversals

The LinkedBST class must provide implementations of traversal methods: methods which allow us to see every piece of data in the tree. A traversal is given by a list of key-value pairs; we will represent such a pair as a pair object.

Example BST

The BST interface supports four different kinds of traversal:

The first three forms of traversal are quite similar and are natural to define recursively. The level-order traversal is somewhat different and requires a breadth-first search approach similar to that which you used in the previous lab.

Each tree traversal function returns an STL vector of all key-value pairs in the tree, in the order specified by the tree traversal.

Wikipedia has some nice pictures of these traversals for reference. Note that your program is adding the elements to a list rather than “displaying” them as described on that Wikipedia page.


The vector STL class

The vector<T> class, found in the vector library, is the C++ STL implementation of an array list. It is used somewhat differently from our List<T> interface. Here’s a handy translation guide:

Operation List code vector code
Insert at front myList.insertAtHead(value) no simple equivalent
Insert at back myList.insertAtTail(value) myVector.push_back(value)
Determine size myList.getSize() myVector.size()
Get element by index myList.get(index) myVector[index]
Set element by index no equivalent myVector[index] = value
Remove from back myList.removeTail() myVector.pop_back()

One other difference is that the pop_back method is void; you must retrieve the last element yourself if you want to see that element before it is removed.

The primary reason that we introduce vector for this lab is because vectors can also be copied just like pairs. Consider the following code:

List<int>* listPointer1 = new STLList<int>; // create a pointer to a new List
List<int>* listPointer2 = listPointer1;  // copy the pointer
listPointer1->insertAtTail(4);           // add an element
cout << listPointer2->getSize() << endl; // prints 1; they point to the same List
LinkedList<int> list1;                   // create a statically-allocated list
LinkedList<int> list2 = list1;           // illegal! Lists don't know how to copy themselves!

vector<int> vector1;                     // create a statically-allocated
vector1.push_back(4);
vector1.push_back(6);
vector<int> vector2 = vector1;           // vectors do know how to copy themselves
cout << vector2.size() << endl;          // prints 2 (since both elements were copied)
vector1[0] = 2;                          // assigns 2 to the first position of vector 1
cout << vector2[0] << endl;              // prints 4; vector 2 is a different vector

Some methods of the BST ADT interface use vector to prevent you from having to worry about memory management (e.g. delete) for the values they return.


Testing your BST

In addition to implementing the LinkedBST class and the helper methods, you must write unit tests for them as well. To get you started, some of the tests in tests.cpp have been completed for you. For each remaining test, a comment containing the string “TODO” has been provided. You must write one test for each of these comments that tests the appropriate method or function in a non-trivial way. For the BST methods, you must provide a test that operates on a tree of height greater than one.

TEST(negativeHeight) {
  LinkedBST<int,string> tree;
  CHECK_EQUAL(-1, tree.getHeight());
}

Checking to ensure that the height of an empty tree is indeed -1 is a good thing to do, but if you were required to write a test for getHeight, this test would not suffice; it does not trigger recursion within the getHeightFromSubtree function. You are welcome to include the above test (or tests like it) in your lab submission, but make sure that each of the functions and methods has at least one non-trivial test.

Please remember that these tests are a development tool as well as a lab requirement. You should write these tests as soon as possible and use them to help you find bugs and verify the correctness of your code. You may even wish to write the tests before you write your implementation just so you can make certain that you know what you expect your code to do. If you use manualTests to figure out what’s happening in your code, you might also consider copying that code into tests.cpp once you figure out what you expect from it.

Consult the macro reference for UnitTest++ here for more tips on what commands/macros are available during unit testing.


Memory Management

For this lab, your program is required to run without memory errors or leaks. You should use valgrind as you proceed in your testing to track memory errors. When you have a complete first draft of your implementation:


Commenting and Coding Style Requirements
For this and future labs, graders will assign minor penalties for poor commenting and coding style. Here are some style tips for this lab:
Survey
When you have completed your lab, answer a few short survey questions in the file README.md.

Summary

When you have completed your lab, you should have

Submit

Once you are satisfied with your code, hand it in using git. Remember the add, commit, push development cycle. You can push as many times as you like, but only the most recent submission will be graded. You may want to run git status to confirm all modifications have been pushed.