Data Structures and Algorithms

Test 3 Study Guide

The tests in this course may contain any material covered in lecture up until this point, although much stronger emphasis will be placed on material for which you have received graded feedback. This page contains a collection of exercises that you may wish to complete to prepare yourself.

Inductive Proofs

You should be prepared to give inductive proofs for simple mathematical summations.

Prove by induction that \(\sum\limits_{i=1}^{n} i \cdot 2^i = 2^{n+1}(n-1) + 2\).

We have \(P(n) \equiv \sum\limits_{i=1}^{n} i \cdot 2^i = 2^{n+1}(n-1) + 2\).

The base case, \(P(1)\), is to say \(\sum\limits_{i=1}^{1} i \cdot 2^i = 2^{1+1}(1-1) + 2\). We reduce this as follows:

\begin{align*} \sum_{i=1}^{1} i \cdot 2^i &= 2^{1+1}(1-1) + 2 \\\\ 1 \cdot 2^1 &= 2^{2}(0) + 2 \\\\ 2 &= 0 + 2 \end{align*}

For the inductive case, we show that \(P(k)\) implies \(P(k+1)\) for positive \(k\).

\begin{align*} \sum_{i=1}^{k} i \cdot 2^i &= 2^{k+1}(k-1) + 2 \\ \left(\sum_{i=1}^{k} i \cdot 2^i\right) + (k+1)\cdot 2^{k+1} &= 2^{k+1}(k-1) + 2 + (k+1)\cdot 2^{k+1} \\ \sum_{i=1}^{k+1} i \cdot 2^i &= 2^{k+1}(k-1) + 2^{k+1}(k+1) + 2 \\ \sum_{i=1}^{k+1} i \cdot 2^i &= 2^{k+1}(k-1+k+1) + 2 \\ \sum_{i=1}^{k+1} i \cdot 2^i &= 2^{k+1}(2k) + 2 \\ \sum_{i=1}^{k+1} i \cdot 2^i &= 2^{k+2}(k) + 2 \\ \sum_{i=1}^{k+1} i \cdot 2^i &= 2^{(k+1)+1}((k+1)-1) + 2 \\ \end{align*}

As we have shown both the base case and the inductive case, \(P(n)\) must be true for all integer \(n>1\).
Prove by induction that \(\sum\limits_{i=1}^{n} \frac{3}{4^i} = 1 - (\frac{1}{4})^n\).

We have \(P(n) \equiv \sum\limits_{i=1}^{n} \frac{3}{4^i} = 1 - (\frac{1}{4})^n\).

The base case, \(P(1)\), is to say \(\sum\limits_{i=1}^{1} \frac{3}{4^i} = 1 - (\frac{1}{4})^1\). We reduce this as follows:

\begin{align*} \sum_{i=1}^{1} \frac{3}{4^i} &= 1 - (\frac{1}{4})^1 \\ \frac{3}{4^1} &= 1 - (\frac{1}{4})^1 \\ \frac{3}{4} &= 1 - \frac{1}{4} \\ \frac{3}{4} &= \frac{3}{4} \\ \end{align*}

For the inductive case, we show that \(P(k)\) implies \(P(k+1)\) for positive \(k\).

\begin{align*} \sum_{i=1}^{k} \frac{3}{4^i} &= 1 - (\frac{1}{4})^k \\ \left(\sum_{i=1}^{k} \frac{3}{4^i}\right) + \frac{3}{4^{k+1}} &= 1 - (\frac{1}{4})^k + \frac{3}{4^{k+1}} \\ \sum_{i=1}^{k+1} \frac{3}{4^i} &= 1 - \frac{1^k}{4^k} + \frac{3}{4^{k+1}} \\ \sum_{i=1}^{k+1} \frac{3}{4^i} &= 1 - \frac{4}{4^{k+1}} + \frac{3}{4^{k+1}} \\ \sum_{i=1}^{k+1} \frac{3}{4^i} &= 1 + \frac{3-4}{4^{k+1}} \\ \sum_{i=1}^{k+1} \frac{3}{4^i} &= 1 - \frac{1}{4^{k+1}} \\ \end{align*}

As we have shown both the base case and the inductive case, \(P(n)\) must be true for all integer \(n>1\).

Dictionary ADT

What kind of information does a Dictionary store?

A Dictionary stores keys and their associated values.
Identify four operations provided by the Dictionary ADT. Describe what types of arguments they take and what types they return.
The Dictionary ADT provides
- contains, which takes a key and returns a boolean
- get, which takes a key and returns a value
- insert, which takes a key and a value and returns nothing.
- remove, which takes a key and returns nothing.
Which four Dictionary implementations did we discuss in class?

The implementations that we considered were Binary Search Trees, AVL Trees, Hash Tables, and Linear Dictionaries (using vectors).
Compare and contrast the various Dictionary implementations. What run times do we expect on the key operations for each implementation? When would we prefer one implementation over another.
- Binary Search Trees
  
  Run time: \(O(h)\) where h is the height of the tree, which in the worst case could be \(O(n)\)
  
  Because BSTs do not have any height guarantees, and AVL Trees do, we would typically prefer the AVL Tree over a BST.
- AVL Trees
  
  Run time: \(O(\log n)\)
  
  This implementation is preferred when knowing some information about the ordering of the keys is important. For instance we can easily find the min or max key within this implementation.
- Hash Tables
  
  Run time: \(O(1)\), on average
  
  This implementation is the fastest on average, and is thus preferred in most cases. However, it trades space for time, so there might be situtations where you are not willing to make this tradeoff. It also requires the use of a hash function, which for some domains may be non-trivial to create.
- Linear (using vectors)
  
  Run time: \(O(n)\)
  
  This implementation is preferred when the size of the dictionary is known to be small and this simpler approach will have less overhead.

Trees

Define the following terms with respect to tree data structures:
- Leaf
- Height
- Binary tree
- Binary search tree
- AVL tree
- Complete binary tree
- Max heap
- A leaf is a node in a tree that has no children.
- The height of a tree is the number of levels below the root, or the number of edges in the longest path from the root to a leaf, or one less than the number of levels in the tree.
- A binary tree is a tree for which each node has at most one left child and at most one right child.
- A binary search tree is a binary tree in which, for each node, all keys in the left subtree are less than the node’s key and all keys in the right subtree are greater than the node’s key.
- An AVL tree is a binary search tree in which, for every node, the height of the left subtree and the height of the right subtree differ by no more than one.
- A complete binary tree is a binary tree in which all levels are full except the last, where all nodes are packed to the left and all empty spaces are packed to the right.
- A max heap is a complete binary tree in which every node’s prioiry is greater than or equal to the priorities of its children.

Binary Search Trees

Show the BST that would result from inserting the following keys into an initially empty tree: 12, 8, 5, 17, 1, 9, 22.
Is this the only BST that could contain these keys? If not, give at least one example of another BST that holds these same keys. If so, explain why the resulting BST is unique.

No, this is not the only BST that could hold these keys. If the keys were inserted in sorted order from 1 through 22, then the tree would be a long line of right branches with 1 at the root and 22 as the only leaf.
Is the following tree a BST? If not, why not?

This is not a BST. The key 29 is to the right of the key 30, which violates the BST invariant.
If we tried to find the key 22 in the following tree, where would it have to be?

If the key 22 where inserted into this BST, it would have to be to the right of the key 17.

AVL Trees

Draw a diagram showing the “rotate left” operation on binary trees. When is it impossible to rotate a binary tree to the left?

It is only possible to perform left rotation if the root has a right child; a binary tree with no right child of the root cannot be left-rotated.

Write pseudocode for the AVL rebalancing algorithm.

Method rebalance(node)
  d <- node.rightChild.height - node.leftChild.height
  If d < -1 Then
    If node.leftChild.leftChild.height < node.leftChild.rightChild.height Then
      node.leftChild <- rotateLeft(node.leftChild)
    EndIf
    Return rotateRight(node)
  Else If d > 1 Then
    If node.rightChild.leftChild.height > node.rightChild.rightChild.height Then
      node.rightChild <- rotateRight(node.rightChild)
    EndIf
    Return rotateLeft(node)
  EndIf
  Return node
End Function

Consider the following tree:
- Write the heights of each node for the tree above.
- Consider the addition of the value 1 to the above tree. Which nodes’ heights change?
- What will the tree look like after the value 1 is inserted?
- What will the tree look like after the value 15 is inserted into the original tree?
- What will the tree look like after the value 5 is deleted from the original tree?
- What will the tree look like after the value 9 is deleted from the original tree?
- What will the tree look like after the value 6 is deleted from the original tree?
- Showing the heights above each node:
- Showing the insertion of 1 into the tree and the changed heights as bold.
- Inserting 15 into the original tree. Note that a rebalancing is necessary because 17 was out of balance after the 15 was added.
- Removing 5 from the original tree. No rebalancing is necessary.
- Removing 9 from the original tree.
- Removing 6 from the original tree.

Priority Queues and Heaps

What is the difference between a queue and a priority queue?

A priority queue has a priority type as well as an element type. When elements are enqueued into a priority queue, they are paired with a priority (in contrast to a queue, which simply requires the element). In a queue, elements are dequeued in the order that they are enqueued; in a priority queue, they are dequeued in an order based upon their associated priorities.
Consider the following lists of numbers. Write each as a complete binary tree using the representation we discussed in class.
- [2, 3, 9, 3, 6, 2]
- [11, 4, 4, 3, 4, 6, 9, 10, 2]
- [9, 12, 3, 14, 2, 1, 6, 8, 8, 8, 9, 14, 7]
Is the following statement true or false? If true, give a justification. If false, give a counterexample. “All BSTs are heaps.”

The statement that “all BSTs are heaps” is false. The following BST shows this:

This binary tree is a BST because 3 < 4 (left subtree data are smaller) and 5 > 4 (right subtree data are larger). But it is not a max heap because 4 < 5 and, in a max heap, all parents must have a priority greater than or equal to the priority of their children.

Give pseudocode for the bubbleDown heap algorithm.

Method bubbleDown(index)
  left <- leftChild(index)
  right <- rightChild(index)
  If right >= size Then
    If left < size Then
      If contents[left].priority > contents[index].priority Then
        swap(contents[left], contents[index])
      EndIf
    EndIf
  Else
    If contents[left].priority > contents[right.priority] Then
      biggest <- left
    Else
      biggest <- right
    EndIf
    If contents[biggest].priority > contents[index].priority Then
      swap(contents[biggest], contents[index])
      bubbleDown(biggest)
    EndIf
  EndIf
End Function

Give pseudocode for the heapify heap algorithm to turn these complete binary trees into heaps. Briefly explain why this algorithm is \(O(n)\).
```
Method heapify()
  For index <- contents.size-1 Down To 0 Do
    bubbleDown(index)
  EndFor
End Function
```
This algorithm is \(O(n)\) because, although the loop runs \(O(n)\) times and bubbleDown takes \(O(\log n)\) worst-case time, most bubbleDown operations are small – that is, most nodes are closer to the leaves of the tree rather than the root. As a result, the overall number of actual swaps that occurs is \(O(n)\).