CS35: Test 3 Study Guide

The tests in this course may contain any material covered in lecture up until this point, although much stronger emphasis will be placed on material covered since Test 2. This page contains a collection of exercises that you may wish to complete to prepare for the Test.

You might also want to look at study guides for the previous exams:

Note:

Many of the trees and graphs in this study guide are written in a crude, basic format. Be prepared to write your own trees and/or graphs similarly. You are not required to follow any specific format, but it should be clear to the graders what you intend to convey.

Induction

Prove by induction that for all \(n\geq 1\) we have \(\sum\limits_{i=1}^{n} i \cdot 2^i = 2^{n+1}(n-1) + 2\).

Solution:

   Proof:

   Base case: (n=1).
     In this case, the left-hand side of the inequality we want to show is equal to 1*2^1 = 2.
     On the other hand, when n=1, the right-hand side equals 2^2*(1-1) + 2 = 0+2 = 2.
     Since the LHS and RHS are equal, the equality holds for the base case of n=1.

   Inductive Hypothesis:  Assume the equality holds for all m<n.

   Inductive Step:
     We want to show the equality holds for m=n as well.  Note that:

     \sum_{i=1}^n i* 2^{i} = n*2^n + \sum_{i=1}^{n-1} i*2^i (taking out the largest term in the summation)
        = n*2^n + 2^{(n-1)+1}((n-1)-1) +2                   (by the induction hypothesis)
        = n*2^n + 2^n(n-2) + 2
        = n*2^n + n*2^n - 2*2^n + 2
        = 2*n*2^n - 2*2^n + 2
        = n*2^{n+1} - 2^{n+1} + 2
        = 2^{n+1}(n-1) + 2

      Thus, the claim holds for m=n as well.  By induction the proof holds for all n>=1

Prove by induction that for all \(n\geq 1, 4^n-1 \text{ is divisible by 3}\).

Solution:

   Proof:

   Base case: (n=1).
     In this case, we have 4^n-1 = 4^1-1 = 4- 1 = 3.  Since 3 is obviously divisible
      by 3, the claim holds for the base case.

   Induction Hypothesis:  Assume for all m<n that 4^n-1 is divisible by 3.

   Induction Step:
     We want to show the claim holds for m=n as well.  Note mathematical

     4^n-1 = 4^n-4 + 3
           = 4(4^{n-1} - 1) + 3
           = 4(3*k_{n-1})  + 3       (for some integer k_{n-1}, by the induction hypothesis)
           = 3*(4k_{n-1} +1)         (factoring out 3)
           = 3*k_n                   (for k_n := 4k_{n-1}+1 which is a positive integer)

     Thus, 4^n-1 is divisible by three, proving the claim by induction.

Dictionary ADT

What does a Dictionary store?

Solution:

  A Dictionary stores keys and their associated values.

Which Dictionary operations do we want to run as efficiently as possible?

Solution:

   The operations `contains(key)`, `get(key)`, `insert(key, value)`, and `remove(key)` must run efficiently.

Name all of the implementations of the Dictionary ADT that we discussed in this class.

Solution:

  The implementations that we considered were Binary Search Trees, AVL Trees, Hash Tables,
  and LinearDictionary (using vectors).

Compare and contrast the various Dictionary implementations. What run times do we expect on the key operations for each implementation? When would we prefer one implementation over another.

Solution:

  * Binary Search Trees.  The runtime for these are O(h), where h is the hight of the tree.
  In the worst case, the height of a BST is O(n).  Because BSTs do not have height
  guarantees, but Balanced BSTS like AVL Trees do, we generally prefer the latter.

  * AVL Trees.  The worst-case runtime is O(log n).  This implementation is preferred
  when knowing some information about the ordering of the keys is important e.g. getMaxKey()

  * Hash Tables.  In practice, the runtime for each operation is O(1).

    This implementation is the fastest on average, and is thus preferred in most cases.
    However, it trades space for time, so there might be situations where you are not
    willing to make this tradeoff.  It also requires the use of a hash function, which
     for some domains may be non-trivial to create.

    Additionally, the operations take O(n) time in the worst case, because we could get
     very unlucky and map all items to the same bucket.

  * LinearDictionary (using vectors).  The runtime of each operation is O(n).

    This implementation is preferred when the size of the dictionary is *known* to
    be small.  Then, this simpler approach will have less overhead.

Trees

Define the following terms with respect to tree data structures:
- Node
- Leaf
- Depth
- Height
- Binary tree
- Binary search tree
- AVL tree
- Max heap

Solution:

  * A *node* is an object that holds the *key*, *value*, and pointers to the *left*
   and *right* children of a particular location within a tree data structure.

  * A *leaf* is a node in a tree that has no children.

  * The *depth* of a node is the length of the root-->node path.

  * The *height* of a tree is the maximum depth of any node in the tree.

  * A *binary tree* is a tree for which each node has at most one left child and at
   most one right child.

  * A *binary search tree* is a binary tree in which obeys the BST Property:
   for each node, all keys in the left subtree are less than the node's key and
   all keys in the right subtree are greater than the node's key.

  * An *AVL tree* is a binary _search_ tree in which obeys the AVL Tree Property:
   for every node, the height of the left subtree and the height of the right subtree
    differ by no more than one.

  * A *binary heap* is a complete binary tree in which every node's prioiry is greater
   than or equal to the priorities of its children.

Binary Search Trees

Show the BST that would result from inserting the following keys into an initially empty tree: 12, 8, 5, 17, 1, 9, 22.

Solution:

  Here is the resulting BST:

                  12
              /       \
             8         17
            /  \        \
           5    9        22
          /
         1

Is this the only BST that could contain these keys? If not, give at least one example of another BST that holds these same keys.

Solution:

   No, this is not the only BST that could hold these keys.  If the keys were inserted
    in sorted order from 1 through 22, then the tree would be a long line of right
    branches with 1 at the root and 22 as the only leaf.

  Here is another BST that could have these keys:

               9
            /     \
           5       17
          / \     /  \
         1   8   12   22

Is the following tree a BST? If not, why not?

       15
   /        \
  4          26
 /  \      /    \
1    7    17     40
            \      \
             19     38

Solution:

   This is **not** a BST, because the BST Property is violated at node 40.
   All keys in 40's right subtree should be larger, but it's right child
   has key 38, which is smaller than 40.

If you inserted 44 into the following BST, where would it need to go?
```
      50
    /
  37
 /  \
16   41
```

Solution:

   To maintain the BST property, 44 would need to go to the right of 41.

AVL Trees

Consider the following AVL Tree:

           12
      /          \
     6            17
   /   \         /   \
  4     9       14    19
 / \     \     /  \
2  5      10  13   16

Write the heights of the nodes for each tree

Solution:

|----------------|
| Node  | Height |
|-------|--------|
|   2   |    0   |
|   4   |    1   |
|   5   |    0   |
|   6   |    2   |
|   9   |    1   |
|   10  |    0   |
|   12  |    3   |
|   13  |    0   |
|   14  |    1   |
|   16  |    0   |
|   17  |    2   |
|   19  |    0   |
|----------------|

Consider adding 1 to the AVL Tree. Which nodes' heights change? What does the tree look like after inserting 1?

Solution:

  When inserting 1, the height of nodes 2,4,6, and 12 all increase by 1.  No another
  nodes change height, and all nodes remain balanced.  The AVL Tree now looks like:

             12
        /          \
       6            17
     /   \         /   \
    4     9       14    19
   / \     \     /  \
  2  5      10  13   16
 /
1

Now insert 15. What does the AVL Tree look like post-insertion?

Solution:

  Inserting 15 causes a LR imbalance at 17, which requires (i) a left rotate on 14
  followed by a right rotate on 17.  No other nodes need be rebalanced.

   The AVL Tree now looks like:

             12
        /          \
       6            16
     /   \         /   \
    4     9       14    17
   / \     \     /  \     \
  2  5      10  13   15   19
 /
1

Now, remove 5. What does the AVL Tree look like post-deletion?

Solution:

  Deleting 5 causes a LL imbalance at 4, which requires a right rotate on 4.

   The AVL Tree now looks like:

             12
        /          \
       6            16
     /   \         /   \
    2     9       14    17
   / \     \     /  \     \
  1   4    10   13   15   19

Now, remove 9. What does the AVL Tree look like post-deletion?

Solution:

  The AVL Tree is still balanced after deleting 9.
   The AVL Tree now looks like:

             12
        /          \
       6            16
     /   \         /   \
    2     10      14    17
   / \           /  \     \
  1   4        13   15   19

Now, remove 6. What does the AVL Tree look like post-deletion?

Solution:

  6 has two children, to remove 6 we copy the greatest key in 6's left subchild
  into 6, then delete that node.  After removing, the AVL Tree looks like:

             12
        /          \
       4            16
     /   \         /   \
    2     10      14    17
   /             /  \     \
  1             13   15   19

  This AVL Tree remains balanced, so no rebalancing is necessary.

Hash Tables

What is a hash function? What is a hash collision?

Solution:

A hash function is a function which transforms data (e.g. a dictionary key) into
 an integer. A hash collision occurs when two different inputs map to the same hash value.

What is separate chaining?

Solution:

  Separate chaining is a technique for resolving hash collisions.  Instead of
  storing a single key/value pair in a bucket, separate chaining stores all items whose
  keys map to this bucket.  In lab, we used LinearDictionaries to store the items in
  each bucket.

What is load factor? How is it used in a hash table?

Solution:

 The load factor of a hash table is its size divided by its capacity.  Load factor
  is used to determine when more buckets are allocated in order to keep mappings
  spread out between buckets.

What is the worst case time for inserting an element into a hash table? How does this worst case time relate to the hashing function?

Solution:

 In the worst case, an element insertion takes O(n) time; this occurs if all of the
  mappings have the same hash value.  The likelihood of this occurring is based
   on the quality of the hash function used with the table.

Suppose you have an hash table with three buckets and a maximum load factor of 0.75. Assume that keys and values are integers and that the hash function is hash(x, capacity) is x % capacity. Insert the following key/value pairs: (4,8), (1,5), and (0,6). Draw the hash table after each addition. Assume that expand_capacity doubles the number of buckets.

Solution:

Before inserting:
   ------
   |    |
   ------
   |    |
   ------
   |    |
   ------

After inserting (4,8):

   ------
   |    |
   ------
   |  *-|--> [(4,8)]
   ------
   |    |
   ------

After inserting (1,5):

   ------
   |    |
   ------
   |  *-|--> [(4,8), (1,5)]
   ------
   |    |
   ------

Inserting (0,6) causes the hash table to expand capacity.  After inserting (0,6):

   ------
   |  *-|--> [(0,6)]
   ------
   |  *-|--> [(1,5)]
   ------
   |    |
   ------
   |    |
   ------
   |  *-|--> [(4,8)]
   ------
   |    |
   ------

Priority Queues and Binary Heaps

What is the difference between a queue and a priority queue?

Solution:

 A priority queue has a priority type as well as an element type.  When elements are
 inserted into a priority queue, they are paired with a priority (in contrast to a
 queue, which simply requires the element).  In a queue, elements are dequeued in
 the order that they are enqueued; in a priority queue, they are removed in an order
  based upon their associated priorities.

What does it mean for a binary tree to be complete?

Solution:

  A complete binary tree is a binary tree where all levels are full except possibly
  the last level, which has nodes filled in from left to right.

Consider the following lists of numbers. Write each as a complete binary tree using the representation we discussed in class.
- [2, 3, 9, 3, 6, 2]
- [11, 4, 4, 3, 4, 6, 9, 10, 2]

Solution:

      2
    /   \
   3     9
  / \   /
 3   6 2

        11
      /     \
      4      4
    /  \    /  \
   3    4  6    9
  /  \
  10  2

Is the following statement true or false? If true, give a justification. If false, give a counterexample. "All BSTs are binary heaps."

Solution:

This statement is *false*.  Binary heaps must be complete and must obey the Heap order
Property.  BSTs don't need to be complete.  Additionally, BSTs generally do not
obey the Heap Order Property.

    18
   /  \
  5    22

The tree above is a BST, but does not obey the Heap Order Property, because 22 is
greater than 18.

Graphs

What is a graph?

Solution:

A graph G is a pair of sets.  The first set, V, is a set of vertices.  The second set, E,
is a set of edges.  An edges is a pair of vertices e.g., (i,j).  A graph represents a
binary relationship between vertices.

Edges in a graph can be *weighted* or *unweighted*, and *directed* or *undirected*.

Define the following terms in relation to graphs:
- Weighted
- Directed
- Connected

Solution:

    A graph is *weighted* if its edges have weights or costs.  A social network,
    for instance, is unlikely to be weighted, while a map probably is
    (where the weight might represent distance, difficulty of traveling, etc.).

    A graph is *directed* if the edges represent assymmetric relationships between
    vertices.  If the relationship is symmetric, the graph is *undirected*.

    An undirected graph is *connected* if, for each vertex in the graph, there is
    a path to every other vertex.

What is an Adjacency List?

Solution:

  An adjacency list is a way of implementing a Graph, where vertices are
  stored together with their list of neighbors or edges.  In pseudocode, we assumed
  the vertices were numbered 1,..., n and drew the adjacency list as an array of
  (vertex, List of vertices) pairs.

  In the lab, we implemented an adjancy List as a Dictionary, with string keys
  representing vertices, and the values vector<Edge> representing the edges incident
  to the vertex key.

Consider the following graph:

A ---  B -- C
|  \  /     |
D -- E ---- F

Perform BFS to find a path from A to F. (in lab we called this function shortestLengthPathBFS) What path gets returned?

Solution:

   [A,E,F]