CS35: Final Exam Study Guide

This study guide includes topics covered since Quiz 3; you should also study all concepts from earlier in the course. The earlier study guides are available here: Quiz 1, Quiz 2, and Quiz 3.

You should be able to define or explain the following terms:

The Dictionary abstract data type
Hash tables, hash functions, hash code, a hash table's load factor
Linear chaining and linear probing
graph, vertex, edge
directed, undirected graphs
weighted, unweighted graphs
adjacent, neighbor, outgoing edges, incoming edges, out-degree, in-degree, degree
path, reachable, path length, distance, cost
connected component, strongly-connected component
adjacency-list representation, adjacency-matrix representation
breadth-first search, depth-first search of a graph
shortest path problem
Dijkstra's algorithm
cycle
directed acyclic graph (DAG)
topological sort
trees
spanning tree, minimum spanning tree (MST)
Kruskal's, Prim-Jarnik's Algorithm

You should be familiar with the following C++-related concepts:

copy constructors
references (at least how to recognize them...)
operator[]
vectors and their functions, particularly those related to the List interface
valgrind

Practice problems

Using linear chaining to resolve hash collisions, insert the following five items into a hash table of capacity 5, in the order given (from top to bottom):

Key Hash code

A 1

B 1

C 1

D 0

E 2

Repeat #1 using linear probing to resolve hash collisions, instead of chaining.

Consider the following hash function, like the function in hashtable.inl:

  int hash(int key, int capacity) {
    return key % capacity;
  }

Assuming that the hash table does not resize, complete the following code so that the total running time is asymptotically proportional to n^2:

  void f(int n) {
    HashTable<int,string> ht(n);  // Creates hash table with capacity n.
    for (int i = 0; i < n; ++i) {

      ht.insert(________________, "skittles");
    }
  }

Consider the following graph:
1. Give the adjacency-list representation of the graph.
2. Give an adjacency-matrix representation of the graph.

Play the Dijkstra Adventure Game by running dag in a terminal window. Be sure to play once or twice with the --random option:
```
  $ dag --random
```
(I'm sorry for how tedious the game can be -- the graph is big!)

In this problem we will analyze Dijkstra's algorithm as given in lecture:
```
  Dijkstra(G, s):
      for each vertex v in G:
          d[v] = INFINITY
      d[s] = 0
      PQ = make empty heap
      PQ.insert(s with priority 0)

      while !PQ.isEmpty():
          current = PQ.removeMin()
          for each neighbor v of current:
              if d[v] > d[current] + weight(current, v):
                  d[v] = d[current] + weight(current, v)
                  PQ.insertOrUpdatePriority(v with priority d[v])
```
Assume that Dijkstra's algorithm is executing on a graph with n vertices and m edges, and that PQ.insertOrUpdatePriority has a worst-case running time of O(lg n) for a heap containing n items.
1. In the worst-case, how many items might be stored in the heap PQ?
2. In the pseudocode above, d is a dictionary where each key is some vertex v and the value is the currently-shortest-known distance from s to v. How many key-value pairs are stored in this dictionary? What data structure might you use to store this dictionary, and what is the running time of getting or setting the value for a key?
3. For a single neighbor v of some current vertex, what is the running time of that neighbor's single execution of the inner for-loop:
```
  if d[v] > d[current] + weight(current, v):
      d[v] = d[current] + weight(current, v)
      PQ.insertOrUpdatePriority(v with priority d[v]) 
```
4. How many total times will that inner for-loop execute for the entire graph? In other words, in terms of n and m, how many neighbors (total) are in the graph?
5. The analysis of part (d) is a bit strange at first because we're considering the total number of neighbors in the entire graph rather than just the neighbors of the current vertex (which varies from graph to graph). This allows us to analyze the total cost of all executions of the for-loop (in sum, for all executions of the while-loop), though, rather than just analyze the cost of exploring the current vertex's neighbors. The total cost of all executions of the inner for-loop is then:
```
  O( #num_neighbors +  
     #num_neighbors*cost_of_each_for_loop_execution ) 
```
  (The first term, #num_neighbors, is the total cost of getting the neighbors for every vertex in the graph using an adjacency-list representation. The second term is the product of (c) and (d) above.) Combine your answers to part (c) and (d) to determine the total cost of executing the for-loop.
6. The only remaining costs in this algorithm are the initialization cost (before the while-loop) and the total cost of executing the line
```
  current = PQ.removeMin() 
```
  for all executions of the while-loop. What are these costs, in terms of n and m? What is the total cost of Dijkstra's algorithm?

Consider the following set of courses and their prerequisites:
- PHYS 005: none
- PHYS 007: PHYS 005 and MATH 025
- PHYS 008: PHYS 007 and MATH 033
- PHYS 014: PHYS 008, MATH 027, and MATH 033
- PHYS 050: MATH 027 and MATH 033
- PHYS 111: PHYS 014 and MATH 033
- PHYS 112: PHYS 014, PHYS 050, and MATH 033
- PHYS 113: PHYS 111 and MATH 027
- PHYS 114: PHYS 111 and MATH 033
- MATH 025: none
- MATH 027: none
- MATH 033: MATH 025
Assuming that a physics student can take only one course at a time, use a graph algorithm from this course to find a sequence of courses that satisfies the prerequisites. Show the graph on which you executed the graph algorithm.

In lecture we saw a variant of recursive depth-first search that also determined if the graph contained a cycle. Recursive depth-first search is extremely simple and elegant if the algorithm does not need to track additional information or return any value:
```
  dfs(G, src):                           // This function initializes the
      isVisited = new dictionary         // dictionary and calls the
      for each vertex v in G:            // recursive helper function.
          isVisited[v] = false
      recursiveDfs(G, src, isVisited)

  recursiveDfs(G, src, isVisited):       // This recursive function
      isVisited[src] = true              // searches the graph.
      for each neighbor v of src:
          if !isVisited[v]:
              recursiveDfs(G, v, isVisited)
  
```
1. Execute dfs on the following graph, using s as the source vertex. Draw the stack frame diagram for the program as it executes. (Assume that isVisited refers to a single copy of the dictionary for all frames of the stack.)
2. Modify the pseudocode above to accept a second vertex, dest, as an argument. Return true if there is a path from src->dest in G, and return false otherwise.
3. Modify the pseudocode again to return the length of some src->dest path (not necessarily the shortest path) if there is a path from src to dest in G. If there is no src->dest path, return -1.

Execute Prim's algorithm to find a MST of the following graph using s as the initial vertex. For each step of the algorithm, show which edges and vertices have been selected for the MST up to that step.

Find the smallest weighted, undirected graph G such that G has multiple distinct MSTs.

Key	Hash code
A	1
B	1
C	1
D	0
E	2