CS35 Final Exam Study Guide

The final exam for this course is comprehensive: it will contain material from throughout the semester, albeit with a slight emphasis on more recent material. This study guide includes questions that summarize major themes from this course and focuses on the additional material we’ve covered after Test 3. To prepare for the final exam, please consult the following problems as well as the previous study guides.

Data Structures and Invariants

You should be prepared to discuss the data structures we have created in class, the fields in their implementations, and the invariants we rely upon in order to write algorithms for those data structures.

What is an ADT? What is the difference between an ADT and a data structure?

An ADT is a description of how data can be accessed and manipulated. In other words, an ADT is an interface allowing the caller to store and retrieve data. A data structure is a concrete realization of an ADT; data structures give specific algorithms to implement the interface that the ADT describes.
Give an example of an ADT and its data structures.

A List is an example of an ADT. Lists are defined by a collection of routines that allow a user to store and manipulate data such as insertFirst and removeFirst. Both linked lists and array lists are concrete implementations of the list ADT; each of them give implementations for the interface that the list ADT describes.
What is an invariant? Give an example of an invariant from one of the data structures we discussed in class.

An invariant is a feature of a data structure that should always remain true. For example, the size of an an ArrayList must be greater than or equal to 0 and less than or equal to the capacity.

Hash Tables

What is a hash? What is a hash collision?

A hash is a function which transforms data (e.g. a dictionary key) into uniform data like an integer. A collision occurs when two different inputs map to the same hash value.
We focused on one technique for resolving hash collisions in class called chaining. Briefly explain how chaining works when a collision occurs.

In chaining, each bucket in the hash table holds a chain of all of the keys (and their values) that have been mapped to that same location.
What is load factor? How is it used in a hash table?

The load factor of a hash table is its size divided by its capacity. Load factor is used to determine when more buckets are allocated in order to keep mappings spread out between buckets.
What is the worst case time for inserting an element into a hash table? How does this worst case time relate to the hashing function?

In the worst case, an element insertion takes $O(n)$ time; this occurs if all of the mappings have the same hash. The likelihood of this occurring is based on the quality of the hash function used with the table.
Suppose you have an empty chaining hash table with three buckets and a maximum load factor of 0.75. Assume that keys and values are integers and that the hash function is hash(x, capacity) is x % capacity. We will insert the mappings $4 \mapsto 8$, $1 \mapsto 5$, $0 \mapsto 6$, and $6 \mapsto 2$. Draw the hash table after each addition. Assume that ensure_capacity doubles the number of buckets before rehashing.

Empty hash table:

$latex 6dc2b1ab2db9209a90d09a6e9f5bcb5c$

After $4 \mapsto 8$:

$latex 0b6bef8ae12b31892b618f8806c71b1e$

After $1 \mapsto 5$:

$latex 60402fe427f67fe79297f9a9857ce7cb$

After $0 \mapsto 6$, which expands the hash table:

$latex 03abf0e07b19cc721ef5c85fb92945f5$

After $6 \mapsto 2$:

$latex 3eff7d5e71cf57635917e44e0e9cdfcd$

Graphs

What is a graph?

A graph is a pair of sets. The first set, $V$, is a set of vertices of any type. The second set, $E$, is a set of edges. An edge is a grouping between a source vertex, a target vertex, a weight, and a label.
Define the following terms in relation to graphs:
1. Weighted
  
  A graph is weighted if its edges have weights / have meaningful weights / have non-uniform weights. A social network, for instance, is unlikely to be weighted, while a geographical map probably is (where weight is distance, difficulty of traveling, etc.).
2. Directed
  
  A graph is undirected if, for each edge heading in one direction, there is an identical edge heading in the opposite direction. A graph is directed if it is not undirected.
3. Connected
  
  A graph is connected if, for each vertex in the graph, there is a path to every other vertex in the graph. A graph is weakly connected if its undirected equivalent is connected.
Consider the following graph:

$latex 7b59868474ff06ccf42afb1e5656add7$
1. Which path would a breadth first search find from vertex A to vertex G?
  
  A breadth-first search would find the path [A, D, G].
2. Which path would Dijkstra’s algorithm find from vertex A to vertex G? Why is this path different from the BFS path above?
  
  Dijkstra’s algorithm would find the path [A, C, D, E, G]. This is different from breadth-first search because the sum of the weights of this path is less than the sum of the weights of the path with the fewest edges.
3. Ignoring the weights on this graph, perform a topological sort of it. Give the resulting topological sort as a list of vertices.
  There are many valid sequences that may result from the topological sort. Here are a two possible outcomes:
  
  [A, B, F, C, D, E, G]
  
  [A, C, B, F, D, E, G]