CS41: Midterm Exam Study Guide

The midterm exam covers all material introduced before fall break, up to and including what we have covered so far from the section on Divide and Conquer. You are responsible for all material discussed in lecture or lab, and all material on homework assignments. You are also expected to be familiar with data structures introduced in CS35.


You should be able to define and explain the following terms:

Practice problems:

  1. For each pair of functions, determine the tightest possible asymptotic comparison. e.g. if $f = O(g)$ and $f = \Omega(g)$, then write $f = \Theta(g)$. If no asymptotic comparison can be made, say so.
    • $f_1(n) = 2n^2 \log(n),\quad f_2(n) = n^3 + n^2/\log(n)$
    • $g_1(n) = n^{10},\quad g_2(n) = 2^{n/10}$
    • $h_1(n) = \sqrt{\log(n)}, \quad h_2(n) = \log(\log(n))$
  2. Prove the following statements.
    • If $f(n) \in \Omega(g(n))$ and $g(n) \in \Omega(h(n))$, then $f(n)\in \Omega(h(n))$.
    • If $f(n) \in \Omega(g(n))$, then $g(n) \in O(f(n))$.
  3. Order the following functions in increasing order of their asymptotic running time.
    • $f_1(n) = (\log\log(n))^2$
    • $f_2(n) = (3/2)^n$
    • $f_3(n) = n^2 + \log^{10}(n)$
    • $f_4(n) = n^{1.3}\log^5(n)$
    • $f_5(n) = 2^{\log^2(n)}$
    • $f_6(n) = (\sqrt{2})^{\log(n)}$
  4. In the shortest paths problem, you are given a graph $G = (V,E)$ with nonnegative lengths $\{\ell_e : e \in E\}$ along with a vertex $s \in V$, and you must return for each vertex $v \in V$ the length of the shortest path $s \leadsto v$.

    Dijkstra's algorithm provides an efficient solution to the shortest paths problem. Consider the following pseudocode for Dijkstra's algorithm:

    Dijkstra($G,s,\ell$) {
    S = {s}
    Initialize d[i] = $\infty$ for all vertices i
    d[s] = 0
    while $S \neq V$ {
    pick $v \in V\setminus S$ with the cheapest $\mathsf{cost}(v)$, where
    $\qquad\mathsf{cost}(v) := \min_{u \in S} \left(d[u] + \ell_{(u,v)}\right)$
    $d[v] := \mathsf{cost}(v)$
    add $v$ to $S$
    }
    return d
    }
    1. Show that Dijkstra's algorithm correctly returns the shortest paths. You can assume the graph is connected.
    2. What is the running time of Dijkstra's algorithm? You're free to choose whatever data structure(s) you need to answer this problem, but must provide the most efficient running time you can.
  5. Solve the following recurrence relations, using (i) recursion trees, and (ii) substitution method
    • $M(n) = 5M(n/2) + 3n, \text{for all } n > 2;\qquad M(2) = 5$
    • $A(n) = 3A(n/3) + 3n, \text{for all } n > 1;\qquad A(1) = 4$
    • $T(n) = 4T(n/2) + 4n^4, \text{for all } n >4;\qquad T(4) = 4$
  6. Suppose you are given two lists $A,B$ each of $n$ positive integers. You are allowed to reorder the elements in each list however you wish, after which you get paid $$\sum_{i=1} A[i]\cdot B[i]$$ Design and analyze a greedy algorithm to maximize your payout. Prove your algorithm returns the optimal payout. What is the running time of your algorithm?
  7. Finding a bichromatic edge. Suppose $G = (V,E)$ is a graph, whose vertices are colored red or blue. A bichromatic edge in G is an edge whose endpoints are colored differently.
    1. Design and analyze an algorithm that takes a graph $G = (V,E)$ with vertices colored red or blue, and returns a bichromatic edge, or says that none exists.
    2. Now, suppose that the graph $G$ is a path---$V = \{v_1,\ldots, v_n\}$, and $E = \{(v_i,v_{i+1}), 1 \leq i < n\}$. Further suppose that $v_1$ is red and $v_n$ is blue. Design and analyze an efficient algorithm that finds a bichromatic edge in this special case.
  8. You are helping some security analysts monitor a collection of networked computers, tracking the spread of a virus. There are $n$ computers in the system, call them $C_1$, $C_2$, \ldots, $C_n$. You are given a trace indicating the times at which pairs of computers communicated. A trace consists of $m$ triples $(C_i, C_j, k)$ that indicate that $C_i$ communicated with $C_j$ at time $t_k$. At that time, the virus could have spread between $C_i$ and $C_j$.
    We assume that the trace holds the triples in sorted order by time. For simplicity, assume that each pair of computers communicates at most once over the time of the trace. Also, it is possible to have pairs $(C_s, C_j, k)$ and $(C_t, C_j, k)$: this would indicate that computer $C_j$ opened connections to both $C_s$ and $C_t$ at time $t_k$, allowing the virus to spread any way among the three machines. Design and analyze an efficient algorithm that, given as input a collection of time-sorted trace data and a virus query, answers the question ``if the virus was introduced to $C_i$ at time $x$, could it spread to $C_j$ at time $y$?'' That is, is there a sequence of communications that could have lead to the virus moving from $C_i$ to $C_j$?