CS41: Final Exam Study Guide

This study guide includes topics covered over the entire semester. The final will be cumulative, although weighted towards the second half of the semester. This study guide is intended to be comprehensive, but not guaranteed to be complete.

In addition to using this study guide, you should:

You should be able to define or explain the following terms:

Practice problems

  1. Divide and Conquer. (Kleinberg and Tardos 5.1) You are interested in analyzing some hard-to-obtain data from two separate databases. Each database contains n numberical values---so there are 2n values total---and you may assume that no two values are the same. You'd like to determine the median of this set of 2n values, which we will define here as the nth smallest value.
    However, the only way you can access these values is through queries to the databases. In a single query, you can specify a value k to one of the two databases, and the chosen database will return the kth smallest value that it contains. Since wueries are expensive, you would like to compute the median using as few queries as possible.
    Give an algorithm that finds the median value using at most O(log n) queries.
  2. Dynamic Programming. In the Subset-Sum problem, you are given n items {1, ..., n}. Each item has a nonnegative integer weight wi. You are also given an integer weight threshold W. For a subset of elements S, define w(S) to be the sum of wi, taken over all i in S. Your goal is to output the set S that maximizes w(S), subject to w(S) <= W.
    Design and analyze an algorithm to solve the Subset-Sum problem. Your algorithm should run in O(nW) time.
  3. Intractability. Consider the following decision-version of Subset-Sum, which we'll call SSum. In this version, you're given n items {1,..., n} with item weights wi, and a weight threshold W, and you must output YES iff there exists a subset whose item weights total exactly W; i.e., if there is S such that w(S) = W.
    1. Show that SSum is NP-Complete.
    2. Show how to solve SSum using your solution for Subset-Sum.
    3. Your algorithm for Subset-Sum runs in O(nW) time. Why does this not give you a polynomial-time algorithm for SSum??
  4. Approximation Algorithms. (Kleinberg and Tardos 11.4) Given a set A = {a1, ..., an} and subsets B1, ..., Bm of A, a hitting set is a subset H of A such that for any 1 <= i <= m, H intersects with Bi. In the HittingSet problem, written HS, you're given A and B1, ..., Bm. Furthermore, each element ai has an item weight ai. You must output the hitting set H of minimum cost.

    Give a polynomial time b-approximation algorithm for HS, where b = max1 <= i<= n |Bi|. Hint: Use LP-relaxation.

  5. Approximation Algorithms. (Kleinberg and Tardos 11.10) Suppose you are given an n by n grid graph G. Associated with each node v is a weight w(v), which is a nonnegative integer. You may assume taht tthe weights of all nodes are distinct. Your goal is to choose an independent set S of nodes of the grid so that the sum of the weights of the nodes in S is as large as possible. Consider the following greedy algorithm for this problem:
    1. Start with S equal to the empty set
    2. While some node remains in G
      • Pick the heaviest remaining node v.
      • Add v to S
      • Delete v and all its neighbors from G
    3. Return S

    First, let S be the independent set returned by this algorithm, and let T be some other independent set in G. Show that for each node v in T, either v is in S, or v is adjacent to some v' in S with w(v') >= w(v).

    Show that this algorithm returns an independent set with total weight at least 1/4 of the maximum total weight of an independent set.

  6. Randomized Algorithms. (Kleinberg and Tardos 13.8) Let G = (V,E) be an undirected graph with n nodes and m edges. For a subset of vertices X, let G[X] denote the subgraph induced on X; i.e., the graph whose vertices are X and whose edge set consists of all edges in G with both endpoints in X.
    We are given a natural number k <= n and are interested in finding a set of k nodes that induces a "dense" subgraph of G; we'll phrase this concretely as follows. Give a randomized algorithm that produces, for a given graph G and natural number k, a set X of k vertices with the property that the induced subgraph G[X] has at least [mk(k-1)]/[n(n-1)] edges.

    Your algorithm should have expected polynomial time but always output correct answers. What is the expected running time of your algorithm?

  7. Streaming Algorithms.
    1. Give a randomized streaming algorithm that, given input S = (a1, ..., am), outputs the ith item ai with probability 2-i, and outputs NULL with probability 2-m. Your algorithm should use O(log m + log n) space.
    2. Let p be any distribution on {1,..., m}. (e.g. if X is distrubuted according to p, then Pr[X=i] = pi.) Give a randomized streaming algorithm that, given input S = (a1, ..., am), outputs the ith item ai with probability pi. How much space does your algorithm use?