CS41: Final Exam Study Guide

This study guide includes topics covered over the entire semester. The final will be cumulative, although weighted towards the second half of the semester. This study guide is intended to be comprehensive, but not guaranteed to be complete.

In addition to using this study guide, you should:

Review your lecture notes
Look over assigned readings from the textbook
Review previous homeworks, and
Review lab assignments

You should be able to define or explain the following terms:

Stable Matching a.k.a. Stable Marriage
Asymptotic Notation (big-Oh, big-Omega, big-Theta)
induction
recurrence relations
recursion tree
substitution method (and partial substitution method)
directed, undirected graphs
breadth-first search, depth-first search (BFS/DFS)
bipartite graph
cycle
connected component
topological ordering
spanning tree
Minimum Spanning Tree (MST)
Kruskal's/Prim's Algorithms
Greedy Algorithm
stays-ahead method
exchange argument
Divide and Conquer
Dynamic Programming
memoization
tabulation
computational complexity
intractability
decision problem
optimization problem
complexity classes (P, NP, NP-Complete)
Cook-Levin Theorem
NP-Complete problems (SAT, VertexCover, IndependentSet, SetCover, 3-Coloring, 3SAT, ...)
Travelling Salesman Problem (TSP)
polynomial-time reduction
polynomial-time verifier
Approximation Algorithm
approximation ratio
minimization problem, maximization problem
pricing method
linear programming (LP)
LP-based approximation algorithms
Randomized Algorithms
discrete probability
random variable
expected value
linearity of expectation
Conditional vs Unconditional Lower Bounds
encoding argument
Lower Bound for comparison-based sorting algorithms
Quantum Computation

Practice problems

Divide and Conquer. (Kleinberg and Tardos 5.2) Recall the problem of finding the number of inversions. As in the text, we are given a sequence of n distinct numbers a₁,a₂,...,a_n, and we define an inversion to be a pair i< j such that a_i > a_j. However, one might feel like this measure is too sensitive. Call a pair a significant inversion if i < j and a_i > 2a_j. Give an O(n log(n)) algoirthm to count the number of significent inversions between two orderings.
Dynamic Programming. In the Subset-Sum problem, you are given n items {1, ..., n}. Each item has a nonnegative integer weight w_i. You are also given an integer weight threshold W. For a subset of elements S, define w(S) to be the sum of w_i, taken over all i in S. Your goal is to output the set S that maximizes w(S), subject to w(S) <= W.
Design and analyze an algorithm to solve the Subset-Sum problem. Your algorithm should run in O(nW) time.

Note:To get full credit, you only need to define the table your dynamic program will use, show how to use it to solve Subset-Sum, and show how to compute each table entry recursively, using smaller table entries. For example, if you needed to solve the Steel-Rod Problem, the following solution is sufficient for full credit:
- Let R[0...n] be a one-dimenstional table, such that R[k] is the maximum revenue obtainable from a k foot rod.
- We wish to output R[n].
- R[0] = 0, and for k>0, R[k] = max {P[j] + R[k-j]}, where the maximum is taken over all 1<=j<=n.
Of course, showing any intuition or work will maximize your chances of getting partial credit.
Intractability. Consider the following decision-version of Subset-Sum, which we'll call SSum. In this version, you're given n items {1,..., n} with item weights w_i, and a weight threshold W, and you must output YES iff there exists a subset whose item weights total exactly W; i.e., if there is S such that w(S) = W.
1. Show that SSum is NP-Complete.
2. Show how to solve SSum using your solution for Subset-Sum.
3. Your algorithm for Subset-Sum runs in O(nW) time. Why does this not give you a polynomial-time algorithm for SSum??
Approximation Algorithms. (Kleinberg and Tardos 11.4) Given a set A = {a₁, ..., a_n} and subsets B₁, ..., B_m of A, a hitting set is a subset H of A such that for any 1 <= i <= m, H intersects with B_i. In the HittingSet problem, written HS, you're given A and B₁, ..., B_m. Furthermore, each element a_i has an item weight a_i. You must output the hitting set H of minimum cost.
Give a polynomial time b-approximation algorithm for HS, where b = max_{1 <= i<= n} |B_i|. Hint: Use LP-relaxation.
Approximation Algorithms. (Kleinberg and Tardos 11.10) Suppose you are given an n by n grid graph G. Associated with each node v is a weight w(v), which is a nonnegative integer. You may assume taht tthe weights of all nodes are distinct. Your goal is to choose an independent set S of nodes of the grid so that the sum of the weights of the nodes in S is as large as possible. Consider the following greedy algorithm for this problem:
1. Start with S equal to the empty set
2. While some node remains in G
  - Pick the heaviest remaining node v.
  - Add v to S
  - Delete v and all its neighbors from G
3. Return S
First, let S be the independent set returned by this algorithm, and let T be some other independent set in G. Show that for each node v in T, either v is in S, or v is adjacent to some v' in S with w(v') >= w(v).
Show that this algorithm returns an independent set with total weight at least 1/4 of the maximum total weight of an independent set.
Randomized Algorithms. (Kleinberg and Tardos 13.8) Let G = (V,E) be an undirected graph with n nodes and m edges. For a subset of vertices X, let G[X] denote the subgraph induced on X; i.e., the graph whose vertices are X and whose edge set consists of all edges in G with both endpoints in X.
We are given a natural number k <= n and are interested in finding a set of k nodes that induces a "dense" subgraph of G; we'll phrase this concretely as follows. Give a randomized algorithm that produces, for a given graph G and natural number k, a set X of k vertices with the property that the induced subgraph G[X] has at least [mk(k-1)]/[n(n-1)] edges.
Your algorithm should have expected polynomial time but always output correct answers. What is the expected running time of your algorithm?
Unconditional Lower Bounds You and your friend Eddie have been discussing unconditional lower bounds for sorting algorithms. Eddie was skeptical of encoding arguments, but after some work, you've convinced him that any comparison-based sorting algorithm requires Ω(n log n) comparisons.
Now, Eddie has a new claim. Rather than sorting the entire list, Eddie wants an algorithm that takes an array of n elements and returns an array with the n/2 smallest elements in sorted order first, followed by the other n/2 elements which may or may not be in sorted order. Furthermore, Eddie claims an O(n) time algorithm which achieves this sort.
Now, you are the one who is skeptical. How skeptical should you be? Either give an O(n)-time algorithm that finds and sorts the n/2 smallest elements, or give a lower bound that shows it cannot exist.