CS97 -- Senior Conference, Fall 2010

Data Management Systems

Recent readings | Course information | Schedule | Grading | Academic Integrity


A Latex style file and example tex file for your final paper are now available.
Guidelines for your project proposal are now available.
Sample project ideas are now available.

Class Information

Room: Science Center 246
Class: Tuesday, Thursday 2:40–3:55pm
Professor: Charlie Garrod
Office: Science Center 255
Phone: 6071
Office hours: Wednesdays 2 - 3:30, or you can stop by whenever my door is open.

Welcome to CS97. This year's topic is data management systems. Our goal is to understand the fundamental problems of data management and the roles played by modern data management systems. With popular sites like Google and Facebook, GPS-enabled mobile computers in our pockets, and an increasing use of networked devices and sensors in our everyday environment, you are surely aware of the explosion in the amount of data we collect and share. In this course, we will study the question of how this data should be stored and accessed, so that the data remains valid and useful as it is updated and used by many simultaneous users.

Early in this course we will focus on traditional relational database systems, including the relational model, transactions, consistency, concurrency control, indexes, and query processing. As the semester progresses we'll increasingly consider systems with non-traditional requirements, implementations, or features, including stream databases, non-relational systems, and relaxed consistency models.

Your work in this course will consist of regular out-of-class reading and written reactions to the reading, presentation of papers in class, one or more short lab assignments, and a longer-term research project and paper. By the end of the course you will demonstrate the following skills:

There is no required textbook for this course. Many of our readings will come from Database Management Systems by Ramakrishnan and Gehrke or from Readings in Database Systems by Hellerstein and Stonebraker. I will provide online links or copies of all required reading; you probably will need to access these links and copies from the on-campus network. Also, a copy of Ramakrishnan and Gehrke is on Honors reserve at the Cornell Library.


Grades will be weighted as follows:
20%Paper reaction notes and participation
15%Paper presentations
10%Short lab assignment(s)
55%Course project
5%Proposal and lit review
10%Checkpoints and rough draft
10%Reviewer comments
10%Project presentation
20%Final paper

Academic Integrity

Strong academic integrity is expected of every student. Plagiarism, cheating, and academic dishonesty will be reported to the College Judiciary Committee and dealt with severely. You may not hand in work done by someone else as your own. You may discuss ideas and problems with others on a general level and such discussions are encouraged, but you must credit any collaborators or resources used in the completion of your assignments and projects. Please see me if there are any questions about what is permissible.


This schedule is subject to -- and will -- change as needed during the semester:

1 Aug 31   Intro Week 1 readings

Lab 1
Sep 02   The relational model, Ryan's summary of Codd, Ashley's summary of "What Goes Around Comes Around"
2 Sep 07   Database internals, Max and Nick's summary of Anatomy... Week 2 readings
Sep 09 Drop/Add ends (Sep 10) Access methods and indexes, part 1, Ross and Alex's summary of R&G Ch. 8-9
3 Sep 14   Access methods and indexes, part 2, Dougal's summary of R trees and R* trees Week 3 readings
Sep 16   Memory management
4 Sep 21   Join processing, Emanne's summary of Shapiro Week 4 readings
Sep 23   Query optimization, Justin's summary of Selinger et al.
5 Sep 28 Project pre-proposal meetings Transactions and concurrency control, Ryan's summary of R&G Ch. 16 Week 5 readings
Sep 30   Optimistic concurrency control, Ross's summary of Kung and Robinson
6 Oct 05   Crashes and recovery, Ashley's summary of the recovery process Week 6 readings
Oct 07 Full project proposals due Distributed transactions

Oct 12

Fall Break

Oct 14

7 Oct 19   Replication Week 7 readings
Oct 21   Consensus
8 Oct 26   Column-oriented DBMSs, Emanne's summary of C-Store Week 8 readings
Oct 28   H-Store, Max's summary of H-Store
9 Nov 02   PNUTS Week 9 readings
Nov 04 Progress presentations
Last day to declare CR/NC or W (Nov 05)
Progress presentations
10 Nov 09 Progress presentations Progress presentations, continued Week 10 readings
Nov 11   Bigtable
11 Nov 16   Streaming DBs None
Nov 18   No class
12 Nov 23 Paper draft due The CAP theorem

Nov 25


13 Nov 30   Staged software systems Week 13 readings
Dec 02 Peer feedback due Pig Latin
14 Dec 07 Final presentations (Dec 10)
Final paper due (Dec 16)
TBD None