CS97 - Senior Conference, Fall 2011

Data Management Systems

Recent readings | Schedule | Course information | Grading | Academic Integrity


Class information

Room: Science Center 246
Class: Tuesday, Thursday 2:40 - 3:55pm, SCI 246
Lab: Monday 2 - 3:30 p.m., SCI 252
Professor: Charlie Garrod
Office: Science Center 255
Phone: 6071
Office hours: 10:30 - 11:30 a.m. or by appointment, or you can stop by whenever my door is open.

Welcome to CS97. This year's topic is data management systems. Our goal is to understand the fundamental problems of data management and the roles played by modern data management systems. With popular sites like Google and Facebook, GPS-enabled mobile computers in our pockets, and an increasing use of networked devices and sensors in our everyday environment, you are surely aware of the explosion in the amount of data we collect and share. In this course, we will study the question of how this data should be stored and accessed, so that the data remains valid and useful as it is updated and used by many simultaneous users.

Early in this course we will focus on traditional relational database systems, including the relational model, transactions, consistency, concurrency control, indexes, and query processing. As the semester progresses we'll increasingly consider systems with non-traditional requirements, implementations, or features, including stream databases, non-relational systems, and relaxed consistency models.

Your work in this course will consist of regular out-of-class reading and written reactions to the reading, presentation of papers in class, several lab assignments, and a longer-term research project and paper. By the end of the course you will demonstrate the following skills:

There is no required textbook for this course. Many of our readings will come from Database Management Systems by Ramakrishnan and Gehrke or from Readings in Database Systems by Hellerstein and Stonebraker. I will provide online links or copies of all required reading; you probably will need to access these links and copies from the on-campus network. Also, a copy of Ramakrishnan and Gehrke is on Honors reserve at the Cornell Library.


Grades will be weighted as follows:
20%Paper reaction notes and participation
15%Paper presentations
15%Lab assignment(s)
50%Course project
5%Proposal and lit review
10%Checkpoints and rough draft
10%Reviewer comments
10%Project presentation
15%Final paper

Late policies

This course has distinct late policies for the lab assignments and for the reaction notes and project work:

Academic integrity

Strong academic integrity is expected of every student. Plagiarism, cheating, and academic dishonesty will be reported to the College Judiciary Committee and dealt with severely. You may not hand in work done by someone else as your own. You may discuss ideas and problems with others on a general level and such discussions are encouraged, but you must credit any collaborators or resources used in the completion of your assignments and projects. Please see me if there are any questions about what is permissible.


This schedule will change as needed during the semester:

1 Aug 30   Intro Week 1 readings
Lab 0
Sep 01   The relational model
2 Sep 06   Physical storage,
database internals
Week 2 readings
Lab 1
Sep 08 Drop/Add ends (Sep 09) Access methods and indexes
3 Sep 13   Join processing Week 3 readings
Lab 2
Sep 15   Memory management
4 Sep 20   Query optimization Week 4 readings
Lab 3
Sep 22   Transactions and concurrency control, part 1
5 Sep 27 Project pre-proposal meetings Transactions and concurrency control, part 2 Week 5 readings
Sep 29   Crashes and recovery
6 Oct 04   Distributed transactions Week 6 readings
Oct 06 Full project proposals due No class

Oct 11

Fall Break

Oct 13

7 Oct 18   Consensus Week 7 readings
Oct 20   Replication
8 Oct 25   Column-oriented DBMSs Week 8 readings
Oct 27   C-Store
9 Nov 01   H-Store Week 9 readings
Nov 03 Last day to declare CR/NC or W (Nov 04) Progress presentations
10 Nov 08   Progress presentations, continued Week 10 readings
Nov 10   Bigtable
11 Nov 15   PNUTS Week 11 readings
Nov 17   Streaming DBs
12 Nov 22 Paper draft due (Nov 23) No class None

Nov 24


13 Nov 29   Staged software systems Week 13 readings
Dec 01   Pig Latin
14 Dec 06 Final paper due (Dec 13) Data privacy Week 14 readings