Welcome to CS97. This year's topic is data management systems. Our goal is to understand the fundamental problems of data management and the roles played by modern data management systems. With popular sites like Google and Facebook, GPS-enabled mobile computers in our pockets, and an increasing use of networked devices and sensors in our everyday environment, you are surely aware of the explosion in the amount of data we collect and share. In this course, we will study the question of how this data should be stored and accessed, so that the data remains valid and useful as it is updated and used by many simultaneous users.
Early in this course we will focus on traditional relational database systems, including the relational model, transactions, consistency, concurrency control, indexes, and query processing. As the semester progresses we'll increasingly consider systems with non-traditional requirements, implementations, or features, including stream databases, non-relational systems, and relaxed consistency models.
Your work in this course will consist of regular out-of-class reading and written reactions to the reading, presentation of papers in class, several lab assignments, and a longer-term research project and paper. By the end of the course you will demonstrate the following skills:
There is no required textbook for this course. Many of our readings will come from Database Management Systems by Ramakrishnan and Gehrke or from Readings in Database Systems by Hellerstein and Stonebraker. I will provide online links or copies of all required reading; you probably will need to access these links and copies from the on-campus network. Also, a copy of Ramakrishnan and Gehrke is on Honors reserve at the Cornell Library.
| 20% | Paper reaction notes and participation | ||||||||||
| 15% | Paper presentations | ||||||||||
| 15% | Lab assignment(s) | ||||||||||
| 50% | Course project | ||||||||||
|
This course has distinct late policies for the lab assignments and for the reaction notes and project work:
Strong academic integrity is expected of every student. Plagiarism, cheating, and academic dishonesty will be reported to the College Judiciary Committee and dealt with severely. You may not hand in work done by someone else as your own. You may discuss ideas and problems with others on a general level and such discussions are encouraged, but you must credit any collaborators or resources used in the completion of your assignments and projects. Please see me if there are any questions about what is permissible.
This schedule will change as needed during the semester:
| WEEK | DAY | ANNOUNCEMENTS | TOPIC | HOMEWORK & READING |
| 1 | Aug 30 | Intro | Week 1 readings Lab 0 |
|
| Sep 01 | The relational model | |||
| 2 | Sep 06 | Physical storage, database internals |
Week 2 readings Lab 1 |
|
| Sep 08 | Drop/Add ends (Sep 09) | Access methods and indexes | ||
| 3 | Sep 13 | Join processing | Week 3 readings Lab 2 |
|
| Sep 15 | Memory management | |||
| 4 | Sep 20 | Query optimization | Week 4 readings Lab 3 |
|
| Sep 22 | Transactions and concurrency control, part 1 | |||
| 5 | Sep 27 | Project pre-proposal meetings | Transactions and concurrency control, part 2 | Week 5 readings |
| Sep 29 | Crashes and recovery | |||
| 6 | Oct 04 | Distributed transactions | Week 6 readings | |
| Oct 06 | Full project proposals due | No class | ||
Oct 11 |
Fall Break |
|||
Oct 13 |
||||
| 7 | Oct 18 | Consensus | Week 7 readings | |
| Oct 20 | Replication | |||
| 8 | Oct 25 | Column-oriented DBMSs | Week 8 readings | |
| Oct 27 | C-Store | |||
| 9 | Nov 01 | H-Store | Week 9 readings | |
| Nov 03 | Last day to declare CR/NC or W (Nov 04) | Progress presentations | ||
| 10 | Nov 08 | Progress presentations, continued | Week 10 readings | |
| Nov 10 | Bigtable | |||
| 11 | Nov 15 | PNUTS | Week 11 readings | |
| Nov 17 | Streaming DBs | |||
| 12 | Nov 22 | Paper draft due (Nov 23) | No class | None |
Nov 24 |
Thanksgiving |
|||
| 13 | Nov 29 | Staged software systems | Week 13 readings | |
| Dec 01 | Pig Latin | |||
| 14 | Dec 06 | Final paper due (Dec 13) | Data privacy | Week 14 readings |