CS 44: Database Systems, Fall 2007

Schedule | Text | Grading | Projects | Late Work | Integrity | Class Resources
Professor: Tia Newhall
Semester: Fall 2007
Time: 9:30-10:20 MWF
Location: Science Center 264

Office hours: W 2-4 and by appointment (249 Sci Ctr)


Final Exam, Friday Dec. 21, 9-12am, room 264 Sci Ctr

Course Description

This course is an introduction to the theory, design, and implementation of relational database management systems (DBMS). We will examine DBMSs from two perspectives, that of a relational DBMS implementor (the internal view), and that of a DBMS system administrator or DBMS user (the external view). Knowing both perspectives will give you a better understanding of how a relational DBMS works, and how to most effectively make use of one either as a user or administrator. Specific topics we will cover include data models, data organization, query languages (SQL and others), indexing, relational operator implementation, query optimization, concurrency control, physical database design, transaction management, crash recovery, and security. Time permitting, we will briefly discuss some advanced topics such as object relational DBMS, parallel and distributed databases, or data mining.

A major component of this course is a project that involves implementing and testing components of a relational database management system. By the end of the semester you will have implemented most parts of a simple relational DBMS. The course project will be in C++. You do not need to know C++ to take this class, but you should know C. If you do not know C, or if it has been awhile since you've programmed in C, I encourage you to look at some C programming references before the start of the semester.

Prerequisite: CPSC 035, experience in C or C++ (usually satisfied by completing CPSC 025 or 033). CPSC 025 or CPSC 033 are recommended.

Required Text

Database Management Systems (Third Edition, 2002)
by Raghu Ramakrishnan and Johannes Gehrke.


The following schedule is tentative; it will be updated as we go along.
I highly recommend doing the week's reading prior to the first class meeting of that week.
1 Sep 03 Wed class in room 252 (Sep 05) Introduction to Databases and C++ Chapt. 1
C++ tutorial
Project #1
2 Sep 10 add/drop ends (Sep 14) Data Storge Chapt. 8.1, 8.2, 9
Project #2
3 Sep 17   ER Model Chapt. 2
Written HW 1
4 Sep 24   Relational Model Chapt. 3
Written HW 2
5 Oct 01   Relational Model, Indexing Chapt. 8 (skip 8.5), 10
Project #3
Written HW #3
6 Oct 08   Indexing Chapt. 10, 11.1-11.2

Oct 15

Fall Break

7 Oct 22   Indexing (cont.) Chapt. 11
8 Oct 29   Relational Algebra Chapt. 4.1-4.2
Project #4
Written HW #4

Oct 30

Midterm Tues 7-9pm, 264 Sci Center

9 Nov 05   SQL Chapt. 5.1-5.7
10 Nov 12 no class nor office hours Wed (Nov 14) SQL, External Sorting and Query Evaluation Chapt. 13, 12.1-12.3
11 Nov 19 Thanksgiving break (Nov 23) Query Evaluation Chapt. 12.4-12.6
Written HW #5
12 Nov 26   Query Optimization Chapt. 14
Project #5
13 Dec 03   Transaction Management and Crash Recovery Chapt. 16, 17
14 Dec 10   Crash Recovery, Advanced Topics Chapt. 17

Dec 21

Final Exam (9am-12pm) room 264 Sci Center


Grades will be weighted as follows:
20%Midterm Exam
25%Final Exam
50%Homework and project assignments
5%Class Participation

Programming Projects

All programming projects will be done in the C++ programming language. There will be one small programming assignments designed to help you become familiar with C++. This will count for up to 2% of your final grade, and will be done individually. Subsequent projects will count for a significantly larger part of your grade and should be done with a partner.

I will provide a brief overview of the language during the first week of class. However, you are responsible for learning C++ on your own. If you know C and Java, much of the C++ language easily can be learned from a good language reference and/or an on-line C++ tutorial (see my C++ help links). There are many advanced features of the C++ language that you do not need to know to implement the project; we will use just the basic object-oriented features of C++. Most of your effort will involve learning C++'s syntax for defining classes and method functions, and remembering that you, the programmer, are responsible for specifying your program's memory management (allocation, initialization, bounds checks, deallocation, etc).

There are several C++ language references that are available for you to use while you are working in the CS Lab. If you want to buy your own, here are some recommendations (note the second one is free):

About the CS Lab

The CS Lab (room 240) and the overflow lab (room 238) are open 24 hours a day for CS students to work on their course assignments. When the CS Lab is in use by a class, you should work in the overflow lab. If you do not already have one, you can get a MicroProx tag from Jeff Knerr. The MicroProx tag will allow you entry to the CS Lab and the Science Center after building hours (use the door between Martin and Cornell to get into the building after midnight).

Please read through the"Computer Lab Rules" under the "Introductions" section of the CS help pages.

Collaboration Tools

You should work with a partner on programming projects. To set up your development environment so that you and your partner can access your joint project files while still protecting them from others, see the following: Safe File Sharing for Group Projects.

In addition, I recommend using some type of revision control software to help you coordinate shared accesses to project files by you and your partner and to allow a way for you to back-up working versions of your code. I'd recommend using CVS. However, RCS is also available on our systems. See the Class Resources section for more information.

Late Work Policy

Written homework assignments are due at the beginning of class on the due date. Late written homework assignments will not be accepted. However, it is still to your advantage to do written assignments even if you do not turn them in; they help to reinforce your understanding of the lecture material and they are often typical of the types of questions you may see on exams.

Programming assignments will be submitted on-line using cs44handin. You can submit the same assignment multiple times up to the due date using cs44handin. Once you have submitted your solution to an assignment, make sure to keep a copy of it that you will not modify after you submit it (this way if something goes wrong with cs44handin I can use the dates of your solution files to determine when you submitted your solution).

You and your partner are allowed to use up to 3 late days this semester for turning in programming assignments. However, you only may use up to 2 late days on an individual assignment. One day late means turned in before the original time the assignment was due on the next day class meets. For example, if the original assignment is due Wednesday before 2am, then if you submit it after Wednesday at 2am but before Friday at 2am it is one day late.

Use late days wisely; once you have used up your 3 days I reserve the right to not accept any further late assignments from you, and if I do accept further late assignments from you, you will receive a significant late penalty.

Academic Integrity

Under no circumstances may you hand in work done with (or by) someone else under your own name. Your code should never be shared with anyone; you may not examine or use code belonging to someone else, nor may you let anyone else look at or make a copy of your code. This includes sharing solutions after the due date of the assignment. Failure to abide by these rules constitutes academic dishonesty and will lead to a hearing of the College Judiciary Committee. According to the Faculty Handbook: "Because plagiarism is considered to be so serious a transgression, it is the opinion of the faculty that for the first offense, failure in the course and, as appropriate, suspension for a semester or deprivation of the degree in that year is suitable; for a second offense, the penalty should normally be expulsion."

Discussing ideas and approaches to problems with others on a general level is fine (in fact, we encourage you to discuss general strategies with each other), but you should never read anyone else's code or let anyone else read your code. If you are in doubt about some help that you received, then credit the person(s) from whom you got help by citing them in a comment at the top of the file and discuss the situation with your instructor.

For this class you may discuss the programming assignments and general solution strategies with any class member. You may help each other with understanding Minibase code that we give you, and you may help each other with general C++ language problems, and with general compiling and linking problems. It is discussing the details of your project and homework solutions and coding strategies that you need to avoid. You may not share your own code with anyone but your partner; do not look at anyone else's code and do not let anyone else look at your code.