CS44 Fall 2007

Project 3: Buffer Manager

Due Friday Oct. 26 before 1am (very late Thursday night)

I encourage you to get this done, or close to done, before fall break.

Introduction

For this assignment, you will implement the Buffer Manager layer for the Minibase database system. Your Buffer Manager will implement the Love/Hate page replacement policy, which is described below.

Buffer Manager Interface


The buffer manager interface allows a higher level program to allocate and deallocate pages on disk, to bring a disk page to the buffer pool and pin it, and to unpin a page in the buffer pool.

The methods that you have to implement are described below:

// All BufMgr functions should return OK on success or an error code 
// that you define on an error.  Return error codes via calls to 
// MINBASE_FIRST_ERROR or MINIBASE_CHAIN_ERROR. 
// You should add private data members and functions, and additional
// classes as needed.
class BufMgr {

  public: 

    // You should create enums for internal errors in the buffer manager. 
    // the error strings go in buf.C
    // look at heapfile.h for an example of how to define these
    // (the difference here is that they are defined inside class definition,
    //  which means that inside BufMgr code you can just refer to them
    //  using ERRORNAME, but outside you need to use BufMgr::ERRORNAME)
    enum bufErrCodes  { 

    };

    // The physical buffer pool of pages.
    // This is made public just because it is accessed in 
    // the test driver code. It should be private for real use.
    Page* bufPool;

  public:

    // Allocate "numbuf" pages (frames) for the pool in main memory.
    BufMgr(int numbuf, Replacer *replacer = 0)

    // Should flush all dirty pages in the pool to disk before
    // shutting down and deallocate the buffer pool in main memory
    ~BufMgr()

    // Check if this page is in buffer pool. If it is, increment the pin_count
    // and return a pointer to this page. If the pin_count was 0 before the
    // call, the page was a replacement candidate, but is no longer a candidate
    // If the page is not in the pool, choose a frame (from the set of replacement
    // candidates, based on the love/hate policy) to hold this page, read the page
    // (using the appropriate DB class method) and pin it.  Also, must 
    // write out the old page in chosen frame if it is dirty before reading 
    // new page. (just ignore the emptyPage parameter for this assignment)
    Status pinPage(PageId PageId_in_a_DB, Page*& page,int emptyPage=0);

    // hate should be TRUE if the page is hated, and FALSE otherwise.
    // Should be called with dirty==TRUE if the client has
    // modified the page. If so, this call should set the dirty bit
    // for this frame. Further, if pin_count>0 should decrement it.
    // If pin_count=0 before this call, return error.
    Status unpinPage(PageId globalPageId_in_a_DB, int dirty, int hate);

    // Find a frame in the buffer pool for the first page in the run of howmany pages
    // If a frame exists, call DB object to allocate a run of new pages and
    // and pin it. (This call allows a client of the Buffer Manager
    // to allocate pages on disk.) If buffer is full, i.e., you
    // can't find a frame to store the first page, return an error.
    Status newPage(PageId& firstPageId, Page*& firstpage,int howmany);

    // This method should be called to delete a page that is on disk.
    // This routine must call the DB class to deallocate the page.
    // Only pages with pin_count of zero can be freed
    Status freePage(PageId globalPageId);

    // Used to flush a particular page of the buffer pool to disk
    // Should call the write_page method of the DB class
		// (think about which types of pages need to be written to disk)
    Status flushPage(int pageid);


    // Flush all pages of the buffer pool to disk
    Status flushAllPages();
};

Getting Started

Create a proj3 subdirectory, and copy over the starting point files:
% mkdir cs44/proj3
% cp -r /home/newhall/public/cs44/proj3  cs44/proj3/.

Starting Point Code


Design Overview and Implementation Details

The buffer pool is a collection of frames (page-sized sequence of main memory bytes) that is managed by the Buffer Manager. It should be stored as an array bufPool[numbuf] of Page objects. In addition, you should maintain an array bufDescr[numbuf] of descriptors, one per frame. Each descriptor is a record with the following fields:

page number, pin_count, dirtybit

The pin_count field is an integer, page number is a PageId object, and dirtybit is a boolean. This describes the page that is stored in the corresponding frame. A page is identified by a page number that is generated by the DB class when the page is allocated, and is unique over all pages in the database. The PageId type is defined as an int in minirel.h.

The buffer manager will make calls to the Disk Manager layer to alloc, dealloc, read, and write pages to/from the buffer pool to/from disk. Use the minibase global variable MINIBASE_DB to access the Disk Manger object (for example, MINIBASE_DB->write_page(...) to invoke the write_page method of the disk manager). The disk manager's interface is defined in include/db.h.

A simple hash table should be used to figure out what frame a given disk page occupies. The hash table should be implemented using an array of pointers to lists of <page number, frame number pairs. The array is called the directory and each list of pairs is called a bucket. You are required to implement the hash table data structure rather than using the map class from the C++ STL. Given a page number, you should apply a hash function to find the directory entry pointing to the bucket that contains the frame number for this page, if the page is in the buffer pool. If you search the bucket and don't find a pair containing this page number, the page is not in the pool. If you find such a pair, it will tell you the frame in which the page resides. This is illustrated below:

The hash function must distribute values in the domain of the search field uniformly over the collection of buckets. If we have HTSIZE buckets, numbered 0 through HTSIZE-1, a hash function h of the form

h(value) = (a * value + b) mod HTSIZE

works well in practice. HTSIZE should be chosen to be a prime number. When a page is requested the buffer manager should do the following: Check the buffer pool (by using the hash table) to see if it contains the requested page. If the page is not in the pool, it should be brought in as follows:

The Love/Hate replacement policy

Theoretically, the best candidate page for replacement is the page that will be last requested in the future. Since implementing such policy requires a future predicting oracle, all buffer replacement policies try to approximate it one way or another. The LRU policy, for example, uses the past access pattern as an indication for the future. However, sequential flooding can ruin this scheme and MRU becomes more appropriate in this particular situation. In this assignment you are to implement the love/hate replacement policy. The policy tries to enhance prediction of the future by relying on a hint from the upper levels about the page. The upper level user gives a hint to the buffer manager that the page is loved if it is more likely that the page will be needed in the future, and is hated if it is more likely that the page will not be needed. The policy should maintain an MRU list for the hated pages and an LRU list for the loved pages. If a page is needed for replacement, the buffer manager selects from the list of hated pages first and then from the loved pages if no hated ones exist.

A situation may arise when a page is both loved and hated at the same time. This can happen if the page was pinned by two different users and then was unpinned by the first one as a hated page and by the other as a loved page. In this case, assume that "love conquers hate", meaning that once a page is indicated as loved it should remain loved.

Think carefully about how to design the data structures used to implement the replacement policy, to locate and move a page from one list to another, and to locate free frames. Your solution should not require entire scans of data structures to locate a desired element.


Testing your solution

With the starting point code is sample output from running the test routines in BMTester.C. As a first step to testing that your solution is correct, make sure that your output matches that of correct_output file.

You can edit TestDriver::runAllTests to run individual tests (right now it runs all 6 test functions). In addition, you can modify tests in BMTester.C to further test special case conditions (I will likely test your solution with more than the tests given in this file).

In addition to being correct, your solution should be well designed, robost with good error messages, and well commented. Also, all memory access errors and memory leaks should be removed (use valgrind).


What to Turn In

Create a submit subdirectory of your working directory, and copy into it all the files that I need to compile, run and test your buffer manger solution (.C, .h, and Makefile). In addition, add a README file with the following information:

  1. Your name and your partner's name
  2. The number of late days that you have used on this assignment, and that you have used so far
  3. A brief description of any features or error handling that you have not implemented in your submitted solution. In addition, if you submit an incomplete solution, it would be a good idea to tell me how to run and test your code for the parts that you did complete (you may want to submit a copy of your own BMTester.C file that contains these tests).
  4. Briefly (~1 paragraph) explain how you implemented the Love/Hate replacement policy (the data structure and operations on them) and why you implemented it the way you did (you could list complexities of the operations to support your choices).

Then create a tar file of your submit subdirectory and submit it via cs44handin before the due date.