1. Due Date

Complete lab: Due by 11:59 p.m., Friday, November 20, 2020.

Checkpoint: Wednesday, November 11 by noon.

Your lab partner for Lab 4 is listed here: Lab 4 lab partners

Our guidelines for working with partners: working with partners, etiquette and expectations

2. Overview

In this lab you will implement the SwatDB HeapFile class for storing pages of variable-sized records. This is a part of the File Management layer of SwatDB that implements the abstraction of a heap file (an unordered collection of records). In the previous lab, you implemented a HeapPage for storing one page of records in a heap file. For this lab you will implement an entire heap file. This implementation of a heap file organizes pages of records in a single linked list of HeapPage pages.

The primary goal of the SwatDB lab assignments is to gain an understanding of the details of how a relational DBMS works by implementing and testing parts of a relational DBMS. The SwatDB code base is quite extensive and will require close reading of its documentation (see Section 3).

2.1. Lab Goals

The main goals of the SwatDB Heap Page Lab are:

  • Understanding the structure of a DBMS heap file, and details of implementing its interface.

  • Understand how the file layer interacts with the buffer manager and disk manager layers of a DBMS.

  • Further practice with manipulating low-level data structures in C++, and mapping types onto raw bytes of memory.

  • Developing a thorough testing methodology for a large system.

  • Understanding the role of abstraction in large systems.

  • Practice working with part of a large code base, much of which you have access to only through its interface definition (i.e., .h files and generated interface documentation).

2.2. Starting Point Code

Find your git repo for this lab assignment off the GitHub server for our class: cs44-f20

Here are some detailed instructions on using git for CS44 labs.

Clone your git repo (Lab4-userID1-userID2) containing starting point files into your labs directory:

cd ~/cs44/labs
git clone [the ssh url to your your repo]
cd Lab4-userID1-userID2

If all was successful, you should see the following files (highlighted files require modification):

Lab 4 Files

  • Makefile - pre-defined. You may edit this file to add extra targets. Here are a few of the most common commands that you will use:

    make
    make clean   # rm all built files and any DB files created in test code
    make gcov     # build a gcov version of untittests (gcovunit)
  • README.md - some directions about how to compile and run test programs for the HeapFile.

  • heapfile.h - the SwatDB HeapFile class, and related struct definitions. Do not add any new data members to these classes or public methods. You can add private helper methods for good modular code design.

  • heapfilescanner.h - the SwatDB HeapFileScanner class. Defines an object for scanning (or iterating) over all records in a HeapFile. Its main method is getNext. This class may be useful in your testing code.

  • heapfile.cpp - the SwatDB HeapFile class implementation. Most of the code you implement will be in this file. All methods defined in heapfile.h should be implemented here. Make sure to include good function comments in the this file too (function comments in .h files alone is not sufficient). For any missing functions, start by copying and pasting function comments from the .h file into here. Then, modify the comments to provide information to a reader of the C++ implementation).

  • unittest.cpp - unit testing code for HeapFile. We are not giving you many complete unit tests with this lab. Instead, use this file as a starting point for adding more complete tests of your implementation. Use the design of the test suites in this file as an example for how to add others.

  • checkpt.cpp - unit testing code for passing the checkpoint.

  • checksaved.cpp - test code to test the persistance of heap files. This is written in the style of the sandbox.cpp program rather than using unittests.

  • sandbox.cpp - another way to test your code. This is a more application code style vs. using the unit-test infrastructure. You can use this to add your own testing code. This is meant to complement the unit tests and to help with implementation, debugging, or designing new tests.

  • runchkpt.sh, runtests.sh - scripts to run checkpoint and test programs. They include calls to the cleanup.sh script to clean up any files that are created and left behind from crashed runs.

2.3. Deliverables

The following will be evaluated for your lab grade:

  • The HeapFile class in heapfile.cpp file. This is the primary file in which you will add code. The class and struct definitions to implement are defined in the heapfile.h file.

  • The class definition in heapfile.h. Only add private helper methods to class definitions to support good modular design of your solution. Do not add public methods or data members to any classes or structs defined in this file.

  • The sandbox.cpp and unittests.cpp are two programs for testing the HeapFile implementation.

    You must add code to unittests.cpp, and you can add to both to fully test and debug your solution. Test code added should be well commented, clearly explaining the specific HeapFile functionality it tests.

  • Your Lab 4 Questionnaire to be completed individually (This will open on the due date and close after 3 days)

2.4. Running Test Code

The unit test programs and the sandbox test programs create SwatDB heap files that are stored on disk. The test code is designed to clean-up these files before it exits. However, if your code crashes these files may stick around, and they will cause subsequent runs to crash.

With your starting point code, is a clean-up script that you can run to clean-up these files:

./checkpt        # run the checkpoint unittests
./cleanup.sh     # clean-up any state not cleaned up from this program

If the .sh scripts do not run, make sure they are executable, and run chmod to add executable permissions if not:

ls -l *.sh
-rwx------   cleanup.sh   # should see x permission (read,write,execute)
chmod 700 cheanup.sh      # set to rwx if not
ls -l *.sh

We also have run scripts that run test code programs, and clean-up their state afterwards. These may be easier to use to run your test code:

./runchkpt.sh
./runtests.sh

You can also create your own run scripts, using ours as a starting point:

cp runtests.sh runmine.sh
chmod 700 runmine.sh         # set this file to executable
vim runmine.sh               # edit the contents to do what you want

2.5. Checkpoint

Before the checkpoint due date, you should complete the HeapFile functionality necessary to pass all the unit tests in the checkpt program, and in the checksaved program. We recommend that you run and test these using the run scripts that cleanup state after crashed runs:

./runchkpt.sh

You can also run each by hand, and use the cleanup.sh script to clean up any files left over on incomplete runs (these files end in .rel or .db):

./checkpt        # run the checkpoint unittests
./cleanup.sh     # clean-up any state not cleaned up from this program
./checksaved     # run the check saved test
./cleanup.sh     # clean-up any state not cleaned up from this program

While we recommend dealing with exceptions as you implement the methods, we do not require that all exceptions are implemented for the checkpoint.

The checkpoint functionality includes:

  • Most HeapFile methods fully implemented, except for deleteRecord.

Much of the functionality of HeapFile should be implemented for the checkpoint. Because deleteRecord is not included, you do not have to handle the case of removing pages from the file, only adding new ones.

3. SwatDB

In this assignment you will implement the the HeapFile class part of the File Management layer of SwatDB. Implementing a relation file requires support for creating and deleting a file, and for inserting, deleting, and updating records in the file. These operations may result in requests to the Buffer Manager layer to get, release, allocate or deallocate a Page for the file. The implementation also makes use of the HeapPage class' interface to perform operations on individual pages of records.

For information about SwatDB, including a link to its on-line code documentation, see this page:

SwatDB documentation that will be particularly helpful for this lab includes:

  • The Buffer Manager class defines the interface to the buffer manager layer of SwatDB. Note: when calling buffer manager method functions, be sure to appropriately handle or pass through any exceptions it throws.

  • The File Class defines that base class for all File objects in the system. HeapFile is derived from the class. Look at its documentation to see methods and data members that the HeapFile inherits.

  • The Page Class defines that base class for all pages in the system. Its getData method provides access to the raw page data: the PAGE_SIZE bytes of memory space for a page. No derived class of Page can add additional data methods, they can only map structure on top of the raw page data.

  • The Record class is a structure for storing record information. It has a data member of type Data class is a simple structure that stores the actual record data.

    Many HeapFile methods are passed pointers to Records as parameters. Looking at the test code examples and at the documentation to help understand how to access and set information for records inserted or retrieved from the page.

  • The HeapPage Class is used by HeapFile. You should call methods of this class to perform operations on individual pages of the file.

  • Type and constant definitions in swatdb_types.h. PageNum, PageId, RecordId, SlotId, and other types are defined here.

  • The Exceptions classes defined at the HeapFile, BufferManager and DiskManager layers in exceptions.h that HeapFile methods may need to catch or throw. Check the method function comments in heapfile.h and heapfilescanner.h to see if a particular method needs to throw an exception(s). Exceptions with BufMgr or DiskMgr suffixes are thrown by those layers and are passed through to callers of HeapFile methods.

    Look at the SwatDB documentation for the exception classes, and look at the SwatDB info page for examples of how to throw and catch exceptions.

4. Lab Details

The HeapFile class is defined in the heapfile.h header file included with the lab starting point code. Open this file in an editor to read its contents:

vim heapfile.h

NOTE: the SwatDB implementation of HeapFile is different from the version you are implementing here. Thus, we suggest you avoid using SwatDB (web) documentation and stick to reading the .h files to understand the interfaces you need to implement.

The HeapFileScanner class is defined in the heapfilescanner.h header file. It is also different from the SwatDB version, which is why we are giving you its .h file here. You do not need to implement this class for this lab assignment, but you may want to use it in your test code. You can see its interface:

vim heapfilescanner.h

4.1. Implementation Overview

You will implement a heap file that stores all of its pages of records in a single linked-list of HeapPage pages. This is different from the two linked-list that we discussed in class that used two linked-lists of pages, one list of full pages and another of pages with free space. In your version, there is a single linked-list of pages, some of which may be full and some of which may have space.

The records in the heap file are stored on HeapPage type pages. The next_page and prev_page fields in the HeapPageHeader of each page link pages together into a doubly-linked list of the file’s pages.

You will complete the implementations of the HeapFile class, and implement more tests in unittests.cpp to test both correctness and robust error handling. The testing part is described in more detail in the Testing your code section.

A heap file is organized as a single header page of meta-data about the heapfile and a linked list of HeapPage pages that store records. The heap file’s single header page of metadata includes:

  • head: the PageNum of the first page on the file’s linked list of HeapPages.

  • num_pages: the number of pages in the file

  • num_records: the number of records in the file

as shown in this figure:

heapfile
The Heap File has a header page with meta data about the file including the head of its linked list of pages of records, the number of pages, and the number of records in the file. Each page is a HeapPage storing variable-sized record data.

HeapFile is derived from the File base class that defines a generic interface to all file types in the system.

Note that the HeapFile object only exists when SwatDB is running. However, its underlying set of pages persists between runs (i.e. its header page and linked list of pages of HeapPage structured page data are stored as a file on disk, managed by the DiskManager layer).

HeapFile objects are created in SwatDB in response to one of two actions:

  1. When SwatDB starts up, any DB relation files that are stored on disk are opened and a new HeapFile object is created and associated with each one,

  2. During SwatDB runtime, if new relation file is created, a HeapFile object is created and associated with the new file.

In the test code with this lab, we provide the code needed to create HeapFile object(s) in response to both of these scenarios.

4.2. Heap File Header Page and type re-casting

The header page of the HeapFile should be allocated on a separate Page that stores only the heap file meta data. A struct HeapFileHeader is mapped onto the first bytes of the raw data of the Page. You must not add additional data members to the Page base class; any additional data members would increase the total number of bytes in the derived class, but every Page in the system must have exactly the same number of bytes (the number of bytes declared in the Page class’s data array).

Do not store any records in the header page of the heap file. The first record stored in the file should be stored on a newly allocated HeapPage.

4.3. HeapFile Class

HeapFile inherits the following data fields from the File class:

  • header_id: this is the PageId of the heap file header page. You will allocate a new Page for a HeapFile’s header page, initialize it, and set this field to the new page’s PageId value.

  • buf_mgr: a pointer to the SwatDB BufferManager object that manages the buffer pool. HeapFile methods will need to make calls to allocate/deallocate/get/release file pages using BufferManager methods.

  • file_id: the file’s unique identifier in the system. This is set by the File Manager when the HeapFile object is created.

  • catalog: a pointer to the SwatDB Catalog object that contains information about all files in the system. (you do not need to use this object in your implementation, but it is used in the sandbox test code print_fids function).

  • schema: a pointer to the Schema object describing the relation stored in this file (you do not need to use this in your implementation beyond some error checking).

In heapfile.h is a struct definition that defines the header page fields. The header page stores heap file-specific metadata about the file and its structure. It is mapped onto the underlying Page data bytes.

The heap file (shown in [HeapFileFig]) is organized so that:

  • Page 0 is the file’s header page.

  • Remaining Pages are stored in a doubly linked list of HeapPage pages that store record data. Records can be inserted on any page in the file, and individual HeapPages can be in any order in the linked list (i.e. it is not sorted by PageNum or record field values).

  • Both full pages and pages that have some space are stored in any order in the linked list of pages.

The header page and the record data pages are stored on disk as a file. The DiskManager manages how they are stored on disk.

HeapFile methods will call BufferManager method functions to allocate, deallocate, get, release, and flush pages from the buffer pool to handle operations on the file. NOTE the buffer manager flushPage method should not be called on regular heapfile operations like inserting or removing a record from the file. This method forces a page to be written to disk, which is useful for rare DB operations, but for most operations the higher levels should just let the buffer manager manage the buffer pool, and let its replacement policy decide when a page in the buffer pool gets written out to disk.

HeapFile methods will call HeapPage to insert/delete/update records on individual pages of the file, and to set page’s prev_page and next_page fields to link pages together in the singly linked list of pages of record data.

4.3.1. Type recasting

You will need to use type-recasting in a few ways in this assignment.

  1. The BufferManager methods return Page *, but your code needs to manipulate the pages as specific types. You can re-cast return values as a pointer to the specific type to do this. For example:

    HeapPage *pg;
    ...
    // recast the return type of getPage to (HeapPage *)
    pg = (HeapPage *)buf_mgr->getPage(pg_id);
  2. We do not give you methods that recast the raw page data of the header page to a HeapFileHeader * in a similar way that we did with the HeapPage lab. However, we suggest that you add one as a private helper method function:

    private:
      /**
       * @return address of  header page recast as
       *         a (HeapFileHeader *)
       * TODO: what are the pre and post conditions?
       * TODO: does this throw any exceptions (or pass any through)?
       */
      HeapFileHeader *_getHeaderPage();

    We recommend that you define and implement this as the first method, and use it throughout your code. Look at the HeapPage lab starting point code for examples of similar methods.

4.3.2. HeapFile interface methods

These methods are implemented for you (look at their implementation for some hints at how to to call Buffer Manager layer methods):

  • constructor: invokes File base class constructor and sets header page to INVALID_PAGE_ID.

  • createHeader: allocates and initializes a new header page for the heap file. This method is only invoked when a new heap file is created in the system vs. when an existing heap file is loaded from disk. You do not need to create a header page in your HeapFile code: the SwatDB FileManager creates a HeapFile object, and it sets the HeapFile’s header page or calls this method to create one.

  • flushHeader: flushes the header page to disk.

The following are HeapFile interface methods you need to implement (you may also want to add some private helper method functions too). NOTE: most of these methods call BufferManager and HeapPage interface method functions to accomplish some of the listed subtasks. Be sure you review the interfaces of those two classes before beginning implementation.

  • insertRecord: inserts a record into the file and returns the RecordId of the inserted record. HeapFile meta data is updated to reflect that the record has been inserted, or an error exception is thrown if the insert fails.

    The passed record is inserted into the first page on the doubly linked list of pages that has enough free space to store it. If there are no existing pages that have enough space to store the record, then a new page is allocated for the file, added to the head of the linked list of pages, and the record is stored on that page.

    Each Page that needs to be accessed to handle an insertRecord (including the header page and all HeapPages) need to utilize the BufferManager to first bring the page into the buffer pool. To access a Page, it needs to be pinned (either through a call to getPage or allocatePage BufferManager methods), and should be unpinned (via a call to releasePage BufferManager method) when the method is done using the page. The dirty bit should be set appropriately for all pages that are modified on an insert record. Any page pinned by this method must be unpinned before the method returns (or before it throws an exception). You will need to keep careful track of when a page is still pinned and when it is no longer needed. Any exceptions in between those two points need to unpin the page before throwing/rethrowing the exception.

    This method throws exceptions on errors, some of which may come from calls to HeapPage methods or to BufferManager methods that are thrown by the buffer manager or disk manager layer. Look at the function comments for more specific information about these. Any time you access either the HeapPage or BufferManager methods, be sure to check the documentation to make sure you are aware of potential exceptions you may need to handle.

    Here are the main steps of this method (note: calls to BufferManager layer to getPage, allocatePage and releasePage need to be made in implementing some of these steps):

    1. Start the implementation of insertRecord by checking for some error conditions and throwing exceptions when the checks fail.

      1. If the passed record is too big, throw an InsufficientSpaceHeapPage exception. HINT: MAX_RECORD_SIZE is the largest record a HeapPage can store.

      2. The passed record schema must match this files’s schema. This just requires comparing the two Schema pointer values for equality. Throw an InvalidSchemaHeapFile exeception if they do not. Examine the class data members and the Record interface to access the schemas.

    2. Search for a page in the doubly linked list that has space to insert the record. There are two main cases:

      1. If such a page is found, insert the record on the page (call insertRecord on the HeapPage). Think carefully about pinning/ unpinning and handling exceptions.

      2. If the list is exhausted with no suitable candidate, insertRecord will allocate a new HeapPage that is appended to the beginning of the linked list. Insert the record into this new HeapPage.

        Remember that a call to the BufferManager allocatePage method, just returns a pointer to a Page of the buffer pool (a pointer to a Page-size chunk of memory data). The buffer manager has no idea what type of page the caller wants to map on top of the allocated Page of memory space (it could be an index page, it could be a HeapPage, it could be a header page, it could be …​). If you want to use the Page of buffer pool space to store particular state (HeapPage data), you need to initialize that Page of memory space to the appropriate values before you start accessing its contents and interpreting their values as having heap page meaning (the Page returned by the buffer manager is just a page of garbage values in memory). See the HeapPage interface for any useful methods for this.

        Again, you will need to carefully think about pinning/unpinning/handling exceptions. Also examine [HeapFileFig] to identify all of the various updates you need to make to insert a new HeapPage at the beginning of a doubly-linked list.

        Note: A HeapFile can only store up to MAX_PAGE_NUM pages of data; throw InsufficientSpaceFilePage exception if adding a new page exceeds this amount.

    3. Update HeapFile header information with results of the insert.

  • getRecord: given a RecordId value and a passed Record to fill, this method constructs a PageId from the passed RecordId and from the files' FileId value, requests the page from the BufferManger (via a call to the getPage method), and calls the HeapPage getRecord method to copy the record data from the heap page into the passed Record *. The page unpinned from the buffer pool before this function returns or throws an exception (via calls to releasePage BufferManager method).

    This method throws exceptions on errors, some of which may come from calls to HeapPage methods or to BufferManager methods that are thrown by the buffer manager or disk manager layer. Look at the function comments for more specific information about them.

  • updateRecord: updates an existing record to its new passed value. Note: an updated record must retain its RecordId value. This means that an update cannot move a record from one page of the heap file to another.

    Like other HeapFile methods, any pages of the file that are accessed by the method needs to be pinned before accessed (via getPage/allocatePage) and should be unpinned (via releasePage) when the method is done using them. All pages pinned by this method must be unpinned before the method returns (or before it throws an exception). This method throws exceptions, some of which may come from calls to HeapPage or BufferManager methods.

  • deleteRecord deletes a record, given its RecordId, from the file. This method could shrink the heap file by one page.

    Here are some of the big steps deleteRecord must take (you need to refine these):

    1. Get the page corresponding to this record from the buffer manager.

    2. Determine if the RecordId is valid.

    3. Delete the record from the page (using the correct HeapPage method)

    4. Determine if the page containing the deleted record can be removed from the file and remove it from the linked list of pages

    5. Update HeapFile metadata in the header to reflect a successful record delete.

    Like other methods, all pages accessed need to be pinned in the buffer pool and all pages pinned by this method need to be unpinned when the method returns (or throws an exception). It may throw exceptions, which may result from its calls to BufferManager or HeapPage method functions.

4.3.3. HeapFile debugging methods

There are several methods already implemented that you can use for debugging purposes only. See the sandbox.cpp file for some examples of their use, you can also call these from gdb or use them in debugging printout in unittest.cpp.

Search for THIS METHOD IS FOR DEBUGGING ONLY in the comments to find them in the file.

4.4. Exceptions

For information about how to throw and catch different SwatDB exception objects, look at the Exceptions Section of the SwatDB information page. You will need to think carefully about the cause of these exceptions in order to know how to handle them. Use the method function comments @throw as a guide for which exceptions methods may need to throw or to pass through. These comments will also help you think about some of the error conditions your code may need to check for and handle appropriately.

For this lab Buffer Manager and Disk Manger layers may throw exceptions in response to calls from HeapFile method functions. Your code needs to determine if it needs to catch the exception or let it pass through.

Remember that if a method throws (or passes through) an exception, it should clean-up any internal state it has partially modified. For this lab, this may include unpinning pages that the method has pinned prior to throwing (or re-throwing) an exception.

5. Lab Requirements

In addition to completing the implementation of the HeapFile class, and adding many more tests to unittest.cpp, you should also:

  • Declare and use variables of the types defined in swatdb_types.h as opposed to their underlying type definition. Also use constants and enum types defined in this file - they help make the code more readable. For example, if a method returns a FrameId, declare a variable of type FrameId rather than std:uint32_t or int to store its return value:

    FrameId frame_num;
    PageId  pg_id;
    
    //get returns a FrameId, storing it as an int compiles but loses valuable
    // information about the purpose of frame_num
    frame_num = buf_map->get(pg_id);
  • Write good C++ code design, and good modular design in your solution. This includes using defined constants and types.

  • Ensure you code is robust to errors, in particular, be sure to test for error handling for exceptions that could be thrown by the buffer manager or disk manager layers, and determine if they need to be caught or passed through.

  • Ensure your code is free of valgrind errors.

  • Make sure your code is well-commented, and there is no line wrapping. (See our C++ Style guide link from the Handy Links section).

  • Your submitted code should have all of our TODO comments removed…​as you implement a TODO, remove it. These are also helpful to find parts of the given code that you need to implement.

6. Testing your code

There are four test files in the starting point code:

  • checkpt.cpp: unit tests for the checkpoint

    # run all of the checkpt test suites
    ./checkpt
    # or you can run individual test suites alone using -s testSuiteName
    ./checkpt -s X    # run just the X test suite
    # to list the test suites names run with -h
    ./checkpt -h

    You can also use the runscripts to run (and cleanup) checkpt and checksaved tests:

    ./runchkpt.sh

    See Section 2.4 for more information about running test code and scripts.

  • checksaved.cpp: checks the persistence of your heapfile operations. Any modifications to the HeapFile, like inserting records, should persist between runs of SwatDB. This test checks that your solution is implemented so that changes to the HeapFile in memory translated to changes on disk. This is accomplished by your solution calling BufferManager methods correctly.

  • sandbox.cpp: some heappage test code written in a more programatic way than the unittest framework.

  • unittest.cpp: the start of a set of unit tests for the heap page. You are required as part of this lab to develop more unit tests that you must add in this file.

You may use any or all of these to start your HeapFile implementation, but you will need to use all of them to verify your program is working correctly. sandbox.cpp, checkpt.cpp, and checksaved.cpp are useful for testing early on.

6.1. Required Unit Tests

The unittest.cpp implements some Heap File test code using the unit tests framework from CS35 (as does checkpt.cpp). You must add additional tests to unittest.cpp by following the code examples in this file to help you structure your code.

unittest.cpp contains a very incomplete set of test functions. Use this file to add tests beyond those tested by the checkpt.cpp. Here are some to consider:

  1. tests for exceptions

  2. tests that consider boundary testing or stress testing each main operations: insert, delete, update

  3. tests that stress test different cases of combinations of operations.

  4. tests that produce file sizes that are larger than the size of memory (the file has more total pages the the buffer pool).

unittest.cpp test program

In unittest.cpp, you can add additional tests to each test suite. We also have two empty test suites into which you can add your test code (you don’t have to use these, but these are here for to use and as example syntax if you want to add more test suites):

/*
 * An empty SUITE for to add some heappage tests.
 */
SUITE(studentTests1){
  //TODO: add tests
}
/*
 * An second empty SUITE to add additional tests.
 */
SUITE(studentTests2){
  //TODO: add tests
}

You are welcome to add additional test suites beyond these two, and it will make your testing easier if you do. Follow the same structure as these and the other test SUITES in this file to do so.

sandbox test program

sandbox.cpp is a more programatic way of designing test code. It includes an example of calling some of the HeapFile debugging functions that are already implemented for you in the starting point of heapfile.cpp. Read the code in this file to understand what it is doing, and add your own tests that follow this model to test a sequence of operations on your buffer manager.

You may also want to look at checksaved.cpp, which is a test program written in the same style as sandbox.cpp. It has some helper functions for inserting and checking records that you may want to copy into sandbox.cpp and use, or use as an example of similar helper functions you may want to add to sandbox.cpp.

cleaning up corrupted files SwatDB maintains several files to allow for persistent storage of data. While that is not a central feature of this lab, a consequence of a program crash is that temporary files do not get properly cleaned up since the DBMS did not shutdown cleanly. This can cause problems when you try to rerun the program, so you will need to clean this up. Use the two options below:

./cleanup.sh

make clean     # or make clean also runs cleanup.sh then just re-build
make

Also see Section 2.4 for running using the runscripts that include calls to the cleanup script for you.

7. Tips and Hints

The following are some tips to help you implement the HeapFile:

  • Run checkpt and unittests often to see which tests you are passing and which fail. This will help you find missing functionality in your code, and some missing cases. These tests are incomplete and part of this assignment is designing and writing more unittests to test your implemenation more thoroughly.

  • Implement and test incrementally. Use the checkpt.cpp tests as a guide for what functionality to implement and test first. We suggest this implemenation order (as you add post-checkpoint functionality, you should also add more testests to unittests.cpp to completely test functionality):

    1. insertRecord: just inserting some records to fill up just one Heap page of records.

      • need to allocate a new page for the first insert

      • call insertRecord on the HeapPage to insert the record on the page.

      • update the header page information to reflect a successful insert.

      • make sure all file pages are unpinned after the call to insertRecord

    2. getRecord: implement all functionality, at least all without possibly complete exception/error handling.

    3. updateRecord: implement all functionality, or at least all perhaps missing some error handling.

      • call updateRecord on HeapPage to update.

      • make sure all file pages are unpinned after the call to insertRecord.

    4. insertRecord: add support for inserting records that span more than one heap page of records.

      • This requires traversing the linked list of pages to find a page with enough free space to insert the record.

      • This may require adding in a new page to the linked list of file pages (a new page should be added to the front of the list), and updating appropriate next and prev fields in pages.

      • ensure that header page information is updated correctly.

      • ensure that all pages modifed are unpinned before this method returns.

    5. Run the checksaved program to ensure that file changes are persistant.

    6. deleteRecord:

      • first try a delete that does not result in removing a page from the file

      • check file header information is correct after a delete and that succeeds

      • check that none of the pages accessed in deleteRecord, are pinned in buffer pool after the call.

      • next, try delete that deletes the last record on a page. The page should be removed from the linked list of pages, deallocatePage should be called to remove it from the underlying file.

  • Refer to Wednesday in-lab code examples for C++, gdb, gcov, and valgrind reminders.

  • Remember the & operator returns the address of its argument (this is C-style operator, not & used to specify reference parameter types in C++). Its argument must be an lvalue (a storage location, such as the name of a variable or an array bucket). For example, to get the address of the 3rd bucket in and array of ints:

    int array[20];

    you would use the & operator like this (in this example I’m assigning its value to an int * variable):

    int *ptr;
    ptr = &(array[3]);
  • Refer to the recasting part of Section 4 for some example type recastig code. Also look at the private method fundtions in the starting point for the HeapPage lab as another example.

8. Submitting your lab

Review the lab deliverables to ensure you have completed all of your work. Before the due date, push your solution to github from one of your local repos to the GitHub remote repo.

From your local repo (in your ~/cs44/labs/Lab4-userID1-userID2 subdirectory)

make clean
git add *.h *.cpp   # only add .h and .cpp file:  DO NOT DO git add *
git commit -m "my correct and well commented solution for grading"
git push

Verify that the results appear (e.g., by viewing the the repository on cs44-f20). You will receive deductions for submitting code that does not run or repos with merge conflicts. Also note that the time stamp of your final submission is used to verify late days, so please do not update your repo until after the late period has ended.

If that doesn’t work, take a look at the "Troubleshooting" section of the Using git for CS44 labs and the Using git pages. At this point, you should submit the required Lab 4 Questionnaire (each lab partner must do this).

9. Handy References