CS44 Lab 5: Building a B+-Tree Index

Complete Solution: Due by 11:59 p.m. Tuesday November 22, 2016

Checkpoint 1: Due in Lab week 10
creating a B+Tree, printing the B+Tree, and inserting records into a B+Tree without splitting.

Checkpoint 2: Due in Lab week 11
insert with support for splitting leaf nodes. Inserting records in different orders into a B+Tree. B+Tree File save and reload (constructor for an existing B+Tree file on disk).

This leaves splitting internal nodes, scans, and developing more test code and more rigorous tests for the final week.

Quick Links

This assignment is to be done with your assigned Lab partner. You may not work with other groups, and the share of workload must be even between both partners. Failing to do either is a violation of the department Academic Integrity policy. Please read over Expectations for Working with Partners on CS Lab work

Your lab5 and beyond CS44 lab partner


Introduction

A key distinction in DBMS data structures design vs. data structures you have designed and used in CS35, is that DBMS data structures are designed to be stored in files on secondary storage (usually disk). In addition to discussing the effect of data structure design, we must also discuss each file type's particular organization and how that organization lends itself to efficient evaluation of typical operations: scan, equality search, range search, insertion, and deletion.

A DBMS index is a structure to access a relation quickly, and potentially to provide additional access methods to tuples in a relation. For this assignment, you will implement a B+ Tree in index that will store data entries in the form < key, rid > (Alternative 2 for data entries in terms of index values). The actual tuples are stored in a file separate from the index; entries in the index file "point to" the location of the tuple using a RecordId.

You will implement a B+ tree, including the full search and insert algorithms as discussed in class. Insert must be capable of dealing with overflows (at any level of the B+Tree) by splitting pages. Your impelementation will always split when full, and should not impelement re-distribution.

Goals of this assignment:

There are number of design choices that you need to make, and you probably need to reserve a big chunk of time for testing and debugging. So, start working on this assignment early; you are unlikely to finish this project if you start is just a week before the deadline.

WiscDB and Getting Started

The goal of the WiscDB projects is to allow students to learn about the internals of a data processing engine. In this first assignment, you built a buffer manager, on top of an I/O Layer that I provide to understand the management of main memory in a DBMS. In part two, you dove into the I/O Layer to implement a heap page.

In this last part, you will construct a B+ tree class to index relations on disk. In the previous two labs, we simplified the topic of records by not directly dealing with relations. In this lab, records will be more complex. This will require you to navigate a wide-range of concepts including efficiency, relations, indexes, and the use of low-level memory. As such, the depth of design choices and testing strategy will increase significantly. Furthermore, you will re-use many of the constructs from previous labs in your implementation, In particular, you will use (modified) page representations to map nodes on to disk and you will use a buffer manager to manage the interface between your B+ tree and the actual data on disk.

The code base is quite extensive and will require much reading of API documentation and thorough testing. Note: For this lab, I have attempted to directly link all class names to their API documentation. This should make understanding the interface for a class more streamlined for you. You are responsible for making significant progress early on the in assignment as waiting until the last few days will not be manageable.

Lab 5 Starting point
First, find your Lab5-partner1-partner2 git repo off the GitHub server for our class: CS44-f16

Next, clone your Lab 5 git repo into your cs44/labs subdirectory, cd into your repo, and run make setup:

cd
cd cs44/labs
git clone [the ssh url to your repo]
cd Lab5-partner1-partner2
make setup
Here are some instructions on Using Git page (follow the instructions for repos on Swarthmore's GitHub Enterprise server).

If all was successful, you should see the following files and symlinks (files highlighted in blue require modification):

If this didn't work, or for more detailed instructions on git see: the Using Git page (follow the instructions for repos on Swarthmore's GitHub Enterprise server).

The starting point files in your repo should be (files highlighted in blue require modification):

As in previous labs, run make setup to create symlinks to lib, exceptions, and includes for files needed to build your solution.

  • include/ - contains header files for the Page, File, BufferManager and related classes. You can open these directly to understand how to use and manage the interface to the I/O layer in your solution. While reading the header files may be helpful, the on-line documentation might be an easier to read WiscDB.
  • lib/ - necessary object files. This directory can be ignored. It provides libraries for the buffer manager and I/O layer that need to be linked into your executable.
  • exceptions/ - defines the list of possible exceptions for WiscDB. You will need to reference these exceptions to both handle possible errors that can be thrown to you or that you must throw. Again, it is probably easier to refer to the online documentation.


    I/O Layer: Modification to Files and Pages

    To help get you started, I have provided you with an implementation of few new classes: PageFile, RawFile, and FileScanner. The PageFile and RawFile classes are derived from the File class. These classes implement a file interface in two different ways. The PageFile class implements the file interface for the File class as was done in the previous two WisdDB labs - a file consists of pages formatted using the Page class (i.e., a double-linked list of heap pages). We use the PageFile class to store all the relations.

    The RawFile class provides an abstraction where there is minimal internal structuring of the data. This allows us to implement our own file organization on top of the RawFile without having our data corrupted in the process. Specifically, the RawFile treats pages as completely unformatted chunks of memory (e.g., 8KB). There is no slot directory or page pointers built in. This allows us to use the page as essentially one large chunk of memory (like the data array from the previous lab).

    You will use the RawFile class to store the B+ index file, where every page in the file is a node from the B+ tree whose type is mapped onto a raw page of bytes (like the data field in the HeapPage).


    The FileScanner class is used to scan records in a file. You will use this class for the base relation, and not for the index file. The file main.cpp file contains code which shows how to use this class. The public member functions of this class are described below.

    • FileScanner(const std::string & relationName, BufMgr *bufMgr)
      The constructor takes the relation name and buffer manager instance as parameters. The methods described below are then used to scan the relation.

    • ~FileScanner()
      Shuts down the scan and unpins any pinned pages.

    • void scanNext(RecordId& outRid)
      Returns (via the outRid parameter) the RecordId of the next record from the relation being scanned. It throws EndOfFileException when the end of relation is reached.

    • std::string getRecord()
      Returns the record identified by rid. The rid is obtained by a preceding scanNext() call.

    • void markDirty()
      Marks the current page being scanned as dirty, in case the page was being modified. You probably won't need to use this in this assignment.


    A B+ Tree Index

    Your assignment is to implement a B+ tree index. To simplify the task, we will make the following assumptions:

    • All records in a file are strings, and have the same length (so for a given attribute its offset in the record is always the same). The length is defined in btree.h as STRINGSIZE.
    • The B+ tree only needs to support single-attribute indexing.
    • The data type for the index attribute will be limited to strings (C-strings to be precise). While this seems limiting, we can always map a structure into c-strings if, for example, we wanted to instead index ints.
    • String keys will undergo pre-fix compression: they are limited to the first 10 characters.
    • All keys are unique. That is, you will not need to handle inserting two keys with the same value.
    As you create test "relations" (just files containing string key values), make sure you are creating them to meet these assumptions. You can create tests that have longer key values an then create the B+Tree key as the key string value by copying in the first 10 chars of the string argument, using strncpy:
    strncpy(dest, src, 10);
    

    The index will be built directly on top of the I/O Layer (the RawFile and the Page classes). An index will need to store its data in a file on disk, and the file will need a name (so that the DB class can identify it). The convention for naming an index file is specified below. To create a disk image of the index file, you simply use the RawFile constructor with the name of the index file. Since a RawFile is an unstructured array of bytes, you will need to map a structure on top of the pages that you get from the I/O Layer to represent the nodes of the B+ tree. Furthermore, where the File class abstracted the creation a header page, you will need to directly allocate a page and format it to be the header page (i.e., store meta-data) for your index.

    We'll start you off with an interface for a class, BTreeIndex. You will need to implement the methods of this interface as described below. You may add new private methods to this class if needed (like helper functions for good modular design), but do not modify the public interfaces that are described here:

    • Constructor
      The constructor first checks if the specified index file exists - if it does, simply load the file and the header page which stores the relevant meta-data. You should verify that the parameters match the meta-data in the loaded index (e.g., file name, attribute offset, etc.). If they do not, throw an BadIndexInfoException. If the file does not exist, create it and construct the index as defined below. To specify this, it is best to examine the role of each parameter:
      • relationName - The name of the relation on which to build the index. If creating a new index, the constructor should scan this relation (using FileScanner) and insert entries for all the tuples in this relation into the index. Insert each entry one-by-one (i.e., don't worry about bulk-loading. Although that would be one fun extension!).
      • outIndexName - A return value for the name of the file for the index. An index file name is constructed by concatenating the relational name with the offset of the attribute over which the index is built e.g., "relName.attrOffset. If you don't have experience with stringstreams, use the following code:
            std::stringstream ss;
            ss << relationName << '.' << attrByteOffset;
            std::string indexName = ss.str(); // indexName is the name of the index file 
        
      • bufMgrIn - A pointer to the buffer manager. You will need to store this pointer so the index can load pages into the buffer pool from disc.
      • attrByteOffset - The offset, in bytes, of the attribute for which the index is being built for. This defines how to extract the values necessary from the structured tuples.

    • Destructor
      Performs any cleanup that may be necessary, including clearing up any state variables, unpinning any B+ tree pages that are pinned, and flushing the index file (by calling bufMgr->flushFile()). Note that this method does not delete the index file itself! It is useful to recall that deletion will call the destructor of the File class causing the index file to be closed.

    • insertEntry
      Inserts a new data entry (i.e., pair). The parameter key is of type const char*. You should follow the algorithm presented in class for spliting overflowing leaves and internal nodes. Do not implement redistribution when a node is full.

    • startScan
      This method is used to begin a "filtered" scan of the index. For example, if the method is called using arguments ("a",GT,"d",LTE), the scan should seek all entries greater than "a" and less than or equal to "d" (a range query). Recall that the operators are an enumerated type to make it easier to define which operator to use. lowValue and highValue are pointers that need to be interpreted as mentioned above. Recall that you should only use a fixed prefix of strings (defined to be 10 in the code) for key comparisons.

      This method does not return any values; you will need to use scanNext to obtain the results one-by-one. You should first check to see if there is already a scan in progress - if so, first end that scan before initializing a new one. You should throw a BadScanrangeException if the user provides a low value greater than the high value and throw a BadOpcodesException if the lowOpParm or highOpParm are not legal (e.g., the lowOpParm must be either GT or GTE). The function should point to the first matching record when it is complete; if no such record exists (i.e., no keys match the search criteria) throw an NoSuchKeyFoundException.


    • scanNext
      After initializing a scan using startScan, this method returns the rid of the next matching record for the search through the outRid parameter. You should be maintaining global information about the current search using existing data members for the BTreeIndex. If there is nothing to return, you should throw an IndexScanCompletedException. Be sure to keep any leaf page that is currently being scanned pinned between searches - logically, the page isn't free until the page is fully scanned or the scan is complete. Each leaf page should have a next sibling pointer to help continue the scan between leaves.

    • endScan
      Terminates the current scan, unpinning any pages related to the scan. You should throw a ScanNotInitializedException if there is no current scan in process for either this method or scanNext.

    Tips and Additional Details

    Here is some additional information to help complete the lab:

    • implement BTreeIndex::printTree early and use it to help you with debugging. It is for your own debugging, so what you print and how you print it out is up to you. You may add additional print tree functions to the btree public interface if this one alone is not enough (you can also make use of the fileScanner methods to produce debug output externally). Additional print tree methods for debugging should be the only change you make to the btree public interface---your btree solution must work with my test code that will use the btree public interface as given.

    • start by implementing an empty B+ Tree, and then try to insert one record. An empty B+ Tree will have 2 pages: a header page, one empty intex page for the root node. One the first record insert, the B+Tree will have 4 pages: the header page, the root node index page, and and two pages for the first two leaf nodes. The first record insert will add the first entry to the root index (a key value of the record being inserted, and TWO ptr (pageId) values to the two leaf pages). The first record will be inserted into the right leaf page, the left will be empty. Note: these are the only two leaf pages whose number of records can ever be lower than d (less than 50% full).

    • Examine the test code in main.cpp that creates and inserts elements into a BTreeIndex. This will help you determine how to write your own test code, but also give you an example of how the BTreeIndex and FileScanner interfaces are used. To create a BTreeIndex, you need a file of records. Look at example code in main.cpp that does this.

    • Uncomment out DEBUG in btree.h to trigger more splitting this will define fewer total records per page, which is useful both for printing out the B+Tree when testing early functionality, but it will also let you stress test and debug your splitting code without having to insert a massive number of records. You should, of course, also test on the regular size, but this is useful for debugging.

    • Using raw Page objects for nodes - Before starting, think closely about how you can use a Page from a RawFile to store node information. See the slides from lab on using Page objects to store LeafNode data Every different type you need to represent in a B+Tree data structure will have to be mapped into a raw file Page.

    • Setting level value for NonLeafNode - You can use the level value in several different ways. What is most important is that you will need some signal to know whether the next node is going to be a leaf or non-leaf (a consequence of mapping our own structure is that we cannot have C++ check types for us). I suggest using level to store how far a node is from the leaf level. You could also use it to signify whether the next level down is a leaf or not. (e.g., level==0 implies the child Page is a NonLeafNode while level==1 implies the child is LeafNode.

    • Buffer Manager - One of the more important aspects to your implementation is the use of the buffer manager. Any time you allocate/read/write a page from disk, you will be using the buffer manager to do so. Be sure you carefully consider when pages no longer need to be used - most pages should be unpinned almost immediately after use, but some pages will be pinned for long durations.

    • Special case for first insert - Dealing with the first insert in a B+ tree can be tricky. In particular, you will be creating a leaf node that technically violates the requirement of 50% occupancy. Also, you may create a leaf node that never is inserted into. This is okay for your initial creation (it should not occur again later though). For example, you could use the following strategy: upon the first insert, create a root node with one key k, two empty leaves that get assigned to P[0] for all keys less than k and P[1] for keys greater than or equal to k. By definition the insert of k* will occur in the leaf pointed from P[1]. Note that it is possible that the leaf P[0] may never be used again (we never insert a key less than k)! That is okay for this lab, every other node will comply with the properties of a B+ tree.

    • Study btree.h - There are quite a large number of member variables. You should be sure to study the header file before beginning your design since the data member serve as useful global variables for your meta and search data.

    • No redistribution on inserts - Do not try to implement (non-splitting) redistribution for insert, it's not fun. You should resolve all full occupancy insertions using splitting.

    • Linking leaf pages - As opposed to the model in class and the book that uses a doubly linked list of leaf pages, you only store a forward pointer on leaf pages i.e., a single-linked list.

    • Exceptions - In practice, there is the expectation that exceptions thrown from other DBMS components may interrupt the B+ tree and cause an inconsistent state. For example, the buffer manager may through an exception if there are no free frames during your attempt to insert information. This could be a huge source for bugs, but you do not need to concern yourself with these potential bugs. Only exceptions within your own B+ tree implementation should be considered carefully. In other words, you can assume the buffer manager will have plenty of space assuming you handling pinning/unpinning properly.

    • Calculating attribute offset - For calculating the attribute offset to send to the BTreeIndex you may want to use the offsetof library. For instance, if we are storing the following structure as a record in the original relation:
        struct RECORD {
          int i;
          double d;
          char s[64];
        };
      
      And, we are building the index over the double d, then the argument for attrByteOffset is offsetof(RECORD, d). There are examples of this in main.cpp that you can use when constructing new tests.
    • My C and C++ programming links: here

    • Week 9 Monday lab has links to other resources.

    Testing and Design

    A large part of your time on this lab will be spent designing and testing your program. When designing your solution, be sure to consider the major hurdles here are two types of nodes - leaf and non-leaf nodes - and each has its own structure. You will need to design several short helper methods for common operations.

    You should develop a testing strategy that parallels your design. The provided test is not useful for incremental development. You should devise smaller tests. Ideas include:

    • Write a print method that does a traversal of the tree, printing out information stored in each node.
    • Create smaller/simple relations that specifically invoke events like splitting.
    • Use the DEBUG flag during development in btree.h. When uncommented, nodes will store only 8 keys. This means splits happen after just a few inserts (as opposed to hundreds of inserts).
    • Write a short test that creates a B+ tree, closes it, and then attempts to read it back from disk to verify you are saving header information properly. There is a provided test at the end of main.cpp that is a good starting point.
    • Search for TODOs in main.cpp for tips on the provided test as well as on ideas for creating new tests.
    • Be sure to also test for expected errors (e.g., giving a bad search range).
    • The current test inserts keys from smallest to largest in order. That is a poor assumption to make. Be sure to try more "random" orderings.


    Submitting your lab
    Before the Due Date, push your solution to github from one of your local repos to the GitHub remote repo.

    From your local repo (in your ~you/cs44/labs/Lab05-partner1-partner2 subdirectory)

    make clean
    git add *
    git commit -m "our correct, robust, and well commented solution for grading"
    git push
    

    If that doesn't work, take a look at the "Troubleshooting" section of the Using git page. Also, be sure to complete the README.md file, add and push it.