CS35 Lab 7: Sort Detective

Code due by 11:59 p.m., Tuesday, March 18; Written part due by noon, Wednesday, March 19

This will again be a partnered lab. For this lab, you will work allowed to work with one other individual. As usual, run update35 to get the initial files for this lab.

This lab consists of four five parts:

  1. A short problem set to get experience working through merge sort and quick sort
  2. Complete a templated implementation of quickSort and its helper function, partition
  3. Test your templated quickSort using ints, floats, and various comparators
  4. A writeup documenting your solution to the Sort Detective problem
  5. (Optional extra credit) Design a sorting algorithm to optimize the number of comparisons.
You will hand in two parts for this lab:

Problem Set

Consider the following array:

  1. Use merge sort to sort the provided array. Be sure to show your work. For example, show the substeps of merge sort using recursion trees. Below are sample recursion trees for the mergeSort step (i.e., divide) and merge step (i.e., conquer).

    Example solution: Divide step
    Example solution: Merge step

  2. For the original array, run quick sort, showing your work. There are many ways to show your work but it should be clear a) what subset of the array is being partitioned and b) what the resulting pivot is (shown in yellow below). You should use the standard implementation where the right-most item is the pivot. Below is an example solution. Note that the first step is simply the first partition while each subsequent step is the quick sort recursive call followed by the partition. Also, the array in this example is not actually copied and split for the recursive call, we have done that for illustration.

    Example solution: quick sort

  3. Describe (in pseudocode) a divide-and-conquer algorithm for finding the median element of an array. Your solution must be faster than simply sorting the array and finding the element in the middle. Additionally, your solution must be completely in pseudocode (you can use defined methods from class without redefining e.g., partition, merge, swap). HINT: your solution should look very similar to one of merge sort or quick sort. Which of those two algorithms can be shortened to find the median?
Completing quickSort and partition

The templated quick sort algorithm is to be defined in quickSort.h and quickSort-inl.h. We have provided the declaration for you, so no modifications are needed in quickSort.h. The algorithm is broken into four components:

To complete this work, you mainly only need to code in the locations marked with TODO. Once you are done, you can test your work using the provided testQuickSort program which sorts an array of ints for you. Specifically, implement the following:


Test quickSort

As you complete your quickSort, you should test your implementation using the code provided in testQuickSort. We have provided one test sorting an array of ints. You will need to study the implementation using the files testQuickSort.cpp and helper functions in randArray.h/-inl.h. Then, add two more tests to testQuickSort:

Once you have completed the test, note how this is an example of polymorphism. We have written a sorting method that can work on any data type as long as it has a defined method for comparing two objects. While it is beyond the scope of the assignment, you should think about how you can compare two Point objects and note that this would allow you to sort an array of Point objects. For example, we could define the following comparator that returns true if a has a smaller x value OR a smaller yvalue if the x value's are equal:

bool lessPoints(Point a, Point b){
  if(a.getX() == b.getX()){
    return a.getY() < b.getY();
  } else if(a.getX() < b.getX()){
    return true;
  } else {
    return false;
  }
}


Sort Detective
This portion of the lab is based on David Levine's Sort Detective lab developed at St. Bonaventure University

In the crime-ridden streets of Swarthmore, PA, agent Michael Scarn receives a knock on his door. In walks Prof. Soni with a stack of papers.

Soni: "There's been a horrible crime. The worst kind a CS professor can ever witness. A student has...submitted an ugly program. No meaningful variable names, no comments. They have goto statements. Look at the graphical interface! All the buttons are labeled as letters in the greek alphabet! It's so obscure, we don't even know who wrote it. *insert sobbing*"

Scarn: "Well, what do you expect. You should never let an undergrad program in Java."

Soni: "Enough chit chat. They are all sorting algorithms, but they are unlabeled. Find me the correct mapping of sort algorithm to label, and we'll give you a handsome reward of A on this lab."

Scarn: "Professors are so cheap..."
End Scene

For this portion of the lab you will identify sorting algorithms based on their run time characteristics. Specifically, you have been given a program that allows a user to utilize any of five sorting algorithms to sort an array. The algorithms are:

Unfortunately, the designer of the program did not label the algorithms; instead they are simply identified with Greek letters. You must apply your understanding of the general properties of the algorithms (and in some cases of the code used to implement them) to determine the proper labeling of the buttons. We have seen two of these algorithms in detail this week. The others were most likely seen in CS 21, but you can familiarize yourself with them by referring to the course textbook or looking at their wikipedia entries. There are several parameters which will help you:

The secondary objective of this lab is for you to gain experience writing a concise, but complete analysis of a system.

Instructions

  1. In your cs35/labs/07/ directory there is a folder sortDetective. Within there, you will run a script:
    $ cd
    $ cd cs35/labs/07/sortDetective
    $ ./runDetective.sh
    
    Before proceeding, explore the interface and your options.

  2. Devise a plan which will enable you to match the particular algorithms to the button names. E.g., insertion sort to Alpha. Your strategy should employ each available parameter (e.g., how does having a reverse sorted array effect each sorting algorithm? When an algorithm is O(n^2), are we referring to comparisons? movements? both?) Hint: It may make sense to try to divide the sorts into initial groups based on run-time performance (O(n^2) vs O(n log n)) and then work on each group separately.

  3. Execute your strategy. Be sure to keep notes to help you with your writeup.For example, "I was able to identify button Alpha because it's run time was..." It is important that your strategy is thorough.

  4. Produce a writeup describing your results and experimental methodology. You should employ general scientific writing skills to produce this paper. Your paper should include a description of your strategy with justifications. It should also include a summary table showing your mapping from button name to algorithm so it is easy for me to identify your conclusions. Your results section should justify your choices (i.e., what features of your experiments helped you identify the pairing).

A note on writing

Part of scientific writing is the ability to precise with your language. You should not submit a 20 page thesis. Your writeup shouldn't be more than 2 pages (3 if using generious spacing).

In addition to being concise, your writeup will also be evaluated based on the completeness of the report, the clarity (and grammar) of the writing, and general presentation. Getting a perfect match does not ensure a high grade.

Extra Credit

You have seen several sorting algorithms, both in this lab and in CS21. For extra credit, you may take any sorting algorithm you wish and try to extend/optimize it to minimize the number of comparisons it uses. You will be judged on the actual number of comparisons your algorithm uses -- not on the big-Oh runtime. In order to count the number of comparisons, the templated comparison function now takes the two items to compare AND a third count int, which will get incremented each time the compare function gets called. Your sorting algorithm should take the following inputs:

You won't need to worry about count, except to pass it into compare each time you call it. See fooSort.h/fooSort-inl.h for how this works.

You should call your sorting algorithm FOOSort and put it in a file fooSort.h/fooSort-inl.h, where FOO is your username (it's ok to leave the number off the end.) For example, my swarthmore user ID is jbrody1, so my sorting algorithm should be called

jbrodySort(...)

and live in jbrodySort.h and jbrodySort-inl.h. If your file and/or algorithm is named incorrectly no extra credit will be given. See testExtraCredit.cpp for a sample of how I might test this.

I will test your sorting algorithm by running it on arrays of three different sizes. For each size, I will generate an array of items (i) in order, (ii) in reverse order, and (iii) in random order (I'll do (iii) twice). During this test, I will use different types and different compare functions.

One final note about this part of the lab: do NOT assume I will test using ints. Your sorting algorithm should work for any type T and any comparison operator CMP. You're free to optimize your algorithm however you want, but you it must remain generic and reusable.

Submit

Run handin35 to submit your quick sort implementation and test. Print your solution to sort detective (no more than 3 pages) and hand in with your problem set by noon on Wednesday.