CS21 Lab 9: Sorting

Due Saturday, November 20, before midnight


Please read through the entire lab before starting!

This is a one week lab. Since the design process should be similar to the previous lab, lab 08, you will be doing the design and implementation for lab 9 in a single week as well. If you found the open-ended nature of the lab 08 challenging, you will want to start this lab early.

Goals

  • Write sorting programs, selection sort and bubble sort

  • Learn to filter and reduce dataset sizes before sorting

  • Continue practicing top-down design (TDD)

Earthquake Data

For this lab, we will continue to work with the same dataset we used in the previous lab, lab 08 Searching. As a reminder, here are some details about the data set, it is a month’s worth of earthquake data collected by the United States Geological Survey (USGS) in 2016, from 07/27/2016 to 08/25/2016. The dataset is a modified version from the CORGIS educational dataset archive, and it contains 8394 entries. We have defined an earthquake class to represent one earthquake record.

Using sorting to investigate earthquake data

You will implement a program called sorting.py, which will allow the user to search through this dataset and learn interesting information. To start, copy your Lab 08 earthquakes.py file into a new sorting.py that you will use for Lab 09:

cp ~/cs21/labs/08/earthquakes.py ~/cs21/labs/09/sorting.py

More details about copying files are here: UsingUnix.

You will adding three new sorting related options, options 4, 5, and 6. Depending on the user’s selection, beyond your lab 08, you will prompt them for additional information and then do one of:

  1. Rank the Earthquakes by Magnitude, print the top N largest.

  2. Sort Earthquakes within a location.

  3. Sort Earthquakes by the Most Recent.

Similar to Lab 08, your porgram should prompt the user with a menu of seven choices:

Please select one of the following choices:
(1) Find by ID
(2) Find by location
(3) Map state
(4) Find the largest earthquakes by magnitude
(5) Find the largest earthquakes within a location
(6) Find the most recent earthquakes
(7) Quit

Choice?

and then prompt them for additional information and execute one of the menu choices. You implemented the first three choices in Lab 08. Your program should also:

  1. Find the largest earthquakes by magnitude, using selection sort.

  2. Find the largest earthquakes in a location, using bubble sort.

  3. Find the most recent earthquakes.

Find the largest earthquakes by magnitude

Note: This feature was an optional extra challenge in lab 08. If you implemented this feature as part of lab 8, please be sure that your version for lab 9 conforms to the specification below (most notably, that it uses selection sort).

For option 4 (find the largest earthquakes by magnitude), you must:

  • prompt the user for the number of earthquakes they want displayed

  • use selection sort to sort the earthquakes by decreasing order of magnitude

  • print out the N earthquakes with the highest magnitude

For example, here the user asks for the eight largest earthquakes (by magnitude):

$ python3 sorting.py

Please select one of the following choices:
(1) Find by ID
(2) Find by location
(3) Map state
(4) Find the largest earthquakes by magnitude
(5) Find the largest earthquakes within a location
(6) Find the most recent earthquakes
(7) Quit

Choice? 4
How many (N)? 8

Largest 8 earthquakes by magnitude:
us100068jg: magnitude 7.7 @ 28km SSW of Agrihan, Northern Mariana Islands (18.5439, 145.541) on 2016-07-29 17:18:26
us10006exl: magnitude 7.4 @ South Georgia Island region (-55.2793, -31.874) on 2016-08-19 03:32:22
us10006d5h: magnitude 7.2 @ 109km E of Ile Hunter, New Caledonia (-22.4953, 173.1114) on 2016-08-11 21:26:35
us10006gbf: magnitude 6.8 @ 25km W of Chauk, Burma (20.9192, 94.5789) on 2016-08-24 06:34:55
us10006fik: magnitude 6.4 @ South Georgia Island region (-55.2775, -31.7546) on 2016-08-20 23:45:23
us10006a2k: magnitude 6.3 @ 70km ENE of Iwo Jima, Japan (24.9477, 142.0074) on 2016-08-04 12:24:33
us10006a1d: magnitude 6.2 @ 53km NW of Abra Pampa, Argentina (-22.3942, -66.0814) on 2016-08-04 10:15:12
us10006d6d: magnitude 6.2 @ South of the Fiji Islands (-25.1394, -177.3386) on 2016-08-11 23:29:33

Please select one of the following choices:
(1) Find by ID
(2) Find by location
(3) Map state
(4) Find the largest earthquakes by magnitude
(5) Find the largest earthquakes within a location
(6) Find the most recent earthquakes
(7) Quit

Choice? 7

Find the largest earthquakes within a location

For option 5 (find the largest earthquakes within a location), you must:

  • prompt the user for a location string

  • prompt the user for the number of earthquakes they want displayed.

  • use linear search to find all the records for which the user-entered string appears in the full location.

  • use bubble sort to sort these earthquakes in descending order of magnitude.

Here are some examples:

$ python3 sorting.py

Please select one of the following choices:
(1) Find by ID
(2) Find by location
(3) Map state
(4) Find the largest earthquakes by magnitude
(5) Find the largest earthquakes within a location
(6) Find the most recent earthquakes
(7) Quit

Choice? 5
Location? Pennsylvania
How many (N)? 3
Largest 3 earthquakes in Pennsylvania:
ld60050103: magnitude 1.37 @ 8km E of Wellsboro, Pennsylvania (41.7358333, -77.196) on 2016-08-22 02:15:45

Please select one of the following choices:
(1) Find by ID
(2) Find by location
(3) Map state
(4) Find the largest earthquakes by magnitude
(5) Find the largest earthquakes within a location
(6) Find the most recent earthquakes
(7) Quit

Choice? 5
Location? New Jersey
How many (N)? 3
Largest 3 earthquakes in New Jersey:
ld60122961: magnitude 0.98 @ 1km N of Wanaque, New Jersey (41.0555, -74.2913333) on 2016-08-08 22:01:44
ld60122426: magnitude 0.81 @ 0km NNE of Brookdale, New Jersey (40.8413333, -74.1771667) on 2016-07-30 20:34:44
ld60049603: magnitude 0.54 @ 0km W of Butler, New Jersey (41.0103333, -74.3435) on 2016-08-08 22:33:00

Please select one of the following choices:
(1) Find by ID
(2) Find by location
(3) Map state
(4) Find the largest earthquakes by magnitude
(5) Find the largest earthquakes within a location
(6) Find the most recent earthquakes
(7) Quit

Choice? 7

Find the most recent earthquakes

For option 6 (find the most recent earthquakes), you must:

  • prompt user for the number of earthquakes, get the input N

  • sort the earthquakes by date/time. To help ease the implementation, the earthquake library provides a compare_dates(quake1, quake2) function, which takes as input two earthquake objects and returns True if quake 1 is more recent than quake 2, otherwise, it returns False.

  • print the N most recent earthquakes

  • You can use either selection sort or bubble sort for this feature

This dataset is a month’s worth of earthquake data collected by the United States Geological Survey (USGS) in 2016, from 07/27/2016 to 08/25/2016. So your sorting should go from the latest one on 8/25/16, 23:58 towards the earliest one on 7/27/16, 0:19.

Here is an example:

$ python3 sorting.py

Please select one of the following choices:
(1) Find by ID
(2) Find by location
(3) Map state
(4) Find the largest earthquakes by magnitude
(5) Find the largest earthquakes within a location
(6) Find the most recent earthquakes
(7) Quit

Choice? 6
How many (N)? 5
Most Recent 5 earthquakes:
ci37672360: magnitude 0.89 @ 14km NE of Yucaipa, CA (34.1191667, -116.9336667) on 2016-08-25 23:58:01
ci37672328: magnitude 1.55 @ 6km NNW of Chatsworth, CA (34.308, -118.6353333) on 2016-08-25 23:36:11
nc72685251: magnitude 1.06 @ 6km WNW of The Geysers, California (38.8050003, -122.8215027) on 2016-08-25 23:30:05
ak13879193: magnitude 1.4 @ 5km ESE of Big Lake, Alaska (61.4984;-149.8627) on 2016-08-25 23:19:18
nc72685246: magnitude 2.42 @ 22km ENE of Soledad, California (36.5154991, -121.0998306) on 2016-08-25 23:19:01

Optional Extra Challenge (Sort States)

For option 8 (Sort States by total numbers of Magnitude M+ Earthquakes), you must:

  • Prompt the user for an integer M from 1 to 7 (e.g., 5).

  • Print the States orders by total numbers of Magnitude M+ Earthquakes

  • Print the numbers of Magnitude M+ Earthquakes for each State in the list

  • If the integer M is not valid, print a suitable error message.

  • If no records match the State and M+, let the user know that you couldn’t find any.

Here is an example:

$ python3 sorting.py

Please select one of the following choices:
(1) Find by ID
(2) Find by location
(3) Map state
(4) Find the largest earthquakes by magnitude
(5) Find the largest earthquakes within a location
(6) Find the most recent earthquakes
(7) Quit
(8) Sort States by total numbers of Magnitude M+ Earthquakes

Choice? 8
Please enter an integer M (from 1 to 7) for Magnitude M+ Earthquakes. 3
Printing States sorted by total numbers of Magnitude 3+ Earthquakes:
States, Numbers of Earthquakes
MA, 103
PA, 92
CA, 88
... ...

Provided Library

The earthquakes library has some helpful tools for managing earthquake records in your program.

Earthquake Class

The library provides an Earthquake class, which encapsulates all the information you’ll need to keep track of one record. You can create an instance object of this class by calling the Earthquake() constructor with each field of information from one row of the earthquakes.txt file passed in as separate parameters. The number fields (magnitude, latitude, and longitude) are floats, and all other fields are strings.

You can call the following methods on a Earthquake object to retrieve information about it:

  • get_id(): Get the ID field. (string)

  • get_magnitude(): Get the magnitude field. (float)

  • get_location_full(): Get the full location field. (string)

  • get_latitude(): Get the latitude field. (float)

  • get_longitude(): Get the longitude field. (float)

  • get_location(): Get the short location field. (string)

  • get_date(): Get the date/time field. (string)

For example:

>>> from earthquakes_lab import *
>>> record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59")
>>> print(record.get_magnitude())
3.5
>>> print(record.get_location_full())
Parrish Hall, Swarthmore, Pennsylvania
>>> print(record.get_location())
PA
>>> print(record)
id12345: magnitude 3.5 @ Parrish Hall, Swarthmore, Pennsylvania (39.90523, -75.35425) on 2021-11-13 23:59:59

Comparing Dates

For option 6, you need to sort earthquake objects by date/time. The library provides a compare_dates(quake1,quake2) function that you can call on two earthquake objects that will return a Boolean, where True means that quake1 happened more recently than quake2, otherwise, it returns False.

Validating Floats

For option 3, you’ll prompt the user to enter a minimum magnitude, which needs to be a float. The library provides a is_float function that you can call on a string that will return a Boolean, where True means the string can safely be converted with float:

>>> from earthquakes_lab import *
>>> is_float("3.1")
True
>>> is_float("3.1.5")
False
>>> is_float("big")
False

Requirements, Hints, and Tips

  • You should practice good top-down design, incrementally implement and test your solution, and document your code with comments. Start with the main menu and get a skeleton of your code working first. Then add each option from the menu one at a time.

  • When reading the input data file, process the file once and store all the records as a list of Earthquake objects. Reading the file once will make it faster to process later. Creating a list of Earthquake objects will avoid some list of list headaches.

  • To print an instance object of the Earthquake class, you can call print() directly on the object:

    record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59")
    print(record)

    The class knows how to format itself into a string, so you can avoid writing nasty formatting strings.

  • Validate menu selections. If the user doesn’t enter a valid number (1-7), let the user know it wasn’t a valid selection and prompt them again for a new menu item.

  • When reading file input, split() only generates a list of strings. Some of the strings will need to be converted to floats.

Submitting

Answer the Questionnaire

Please edit the Questions-09.txt file in your cs21/labs/09 directory and answer the questions in that file.

Once you’re done with that, run handin21 again.

Turning in your labs…​

Remember to run handin21 to turn in your lab files! You may run handin21 as many times as you want. Each time it will turn in any new work. We recommend running handin21 after you complete each program or after you complete significant work on any one program.

Logging out

When you’re done working in the lab, you should log out of the computer you’re using. First quit any applications you are running, like the browser and the terminal. Then click on the logout icon (logout icon or other logout icon) and choose "log out".

If you plan to leave the lab for just a few minutes, you do not need to log out. It is, however, a good idea to lock your machine while you are gone. You can lock your screen by clicking on the lock xlock icon. PLEASE do not leave a session locked for a long period of time. Power may go out, someone might reboot the machine, etc. You don’t want to lose any work!