1. CS21 Lab 9: Sorting

Due Sunday April 13, before midnight


Please read through the entire lab before starting!

This is a one week lab. Since the design process should be similar to the previous lab, lab 08, you will be doing the design and implementation for lab 9 in a single week as well. If you found the open-ended nature of the lab 08 challenging, you will want to start this lab early.

1.1. Goals

  • Work with lists of objects

  • Write and modify sorting functions to sort lists of objects by different criteria

  • Learn to filter and reduce dataset sizes before sorting

  • Continue practicing top-down design (TDD)

1.2. Earthquake Data

For this lab, we will continue to work with the same dataset we used in the previous lab, lab 08 Searching. As a reminder, here are some details about the data set, it is a month’s worth of earthquake data collected by the United States Geological Survey (USGS) in 2016, from 07/27/2016 to 08/25/2016. The dataset is a modified version from the CORGIS educational dataset archive, and it contains 8394 entries. We have defined an earthquake class to represent one earthquake record.

2. Using sorting to investigate earthquake data

You will implement a program called sortquake.py, which will allow the user to search and sort earthquake data in an intuitive way. To start, copy your Lab 08 earthquake.py file into a new sortquake.py that you will use for Lab 09:

cp ~/cs21/labs/08/earthquake.py ~/cs21/labs/09/sortquake.py

More details about copying files are here: UsingUnix.

You will be adding three new sorting related options, options 4, 5, and 6. Depending on the user’s selection, beyond your lab 08, you will prompt them for additional information and then do one of:

  1. Find earthquakes within a magnitude range and sort by largest

  2. Find the largest earthquakes within a location

  3. Find the most recent earthquakes within a location

Similar to Lab 08, your program should prompt the user with a menu of seven choices:

Please select one of the following choices:
(1) Find by location
(2) Find by ID
(3) Find by magnitude
(4) Find quakes by magnitude and sort by largest
(5) Find quakes by location and sort by largest
(6) Find quakes by location and sort by most recent
(7) Quit

Choice?

and then prompt them for additional information and execute one of the menu choices. You implemented the first three choices in Lab 08. Your program should also:

  1. Find quakes by magnitude and sort by largest.

  2. Find quakes by magnitude and sort by largest in a location.

  3. Find the most recent earthquakes at a location.

3. Find quakes by magnitude and sort by largest

For option 4 (Find quakes by magnitude and sort by largest), you must:

  • prompt the user for a mininum and maximum magnitude to display

  • find and return a list of all earthquakes that are within the provided range

  • prompt the user for the number \(N\) of earthquakes they want displayed

  • sort the earthquakes by decreasing order of magnitude

  • print out the \(N\) earthquakes with the highest magnitude. If there are fewer than \(N\) earthquakes, print all the earthquakes that match the query.

For example, here the user asks for the ten largest earthquakes between magnitude 5 and 6:

$ python3 sortquake.py

Please select one of the following choices:
(1) Find by location
(2) Find by ID
(3) Find by magnitude
(4) Find quakes by magnitude and sort by largest
(5) Find quakes by location and sort by largest
(6) Find quakes by location and sort by most recent
(7) Quit

Choice? 4
Enter minimum magnitude: 5
Enter maximum magnitude: 6
How many earthquake do you want to report? 10

Showing 10 results:
  1. ind0058: magnitude 6.0 @ 132km N of Nebe, Indonesia (-7.2805, 122.4255) on 2016-08-23 15:39:44
  2. jap0029: magnitude 6.0 @ 167km ENE of Miyako, Japan (40.3564, 143.6799) on 2016-08-20 05:01:26
  3. jap0030: magnitude 6.0 @ 170km ENE of Miyako, Japan (40.2933, 143.7539) on 2016-08-20 11:58:04
  4. jap0042: magnitude 5.9 @ Izu Islands, Japan region (30.6151, 137.8462) on 2016-08-25 13:04:43
  5. rus0019: magnitude 5.9 @ 52km S of Aleksandrovsk-Sakhalinskiy, Russia (50.4274, 142.258) on 2016-08-14 07:15:14
  6. sou0004: magnitude 5.9 @ 54km NNW of Visokoi Island, South Georgia and the South Sandwich Islands (-56.255, -27.5494) on 2016-07-31 07:33:20
  7. sou0043: magnitude 5.9 @ Southern East Pacific Rise (-55.9195, -123.197) on 2016-08-18 14:09:43
  8. ind0062: magnitude 5.8 @ 176km WSW of Sungaipenuh, Indonesia (-2.9567, 100.0549) on 2016-08-24 09:48:44
  9. ind0063: magnitude 5.8 @ 140km N of Palue, Indonesia (-7.2081, 122.5412) on 2016-08-23 15:40:46
 10. sou0058: magnitude 5.8 @ South Georgia Island region (-55.2088, -32.0731) on 2016-08-19 13:33:40


Please select one of the following choices:
(1) Find by location
(2) Find by ID
(3) Find by magnitude
(4) Find quakes by magnitude and sort by largest
(5) Find quakes by location and sort by largest
(6) Find quakes by location and sort by most recent
(7) Quit

Here the user asks for the 100 largest earthquakes between magnitude 5.5 and 5.6, but only 8 results are found:

$ python3 sortquake.py

Please select one of the following choices:
(1) Find by location
(2) Find by ID
(3) Find by magnitude
(4) Find quakes by magnitude and sort by largest
(5) Find quakes by location and sort by largest
(6) Find quakes by location and sort by most recent
(7) Quit

Choice? 4
Enter minimum magnitude: 5.5
Enter maximum magnitude: 5.6
How many earthquake do you want to report? 100

Fewer than 100 total results. Showing all results instead.

Showing 8 results:
  1. ak02185: magnitude 5.6 @ 45km S of Semisopochnoi Island, Alaska (51.5396, 179.5501) on 2016-08-14 12:28:55
  2. chi0049: magnitude 5.6 @ 155km SSE of San Pedro de Atacama, Chile (-24.286, -67.8647) on 2016-07-27 02:43:45
  3. ind0007: magnitude 5.6 @ 7km SE of Labuhankananga, Indonesia (-8.194, 117.8145) on 2016-07-31 19:40:01
  4. jap0037: magnitude 5.6 @ Izu Islands, Japan region (29.8965, 139.1312) on 2016-08-22 05:33:08
  5. sou0059: magnitude 5.6 @ 300km ESE of Grytviken, South Georgia and the South Sandwich Islands (-55.2031, -32.1207) on 2016-08-19 16:37:16
  6. ita0006: magnitude 5.5 @ 4km NE of Norcia, Italy (42.8223, 13.1257) on 2016-08-23 22:33:30
  7. per0009: magnitude 5.5 @ 39km N of Lluta, Peru (-15.6569, -72.0174) on 2016-08-14 22:59:00
  8. sou0009: magnitude 5.5 @ 62km ENE of Bristol Island, South Sandwich Islands (-58.724, -25.6076) on 2016-08-02 03:32:29

4. Find the largest earthquakes within a location

For option 5 (find the largest earthquakes within a location), you must:

  • prompt the user for a location string

  • use linear search to find all the records for which the user-entered string appears in the full location and return the results

  • sort the returned results in decreasing order of magnitude

  • print all the matching results in sorted order

Here are some examples:

$ python3 sortquake.py

Please select one of the following choices:
(1) Find by location
(2) Find by ID
(3) Find by magnitude
(4) Find quakes by magnitude and sort by largest
(5) Find quakes by location and sort by largest
(6) Find quakes by location and sort by most recent
(7) Quit

Choice? 5
Enter location: fairb

Showing 9 results:
  1. ak01506: magnitude 1.2 @ 22km NNE of Fairbanks, Alaska (65.0074, -147.4484) on 2016-08-15 19:09:43
  2. ak01299: magnitude 1.1 @ 7km SSW of Fairbanks, Alaska (64.7769, -147.7978) on 2016-08-12 14:37:38
  3. ak02147: magnitude 1.0 @ 2km NNE of Fairbanks, Alaska (64.8631, -147.7033) on 2016-08-25 22:46:36
  4. ak02148: magnitude 0.9 @ 2km NNE of Fairbanks, Alaska (64.8631, -147.7033) on 2016-08-25 22:48:32
  5. ak01504: magnitude 0.8 @ 2km S of Fairbanks, Alaska (64.8132, -147.7095) on 2016-08-15 17:15:16
  6. ak00781: magnitude 0.6 @ 9km SSW of Fairbanks, Alaska (64.7554, -147.7827) on 2016-08-06 00:17:27
  7. ak00335: magnitude 0.5 @ 18km SSW of Fairbanks, Alaska (64.6853, -147.8893) on 2016-08-01 07:30:05
  8. ak00900: magnitude 0.4 @ 16km S of Fairbanks, Alaska (64.6888, -147.7854) on 2016-08-08 08:20:51
  9. ak00161: magnitude 0.2 @ 30km SSW of Fairbanks, Alaska (64.5785, -147.928) on 2016-07-29 04:10:08

Please select one of the following choices:
(1) Find by location
(2) Find by ID
(3) Find by magnitude
(4) Find quakes by magnitude and sort by largest
(5) Find quakes by location and sort by largest
(6) Find quakes by location and sort by most recent
(7) Quit

Choice? 5
Enter location: Jackson, Wyo

Showing 5 results:
  1. wy00003: magnitude 1.82 @ 26km E of Jackson, Wyoming (43.5161667, -110.4413333) on 2016-08-03 15:46:04
  2. wy00004: magnitude 1.8 @ 30km ENE of Jackson, Wyoming (43.5345, -110.3905) on 2016-08-06 08:43:20
  3. wy00006: magnitude 0.71 @ 22km NE of Jackson, Wyoming (43.603, -110.5425) on 2016-08-18 14:20:39
  4. wy00009: magnitude 0.71 @ 20km E of Jackson, Wyoming (43.4723333, -110.5076667) on 2016-08-22 20:19:02
  5. wy00005: magnitude 0.4 @ 45km E of Jackson, Wyoming (43.4936667, -110.201) on 2016-08-17 01:35:09

Note, with a careful design, you can use the same sorting function for both options 4 and 5.

5. Find the most recent earthquakes

For option 6 (find the most recent earthquakes), you must:

  • prompt the user for a location string

  • prompt user for the number, \(N\), of earthquakes to print

  • find and return a list of all earthquakes that match the given location string

  • sort the returned list of earthquakes by date/time. To help ease the implementation, the earthquake library provides a is_more_recent(quake1, quake2) function, which takes as input two earthquake objects and returns True if quake 1 is more recent than quake 2. Otherwise, it returns False.

  • print the \(N\) most recent earthquakes. If there are fewer than \(N\) earthquakes, print all the earthquakes that match the location.

You may use either a modified bubble sort or a modified selection sort to sort the earthquakes by date/time.

This dataset is a month’s worth of earthquake data collected by the United States Geological Survey (USGS) in 2016, from 07/27/2016 to 08/25/2016. So your sorting should go from the latest one on 8/25/16, 23:58 towards the earliest one on 7/27/16, 0:19.

Here is an example:

$ python3 sortquake.py


Please select one of the following choices:
(1) Find by location
(2) Find by ID
(3) Find by magnitude
(4) Find quakes by magnitude and sort by largest
(5) Find quakes by location and sort by largest
(6) Find quakes by location and sort by most recent
(7) Quit

Choice? 6
Enter location: cali
How many earthquake do you want to report? 10

Showing 10 results:
  1. ca03275: magnitude 1.06 @ 6km WNW of The Geysers, California (38.8050003, -122.8215027) on 2016-08-25 23:30:05
  2. ca03274: magnitude 2.42 @ 22km ENE of Soledad, California (36.5154991, -121.0998306) on 2016-08-25 23:19:01
  3. ca03273: magnitude 0.64 @ 8km W of Cobb, California (38.8279991, -122.8221664) on 2016-08-25 22:18:09
  4. ca03272: magnitude 0.86 @ 13km SE of Mammoth Lakes, California (37.5541649, -118.8744965) on 2016-08-25 22:08:36
  5. ca03271: magnitude 0.94 @ 7km NW of The Geysers, California (38.8138351, -122.8233337) on 2016-08-25 21:53:21
  6. ca03270: magnitude 0.47 @ 3km W of Cobb, California (38.8168335, -122.7613297) on 2016-08-25 19:26:05
  7. ca03269: magnitude 0.9 @ 10km WNW of Cobb, California (38.8723335, -122.8314972) on 2016-08-25 19:16:34
  8. ca03268: magnitude 1.56 @ 14km NNE of Shandon, California (35.7635002, -120.2919998) on 2016-08-25 18:17:59
  9. ca03267: magnitude 0.56 @ 2km NNW of The Geysers, California (38.7970009, -122.7653351) on 2016-08-25 16:59:57
 10. ca03266: magnitude 1.0 @ 5km NW of The Geysers, California (38.8056679, -122.8083344) on 2016-08-25 16:15:11

6. Provided Library

The earthquakes library has some helpful tools for managing earthquake records in your program.

6.1. Earthquake Class

The library provides an Earthquake class, which encapsulates all the information you’ll need to keep track of one record. You can create an instance object of this class by calling the Earthquake() constructor with each field of information from one row of the earthquakes.txt file passed in as separate parameters. The number fields (magnitude, latitude, and longitude) are floats, and all other fields are strings.

You can call the following methods on a Earthquake object to retrieve information about it:

  • get_id(): Get the ID field. (string)

  • get_magnitude(): Get the magnitude field. (float)

  • get_location_full(): Get the full location field. (string)

  • get_date(): Get the date/time field. (string)

For example:

>>> from earthquakes_lab import *
>>> record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59")
>>> print(record.get_magnitude())
3.5
>>> print(record.get_location_full())
Parrish Hall, Swarthmore, Pennsylvania
>>> print(record)
id12345: magnitude 3.5 @ Parrish Hall, Swarthmore, Pennsylvania (39.90523, -75.35425) on 2021-11-13 23:59:59

6.2. Comparing Dates

For option 6, you need to sort earthquake objects by date/time. The library provides a is_more_recent(quake1,quake2) function that you can call on two earthquake objects that will return a Boolean, where True means that quake1 happened more recently than quake2, otherwise, it returns False.

6.3. Validating Floats

For option 3, you’ll prompt the user to enter a minimum magnitude, which needs to be a float. The library provides a is_float function that you can call on a string that will return a Boolean, where True means the string can safely be converted with float:

>>> from earthquakes_lab import *
>>> is_float("3.1")
True
>>> is_float("3.1.5")
False
>>> is_float("big")
False

7. Hints, and Tips

  • You should practice good top-down design, incrementally implement and test your solution, and document your code with comments. Start with the main menu and get a skeleton of your code working first. Then add each option from the menu one at a time.

  • When reading the input data file, process the file once and store all the records as a list of Earthquake objects. Reading the file once will make it faster to process later. Creating a list of Earthquake objects will avoid some list of list headaches.

  • To print an instance object of the Earthquake class, you can call print() directly on the object:

    record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59")
    print(record)

    The class knows how to format itself into a string, so you can avoid writing nasty formatting strings.

  • Validate menu selections. If the user doesn’t enter a valid number (1-7), let the user know it wasn’t a valid selection and prompt them again for a new menu item.

  • When reading file input, split() only generates a list of strings. Some of the strings will need to be converted to floats.

  • You can use either of the in-class sorting algorithms (selection sort or bubble sort) to sort the earthquakes by magnitude or date/time. You will need to make modifications to the inclass code to:

    1. Compare objects instead of numbers

    2. Compare by magnitude or date/time

    3. Sort in decreasing order (for magnitude) or by most recent (for date/time)

      It is recommended that you write two different sorting functions, one for magnitude and one for date/time.

  • Do not sort the original list of earthquakes. Only sort the list of earthquakes that you return to the user after a query. Sorting the original list will break the binary search feature of searching by location.

8. Requirements

The code you submit for labs is expected to follow good style practices, and to meet one of the course standards, you’ll need to demonstrate good style on six or more of the lab assignments across the semester. To meet the good style expectations, you should:

  • Write a comment that describes the program as a whole using a comment at the top of the file.

  • All programs should have a main() function.

  • Use descriptive variable names.

  • Write code with readable line length, avoiding writing any lines of code that exceed 80 columns.

  • Add comments to explain complex code segments (as applicable).

  • Write a comment for each function (except main) that describes its purpose, parameters and return value.

In addition, you’ll need to demonstrate good top-down design practices on two or more lab assignments across the semester. To meet the top-down design expectations, you should:

  • Have a main() function that serves as an outline for your program.

  • Your main() function should call other functions that implement subtasks.

  • The functions that are called by main() should be well-named and have good comments explaining their purpose, parameters (including their types), and return value (including its type).

Your program should meet the following requirements:

  1. Your program must read in the earthquake data from the source file, using your code from last week, display matching results after every query, and return the user to the main menu after every operation until the user quits.

  2. Search by id must continue to use the binary search from lab 08. Do not sort the original data by magnitude or time.

  3. When the user selects find quakes by magnitude, your program must prompt the users for a minimum and maximum magnitude, where the maximum magnitude is greater than the minimum. Your must compute, sort, and display matching results as described in Section 3.

  4. When the user selects find the largest earthquakes within a location, your program must meet the requirements of Section 4. You can use the same sorting function as in option 4, but you will need to modify the search function to find all earthquakes that match the location string.

  5. Find the most recent earthquakes, your program must meet the requirements of Section 5. You must use a different sorting function than the one used in option 4, because you must sort by time and not magnitude. You can use the is_more_recent function to help you sort the earthquakes by date/time. The most recent earthquake should be at the beginning of the list and the oldest earthquake should be at the end of the list.

  6. Your program must demonstrate good top down design principles and include at least four helper functions in addition to main.

Answer the Questionnaire

After each lab, please complete the short Google Forms questionnaire. Please select the right lab number (Lab 09) from the dropdown menu on the first question.

Once you’re done with that, you should run handin21 again.

Submitting lab assignments

Remember to run handin21 to turn in your lab files! You may run handin21 as many times as you want. Each time it will turn in any new work. We recommend running handin21 after you complete each program or after you complete significant work on any one program.

Logging out

When you’re done working in the lab, you should log out of the computer you’re using.

First quit any applications you are running, including your vscode editor, the browser and the terminal. Then click on the logout icon (logout icon or other logout icon) and choose "log out".

If you plan to leave the lab for just a few minutes, you do not need to log out. It is, however, a good idea to lock your machine while you are gone. You can lock your screen by clicking on the lock xlock icon. PLEASE do not leave a session locked for a long period of time. Power may go out, someone might reboot the machine, etc. You don’t want to lose any work!