1. CS21 Lab 9: Sorting
Due Sunday April 13, before midnight
Please read through the entire lab before starting!
This is a one week lab. Since the design process should be similar to the previous lab, lab 08, you will be doing the design and implementation for lab 9 in a single week as well. If you found the open-ended nature of the lab 08 challenging, you will want to start this lab early.
1.1. Goals
-
Work with lists of objects
-
Write and modify sorting functions to sort lists of objects by different criteria
-
Learn to filter and reduce dataset sizes before sorting
-
Continue practicing top-down design (TDD)
1.2. Earthquake Data
For this lab, we will continue to work with the same dataset we used in the previous lab, lab 08 Searching. As a reminder, here are some details about the data set, it is a month’s worth of earthquake data collected by the United States Geological Survey (USGS) in 2016, from 07/27/2016 to 08/25/2016. The dataset is a modified version from the CORGIS educational dataset archive, and it contains 8394 entries. We have defined an earthquake class to represent one earthquake record.
2. Using sorting to investigate earthquake data
You will implement a program called sortquake.py
, which will
allow the user to search and sort earthquake data in an intuitive way. To start, copy your Lab 08 earthquake.py
file into
a new sortquake.py
that you will use for Lab 09:
cp ~/cs21/labs/08/earthquake.py ~/cs21/labs/09/sortquake.py
More details about copying files are here: UsingUnix.
You will be adding three new sorting related options, options 4, 5, and 6. Depending on the user’s selection, beyond your lab 08, you will prompt them for additional information and then do one of:
-
Find earthquakes within a magnitude range and sort by largest
-
Find the largest earthquakes within a location
-
Find the most recent earthquakes within a location
Similar to Lab 08, your program should prompt the user with a menu of seven choices:
Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Find quakes by magnitude and sort by largest (5) Find quakes by location and sort by largest (6) Find quakes by location and sort by most recent (7) Quit Choice?
and then prompt them for additional information and execute one of the menu choices. You implemented the first three choices in Lab 08. Your program should also:
-
Find quakes by magnitude and sort by largest.
-
Find quakes by magnitude and sort by largest in a location.
-
Find the most recent earthquakes at a location.
3. Find quakes by magnitude and sort by largest
For option 4 (Find quakes by magnitude and sort by largest), you must:
-
prompt the user for a mininum and maximum magnitude to display
-
find and return a list of all earthquakes that are within the provided range
-
prompt the user for the number \(N\) of earthquakes they want displayed
-
sort the earthquakes by decreasing order of magnitude
-
print out the \(N\) earthquakes with the highest magnitude. If there are fewer than \(N\) earthquakes, print all the earthquakes that match the query.
For example, here the user asks for the ten largest earthquakes between magnitude 5 and 6:
$ python3 sortquake.py Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Find quakes by magnitude and sort by largest (5) Find quakes by location and sort by largest (6) Find quakes by location and sort by most recent (7) Quit Choice? 4 Enter minimum magnitude: 5 Enter maximum magnitude: 6 How many earthquake do you want to report? 10 Showing 10 results: 1. ind0058: magnitude 6.0 @ 132km N of Nebe, Indonesia (-7.2805, 122.4255) on 2016-08-23 15:39:44 2. jap0029: magnitude 6.0 @ 167km ENE of Miyako, Japan (40.3564, 143.6799) on 2016-08-20 05:01:26 3. jap0030: magnitude 6.0 @ 170km ENE of Miyako, Japan (40.2933, 143.7539) on 2016-08-20 11:58:04 4. jap0042: magnitude 5.9 @ Izu Islands, Japan region (30.6151, 137.8462) on 2016-08-25 13:04:43 5. rus0019: magnitude 5.9 @ 52km S of Aleksandrovsk-Sakhalinskiy, Russia (50.4274, 142.258) on 2016-08-14 07:15:14 6. sou0004: magnitude 5.9 @ 54km NNW of Visokoi Island, South Georgia and the South Sandwich Islands (-56.255, -27.5494) on 2016-07-31 07:33:20 7. sou0043: magnitude 5.9 @ Southern East Pacific Rise (-55.9195, -123.197) on 2016-08-18 14:09:43 8. ind0062: magnitude 5.8 @ 176km WSW of Sungaipenuh, Indonesia (-2.9567, 100.0549) on 2016-08-24 09:48:44 9. ind0063: magnitude 5.8 @ 140km N of Palue, Indonesia (-7.2081, 122.5412) on 2016-08-23 15:40:46 10. sou0058: magnitude 5.8 @ South Georgia Island region (-55.2088, -32.0731) on 2016-08-19 13:33:40 Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Find quakes by magnitude and sort by largest (5) Find quakes by location and sort by largest (6) Find quakes by location and sort by most recent (7) Quit
Here the user asks for the 100 largest earthquakes between magnitude 5.5 and 5.6, but only 8 results are found:
$ python3 sortquake.py Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Find quakes by magnitude and sort by largest (5) Find quakes by location and sort by largest (6) Find quakes by location and sort by most recent (7) Quit Choice? 4 Enter minimum magnitude: 5.5 Enter maximum magnitude: 5.6 How many earthquake do you want to report? 100 Fewer than 100 total results. Showing all results instead. Showing 8 results: 1. ak02185: magnitude 5.6 @ 45km S of Semisopochnoi Island, Alaska (51.5396, 179.5501) on 2016-08-14 12:28:55 2. chi0049: magnitude 5.6 @ 155km SSE of San Pedro de Atacama, Chile (-24.286, -67.8647) on 2016-07-27 02:43:45 3. ind0007: magnitude 5.6 @ 7km SE of Labuhankananga, Indonesia (-8.194, 117.8145) on 2016-07-31 19:40:01 4. jap0037: magnitude 5.6 @ Izu Islands, Japan region (29.8965, 139.1312) on 2016-08-22 05:33:08 5. sou0059: magnitude 5.6 @ 300km ESE of Grytviken, South Georgia and the South Sandwich Islands (-55.2031, -32.1207) on 2016-08-19 16:37:16 6. ita0006: magnitude 5.5 @ 4km NE of Norcia, Italy (42.8223, 13.1257) on 2016-08-23 22:33:30 7. per0009: magnitude 5.5 @ 39km N of Lluta, Peru (-15.6569, -72.0174) on 2016-08-14 22:59:00 8. sou0009: magnitude 5.5 @ 62km ENE of Bristol Island, South Sandwich Islands (-58.724, -25.6076) on 2016-08-02 03:32:29
4. Find the largest earthquakes within a location
For option 5 (find the largest earthquakes within a location), you must:
-
prompt the user for a location string
-
use linear search to find all the records for which the user-entered string appears in the full location and return the results
-
sort the returned results in decreasing order of magnitude
-
print all the matching results in sorted order
Here are some examples:
$ python3 sortquake.py Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Find quakes by magnitude and sort by largest (5) Find quakes by location and sort by largest (6) Find quakes by location and sort by most recent (7) Quit Choice? 5 Enter location: fairb Showing 9 results: 1. ak01506: magnitude 1.2 @ 22km NNE of Fairbanks, Alaska (65.0074, -147.4484) on 2016-08-15 19:09:43 2. ak01299: magnitude 1.1 @ 7km SSW of Fairbanks, Alaska (64.7769, -147.7978) on 2016-08-12 14:37:38 3. ak02147: magnitude 1.0 @ 2km NNE of Fairbanks, Alaska (64.8631, -147.7033) on 2016-08-25 22:46:36 4. ak02148: magnitude 0.9 @ 2km NNE of Fairbanks, Alaska (64.8631, -147.7033) on 2016-08-25 22:48:32 5. ak01504: magnitude 0.8 @ 2km S of Fairbanks, Alaska (64.8132, -147.7095) on 2016-08-15 17:15:16 6. ak00781: magnitude 0.6 @ 9km SSW of Fairbanks, Alaska (64.7554, -147.7827) on 2016-08-06 00:17:27 7. ak00335: magnitude 0.5 @ 18km SSW of Fairbanks, Alaska (64.6853, -147.8893) on 2016-08-01 07:30:05 8. ak00900: magnitude 0.4 @ 16km S of Fairbanks, Alaska (64.6888, -147.7854) on 2016-08-08 08:20:51 9. ak00161: magnitude 0.2 @ 30km SSW of Fairbanks, Alaska (64.5785, -147.928) on 2016-07-29 04:10:08 Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Find quakes by magnitude and sort by largest (5) Find quakes by location and sort by largest (6) Find quakes by location and sort by most recent (7) Quit Choice? 5 Enter location: Jackson, Wyo Showing 5 results: 1. wy00003: magnitude 1.82 @ 26km E of Jackson, Wyoming (43.5161667, -110.4413333) on 2016-08-03 15:46:04 2. wy00004: magnitude 1.8 @ 30km ENE of Jackson, Wyoming (43.5345, -110.3905) on 2016-08-06 08:43:20 3. wy00006: magnitude 0.71 @ 22km NE of Jackson, Wyoming (43.603, -110.5425) on 2016-08-18 14:20:39 4. wy00009: magnitude 0.71 @ 20km E of Jackson, Wyoming (43.4723333, -110.5076667) on 2016-08-22 20:19:02 5. wy00005: magnitude 0.4 @ 45km E of Jackson, Wyoming (43.4936667, -110.201) on 2016-08-17 01:35:09
Note, with a careful design, you can use the same sorting function for both options 4 and 5.
5. Find the most recent earthquakes
For option 6 (find the most recent earthquakes), you must:
-
prompt the user for a location string
-
prompt user for the number, \(N\), of earthquakes to print
-
find and return a list of all earthquakes that match the given location string
-
sort the returned list of earthquakes by date/time. To help ease the implementation, the earthquake library provides a
is_more_recent(quake1, quake2)
function, which takes as input two earthquake objects and returnsTrue
if quake 1 is more recent than quake 2. Otherwise, it returnsFalse
. -
print the \(N\) most recent earthquakes. If there are fewer than \(N\) earthquakes, print all the earthquakes that match the location.
You may use either a modified bubble sort or a modified selection sort to sort the earthquakes by date/time.
This dataset is a month’s worth of earthquake data collected by the United States Geological Survey (USGS) in 2016, from 07/27/2016 to 08/25/2016. So your sorting should go from the latest one on 8/25/16, 23:58 towards the earliest one on 7/27/16, 0:19. |
Here is an example:
$ python3 sortquake.py Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Find quakes by magnitude and sort by largest (5) Find quakes by location and sort by largest (6) Find quakes by location and sort by most recent (7) Quit Choice? 6 Enter location: cali How many earthquake do you want to report? 10 Showing 10 results: 1. ca03275: magnitude 1.06 @ 6km WNW of The Geysers, California (38.8050003, -122.8215027) on 2016-08-25 23:30:05 2. ca03274: magnitude 2.42 @ 22km ENE of Soledad, California (36.5154991, -121.0998306) on 2016-08-25 23:19:01 3. ca03273: magnitude 0.64 @ 8km W of Cobb, California (38.8279991, -122.8221664) on 2016-08-25 22:18:09 4. ca03272: magnitude 0.86 @ 13km SE of Mammoth Lakes, California (37.5541649, -118.8744965) on 2016-08-25 22:08:36 5. ca03271: magnitude 0.94 @ 7km NW of The Geysers, California (38.8138351, -122.8233337) on 2016-08-25 21:53:21 6. ca03270: magnitude 0.47 @ 3km W of Cobb, California (38.8168335, -122.7613297) on 2016-08-25 19:26:05 7. ca03269: magnitude 0.9 @ 10km WNW of Cobb, California (38.8723335, -122.8314972) on 2016-08-25 19:16:34 8. ca03268: magnitude 1.56 @ 14km NNE of Shandon, California (35.7635002, -120.2919998) on 2016-08-25 18:17:59 9. ca03267: magnitude 0.56 @ 2km NNW of The Geysers, California (38.7970009, -122.7653351) on 2016-08-25 16:59:57 10. ca03266: magnitude 1.0 @ 5km NW of The Geysers, California (38.8056679, -122.8083344) on 2016-08-25 16:15:11
6. Provided Library
The earthquakes
library has some helpful tools for managing earthquake
records in your program.
6.1. Earthquake Class
The library provides an Earthquake
class, which encapsulates all the
information you’ll need to keep track of one record. You can create an
instance object of this class by calling the Earthquake()
constructor with
each field of information from one row of the earthquakes.txt
file passed in
as separate parameters. The number fields (magnitude, latitude, and longitude)
are floats, and all other fields are strings.
You can call the following methods on a Earthquake
object to retrieve information about it:
-
get_id()
: Get the ID field. (string) -
get_magnitude()
: Get the magnitude field. (float) -
get_location_full()
: Get the full location field. (string) -
get_date()
: Get the date/time field. (string)
For example:
>>> from earthquakes_lab import * >>> record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59") >>> print(record.get_magnitude()) 3.5 >>> print(record.get_location_full()) Parrish Hall, Swarthmore, Pennsylvania >>> print(record) id12345: magnitude 3.5 @ Parrish Hall, Swarthmore, Pennsylvania (39.90523, -75.35425) on 2021-11-13 23:59:59
6.2. Comparing Dates
For option 6, you need to sort earthquake objects by date/time. The
library provides a is_more_recent(quake1,quake2)
function that you
can call on two earthquake objects that will return a Boolean, where
True
means that quake1
happened more recently than quake2
,
otherwise, it returns False
.
6.3. Validating Floats
For option 3, you’ll prompt the user to enter a minimum magnitude, which needs
to be a float. The library provides a is_float
function that you can call on
a string that will return a Boolean, where True
means the string can safely
be converted with float
:
>>> from earthquakes_lab import * >>> is_float("3.1") True >>> is_float("3.1.5") False >>> is_float("big") False
7. Hints, and Tips
-
You should practice good top-down design, incrementally implement and test your solution, and document your code with comments. Start with the main menu and get a skeleton of your code working first. Then add each option from the menu one at a time.
-
When reading the input data file, process the file once and store all the records as a list of
Earthquake
objects. Reading the file once will make it faster to process later. Creating a list ofEarthquake
objects will avoid some list of list headaches. -
To print an instance object of the
Earthquake
class, you can callprint()
directly on the object:record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59") print(record)
The class knows how to format itself into a string, so you can avoid writing nasty formatting strings.
-
Validate menu selections. If the user doesn’t enter a valid number (1-7), let the user know it wasn’t a valid selection and prompt them again for a new menu item.
-
When reading file input,
split()
only generates a list of strings. Some of the strings will need to be converted to floats. -
You can use either of the in-class sorting algorithms (selection sort or bubble sort) to sort the earthquakes by magnitude or date/time. You will need to make modifications to the inclass code to:
-
Compare objects instead of numbers
-
Compare by magnitude or date/time
-
Sort in decreasing order (for magnitude) or by most recent (for date/time)
It is recommended that you write two different sorting functions, one for magnitude and one for date/time.
-
-
Do not sort the original list of earthquakes. Only sort the list of earthquakes that you return to the user after a query. Sorting the original list will break the binary search feature of searching by location.
8. Requirements
The code you submit for labs is expected to follow good style practices, and to meet one of the course standards, you’ll need to demonstrate good style on six or more of the lab assignments across the semester. To meet the good style expectations, you should:
In addition, you’ll need to demonstrate good top-down design practices on two or more lab assignments across the semester. To meet the top-down design expectations, you should:
|
Your program should meet the following requirements:
-
Your program must read in the earthquake data from the source file, using your code from last week, display matching results after every query, and return the user to the main menu after every operation until the user quits.
-
Search by id must continue to use the binary search from lab 08. Do not sort the original data by magnitude or time.
-
When the user selects find quakes by magnitude, your program must prompt the users for a minimum and maximum magnitude, where the maximum magnitude is greater than the minimum. Your must compute, sort, and display matching results as described in Section 3.
-
When the user selects find the largest earthquakes within a location, your program must meet the requirements of Section 4. You can use the same sorting function as in option 4, but you will need to modify the search function to find all earthquakes that match the location string.
-
Find the most recent earthquakes, your program must meet the requirements of Section 5. You must use a different sorting function than the one used in option 4, because you must sort by time and not magnitude. You can use the
is_more_recent
function to help you sort the earthquakes by date/time. The most recent earthquake should be at the beginning of the list and the oldest earthquake should be at the end of the list. -
Your program must demonstrate good top down design principles and include at least four helper functions in addition to
main
.
Answer the Questionnaire
After each lab, please complete the short Google Forms questionnaire. Please select the right lab number (Lab 09) from the dropdown menu on the first question.
Once you’re done with that, you should run handin21
again.
Submitting lab assignments
Remember to run handin21
to turn in your lab files! You may run handin21
as many times as you want. Each time it will turn in any new work. We
recommend running handin21
after you complete each program or after you
complete significant work on any one program.
Logging out
When you’re done working in the lab, you should log out of the computer you’re using.
First quit any applications you are running, including your vscode editor, the browser and the
terminal. Then click on the logout icon ( or
) and choose "log out".
If you plan to leave the lab for just a few minutes, you do not need to log
out. It is, however, a good idea to lock your machine while you are gone. You
can lock your screen by clicking on the lock icon.
PLEASE do not leave a session locked for a long period of time. Power may go
out, someone might reboot the machine, etc. You don’t want to lose any work!