Week 14: Dictionaries
Data Structures
As programs get larger and more complex, they will need to handle more and more data. Programs need to access data in a way that is both easy to use and reasonably fast. A data structure is a collection of data values, along with the functions that can be applied to the data.
Understanding how to store data in a way that makes access/updating efficient is topic that could take multiple semesters to fully understand. It is the primary topic of CS35.
An abstract data type (ADT) is a description of what data is being stored and what functions can be applied to the data, without describing/defining the underlying implementation. An ADT specifies the interface to a data structure without defining the implementation.
Example: maintaining inventory
Imagine you run a exotic donut shop and make and sell donuts of all sorts of flavors. It would be nice to write a program to help you manage your inventory of how many of each kind of donut you have for sale. An example list may look as follows:
Peppermint,40 Lemon,20 Maple Bacon,10 Peanut Butter and Jelly,12 ...
As you make and sell more donuts, you’ll need to change the inventory numbers for the donuts you keep track of. When you decide to stop making a donut, you’ll need to delete the donut from your list. You’ll also need to search through the donuts to find flavors and perhaps create reports of the donuts you have.
The Dictionary ADT
A dictionary is a data structure that maintains (key, value) pairs
and supports the following operations:
-
search(key): return the value associated with that key -
insert(key, value): add (key, value) to the dictionary. -
delete(key): remove the (key,value) pair with the given key.
We’ll also assume that keys must be unique, and that when you try to
insert(key,value) for a key already in your dictionary, it should
update the value.
For example, to add 20 Lemon donuts to your inventory, you might
want to execute donuts.insert("Lemon", 20).
Exercise:
On pen and paper, sketch out an implementation of the Dictionary ADT
that uses Python lists to store the data. How would you search for
the key/value pair? How would the delete operation work?
Python dictionaries
The Dictionary ADT is common enough that Python has provided a built-in dictioanry data type for you (although the interface is somewhat different from the one described above)
Creating a dictionary
To create a dictionary in Python, use curly braces {,}.
bagels = {} # create an empty dictionary
donuts = {"Lemon":20, "Maple Bacon": 10} # initialize a dictionary
Searching a dictionary
Accessing key/value pairs in a Python dictionary is similar to list indexing. The difference is that instead of indexing with an integer, you "index" with the key:
donuts["Blueberry"]
print("%d Lemon donuts left." % donuts["Lemon"])
print("%d Maple Bacon donuts left." % donuts["Maple Bacon"])
What happens if a key is not in the dictionary? You can check if a key is in the dictionary using the in operator e.g.
if "Pickle" in donuts:
...
Adding key/value pairs
To insert a key/value pair into the dictionary, also use indexing syntax:
donuts["Pickle"] = 2
Removing key/value pairs
Use the del command to remove a key/value pair from the dictionary.
del donuts["Pickle"] # pickle flavored donuts were a big mistake
Iterating over items in a dictionary
Sometimes you might want to access all key/value pairs in the dictionary. In python you can iterate over all keys in a way similar to how you iterate over items in a list. Then you can get the value using the key.
for flavor in donuts:
print("(flavor/count) = (%s,%d)" % flavor, dict[flavor])
Dictionary Application
Dictionaries are fantastic for applications that benefit from repeatedly looking up the value associated with a key. For example, suppose we gave you a large corpus of text, like a book, and we asked you to count the number of times certain words (or perhaps even all the words!) appeared. Could you do it without a without a dictionary? Yes, but what would it look like (and what would the performance be like)?
Let’s try it in wordcount.py with the text of Pride and Prejudice, which is
freely available via Project Gutenberg[https://www.gutenberg.org/ebooks/1342].