Basic Data Structures


Topics:

  • Lists
  • Tuples
  • Dictionaries

Introduction:

In order to process many data, we have to keep track of many data. Just like books are stored on shelves in libraries, we have to store our data in an orderly fashion such that it can be accessed and manipulated systematically. Instead of shelves, we have many structures to choose from. This morning we will introduce the two you will use the most: lists and dictionaries. The list is a simple list of items, while dictionaries are more like look-up tables with words (called "keys") and their corresponding definitions (called "values").


Lists


A list provides a way of storing an ordered series of values in a structure referenced by a single variable.

Let's make a directory for S2.1 and create a text file called 'lists.py' to write our python script.

#!/usr/bin/env python
 
#initialize a couple of lists
li1 = ['wabbit', 'season']
li2 = ['duck', 'season']
li3 = ['...Fire']
 
#print lists
print li1
print li2
print li3
print
 
#access single item in list
print li1[0], li2[0], li3[0]
print li1[:]
print

$ ./lists.py
['wabbit', 'season']
['duck', 'season']
['...Fire']

wabbit duck ...Fire
['wabbit', 'season']

Here, we have defined three variables, li1, li2, and li3, formatted as lists because we enclosed a series of values separated by commas in a pair of square brackets.

Lists have lots of really useful features. One is that they are ordered, which means the order of items in a list does not change (this is not true for dictionaries, as we will see later). This means you can access individual items in a list or entire sections by indexing or slicing. You can also manipulate your list using built-in methods (more on what this means later this week). For example, we can add to the list by using the append() method.

Adding to a List


#add single item to list
print 'using append'
li1.append(li2[0])
print li1
li1.append(li2[1])
print li1
print
 
#combine two lists by extension
print 'using extend'
li1.extend(li2)
print li2
print li1
print
 
#combine two list by concatenation
print 'concatenating'
li4 = li1 + li3
print li1
print li3
print li4
print
 
#print out slices of list
print 'slicing the list'
print li4
print li4[2:]
print li4[:4]
print li4[1:3]
print li4[-1:]
print li4[-2:]
print li4[:-1]
print li4[1:4:2]
print li4[::-1]
 
print
 

$ ./lists.py
using append
['wabbit', 'season', 'duck']
['wabbit', 'season', 'duck', 'season']

using extend
['duck', 'season']
['wabbit', 'season', 'duck', 'season', 'duck', 'season']

concatenating
['wabbit', 'season', 'duck', 'season', 'duck', 'season']
['...Fire']
['wabbit', 'season', 'duck', 'season', 'duck', 'season', '...Fire']

slicing the list
['wabbit', 'season', 'duck', 'season', 'duck', 'season', '...Fire']
['duck', 'season', 'duck', 'season', '...Fire']
['wabbit', 'season', 'duck', 'season']
['season', 'duck']
['...Fire']
['season', '...Fire']
['wabbit', 'season', 'duck', 'season', 'duck', 'season']
['season', 'season']
['...Fire', 'season', 'duck', 'season', 'duck', 'season', 'wabbit']

This script illustrates several of the most common ways to grow lists:

1) The list method append(), which adds a single item to the end of a list

2) The list method extend(), which adds a whole list to the end of the list you ask to extend itself

3) The list concatenation operator, which stitches two things together to make a new whole, without changing either original list.

The insert() method is another way to add to a list. This method takes two arguments (in order): an index to insert at, and the object to insert. You can also insert by slicing, something like this: li[2:2] = [var_to_insert]. The first one is somewhat clearer, so it might be preferred unless you have very particular reasons for doing the other one.

#using insert
print 'using insert'
print li1
li1.insert(1,'inserted!')
print li1
print
 
#using slicing to insert
print 'using slicing to insert'
print li2
li2[1:1] = ['inserted2!']
print li2
print
 


$ ./lists.py
using insert
['wabbit', 'season', 'duck', 'season', 'duck', 'season']
['wabbit', 'inserted!', 'season', 'duck', 'season', 'duck', 'season']

using slicing to insert
['duck', 'season']
['duck', 'inserted2!', 'season']


And if you didn't get what these lists were talking about, here's a clip: Duck season, wabbit season

Shrinking a list


Let's comment out everything from before and begin with a new list.

#Initialize a new list
li= ['Why','do','superheroes','wear','spandex?']
print li
print
 
#remove an item from the list
print 'using del'
del li[3]
print li
print
 
#remove the last item from the list
word=li.pop()
print 'using pop'
print li
print word
print
 
#remove the last item from a list by slicing
print 'using slicing to remove last item from list'
another_li=li[:-1]
print another_li
print
 



$./lists.py

['Why', 'do', 'superheroes', 'wear', 'spandex?']

using del
['Why', 'do', 'superheroes', 'spandex?']

using pop
['Why', 'do', 'superheroes']
spandex?

using slicing to remove last item from list
['Why', 'do']


Here, we are removing things from lists in two ways:

1) The built-in function del removes a particular item from the list.

2) The list method pop() removes the last item from the list and returns the variable.

The last way, removal by slicing, is essentially redefining the list to be everything except the last item of the original list.

Changing lists in place


In addition to adding things to lists and taking them away again, we can also change lists in place.

#and now for some numbers
#create list of zeros
noLi = 4*[0]
# This is equivalent to:
#  noLi = [0] + [0] + [0] + [0]
# Once you understand how the addition operator works, the multiplication operator
#  works in an analogous way.
#  3*4 is the same as adding 3 to itself 4 times, so
#  [3] * 4 is equivalent to adding the list [3] to itself 4 times.
print noLi
print
 
#modify items in list
mice_brain = 10
rat_brain = 20
human_brain = 500
noLi[2] = mice_brain
noLi[1] = rat_brain
noLi[3] = human_brain
print noLi
print
 
#sort list
print 'sorted list!'
noLi.sort()
print noLi
 
#reverse order
print 'reverse the list'
noLi.reverse()
print noLi
print
 
 
# sorting into a new list
print 'sorted() vs .sort()'
another_noLi = sorted(noLi)
print noLi
print another_noLi
print
 
#sort string list
print li
li.sort()
print li
li.reverse()
print li
print
# Why is it not sorting in alphabetical order?
 

$ ./lists.py
[0, 0, 0, 0]

[0, 20, 10, 500]

sorted list!
[0, 10, 20, 500]
reverse the list
[500, 20, 10, 0]

sorted() vs .sort()
[500, 20, 10, 0]
[0, 10, 20, 500]

['Why', 'do', 'superheroes']
['Why', 'do', 'superheroes']
['superheroes', 'do', 'Why']

Now, we've modified the list in a couple of important ways:

1) Overwritten items in the list using slices.

2) Sorted the list using the method sort() and the function sorted(). sort() works on the list in place, while sorted() returns a new list.

3) Reversed the order of the list using the method reverse().

These are, by the way, all demonstrations of the mutability of lists: unlike when you change a string or a number, you don't have to perform an operation and store the result because these operations change the list in place.

Characterizing lists

#figure out how long a list is
print li
print len(li)
print noLi
print len(noLi)
print 'Max =', max(noLi)
print 'Min =', min(noLi)
print
 
#iterate over list
for x in li: print x
print
 
#find where something is stored
idx = li.index('Why')
print idx
 

$ ./lists.py
['superheroes', 'do', 'Why']
3
[500, 20, 10, 0]
4
Max = 500
Min = 0

superheroes
do
Why

2

Here, we have started to characterize our lists.

1) The built-in functions len(), max() and min() tell us how many items are in the list and the maximum and minimum values in the list.

2) The list method index() tells us where an item is in the list.

3) We can iterate over each item in the list and print it using the syntax for x in [ ]:

Tuples


Tuples were used in yesterday's exercise and now that we are familiar with lists, things will make more sense. A tuple is essentially a list that you can not change. You can index, slice them and add them together to make new tuples but not use sort(), reverse(), delete or remove items from them. If you ever have a tuple that you want to change, you have to turn it into a list.

Let's make a new file called 'tuples.py'.
#!/usr/bin/env python
SNP = ('chrII', '378445')
print type(SNP)
 
for i in SNP: print i
 
#Let's see if python will let us change the SNP tuple:
SNP[0] = 'chrV'
print SNP
 
#What if we first coerce the tuple to a list?
SNP = list(SNP)
print type(SNP)
SNP[0] = 'chrV'
print SNP
 
At first,
<type 'tuple'>
chrII
378445
Traceback (most recent call last):
File "./tuples.py", line 137, in <module>
SNP[0] = 'chrV'
TypeError: 'tuple' object does not support item assignment

After commenting out
#SNP[0] = 'chrV'
#print SNP
we get
<type 'tuple'>
chrII
378445
<type 'list'>
['chrV', '378445']

Dictionaries


You can imagine a dictionary as just that -- a dictionary. To retrieve information out of it, you look up a word, called a key, and you find information associated with that word, called the key's value.

To create a dictionary, you write each key-value pair as key:value, divide the pairs with commas, and surround the entire structure with curly braces.

Create a file: emacs dictionaries.py &

#!/usr/bin/env python
 
#initialize a couple of dictionaries
names = {'aisha':'ellahi', 'courtney':'french', 'melinda':'yang'}
drinks = {'james':'milk', 'fernando':'coffee', 'jeff':'soda'}
wines = {'red':'cabernet','white':'pinot grigio',\
'sparkling':'blanc de noirs', 'sticky':'muscato'}
 
#print dictionaries
print names
print drinks
print wines
print
 
#find a particular value
print names['courtney']
print drinks['fernando']
print wines['sparkling']
print



$ ./dictionaries.py
{'aisha': 'ellahi', 'melinda': 'yang', 'courtney': 'french'}
{'james': 'milk', 'fernando': 'coffee', 'jeff': 'soda'}
{'white': 'pinot grigio', 'sparkling': 'blanc de noirs', 'red': 'cabernet', 'sticky': 'muscato'}

french
coffee
blanc de noirs


Just like lists, dictionaries are special and have their own useful features. They are of the form {key:value}.

Adding to a dictionary

#add single item to dictionary
print drinks
drinks['courtney'] = 'diet coke'
print drinks
print
 
#combine two dictionaries
print names
print drinks
names.update(drinks)
print names
print drinks
print

$ ./dictionaries.py
{'james': 'milk', 'fernando': 'coffee', 'jeff': 'soda'}
{'james': 'milk', 'courtney': 'diet coke', 'fernando': 'coffee', 'jeff': 'soda'}

{'aisha': 'ellahi', 'melinda': 'yang', 'courtney': 'french'}
{'james': 'milk', 'courtney': 'diet coke', 'fernando': 'coffee', 'jeff': 'soda'}
{'jeff': 'soda', 'melinda': 'yang', 'aisha': 'ellahi', 'courtney': 'diet coke', 'fernando': 'coffee', 'james': 'milk'}
{'james': 'milk', 'courtney': 'diet coke', 'fernando': 'coffee', 'jeff': 'soda'}

We are adding to the dictionaries in two ways:

1) Adding a single new item to the dictionary by defining a new key:value pair

2) Merging two dictionaries together using the update() method. How is this different from the list method append()? What happened to the value for the key 'courtney' in the names dictionary?

Shrinking a dictionary


#remove a particular item from a dictionary
print names
del names['courtney']
print names
print
 


$ ./dictionaries.py
{'jeff': 'soda', 'melinda': 'yang', 'aisha': 'ellahi', 'courtney': 'diet coke', 'fernando': 'coffee', 'james': 'milk'}
{'jeff': 'soda', 'melinda': 'yang', 'aisha': 'ellahi', 'fernando': 'coffee', 'james': 'milk'}

We have now deleted a key:value pair from the dictionary. What happens when you try the list method pop()? What about sort()? And sorted()?

Changing dictionaries in place


#modify items in dictionary
print names
names['fernando'] = 'racimo'
print names
print
 

$ ./dictionaries.py
{'jeff': 'soda', 'melinda': 'yang', 'aisha': 'ellahi', 'fernando': 'coffee', 'james': 'milk'}
{'jeff': 'soda', 'melinda': 'yang', 'aisha': 'ellahi', 'fernando': 'racimo', 'james': 'milk'}

We have changed a values in place by modifying the key:value pair.

Characterizing dictionaries
#identify components of the list
keys = wines.keys()
values = wines.values()
print keys
print values
print
 
#iterate over keys
for x in keys:
    print 'The category is', x, 'and the varietal is', wines[x]
print
 
#sort keys and iterate over keys
keys.sort()
for x in keys:
    print 'The category is', x, 'and the varietal is', wines[x]
print
 
#find if something is stored
if 'red' in wines: print wines['red']
print

$ ./dictionaries.py

['white', 'sparkling', 'red', 'sticky']
['pinot grigio', 'blanc de noirs', 'cabernet', 'muscato']

The category is white and the varietal is pinot grigio
The category is sparkling and the varietal is blanc de noirs
The category is red and the varietal is cabernet
The category is sticky and the varietal is muscato

The category is red and the varietal is cabernet
The category is sparkling and the varietal is blanc de noirs
The category is sticky and the varietal is muscato
The category is white and the varietal is pinot grigio

cabernet

Here, we have started to characterize our dictionaries.

1) The dictionary methods keys() and values() return lists containing keys or values. These lists can be stored and acted on as lists.

2) We can iterate over the keys and print the values using the syntax for x in [ ]:
NOTE: The variable that is changing is the KEY not the value.

3) We can quickly find out if a particular key already exists. Note that each key must be unique, but multiple keys can have the same value.

Summary So Far...


Lists are:

1) ordered collections of arbitrary variables.
2) accessible by slicing.
3) can be grown or shrunk in place.
4) mutable (can be changed in place).
5) defined with list = [X,Y]

Dictionaries are:

1) unordered collection of arbitrary variables.
2) accessible by keys.
3) can be grown or shrunk in place.
4) mutable.
5) defined with dict = {X:Y}

List methods include: append(), extend(), insert(), pop(), sort(), reverse(), index()
Dictionary methods include: update(), keys(), values(), pop()--but pop() works a bit differently compared to how it's used in lists!
Built in functions include: sorted, len, max, min, type

You will use dictionaries and lists almost exclusively in your coding. However, there is a remaining data structure that you should know about to make your life a little easier: Sets. Sets are unordered and unique bags of variables. We'll be talking about these in Session 3.1, so stay tuned!

Questions?


Exercises


1) Get comfortable with new data structures (adapted from Learning Python)
Create a new file. Type in the commands below. Execute the script. Add comments to your script describing what is happening in each line.

L = [1,2,3] + [4,5,6]
print L, L[:], L[:0], L[-2], L[-2:]
L.reverse()
print L
L.sort()
print L
idx = L.index(4)
print idx

print {'a':1, 'b':2}['b']
D = {'x':1, 'y':2, 'z':3}
D['w'] = 0
print D['x'] + D['w']
print D.keys(), D.values(), D.has_key('z')

2) Getting to know your TAs (adapted from Learning Python)
Introduce yourselves to your TAs and find out their names. Create a new file. Define a new list containing the names of the TAs and a second list with the names of your teachers. Add appropriate print statements to the file to answer the following:
a. What happens when you try to index out of bounds (eg. TAs[10])?
b. What about slicing out of bounds (eg.TAs[-100:100], TAs[30:50], TAs[2:10])?
c. What happens when you try to extract a sequence in reverse--with the lower bound greater than the higher bound (e.g. TAs[3:1])? What happens when you try assigning TAs[3:1] = ['?']?
d. Add your own name to the middle of the list of TAs by indexing, then add your neighbor's name to the list of teachers using the insert() method that we discussed in class. See what happens if you use indexing to add the string 'AHHH' to the list of teachers using the index [3:5] or [4:2]. (HINT: The syntax for the insert() method is "listname.insert(index, objectname)")

3) Getting to know your neighbors (adapted from Learning Python)
Introduce yourself to four of your neighbors and find out what lab they work in. Create a new file. Define a new dictionary containing the names of your neighbors as keys and their labs as values. Add appropriate print statements to the file to answer the following:
a. What happens if you try to index a non-existent key (eg print D['Terry'])?
b. What happens if you try to assign to a non-existent key (eg D['Terry'] = '?')?
c. How does this compare to out-of-bound assignments for lists?

4) Pulling it all together
Your boss has asked you to do a small bioinformatics project on LeuT (pdb code 2Q6H), which is the neurotransmitter responsible for transporting antidepressants. To help you out, I am providing a script that will read in a file and save the protein sequence to a list called protSeq. I have commented the code, but there are many things in here you haven't learned yet.
HINT: Protein Databank (PDB) structure files are stored at http://www.pdb.org/. Use the pdb code to find the structure. The structure file can be downloaded from Download Files >> PDB File (text). The structure file must be in the same directory as the script for the script to run properly.

So, copy the code below into a file, and then you can access the list of all amino acids stored as the list variable protSeq. You can start by printing this variable and proving to yourself that the list contains the information you think it does. Then add to the code to answer the following questions:

a. How many total amino acids are in the protein?
b. Print out total count of each amino acid in alphabetical order.

#!/usr/bin/env python
 
#initialize list to store sequence
protSeq = []
#open pdb file
f1 = open('2Q6H.pdb', 'r')
#loop over lines in file
for next in f1:
    #identify lines that contain sequences
    if next[:6] == 'SEQRES':
        #strip away white space and
        #convert line into list
        line = next.strip().split()
        #delete descriptor information
        #at beginning of each line
        del line[:4]
        #loop over amino acids in line
        for aa in line:
            #add to sequence list
            protSeq.append(aa)
#close file
f1.close()

EXTRA CREDIT: Look up the documentation for set(). We'll be learning more about using set() this afternoon but see if you can rewrite part (b) using the new command based on your readings.


Solutions


Alternate solutions are in an iPython Notebook here.

1) Get comfortable with new data structures

##EXERCISE 1##
 
L = [1,2,3] + [4,5,6]
##L is new list of these six elements, [1,2,3,4,5,6]
 
print L, L[:], L[:0], L[-2], L[-2:]
##print list, slice of list with everything from beginning to end,
##slice with everything up to but not including the first element,
##the second to last element, slice of the list from the second to
##last element up to the end of the list
##should be: [1, 2, 3, 4, 5, 6] [1, 2, 3, 4, 5, 6] [] 5 [5, 6]
 
L.reverse()
##reverse the order of the list
 
print L
##print out the list, should be [6,5,4,3,2,1]
 
L.sort()
##sort list into numerical order
 
print L
##print out list, should be [1,2,3,4,5,6]
 
idx = L.index(4)
##assign to the variable idx the position of the number 4 in the list
 
print idx
##print the variable idx, which is 3
 
 
print {'a':1, 'b':2}['b']
##print the value for the key b in the dictionary, which is 2
 
D = {'x':1, 'y':2, 'z':3}
##assign the variable D a dictionary
 
D['w'] = 0
##add a new key:value pair to D
 
print D['x'] + D['w']
##print the sum of the values for 'x' and 'w' in the dictionary, should be 1
 
print D.keys(), D.values(), D.has_key('z')
##print list of keys, list of values, and whether D has the key 'z' (T/F)
##should be ['y', 'x', 'z', 'w'] [2, 1, 3, 0] True
 
Output of ex1.py:
[1, 2, 3, 4, 5, 6] [1, 2, 3, 4, 5, 6] [] 5 [5, 6]
[6, 5, 4, 3, 2, 1]
[1, 2, 3, 4, 5, 6]
3
2
1
['y', 'x', 'z', 'w'] [2, 1, 3, 0] True

2) Getting to know your TAs

##Exercise 2##
##initiate two lists, one of the TAs and another of the teachers
TAs = ["Christopher", "Gavin","Avi","David","Jeff"]
teachers = ["Aisha","Mel","James","Fernando","Courtney"]
 
##A##
##print TAs[10]
##The above line, uncommented, should give you the following error. You cannot call an element of the list at a position in the list that doesn't exist.
'''
Traceback (most recent call last):
  File "ex2.py", line 5, in <module>
    print TAs[10]
IndexError: list index out of range
'''
 
##B##
print TAs[-100:100]
##Everything in the TAs list is printed, because the slice includes everything from position -100 to 100, which includes positions 0 to 4, which are present in the list
print TAs[30:50]
##An empty list is given because you are asking for a slice of the list TAs at position 30 to 50, but at those positions there isn't anything.
print TAs[2:10]
##A list is printed from "Avi" to the end, because those are part of the slice, but nothing is included once the TAs list ends.
 
##C##
print TAs[3:1]
##prints an empty list -- there is nothing between that slice
TAs[3:1] = ['?']
print TAs
##'?' is inserted into the list at position 3
 
##D##
##TAs[2:2] = "your name" ##What happens here? Is this what you want?
TAs[2:2] = ['your name']
print TAs
##'your name' is new element in list at position 2, no other elements deleted
teachers.insert(2,"your neighbor's name")
print teachers
##'your neighbor's name' is added to the teacher list at position 3, no other elemends deleted
teachers[3:5] = ['AHHH']
print teachers
##'AHHH' added at position 3, but Ahhhh, where did 'James' and 'Fernando' go? All of teachers[3:5] was replaced with the new element 'AHHH'.
teachers[4:2] = ['AHHH2']
print teachers
##'AHHH2' added at position 4, but nothing taken out, because going backwards means an empty slice--no part of the list is included in that slice.
 
Output of ex2.py is:
['Christopher', 'Gavin', 'Avi', 'David', 'Jeff']
[]
['Avi', 'David', 'Jeff']
[]
['Christopher', 'Gavin', 'Avi', '?', 'David', 'Jeff']
['Christopher', 'Gavin', 'your name', 'Avi', '?', 'David', 'Jeff']
['Aisha', 'Mel', "your neighbor's name", 'James', 'Fernando', 'Courtney']
['Aisha', 'Mel', "your neighbor's name", 'AHHH', 'Courtney']
['Aisha', 'Mel', "your neighbor's name", 'AHHH', 'AHHH2', 'Courtney']

3) Getting to know your neighbors



##Exercise 3##
 
D = {"Aisha":"Rine", "Sara":"Brenner", "James":"Miller", "Fernando":"Slatkin"}
print D
##Dictionary initializing names as keys and labs as values
 
##A##
##print D["Terry"]
##With the above line uncommented, there is a key error (see below), because the key does not exist.
'''
Traceback (most recent call last):
  File "ex3.py", line 8, in <module>
    print D["Terry"]
KeyError: 'Terry'
'''
 
##B##
D["Terry"] = '?'
print D
##This is a way of adding key:value pairs to dictionaries--Terry is now added.
 
##C##
##This is different from out of bound assignments because no error is returned. Instead, the dictionary is updated.
 
Output of ex3.py is:

{'Aisha': 'Rine', 'Sara': 'Brenner', 'Fernando': 'Slatkin', 'James': 'Miller'}

{'Aisha': 'Rine', 'Sara': 'Brenner', 'Terry': '?', 'Fernando': 'Slatkin', 'James': 'Miller'}


4) Pulling it all Together
#!/usr/bin/env python
###NEW COMMENTS HAVE THREE HASHTAGS!!!
 
#initialize list to store sequence
protSeq = []
#open pdb file
f1 = open('2Q6H.pdb', 'r')  ###Download the pdb file for 2Q6H
#loop over lines in file
for next in f1:
    #identify lines that contain sequences
    if next[:6] == 'SEQRES':
        #strip away white space and
        #convert line into list
        line = next.strip().split()
        #delete descriptor information
        #at beginning of each line
        del line[:4]
        #loop over amino acids in line
        for aa in line:
            #add to sequence list
            protSeq.append(aa)
#close file
f1.close()
 
###My added lines below
##print protSeq
##A list of amino acids are printed, indicating the amino acid chain for 2Q6H
 
##A##
print "The total number of amino acids in the protein is:", len(protSeq)
##Get the number of elements in the list (which is the number of amino acids). The answer is 519.
print
 
 
##B##
D = {} ##initialize empty dictionary D
for AA in protSeq: D[AA] = 0 ##initialize a key value pair where the value will be the count for every amino acid (because all keys are unique, looping over a previous amino acid does not mean an amino acid will appear more than once
for AA in protSeq: D[AA] += 1 ##reloop over file, adding a count every time you loop over that amino acid
for AA in sorted(D.keys()): print AA, '=', D[AA] ##loop over sorted keys, printing the amino acid and the count for that amino acid
 
The output of ex4.py is:
The total number of amino acids in the protein is: 519

ALA = 54
ARG = 21
ASN = 14
ASP = 12
GLN = 6
GLU = 24
GLY = 45
HIS = 6
ILE = 54
LEU = 61
LYS = 19
MET = 13
PHE = 50
PRO = 25
SER = 18
THR = 27
TRP = 16
TYR = 17
VAL = 37

EXTRA CREDIT
setAA = set(protSeq)
print setAA ##Sets cannot have duplicates, so this creates a non-redundant list of all the amino acids in the list protSeq
 
setD = {}
for uniqAA in setAA: setD[uniqAA] = 0 ##Go through unique set and initialize each key value pair.
for AA in protSeq: setD[AA] += 1 ##Do same as in part b
for AA in sorted(setD.keys()): print AA, '=', setD[AA] ##Do same as in part b
###Why is this slightly more efficient than what we did in part b?###
In the changed step, we are looping over a shorter list, so presumably it is faster.