Python Strings and Lists -- Quick Reference

To see the full documentation for both the str and list classes:

$ python3
>>> help(str)
>>> help(list)

Sequences

A string is a sequence of characters between single or double quotes. A list is a sequence of items, where each item could be anything (an integer, a float, a string, etc).

Both strings and lists have lengths: a string's length is the number of characters in the string; a list's length is the number of items in the list.

>>> S = "hello"
>>> L = ["a","b","zebra"]
>>> len(S)
5
>>> len(L)
3

Each character in a string as well as each item in a list has a position, also called an index. In python, positions start at 0, so the "h" in the above string is at position 0, and the "o" is at position 4 (note: one less than the length of the string). In the list above, "zebra" is at position 2.

You can access individual characters in a string, or items in a list, using square-bracket indexing:

>>> print(L[2])
zebra
>>> print(S[2])
l
>>> print(S[3])
l
>>> print(S[4])
o

Any sequence in python can be used in a for loop. For strings, we can either loop over characters in the string or indices (0 to len(S)-1). For lists, we can either loop over items in the list or indices. Here are a few examples:

>>> S = "hello"
>>> for ch in S:
...   print(ch)
...
h
e
l
l
o
>>> for i in range(len(S)):
...   print(i,S[i])
...
0 h
1 e
2 l
3 l
4 o
>>>
>>> L = ["a","b","zebra"]
>>> for i in range(len(L)):
...   print(L[i])
...
a
b
zebra

mutable vs immutable

Strings are immutable, which means you cannot change individual characters in a string. For example, this doesn't work:

>>> name = "jeff"
>>> name[0] = "J"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

However, you can create new strings from parts of existing strings. Slicing (using [:] to pick out part of the original string) and concatenation (using the + operator) are useful for doing this:

>>> name = "jeff"
>>> name = "J" + name[1:]
>>> name
'Jeff'
>>> print(name)
Jeff

Unlike strings, lists are mutable, which means that their contents can be modified, without having to create a new list.

>>> L = ["a","b","zebra"]
>>> L[2] = "c"       # assign to position 2 in the list
>>> print(L)
["a","b","c"]

accumulating strings and lists

We can use the accumulator pattern to grow/create both a string and a list:

>>> S = ""
>>> for i in range(5):
...    S = S + str(i)
...    print(S)
...
0
01
012
0123
01234

In the above code, str(i) converts the current value of i to a string. This is then added to the current value of S, and assigned back to the variable S.

You can do the same thing with lists, but there's a better way (see the append() method below). Still, it is possible to add lists together:

>>> L = []
>>> for i in range(5):
...    L = L + [str(i)]
...    print(L)
...
['0']
['0', '1']
['0', '1', '2']
['0', '1', '2', '3']
['0', '1', '2', '3', '4']

This time we again convert i to a string, then add it as a list ([str(i)]) to the existing list, then assign back to the original variable. It's klunky, but it works (again, see append() below for a better way).

comparing strings

Strings are encoded using the ascii encoding scheme. This just means a numeric value is assigned to each possible character (+,-./0123....ABCD...XYZ...abcd...xyz...).
When you compare strings, they are compared by these numeric values. It is not important to know the ascii values, but you should remember that the ascii values for 'a' to 'z', 'A' to 'Z' and '0' to '9' are contiguous. This means comparing two lowercase strings works as you would expect:

>>> "apple" < "banana"
True
>>> "zebra" > "pony"
True

and comparing two uppercase strings also 'works':

>>> "Rich" > "Jeff"
True
>>> "Jeff" > "Sara"
False

You can also test if a substring is present in another string using the in (membership) operator:

>>> "a" in "computer science"
False
>>> "the" in "well hello there"
True
>>> "LL" in "hello"
False

The in operator works for lists, too:

>>> positives = ["y","Y","yes","Yes","YES"]
>>> "no" in positives
False
>>> "yes" in positives
True

If you are curious, you can convert single characters to their ascii values using the ord() function (and convert back using chr()).
The ascii values alone are not that useful, but comparing them and doing math with them can often be useful.

>>> ch = "C"
>>> ord(ch)
67
>>> difference = ord(ch) - ord("A")
>>> print(difference)
2
>>> ord(ch) + difference
69
>>> chr(ord(ch) + difference)
'E'

common string methods

Strings are objects in Python, and thus have methods that we can invoke on them. There are a lot of methods in the str library for creating new string objects from existing string objects and for testing properties of strings. Keep in mind that strings are immutable!

Here are a few str methods that may be particularly useful (run help(str) in the python interpreter to see the full set):

str method result
upper() return copy of str converted to uppercase
lower() return copy of str converted to lowercase
isalpha() return True if string is all alphabetic characters
isdigit() return True if string is all digits
count(sub) return number of occurrences of sub
index(sub) return index of first occurrence of sub
strip() strip off leading and trailing whitespace
split() split into list of "words" (see below)
>>> S = "we LOVE cs"
>>> S.upper()
'WE LOVE CS'
>>> S.lower()
'we love cs'
>>> S.isalpha()
False
>>> S.isdigit()
False
>>> S.count(" ")
2
>>> S.index("L")
3
>>> S.split()
['we', 'LOVE', 'cs']
>>> S = "   we love cs    "
>>> len(S)
17
>>> S = S.strip()
>>> len(S)
10
>>> print(S)
we love cs

common list methods

Lists are also objects in Python, and thus have methods that we can invoke on them. Here are a few that may be particularly useful (run help(list) in the python interpreter to see the full set):

list method result
append(item) add item to end of list
extend(L1) add list L1 to original list
sort() sort the list
reverse() reverse the list
count(item) return number of occurrences of item in list
index(item) return index of first occurrence of item
pop(index) remove and return item at index
>>> L = list("ABCDEFG")
>>> print(L)
['A', 'B', 'C', 'D', 'E', 'F', 'G']
>>> L.append("X")
>>> print(L)
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'X']
>>> L.extend(["Y","Z"])
>>> print(L)
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'X', 'Y', 'Z']
>>> L.reverse()
>>> print(L)
['Z', 'Y', 'X', 'G', 'F', 'E', 'D', 'C', 'B', 'A']
>>> L.sort()
>>> print(L)
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'X', 'Y', 'Z']
>>> L.count("E")
1
>>> L.index("E")
4
>>> L.pop(4)
'E'
>>> print(L)
['A', 'B', 'C', 'D', 'F', 'G', 'X', 'Y', 'Z']

converting between strings and Lists

Often in programs that manipulate strings, you want to convert from a string to a list or from a list to a string (due to lists being mutable and strings being immutable).

To convert any string to a list of individual characters, use the list() function:

>>> S = "ABCDEFG"
>>> L = list(S)
>>> print(L)
['A', 'B', 'C', 'D', 'E', 'F', 'G']

Use the split() string method to split a string into substrings. The substrings are returned as a list. If no argument is given, it will split on whitespace. Using an argument, we can split on any substring:

>>> S = "we love cs"
# split on spaces
>>> L = S.split()
>>> print(L)
['we', 'love', 'cs']

>>> S = "jeff:rich:sara:david"
# split on colons
>>> L = S.split(":")
>>> print(L)
['jeff', 'rich', 'sara', 'david']

Use the join() string method to convert a list back to a string. Here's a quick example:

>>> print(L)
['jeff', 'rich', 'sara', 'david']
>>> S = "".join(L)
>>> print(S)
jeffrichsaradavid
>>> S = "####".join(L)
>>> print(S)
jeff####rich####sara####david

Whatever string is used as the object ("" or "####"), it is inserted between each item in the list when making the new string.


CS21 Topics