What happens when you put lists in a tuple in Python?

In this post, I illustrate some unexpected results when mutable containers are nested inside an immutable container.

Python containers can be classified according to whether they are mutable or immutable. An object whose value can be changed after its initialization is called mutable; otherwise, it is called immutable. For example, list, set, and dict are mutable containers; tuple and frozen set are immutable containers.

Often, you want nested containers, e.g., a list of lists. The need for nested containers arises when you have data that require hierarchical organization to facilitate retrieval or computation. Consider the following gradebook:

harry_potter_record = [1, 10]
hermione_granger_record = [10, 9, 9]
all_records = {('Harry', 'Potter'):harry_potter_record, ('Hermione', 'Granger'):hermione_granger_record}
print(all_records)
{('Harry', 'Potter'): [1, 10], ('Hermione', 'Granger'): [10, 9, 9]}

The keys are tuples, storing first name and then last name in that order. The values are the student’s grades in three subjects. A dict with tuple keys and list values makes sense for the structure inherent in a gradebook’s data: each student is identified by a unique key (their first and last names) that should not be changed, and each student has a list of grades that should be modifiable. For example, Harry’s first subject grade turned out to be 10 but was entered as 1, and his third subject’s grade was not entered due to a mistake, we’d like to change his grade list

all_records[('Harry', 'Potter')][0] = 10
all_records[('Harry', 'Potter')] += [5]
print(all_records)
{('Harry', 'Potter'): [10, 10, 5], ('Hermione', 'Granger'): [10, 9, 9]}

We see that changing the value of a mutable container nested in another mutable container (a list inside a dict in our case) should cause no problem.

But what if we stored the individual records as a tuple of lists? While there is no clear reason for such a choice, we’re curious about whether we can change the value of a mutable object nested inside an immutable container.

harry_potter_record = [1, 10]
hermione_granger_record = [10, 9, 9]
all_records_tuple = (harry_potter_record, hermione_granger_record)
all_records_tuple[0][0] = 10
print(all_records_tuple)
([10, 10], [10, 9, 9])

So this reassignment succeeds.

all_records_tuple[0] += [5]
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-89-466cb732d9ca> in <module>()
----> 1 all_records_tuple[0] += [5]


TypeError: 'tuple' object does not support item assignment

Appending [10] to the first list returns a TypeError, which is what we would expect. However,

print(all_records_tuple)
([10, 10, 5], [10, 9, 9])

Harry’s record still was changed. What happened?

The previous line of code works as follows: first, [10] is appended to the object referenced by all_records_tuple[0], which is harry_potter_record. This is allowed since it is a list. Next, the updated list is assigned back to all_records_tuple[0]. Since all_records_tuple is a tuple, it is immutable, therefore the assignment fails and produces the TypeError message. However, because all_records_tuple[0] still references harry_potter_record, printing all_records_tuple we see that Harry’s record was indeed updated.

All of this has to do with the fact that Python containers don’t actually store the contained objects, but rather their references. An immutable container will refer to the same objects assigned to it at initialization. If these objects are mutable and changed in place, then the change is reflected in the container. If the objects are mutable and changed out of place, then the additional assignment to the immutable container will result in a TypeError, yet the change is still reflected in the container.

This also explains why changing Harry’s first grade from 1 to 10 worked. His grade list is a list so it is mutable. Changing the list’s first value from 1 to 10 is equivalent to changing harry_potter_record[0] to refer to the number 10 instead of number 1. However, harry_potter_record’s identity remains the same, so there was no assignment to the tuple. Using the function id() makes this clear.

harry_potter_record = [1, 10]
hermione_granger_record = [10, 9, 9]
all_records_tuple = (harry_potter_record, hermione_granger_record)

print 'Before...'
print 'id of number 1: {}'.format(id(1))
print 'id of all_records_tuple[0][0]: {}'.format(id(all_records_tuple[0][0])) # same as id(1)
print 'id of all_records_tuple: {}'.format(id(all_records_tuple))

all_records_tuple[0][0] = 10
print '\nAfter...'
print 'id of number 10: {}'.format(id(10))
print 'id of all_records_tuple[0][0]: {}'.format(id(all_records_tuple[0][0])) # same as id(10)
print 'id of all_records_tuple: {}'.format(id(all_records_tuple)) # did not change from before

print '\nResult...'
print(all_records_tuple)
Before...
id of number 1: 4301280440
id of all_records_tuple[0][0]: 4301280440
id of all_records_tuple: 4396785248

After...
id of number 10: 4301280224
id of all_records_tuple[0][0]: 4301280224
id of all_records_tuple: 4396785248

Result...
([10, 10], [10, 9, 9])

The Python Data Model Documentation says

The value of an immutable container object that contains a reference to a mutable object can change when the latter’s value is changed; however the container is still considered immutable, because the collection of objects it contains cannot be changed. So, immutability is not strictly the same as having an unchangeable value, it is more subtle.

The morals of this example are * Putting mutable objects inside an immutable container is both conceptually and implementation-wise a bad idea. * Thinking about Python containers as placeholders of references to objects helps explaining their occasional puzzling behavior. * Understanding Python Data Model Documentation helps you make better programming decisions.