Memory Management in Lists and Tuples
Python has more than one data structure type to save items in an ordered way. This article looks at lists and tuples to create an understanding of their commonalities and the need for two different data structure types. It also looks at how the memory is managed for both of these types. This article is written with reference to CPython implementation.
The commonalities between lists and tuples are:
■ Both are sequence data types.
■ Each item stored in a list can be of any data type.
■ Elements can be accessed by indexing and slicing.
■ Both allow duplicates.
Let’s see how they are different with examples.
Lists
Lists are mutable in nature, and are sortable. A list can be used to save any kind of object. An example is:
The output is:
Slicing
You can access the contents of a list in the following ways:
The output is:
Mutable
We can edit the values in the list as follows:
The output is:
Memory allocation
We have now come to the crux of this article — how memory is managed while storing the items in the list. We will first see how much memory is currently allocated, and later see
how the size changes each time new items are allocated. Let’s check the memory allocated currently:
The output is:
Here is a common function to see how much memory is allocated before and after values are appended:
Please closely observe the size and memory address of the list before and post update.
Note: In the scenario given below, the memory address didn’t change, but this will not always be the case.
The output is:
As you can see, the size of the list first expanded from 96 to 128, but didn’t change for the next couple of items and stayed there for some time. Then the size expanded to 192. The reason is that in CPython the memory is preallocated in chunks beforehand. This is to avoid making frequent heavy system calls. And if you see, the allocation is not static but mild and linear.
The reallocation happens to extend the current memory needed. So when you have a huge array in need and the realloc does not have so much space, it will create new memory and copy; this will be a very expensive operation. To avoid this, we can preallocate the required memory. The code snippet of C implementation of list is given below. The pictorial representation is given in Figure 1.
Empty list
When an empty list is created, it will always point to a different address. It will also hold preallocated memory as well.
The output is:
Removal and insertion
When we perform removal, the allocated memory will shrink without changing the address of the variable. While performing insert, the allocated memory will expand and the address might get changed as well.
The output is:
Sort
The address of the list doesn’t get changed before and after the sort operation.
The output is given below.
Note: Even after sort, the address remains the same.
Tuples
Let’s observe how tuples are defined, and how they differ in the allocation of memory compared to lists. Tuples are:
■ Immutable in nature
■ Consume less memory as compared to lists
Definition
Here’s a quick example of how a tuple is defined:
The output is:
Slicing
Accessing tuples is the same as lists:
The output is:
Changing the single value
As tuples are immutable in nature, we cannot change their value. You can find the error that comes up while trying to change the value of the tuple as follows:
The output is:
Replacing a tuple with a new tuple
We can overwrite the existing tuple to get a new tuple; the address will also be overwritten:
Changing the list inside tuple
We know that the tuple can hold any value. We have tried to save a list inside tuple. Let’s try editing its value. Can we edit? Will it change the list? Let’s find out:
It has clearly thrown an error, so it should not have updated the values as well:
The output is:
But if you see carefully, the values are appended. This is an edge case where Python behaves strangely. So, putting mutable items in tuples is not a good idea.
Memory allocation
Check the memory allocated – a tuple uses only required memory. It is not overallocated as it is not resizable:
The output is:
Reuse memory
To reduce memory fragmentation and speed up allocations, Python reuses old tuples. If a tuple is no longer needed and has less than 20 items, instead of deleting it permanently, Python moves it to a free list and uses it later.
Note: Though it’s not certain always, the same memory will be used.
The output is:
Empty tuple
When two empty tuples are created, they will point to the same address space. Empty tuples act as singletons, that is, there is always only one tuple with a length of zero. When creating an empty tuple, Python points to the already preallocated one in such a way that any empty tuple has the same address in the memory. This is possible because tuples are immutable, and sometimes this saves a lot of memory:
The output is:
Removal and insertion
We cannot update the existing tuple, but we can create new tuple with it; it will be copied into a new address:
The output is:
Sort
As tuples are immutable, we cannot implicitly sort them. But we can make use of the sort function to do so.
Note: It will return as list.
The output is:
Named tuple
The named tuple and normal tuple use exactly the same amount of memory because the field names are stored in the class. So we can either use tuple or named tuple. However, named tuple will increase the readability of the program. That’s a bonus!
The output is:
The performance is:
Note: As tuples are immutable they have only limited features.
To sum up, we should use lists when the collection needs to be changed constantly. We should use tuples when:
■ Working with arguments
■ Returning two or more items from a function
■ Iterating over a dictionary’s key-value pairs
■ Using string formatting
Lists are complex to implement, while tuples save memory and time (a list uses 3000+ lines of code while tuple needs only 1000+ lines of C code).