Assignment statements in Python are more interesting than you might think
In this article, we will take a deep look at three kinds of assignment statements in Python and discuss what’s going on under the hood.
What we find may surprise you.
What happens when the right hand side is a simple expression?
The first case is the easiest, so let us start with that.
In simple terms, this creates a string in memory and assigns the name to it. If you are using CPython, then we can even check the memory address explicitly by using the built in function .
That big number denotes where the data lives in the memory. It will be very useful for us in this entire discussion.
What happens if we create another string with the same value?
Does it reuse the previous “Hello World” stored in memory or does it create an independent copy? Let’s check this by querying the id function again.
This outputs a different id, so this must be an independent copy. We conclude that:
Assignment statements where the right hand side is a simple expression creates independent copies every time.
While for everyday programming, this is the rule we should remember, there are actually some weird exceptions to this rule. Here’s an example.
In this case, two consecutive assignment statements did not create independent copies. Why?
It gets interesting now.
For optimizing memory, Python treats a special set of objects differently. The string belongs to this privileged set and has a different behavior. The exact set depends on the implementation like CPython, PyPy, Jython or IronPython. For CPython, the special rule applies to:
- Strings without whitespaces and less than 20 characters and
- Integers from -5 to +255.
These objects are always reused or interned. The rationale behind doing this is as follows:
- Since programmers use these objects frequently, interning existing objects saves memory.
- Since immutable objects like tuples and strings cannot be modified, there is no risk in interning the same object.
However, Python does not do this for all immutable objects because there is a runtime cost involved for this feature. For interning an object, it must first search for the object in memory, and searching takes time. This is why the special treatment only applies for small integers and strings, because finding them is not that costly.
What happens when the right hand side is an existing Python variable?
Let’s move on to the second type of assignment statement where the right hand side is an existing Python variable.
In this case, nothing is created in memory. After the assignment, both variables refer to the already existing object. It’s basically like giving the object an additional nickname or alias. Let’s confirm this by using the id function.
The natural question at this stage is : what if, instead of just giving the existing object an alias, we wanted to create an independent copy?
For mutable objects, this is possible. You can either use the module of Python (which works on all objects) or you may use copy methods specific to the class. For a , you have several possibilities for creating copies, all of which have different runtime.
How can you copy an immutable object? Well…you can’t! At least not in a straightforward way. If you try to use the copy module or the slicing notation, you will get back the same object and not an independent copy. Here’s proof.
More importantly, there is no reason for explicitly copying an immutable object anyway. We will see why in a moment when we discuss the third kind of assignment statement.
What happpens when the right hand side is an operation?
In this case, what happens depends on the result of the operation. We will discuss two simple cases:
- adding an element to an immutable object (like a tuple) and
- adding an element to a mutable object (like a list).
Let’s start with the case of the tuple.
When you add a new element to a tuple using , this creates a new object in memory. The immutability of tuples is key to understanding this. Since tuples are immutable, any operation that leads to a changed tuple would result in an independent copy.
This is the reason why you don’t need to explicitly copy immutable objects : it happens automatically under the hood. Here’s an example.
The situation is much different for mutable objects and much more confusing. Let’s try the same example, but now for lists.
Mutable objects can be modified in place. Some operations modify the list in place and some operations don’t. In this case, the statement calls and modifies the existing object in place.
To make things doubly confusing, we would have completely different results if we used a slightly different notation.
Woah! What’s going on? What changed?
It turns out that when we change the third line, Python now internally calls a different function instead of . This function returns a new copy instead of modifying the list in place.
To prevent this confusion, it is always better to create a true copy of the list if you wish to prevent modification to the original.
Let’s remember the list copy methods from before. They were , , and . This is what we should use.
There’s one last gotcha that can happen when copying lists.
Suppose we have a list that has a nested list inside it. We copy this list using and then modify the nested list. Unfortunately, this will modify the original list again!
Why did that happen? Didn’t we just copy the original list?
The truth is : we actually don’t have a completely independent copy in this case. The function generates a shallow copy. To see what it does, let’s look at the ids of all the elements in and the ids of all the elements in the copied list.
We see the ids of and are indeed different, indicating is a copy. But the ids of the elements contained in have the same ids as the elements in . So the elements have not been copied!
This is the property of shallow copy. It creates a new copy of the object but reuses the attributes and elements of the old copy. Thus, when you modify the elements of the new copy, you are modifying the elements of the old copy too.
To solve this problem, we need to copy an object along with all its attributes and elements. This can be achieved by .
Deep copy is a quite time intensive operation and can take 1o times longer to complete compared to a shallow copy. But in some situations, it is unavoidable.
This brings me to the end of this discussion. To summarize, we have talked about the different scenarios which can arise in an assignment statement in Python. We found that:
- When the right hand side is a simple expression, a new copy is created every time. There are some exceptions to this rule, which depend on the implementation.
- When the right hand side is an existing Python variable, then an alias is created for the existing copy.
- When the right hand side is an operation, then the outcome depends on the operation. In a simple case involving a tuple, we saw that an independent copy was created. In the same case with lists, we saw that the list was modified in place in one case (when we used ) and a new copy was generated in another case (when we used ).
- Mutable objects can be copied but immutable objects cannot be copied in a straightforward way. There is also no need to copy immutable objects.
- To copy a mutable object along with all its attributes and elements, we need to use deep copy.
That’s it for today. Thanks for reading so far. As always, I love reading your comments and discussing further. So don’t hesitate to respond in the comment section.
If you liked this post, please hit the ❤ button to recommend it. This will help other medium readers find this post.
It looks like you are confusing list comprehension with looping constructs in Python.
A list comprehension produces -- a list! It does not lend itself to a single assignment in an existing list. (Although you can torture the syntax to do that...)
While it isn't exactly clear what you are trying to do from your code, I think it is more similar to looping over the list (flow control) vs producing a list (list comprehension)
Loop over the list like this:
That is a reasonable way to do this, and is what you would do in C, Pascal, etc. But you can also just test the list for the one value and change it:
Or, if you don't know the index:
or, if you have a list of lists and want to change each first element of each sublist:
etc, etc, etc.
If you want to apply something to each element of a list, then you can look at using a list comprehension, or map, or one of the many other tools in the Python kit.
Personally, I usually tend to use list comprehensions more for list creation:
Which is more natural than:
But to modify that same list of lists, which is more understandable?
Here is a great tutorial on this.