4. Basic Programing (Data Structures)

This is a continuation of the previous chapter, mainly looking at data structures such as lists.

If you are already familiar with the subject, you may still want to focus on the Practice sections and Exercises. However, if you have not used the Online Python Tutor described in the section “Python’s Data Model”, you should definitely try it; it is very useful even for those who are familiar with Python.

4.1. List

4.1.1. Sequence

A Sequence is a data structure in which some data is arranged in series and the elements are accessible by index numbers. The most important of these is a list:

x = [10, 20, 30, 40, 50]      # Square brackets are used to define a list
len(x)                        # 5 (length)
x[0]                          # 10 (Note that the first index is 0)
x[-1]                         # [50] (last element)
x[-3]                         # [30] (3rd last element)
20 in x                       # True (True if included as an element)
99 not in x                   # True
[10, 20, 30] == [10, 20, 30]  # True (True if all the elements are equal)

A subsequence can be accessed by successive indices. It is called a slice:

x[2:4]                        # [30, 40] (Note that x[4] is not included)
x[2:]                         # [30, 40, 50]
x[:2]                         # [10, 20]

You can use concatenation and iteration operators:

[10, 20, 30] + [40, 50]  # [10, 20, 30, 40, 50]
[10, 20] * 3             # [10, 20, 10, 20, 10, 20]
Why doesn’t the slice include x[4] when it is written as x[2:4]? Isn’t that confusing? In the first place, why do the indices start from 0? Why not start from 1 like vectors and matrices in mathematics?
Each programming language has its own conventions, so you’ll just have to get used to Python as it is.

If we are following the mathematical notation of vectors and matrices, it is better to start from 1 and include the last index; this is the way Fortran and MATLAB do it because they value the similarity to mathematical notation.

In contrast, if we want to represent the location of data in computer memory as the starting position of a list plus an index, it is more natural to set the index of the first element to 0. Not only Python, but also C, C++, Java, and many other languages use this style.

There are many different views on how to handle the last index. The “don’t include” camp often cites the following reasons:

  • When writing x[n:n+k], it is easy to understand that there are k elements.

  • It is convenient to write x[:N] and x[N:] to separate the elements before and after the N-th element.

  • Good consistency with the scheme that recognizes the end of a list when a special element (called a sentinel) appears, like a string in the C language, in which the number of characters are not known. In this case, the sentinel should not be included in the wanted data:

    for (i = 0; str[i] != '\0'; i++) {
        putchar(str[i]);
    }
    

4.1.2. Various Ways to Create a List

The range function can be used to create a sequential number:

range(5)              # range-type object (This itself is not a list)
list(range(5))        # [0, 1, 2, 3, 4]  (From 0 to less than 5)
list(range(2, 6))     # [2, 3, 4, 5]     (From 2 to less than 6)
list(range(2, 10, 3)) # [2, 5, 8]        (From 2 to less than 10, with step 3)

The list comprehensions notation is useful to create a list in which each element is a modified version of an existing sequence:

x = [1, 2, 15, 30]
[2 * a for a in x]                # [2, 4, 30, 60] (2 * a for each element a in x)
[2 * a for a in x if a % 2 == 0]  # [4, 60]
[2 ** n for n in range(5)]        # [1, 2, 4, 8, 16]
If you pass three arguments to range, isn’t it hard to tell at first glance which is the first, which is the last, and which is the increment?
I agree. This may be where explanatory variables come into play.
step = 3
for x in range(2, 10, step):
    print(x)

4.1.3. Mutating a List

A list is mutable:

x[2] = 12345             # 要素 30 を 12345 に変更する
x[2:4] = [-20, -30]      # スライスの変更

The basic operations for expanding and contracting a list are adding elements to the end and removing elements from the end:

x = []
x.append(10)             # x: [10]
x.append(20)             # x: [10, 20]
item = x.pop()           # x: [10], item: 20

You can also insert elements at arbitrary positions and delete elements at arbitrary positions, but this is less efficient than adding or deleting trailing elements:

x = [0, 10, 20, 30]
x.insert(2, 12345)     # x: [0, 10, 12345, 20, 30]
item = x.pop(1)        # x: [0, 12345, 20, 30], item: 10
del x[1]               # x: [0, 12345, 30]  (When you don't need the return value)
del x[:]               # x: [] (Delete the slice)
x.clear()              # x: [] (The same as del x[:], but a little faster)
x.remove(12345)        # x: [0, 30] (Delete the element having the value)
I thought Python variables didn’t need to be defined in advance, but why do I need to define x = [] before calling the append method?
The reason is that the append method cannot be used unless the content of x becomes a list-type object.

So, for example, if you write the following, of course x = [] is not necessary:

x = [10]
x.append(20)
item = x.pop()

4.1.4. Practice

Now, let’s get back to hello_pygame.py. So far, the player character has only been moving in a still image, but we will try to animate it to look like it is walking.

The word “animation” may sound difficult, but what we are going to do is simple: just a flick book.

In the same folder as p1_walk01.png, there are also images of the player with slightly different postures. If we repeat p1_walk04.png to p1_walk07.png, it will look as if the player is walking.

So, let’s make a list of these images by reading them and display them repeatedly in order. So far, we have only dealt with integer lists, but you can put anything in a list, not just integers. Here, we will create a list of type Surface.

hello_pygame.py (ver 15.0)
 1import pygame
 2
 3
 4def init_screen():
 5    pygame.init()
 6    width, height = 600, 400
 7    screen = pygame.display.set_mode((width, height))
 8    return screen
 9
10
11def create_text():
12    font_size = 50
13    font_file = None
14    antialias = True
15    font = pygame.font.Font(font_file, font_size)
16    text_image = font.render("hello, pygame", antialias, pygame.Color("green"))
17    return text_image
18
19
20def create_player():
21    player_images = [
22        pygame.image.load("../../assets/player/p1_walk04.png").convert(),
23        pygame.image.load("../../assets/player/p1_walk05.png").convert(),
24        pygame.image.load("../../assets/player/p1_walk06.png").convert(),
25        pygame.image.load("../../assets/player/p1_walk07.png").convert()
26    ]
27    return player_images
28
29
30def draw(screen, player_image, text_image, mouse_pos):
31    screen.fill(pygame.Color("black"))
32    screen.blit(player_image, mouse_pos)
33    mouse_x, mouse_y = mouse_pos
34    text_offset_x = 100
35    screen.blit(text_image, (mouse_x + text_offset_x, mouse_y))
36    pygame.display.update()
37
38
39def main():
40    screen = init_screen()
41    text_image = create_text()
42    player_images = create_player()
43    frame_index = 0
44
45    while True:
46        should_quit = False
47        for event in pygame.event.get():
48            if event.type == pygame.QUIT:
49                should_quit = True
50            elif event.type == pygame.KEYDOWN:
51                if event.key == pygame.K_ESCAPE:
52                    should_quit = True
53                elif event.key == pygame.K_b:
54                    pass
55        if should_quit:
56            break
57        mouse_pos = pygame.mouse.get_pos()
58
59        frame_index += 1
60        animation_index = frame_index % len(player_images)
61        draw(screen, player_images[animation_index], text_image, mouse_pos)
62
63    pygame.quit()
64
65
66if __name__ == "__main__":
67    main()
Lines 20-27

We define a function create_player that creates and returns a list of Surfaces that have player images loaded.

The way to make the list is not so cool. Let’s fix it later.

Line 42

The function create_player is called, and the returned list is assigned to the player_images variable.

The player_images variable in the function create_player and that in the function main are local variables of the respective functions, so we need to pass them through the return value.

Line 43

Since the player image to be displayed changes as time progresses, we prepare a variable to hold the current time.

The image displayed on a display or captured by a video camera at each time is called a frame. In this example, we count frame_index, which is the number of the frame shown on the display, and we treat it as the time.

Line 59
Each frame is counted up by adding 1 to frame_index.
Lines 60-61

len(player_images) gives the number of elements in player_images. The remainder operator % is used to find the remainder by dividing the frame index by the number of elements. Since there are four elements now, the result will be one of 0, 1, 2, or 3.

By using this as an index to specify the elements of player_images, the Surfaces in this list will be displayed repeatedly in order.


When you run the program, I think that the flapping of the legs is too fast and it doesn’t look like it’s walking. Let’s try to adjust the speed.

The problem with the current program is that we have no idea how fast it is flipping through the images. If your computer is fast, the frame will go faster, and if it is slow, it will go slower. First of all, let’s keep the frame update rate constant.

hello_pygame.py (ver 16.0)
39def main():
40    screen = init_screen()
41    text_image = create_text()
42    player_images = create_player()
43    clock = pygame.time.Clock()
44    frame_index = 0
45
46    while True:
47        frames_per_second = 60
48        clock.tick(frames_per_second)
49
50        should_quit = False
51        for event in pygame.event.get():
52            if event.type == pygame.QUIT:
53                should_quit = True
54            elif event.type == pygame.KEYDOWN:
55                if event.key == pygame.K_ESCAPE:
56                    should_quit = True
57                elif event.key == pygame.K_b:
58                    pass
59        if should_quit:
60            break
61        mouse_pos = pygame.mouse.get_pos()
62
63        frame_index += 1
64        animation_period = 6
65        animation_index = (frame_index // animation_period) % len(player_images)
66        draw(screen, player_images[animation_index], text_image, mouse_pos)
67
68    pygame.quit()
69
70
71if __name__ == "__main__":
72    main()
Line 43
A Clock-type object returned by pygame.time.Clock function is assigned to a variable clock. This will be used for time management.
Lines 47-48

At the beginning of the while loop (actually, it could be anywhere at a fixed point in the loop), we call the tick method belonging to clock. In this case, the frame rate is specified as an argument. The unit is frames/s (fps), that is, how many frames are displayed per second. In this example, 60 frames/s is specified. This is the standard rate at which a modern computer display refreshes.

The Clock object has a capability to remember the last time the tick method was called. When it is called in a loop, it calculates the time elapsed since the last call and puts in an appropriate wait time. The current value is 60 frames/s, so we need to adjust it so that the time elapsed since the last call is more than 16.66 ms.

Once you run the program with these changes, you may find that the flapping is still too fast. As you can see, we’re playing back a four frame animation with one cycle of foot movement at 60 frames/s, which is 15 foot cycles/s. That’s still pretty fast, isn’t it?

Lines 64-65
So let’s make it a little slower. We’ll use the integer division operator // to divide frame_index by an appropriate number before using it. In this case, I decided to divide by 6 empirically. You can change it to another value if you like.
If I specify 60 as the argument for clock.tick, will the frame rate be exactly 60 frames/s?
No, it just won’t be faster than 60 frames/s.
The flapping of the legs seems to be faster after adding clock.tick(frames_per_second) than before. Am I doing something wrong?
This phenomenon can happen. It’s called aliasing.

Try a little thought experiment: Take a picture of a circular motion that rotates once every 1.0 second with a camera that releases the shutter every 0.9 seconds. If you play it back as a movie at the same speed as you shot it, it should look like a slow rotation in the opposite direction. In a more extreme situation, if the shutter is released every 1.0 second, it will appear to stand still.

Many people have experienced similar phenomena such as a car tire wheel or a helicopter rotor that appears to rotate slowly in a movie.

When we discretize a fast motion with a sampling period that is too slow to properly capture the motion, we observe frequency components that are different from the original motion. If you want a more precise explanation, please look up keywords such as sampling theorem and aliasing.

4.2. List-like Data Structures

4.2.1. Tuple

A tuple is an immutable sequence. It is not possible to change its length or to change its elements. Non-changing operations can be performed on them in the same way as on lists:

>>> tup = (10, 20, 30)
>>> tup[1] = 123
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
What’s the point of having an “immutable lists” that is backwards compatible with a list? Why don’t we just make everything a list?
It’s a fair question, but tuples have many raisons d’etre.

The most obvious, but most boring, reason is that they are fast and use little memory, since their internal structure is optimized assuming that it does not have to respond to changes.

What is more important is that when we read the program, we can be sure that it will never change because it is a tuple. For example, suppose you see a function f with a list x passed as an argument:

x = [10, 20, 30]
f(x)

Whether or not the contents of x change as a result of this call is not known without a detailed understanding of the function f’s behavior. In contrast, if x is a tuple, even if you don’t know what f does, you can at least read the rest of the code assuming that x does not change. This makes it much easier to read and understand the program. This is similar to an aphorism for C and C++ that recommends us to put const on things that do not change whenever possible.

Another benefit is that it can be used as a key for a dictionary, which will be explained soon.

Hmm? I succeed in “mutating” a tuple when I do the following. Why?:

>>> tup = (10, 20, 30)
>>> tup = (40, 50, 60)
>>> tup
(40, 50, 60)
It is different from a mutation.

It is just creating a new tuple (40, 50, 60) and assigning it to the variable tup. The original tuple (10, 20, 30) has not been mutated.

For a better understanding of this situation, please refer to Section 4.4, “Python’s Data Model”.

4.2.2. String

A string is also an immutable sequence. There are various notations:

"hello, world"
'hello, world'              # Single quotes are also allowed
"what's new"
'1" is equal to 25.4 mm'
'1h 23\' 45"'               # \ escapes the special role of the following character
"hello\nworld"

There are many methods specific to the string type. The following are just a few examples:

"Computer Seminar I".split()                 # ['Computer', 'Seminar', 'I']
", ".join(["abc", "def", "ghi"])             # 'abc, def, ghi'
"2021-10-01".replace("-", "/")               # '2021/10/01'
"number {} and string {}".format(10, "abc")  # 'number 10 and string abc'

The format method of the string type can be used to specify various formats. The following are just a few examples:

>>> "{1} {2}, {0}".format(2021, "October", 1)  # Selection of arguments
>>> "{:5}".format(123)                         # Space-padded if less than 5 chars
>>> "{:05}".format(123)                        # Zero-padded if less than 5 chars
>>> "{:.4f}".format(3.141592)                  # Rounded to 4 decimal places
>>> "{:10.4f}".format(3.141592)                # Rounded + space-filled
>>> "{0} {0:5} {0:05}".format(123)             # Arguments selection + padding
Do I have to memorize all these huge number of methods?
No, it’s rare to have all the methods memorized.

Rather, it is important to be able to look them up in manuals, books, etc. whenever you want to use them (i.e., to know where to look and how to read them).

The official documentation for string methods can be found at the following page.

If you find it hard to read, try to find a document that fits your needs. There are many “cheat sheets” available that are easy to read.

The formatting string of the format method is similar to that of C’s printf but slightly different, which is annoying.
If you are familiar with C, the % operator may be easier to understand than the format method.

4.2.3. Dictionary

A dictionary is a data structure similar to, but not a kind of a Sequence. It can use values other than integers in place of indices:

score = {"Alice": 80, "Bob": 80, "Charlie": 80}  # Definition
score["Alice"]       # 80  (Value associate to key "Alice")
score["Alice"] = 90  # Dictionary is mutable
score["Dave"] = 95   # append
del score["Charlie"] # delete
"Alice" in score     # True (Test whether the key exists)
list(score)          # ['Alice', 'Bob', 'Dave'] (List of keys are obtained)

4.2.4. Practice

Let’s try to make the list of player images a bit smarter by using string processing. Four file names can be generated by using the format method of the string, since the only difference is the sequential numbers.

hello_pygame.py (ver 17.0)
20def create_player():
21    file_path = "../../assets/player/p1_walk{:02}.png"
22    player_images = []
23    for k in range(4, 8):
24        player_images.append(pygame.image.load(file_path.format(k)).convert())
25    return player_images

At first glance, it may look complicated, but the file name string is generated by changing k to 4, 5, 6, and 7, and inserting two zero-padded characters in the {:02} part of the file_path. All that is left is to append the files one by one.

The same thing can be written a little more concisely using the list comprehension notation.

hello_pygame.py (ver 18.0)
20def create_player():
21    file_path = "../../assets/player/p1_walk{:02}.png"
22    player_images = [pygame.image.load(file_path.format(k)).convert()
23                     for k in range(4, 8)]
24    return player_images
I’ve been wondering about this since ver 5.0 of hello_pygame.py. In C, I learned that we should always check for errors when opening a file. Isn’t error checking necessary when using pygame.image.load?
If you just want to exit abnormally on an error, you don’t need to do anything, because that will happen automatically. If you want to resort to another procedure when an error occurs, you can use an exception handling mechanism.

Here are the basics. First, check what kind of error you are getting. As shown in the previous Q&A, if there is no file, pygame.image.load will cause:

FileNotFoundError: No such file or directory.

to be displayed. The first word FileNotFoundError indicates the type of error that occurred, so keep that in mind.

If you want to do something else when FileNotFoundError occurs, you can write something like the following:

def create_player():
    try:
        file_path = "../../assets/player/p1_walk{:02}.png"
        player_images = [pygame.image.load(file_path.format(k)).convert()
                         for k in range(4, 8)]
        return player_images
    except FileNotFoundError:
        player = pygame.Surface((64, 64))
        player.fill(pygame.Color("red"))
        return [player]

As long as the file exists, it should work as before. If the file does not exist (for example, try writing “jpg” instead of “png” in the above code), you will see a 64x64-pixel red rectangle instead of the player image.

In the try clause, the normal process is written. If no error occurs, the except clause is skipped. If an error occurs during the execution of the try clause, the execution is immediately moved to the except clause; if that error is specified after the except keyword (in this case FileNotFoundError), the body of the except clause is executed. If the error is the one not specified after the except keyword, the program terminates abnormally.

An object that represents the kind of error such as FileNotFoundError is called an exception in Python, and catching the exception with a try statement and handling it differently is called exception handling.

The convenience of this scheme is that exceptions can be handled across function calls. For example, in the current example, the FileNotFoundError exception is raised inside the pygame.image.load function. This means that the caller, create_player, can catch it without having to specifically check the return value of pygame.image.load. If you want, you can leave the create_player as it is, and let the main function, which is one level higher than create_player, catch it:

def main():
    screen = init_screen()
    text_image = create_text()
    try:
        player_images = create_player()
    except FileNotFoundError:
        player = pygame.Surface((64, 64))
        player.fill(pygame.Color("red"))
        player_images = [player]
    clock = pygame.time.Clock()
    frame_index = 0

If you want to raise an exception, use the raise statement. See the document for details.

4.3. Exercises

Problem 4-1

Try interactive execution to perform the following list operations. Before executing, predict the results (at ???), and check that your predicted result matches the actual result. If your prediction is wrong, think about the reason and try to understand it correctly.

>>> x = [10, 20, 30, 40, 50]
>>> x[3] = 99
>>> x
???
>>> x = [10, 20, 30, 40, 50]
>>> x[1:4]
???
>>> x = [10, 20, 30, 40, 50]
>>> x[:] = []
>>> x
???
>>> x = [10, 20, 30, 40, 50]
>>> x.append(99)
???
>>> list(range(2, 7))
???
>>> list(range(7, 2, -1))
???
>>> "/".join([c for c in ("red", "green", "blue") if len(c) >= 4])
???

4.4. Python’s Data Model

4.4.1. Variable is not a Box

When you start working with mutable data structures such as lists, you may encounter some seemingly puzzling phenomena. For example:

>>> x = [10, 20, 30]
>>> y = x
>>> y[0] = 123
>>> y
[123, 20, 30]
>>> x
[123, 20, 30]

When we changed the list y, x also changed. Can you guess what happened?

The key is y = x in the second line. Many of you may have imagined that the contents of list x, [10, 20, 30], would be copied and assigned to y. However, this is not what actually happens.

In order to understand this point, we need to refresh our image of what we have been calling “variables”. When we say “variable x”, most people think of a box with the name “x” written on it, and data stored in it. However, variables in Python are not like that. Data cannot have a name, but there is a “baggage tag” with a name written on it in another place. There is a string running from the baggage tag to the data.

_images/box_or_tag.png

The variable is this baggage tag, and the assignment is to connect the baggage tag to the data by a string.

If you think of it this way, you can understand the behavior of the previous example correctly. When x = [10, 20, 30], the list [10, 20, 30] is created, and a baggage tag labeled “x” is made, and the two are tied together with a string. Next, y = x will create a new baggage tag labeled y, which will be tied to the same [10, 20, 30], the destination of the string from x. Executing y[0] = 123 will follow the string from the tag y to the data in the list and change the first element to 123.

4.4.2. Online Python Tutor

Here is a useful website to help you get the right picture.

Click “Start visualizing your code now” to go to the page where you can enter your code. Make sure you select [default] in the drop-down list below the input fields. Enter this example code and click “Visualize Execution”:

x = [10, 20, 30]
y = x
y[0] = 123

You will see something like the following. Click the “Next” button, and the program will run line by line, and the status of the “tags and data” at each point will be shown on the right. (The following figure also works)

4.4.3. The Same Rule Applies to Other Type of Data

The relationship between variables and data as you see is not limited to lists. Integers, real numbers, pygame’s Surface, and Color; all work in the same way.

“What? That’s ridiculous. Integers and real numbers must be different, right?” I think many of you think so. Let’s try the same thing, but with integers instead of lists:

>>> x = 10
>>> y = x
>>> y = 123
>>> y
123
>>> x
10

This time, rewriting y does not change x. “I knew you were wrong. In this case, there is a box called x and a box called y, and inside the boxes are data such as 10 and 123, right?”, you might think.

But this is not the case. In the first line, x = 10, the integer 10 is created, a baggage tag x is made, and the two are tied together with a string. The second line, y = x, creates a tag y, which is tied to the 10 to which x is connected. At this point, the tags x and y are sharing the same integer 10.

The important part is the following: y = 123 creates an integer 123, and the string extending from y is detached from 10 and tied to 123. In no way is 10 overwritten by 123.

_images/box_or_tag_reassign.png

Because of this mechanism, the result for integers is the same as in the “box model”. That’s why we’ve been able to talk about it without being aware of the “baggage tag model”.

In Online Python Tutor, the default setting is to omit the visualization of integers and real numbers in the “baggage tag model”. To cancel this omission, select “render all objects on the heap” instead of “inline primitives” in the drop-down list below the code entry field. By inputting:

x = 10
y = x
y = 123

and clicking on Visualize Code, you will see that the string will be detached as explained earlier. You can also try the previous code example with a list [10, 20, 30]. You can see that the elements of the list in series are also “baggage tags”, and that the strings to the integer data are extended.

In the end, the difference in behavior between our examples for lists and integers comes down to the fact that assignment to list elements (i.e., x[0] = ...) changes the “baggage tags” stored in the list (which is allowed because lists are mutable), while assignment to a variable (i.e., x = ... ) changes the “baggage tag” held by the variable itself. In practice, everything works in the “baggage tag model”, but it is consistent with the “box model” as long as we think of immutable data structures.

Of course, “a string from A is connected to B” is not an accurate term. A more correct expression would be “A holds a reference to B”, “A refers to B”, “A points to B”, etc.

I think I’ve finally figured out what “Python variables have no type” means. We can connect variables to any type of data because they are just baggage tags, and we can reconnect them to other types of data. Is that correct?
Yes, but it is better to avoid using the same variable for different types of data, because it is confusing.
What exactly is the mechanism of this “reference”? It’s not like there’s an actual string extending through the computer, is there?
It holds a value, called a memory address, that represents the location where the referenced piece of data is.

Inside a computer, when a piece of data is written to or read from a specific location in memory, the location is specified using a numerical value called an address. When x = 10 is executed, the number 10 will be written somewhere in memory, and the variable x will hold that address.

To know the address of a variable, you can use the function id:

>>> x = 10
>>> id(x)
2528242330192
>>> y = x
>>> id(y)
2528242330192
>>> y = 123
>>> id(y)
2528242522288

The specific number returned by id is not very meaningful. Just note that the first id(y) is the same as id(x), and the second id(y) has a different value.

By references, you mean pointers in C, right? I’ve heard that Python doesn’t have pointers and therefore it’s easy. Is my understanding wrong?
You are right in that references and pointers are roughly the same. It’s not that Python doesn’t have pointers; it’s just that all variables are pointers, so there’s no need to say “This is a pointer”.

Another major difference between pointers and references is that the operations that can be performed are strictly limited. For example, in C you can add and subtract integers to/from a pointer to perform address calculations, but not in Python. This restriction makes Python safer to use than C pointers.

4.4.4. is vs. == (is not vs. !=)

With this understanding, we can see that when we say two pieces of data are “equal” to each other, it can mean two different things.

What matters in most cases is whether they have the same value or not (equality). This is what the operators == and ! = are used to determine this. For example, we have:

>>> x = [10, 20, 30]
>>> y = [10, 20, 30]
>>> x == y
True

However, in this case, the data entities pointed to by x and y are different. For example, if we execute y[0] = 99 immediately after this, x[0] will not change. This is because they are not identical.

To determine whether or not the entities are the same (identity), the operators is and is not are used. In the previous example, x and y are equal, but they are not identical:

>>> x is y
False
>>> x is not y
True

However, they become identical if y points to the same thing as x, as follows:

>>> y = x
>>> x is y
True

If you are familiar with the C language, it may be easier to think of is and is not as the equivalent of pointer comparison.

4.4.5. Practice

Let’s make the string “hello, world” be displayed only while the left mouse button is pressed.

hello_pygame.py (ver 19.0)
27def draw(screen, player_image, text_image, mouse_pos):
28    screen.fill(pygame.Color("black"))
29    screen.blit(player_image, mouse_pos)
30    if text_image is not None:
31        mouse_x, mouse_y = mouse_pos
32        text_offset_x = 100
33        screen.blit(text_image, (mouse_x + text_offset_x, mouse_y))
34    pygame.display.update()
44    while True:
45        frames_per_second = 60
46        clock.tick(frames_per_second)
47
48        should_quit = False
49        for event in pygame.event.get():
50            if event.type == pygame.QUIT:
51                should_quit = True
52            elif event.type == pygame.KEYDOWN:
53                if event.key == pygame.K_ESCAPE:
54                    should_quit = True
55                elif event.key == pygame.K_b:
56                    pass
57        if should_quit:
58            break
59        mouse_pos = pygame.mouse.get_pos()
60        buttons_pressed = pygame.mouse.get_pressed()
61        if buttons_pressed[0]:
62            text_image_shown = text_image
63        else:
64            text_image_shown = None
65
66        frame_index += 1
67        animation_period = 6
68        animation_index = (frame_index // animation_period) % len(player_images)
69        draw(screen, player_images[animation_index], text_image_shown, mouse_pos)
Lines 27-34

We modify the draw function so that if the text_image argument is None, it is not displayed.

We use is instead of == to compare with None, which is an object that is not supposed to be identical to any other data that may exist. We use is because this is a matter of identity.

Lines 60-64, and 69

If the left mouse button is pressed, the Surface type object pointed to by text_image is assigned to a variable text_image_shown; otherwise None is assigned to this.

The function pygame.mouse.get_pressed is similar to pygame.key.get_pressed in that it returns a sequence of boolean values for each mouse button pressed or not. The left button has index 0.

4.4.6. Relationship between Function Calls and Variables

It is important to understand the correct behavior of arguments, return values, and local variables when calling functions. Try running the following code in Online Python Tutor with “render all objects” specified:

def func(x, a):
    x[0] = 123
    a = 456
    y = [99, 88, 77]
    return y

def main():
    xm = [10, 20, 30]
    am = 1
    ym = func(xm, am)
    return

main()

There are a few important things to understand.

  • When a function is called, a frame (a different term from the video frame) is created for each function, and arguments and local variables are placed in the frame. These are “baggage tags”. Only the “baggage tags” are placed in the frame. The actual data are never placed in the frame.
  • Arguments and return values are passed in the same way as in assignment statements. In other words, the data pointed to by the source variable are also pointed to and shared by the destination variable.
  • During the execution of a function, the frame of that function is retained. It is retained even while calling other functions from it (for example, the main frame exists while calling func from main). However, when you return from a function, the frame of that function is deleted and the variables in it are also deleted.
  • Even if a variable is deleted, it does not mean that the data it points to are deleted.

Be careful with the last item. For example, notice the behavior before and after the return from func in the above example. The list pointed to by the variable y [99, 88, 77] was created in the function func, and when you return from func, the “baggage tag” y disappears, but not the list [99, 88, 77] pointed to by it. This list will now be pointed to by the variable ym, and can be used in the main function (although in this example it will not be used in any way because main exits soon).

Then, once data are created, will they remain forever? No, they will be automatically deleted when they are no longer referenced by any of the remaining variables. For example, the integer 456 pointed to by variable a will be deleted on return from func.

The invocation of deletion of data is not limited to the return from a function. For example, if two lines such as:

x = 100
x = 200

are executed, the integer 100 created in the first line will be deleted when the second line is executed, since it is no longer referenced from anywhere.

It seems to be very different from the C language.
It’s very different. If you are familiar with C, you can compare them for a better understanding.

If you go to the Online Python Tutor page at https://pythontutor.com/ and open C Tutor under Related services, you can visualize the execution of your C code in the same way.

If we write something similar to the above Python code in C, it will look like this. Try visualizing it with C Tutor:

int *func(int *x, int a) {
    x[0] = 123;
    a = 456;
    int y[3] = {99, 88, 77};
    return y;
}

int main() {
    int xm[3] = {10, 20, 30};
    int am = 1;
    int *ym = func(xm, am);
    return 0;
}

What you will notice right away is that both integers and arrays are placed in the “frame” on the left side of the visualization screen. The arrow does not extend to the right side of the screen. Reassigning a value directly rewrites the contents of the variables placed in the frame on the left side. In other words, in C, these variables are working in the “box model”.

During the execution of the function func, a 3-element array y is kept in the frame of func, and its starting address is returned to main by return y;; at that point, the frame of func is destroyed, so ym in main does not point to anything (depending on the version of the compiler you have chosen, it may be NULL or garbage). If you try to do ym[0] = 999; after this in main, you will get a runtime error.

This is why in C we are taught not to return local arrays defined in functions.

The reason why it is safe to return a list in Python is that a list (or any piece of data) is allocated in the “right side region”, and the only thing that is destroyed together with the function frame is the baggage tag that points to the data.

If you want to do something similar to the Python behavior in C, you can do something like this:

#include <stdlib.h>

int *func(int *x, int a) {
    x[0] = 123;
    a = 456;
    int *y = (int *)malloc(sizeof(int) * 3);
    y[0] = 99;
    y[1] = 88;
    y[2] = 77;
    return y;
}

int main() {
    int *xm = (int *)malloc(sizeof(int) * 3);
    xm[0] = 10;
    xm[1] = 20;
    xm[2] = 30;
    int am = 1;
    int *ym = func(xm, am);
    return 0;
}

The malloc (Memory ALLOCation) function may be new to some of you. Without going into details, in this example, it does the job of allocating an array of three integers in the “right side region”. In this way, even after the frame of func is destroyed, the array data pointed to by y survive and can be read and written through the ym variable in the main function.

The problem is that even if these array data are not referenced anymore, they will not be freed automatically like in Python (although this is not a problem in this example because main will exit immediately afterwards). When memory is allocated using the malloc function in C, it is the programmer’s responsibility to free it by calling the function free, which is sometimes difficult and can lead to bugs if not written carefully.

Comparing the behavior of Python and C, it seems more convenient and logical to automatically allocate and free data as in Python. Why does C behave in such an inconvenient way? What about other languages?
The reason why C behaves this way is because it is speed-oriented.

Allocating memory space is a time-consuming heavy process. Detecting data that are no longer referenced and releasing them (called garbage collection) is an even heavier process. Doing this for all the data slows down the execution of the program. For this reason, in C, the policy is to leave everything to the programmer.

C++ is basically the same as C, but it uses a mechanism called smart pointers that allows automatic memory management similar to that of Python when necessary.

In Java, primitive data types such as int and float are the same as in C, but arrays and user-defined types are automatically managed in the same way as in Python. You may wonder what you should do if you want to let Java automatically manage integers and real numbers. Java employs an acrobatic mechanism of providing separate types Integer and Float, which work differently from int and float but work in the same way as Python.

On the right side of the Online Python Tutor screen, where the data are listed, it says “Objects”. Does that mean that everything that appears here is an object?
Yes, it is.

In fact, in Python, all data (and even functions and modules) are treated as objects. So asking “Is this an object or not?” is almost pointless. The only important thing here is that you should be able to distinguish between objects and the variables that point to them (i.e., the baggage tags).

4.4.7. Master The Use of Lists

The elements of a list can be of different types. Since they are just a list of “baggage tags”, each tag can point to an entity of any type:

>>> x = [10, 20, 3.14, "abc"]
>>> x
[10, 20, 3.14, 'abc']

Thus, a list can also be an element of a list. This makes it possible to construct quite complex data structures using only lists:

>>> x = [10, 20, [99, 88, ["a", "b", "c"]], 40, 50]
>>> x
[10, 20, [99, 88, ['a', 'b', 'c']], 40, 50]
>>> x[2]
[99, 88, ['a', 'b', 'c']]
>>> x[2][2]
['a', 'b', 'c']
>>> x[2][2][2]
'c'

Can you imagine what the data structure would look like? Try visualizing it in the Online Python Tutor.

The assignment statement y = x only makes y point to the same thing as x, and it does not copy the data at all. So how can we make a copy? One way is to use the method copy on the list type:

x = [10, 20, [99, 88, ["a", "b", "c"]], 40, 50]
y1 = x
y2 = x.copy()

Exactly the same thing as x.copy() can be done by specifying a “slice from the first element to the last element” x[:]. Add the following lines to check:

y3 = x[:]

Note that in both ways, only the first depth level of the list is copied. In other words, we are only copying the “baggages tags” in series at the destination pointed to by x. It does not copy beyond the first depth level of “tags”, so the “second depth level” and beyond are shared. This is called a shallow copy.

In some cases, you may want to recreate a completely independent data set by copying all the data at and beyond the second depth level. This is called a deep copy. You can do it manually as follows, but it is a bit tedious:

x = [10, 20, [99, 88, ["a", "b", "c"]], 40, 50]
y4 = x.copy()
y4[2] = x[2].copy()
y4[2][2] = x[2][2].copy()

There is a function called deepcopy in Python’s standard copy module. This will do the job for you in a single step. Check out the Online Python Tutor to see that the result is the same as if you had done it manually:

import copy
x = [10, 20, [99, 88, ["a", "b", "c"]], 40, 50]
y5 = copy.deepcopy(x)

4.5. Exercise

Problem 4-2

Use interactive execution to perform the following list operations. Before executing, predict the result (at ???), and check that it matches the actual result. If it is not what you expected, check with the Online Python Tutor to make sure you understand it correctly.

>>> x = [10, 20, 30]
>>> y = x
>>> x.append(40)
>>> y
???
>>> x = [10, 20, 30]
>>> y = x
>>> x = [10, 20, 30, 40]
>>> y
???
>>> x = [10, 20, 30]
>>> y = x
>>> x[:] = [10, 20, 30, 40]
>>> y
???
>>> x = [10, 20, 30]
>>> x[1] = [99, 98, 97]
>>> x
???
>>> def func(x):
...     x[0] = 99
...
>>> x = [10, 20, 30]
>>> func(x)
>>> x
???
>>> def func(x):
...     x = [99, 98, 97]
...
>>> x = [10, 20, 30]
>>> func(x)
>>> x
???

4.6. Writing docstrings

This concludes our review of Python basics with the example of the hello_pygame.py development. Finally, let’s take a look at docstrings.

A docstring is an explanatory comment that can be placed at the beginning of a file, at the beginning of a function, etc. You may have noticed that when you mouse-hover over a function name in VSCode, the description pops up. It will be convenient if your own functions have such popups. It helps people who will read the code later (including yourself a few months later) even if they don’t use VSCode.

Here’s a simple example for the functions init_screen and draw.

def init_screen():
    """Initialize pygame screen.

    Library pygame is initialized and screen size is set to 600x400.

    Returns
    -------
    Surface
        Screen surface
    """
    pygame.init()
    width, height = 600, 400
    screen = pygame.display.set_mode((width, height))
    return screen
def draw(screen, player_image, text_image, mouse_pos):
    """Draw images onto screen.

    player_image is drawn at mouse_pos on screen, and text_image,
    unless it is None, is drawn to the right of player_image.

    Parameters
    ----------
    screen : Surface
        Screen onto which images are drawn.
    player_image : Surface
        Player image to be blit'ed.
    text_image : Surface
        Text image to be blit'ed.  If None is passed, nothing is drawn.
    mouse_pos : tuple[int, int]
        Position where player_image is blit'ed.
    """
    screen.fill(pygame.Color("black"))
    screen.blit(player_image, mouse_pos)
    if text_image is not None:
        mouse_x, mouse_y = mouse_pos
        text_offset_x = 100
        screen.blit(text_image, (mouse_x + text_offset_x, mouse_y))
    pygame.display.update()

In the case of a function definition, it is written between two “”“‘s immediately after the def line. To be precise, “””…”””” is not a comment, but a string that can contain line breaks. A statement with only a string has no effect on execution, so it can be effectively treated as a comment. Note, however, that the indentation level must be consistent.

There are several styles of writing docstrings, such as Google style, numpy style, etc., but they all have one thing in common: write a brief one-line summary first, and then write a detailed explanation after a single blank line. The detailed description should include the behavior, argument types and their descriptions, return type and its description, etc. in the respective styles. The above examples are in numpy style.

Don’t you write docstrings in the sample code in this textbook?
No, because it would make the sample too long and the same docstring would appear many times. Please read the explanation of functions etc. from the main text.
I always try to put comments in every line of code, not just at the beginning of files and functions.
Unfortunately, that canno be recommended.

Rather, it is more useful to devise the names of variables and functions so that the reader can understand what each line is doing naturally without comments.

After that, it will be easier to read the program if you add comments only where it is still difficult to understand, where you need to explain why it was necessary to write it that way, or where you need to explain the purpose of the process.

4.7. Exercise

Problem 4-3

In Exercises in the previous chapter, we modified the program so that the displayed text changes while a key is pressed. In this chapter, the text is now only displayed when the mouse button is pressed, so let’s make the text changes its content and color each time the left mouse button is pressed.

(Hint: You may want to consider creating a Surface list text_images and displaying it just like we do for player_images. Use the MOUSEBUTTONDOWN event to detect button presses)

Problem 4-4

Modify the program to stop footstepping when the player is not moving.

(Hint: Check out the pygame.mouse.get_rel function. It will give you the relative movement of the mouse. You may want to consider calculating the index to player_images based on that amount)

Problem 4-5

Modify the program to change the speed of the footsteps depending on how fast the player is moving.

(Hint: You can get the cumulative distance of walk by accumulating the distance per frame from pygame.mouse.get_rel. You can use math.sqrt to calculate the distance per frame)

Problem 4-6

assert statement is of a form assert EXPRESSION, which will do nothing if EXPRESSION evaluates to True, but raise an error if it is False. It is useful to check if the program is behaving as expected. Modify the following function set_to_zero_vector so that the following assert statements pass:

def set_to_zero_vector(vec, n_elements):
    vec = [0] * n_elements

x = []
set_to_zero_vector(x, 3)
assert x == [0, 0, 0]

y = [10, 20]
set_to_zero_vector(y, 5)
assert y == [0, 0, 0, 0, 0]