In this post, I present some interesting features of Python that were not within the scope of the previous posts.

Strings

String formatting

While Python offers multiple ways of formatting strings (i.e., combining predefined text and variables), F-strings are particularly elegant:

constants = {
    "pi": 3.14159265358979323846,
    "sqrt(2)": 1.41421356237309504880,
    "Euler's number": 2.71828182845904523536
}

for name, value in constants.items():
    print(f"{name} = {value}")
pi = 3.141592653589793
sqrt(2) = 1.4142135623730951
Euler's number = 2.718281828459045

The float variables can be rounded to a given precision:

for name, value in constants.items():
    print(f"{name} = {value:.3f}")
pi = 3.142
sqrt(2) = 1.414
Euler's number = 2.718

Values can also be formatted to occupy a minimum fixed width:

for name, value in constants.items():
    print(f"{name:10} = {value:.3f}")
pi         = 3.142
sqrt(2)    = 1.414
Euler's number = 2.718

Note that the string “Euler’s number” exceeds the minimum length of 10, and is hence represented as is.

Implicit string concatenation

Strings placed next to each other are automatically concatenated:

assert "foo" "bar" == "foo" + "bar"

This is useful to cleanly produce long strings while respecting a certain maximum line length:

message = "Hello, " \
          "World!"
print(message)
Hello, World!

Lists

enumerate with an offset

The enumerate function creates a lazy generator over an iterable that will return a tuple (index, item). It can take a second parameter, to indicate the first index to start counting from:

x = ["a", "b", "c"]

for idx, item in enumerate(x, 10):
    print(f"{idx}: {item}")
10: a
11: b
12: c

zip and itertools.zip_longest

The zip function combines two or more iterators, generating a lazy generator which yields the next item from each. It is particularly useful to handle related lists that have the same length:

numbers = [1, 2, 3]
squared = [x**2 for x in numbers]

for number, square in zip(numbers, squared):
    print(f"The square of {number} is {square}.")
The square of 1 is 1.
The square of 2 is 4.
The square of 3 is 9.

However, when the two iterables have different lenghts, zip will only emit as many elements as the shortest of them:

xs = list(range(4))
ys = list(range(5))

for x, y in zip(xs, ys):
    print(x, y)
0 0
1 1
2 2
3 3

When we do not wish this truncation to happen, itertools.zip_longest might be what we need:

from itertools import zip_longest

xs = list(range(4))
ys = list(range(5))

for x, y in zip_longest(xs, ys):
    print(x, y)
0 0
1 1
2 2
3 3
None 4

Sorting by complex criteria

The list.sort method orders a list’s elements in ascending order. It will work as long as the items have defined the < comparison operator, as is the case for floats, integers and strings. However, in some cases that operator might not be implemented, or might not be making the comparison that we care about. The key argument is helpful in those cases:

class Animal:
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight

    def __repr__(self):
        return f"Animal({self.name}, {self.weight})"

animals = [
    Animal("whale", 100000),
    Animal("sea lion", 200),
    Animal("lion", 200),
    Animal("possum", 2.5)
]

# sort by weight
animals.sort(key = lambda x: x.weight)
print(animals)
[Animal(possum, 2.5), Animal(sea lion, 200), Animal(lion, 200), Animal(whale, 100000)]
# sort by name
animals.sort(key = lambda x: x.name)
print(animals)
[Animal(lion, 200), Animal(possum, 2.5), Animal(sea lion, 200), Animal(whale, 100000)]

As shown, key takes a function which will receive an item, and output a comparable value. If we want to order first by weight, then by name, we just need to combine both in a tuple:

# sort by name
animals.sort(key = lambda x: (x.weight, x.name))
print(animals)
[Animal(possum, 2.5), Animal(lion, 200), Animal(sea lion, 200), Animal(whale, 100000)]

The walrus operator

The walrus operator (:=) allows to assign variables in the middle of expressions:

def is_divisor(x, y):
    """
    Check if y is a divisor of x.

    Parameters:
    - x (int): The dividend.
    - y (int): The potential divisor.

    Returns:
    tuple: A tuple containing a boolean indicating whether y is a divisor of x,
           and the remainder when x is divided by y. If y is a divisor, the
           boolean is True, and the remainder is 0; otherwise, the boolean is
           False, and the remainder is the result of x % y.
    """
    if remainder := x % y:
        return False, remainder
    else:
        return True, 0

print(is_divisor(10, 5))
(True, 0)
print(is_divisor(10, 3))
(False, 1)

The walrus operator is present in the first line of the is_divisor function. It allows two things to happen at once. First, the if clause will evaluate the expression x % y (false if the remainder is 0; true if it’s any other number). Additionally, it is setting the remainder variable to x % y. This makes the code easier to understand, since remainder is only defined if it is going to be used.

Floats and integers

Underscores as visual separators

We can use underscores _ as visual separators between any pair of digits in integers, floats or complex numbers:

assert 10_000_000 == 10000000
assert 1_100.3 == 1100.3

I find this particularly useful when dealing with large numbers.

Limitations of floating point arithmetic

Most decimal floating-point numbers cannot be represented as binary floating-point numbers. Instead, computers just store an approximation. This behavior is not evident by just asking Python to display a number, since it will round it:

print(0.1)
0.1

However, if we request Python to give more significant digits:

format(0.1, '.20g')
'0.10000000000000000555'

While this approximation is smaller than \(2^(-53)\), that is enough to cause errors:

assert .1 + .2 == .3
AssertionError

Luckily, we can get around it with a little extra work:

import math

assert math.isclose(.1 + .2, .3)

assert round(.1 + .2, ndigits=1) == round(.3, ndigits=1)

Generators

TODO

Typing hints

In this series, we have seen multiple examples in which the type of a variable is specified. For instance:

  • Variables:
    x: dict[int, int] = {0: 0, 1: 1}
    
  • Function arguments and return values:
    def pretty_print(x: str, prefix: str | None = None) -> None:
      prefix = f"{prefix}: " if prefix else ""
      print(f"{prefix}{x.title()}.")
    

Note that typing hints are a relatively recent addition to Python. Typing hints of recent verions of Python might produce parsing errors on older version.

The stdlib’s typing module gives many options to control type hints. (Widely used packages bring their own typing hints, like numpy.) Below I explore some interesting features.

Overloading functions

The decorator @typing.overload allows to overload functions, that is, have a function behave differently depending on the argument type.

from typing import overload
 
@overload
def square(x: int) -> int:
    ...
 
@overload
def square(x: list[int]) -> list[int]:
    ...
 
def square(x: list[int] | int) -> list[int] | int:
    if isinstance(x, list):
        return [square(_x) for _x in x]
    return x * 2

Type checking

Python is a dynamically typed language. Hence, typing hints are just, that, hints. However, we can use mypy on our entire codebase to check that types are used correctly.

Bitwise operations

For integer arithmetic

Some people are really concerned by performance. Their concern is such that they are willing to sacrifice code readability for minor gains in performance. Such people might get satisfaction from replacing arithmetic operations involving integers by bitwise operations. Since those act directly on the bit representation of the integer, they can be more efficient. Despite compilers performing some optimization of their own, there is some somewhat old evidence supporting that bitwise operations are faster. I describe below some common optimizations.

Dividing and multiplying by powers of two

The >> and the << operators shift the bit representation to the left and to the right, respectively. This can be used to quickly divide or multiply integers by powers of two:

x = 0b101 # 5

# shift to the right by 1
#   0b101 >> 0b10
# equivalent to 5 // 2**1
5 >> 1 # 2

# shift to the left by 4
#   0b101 >> 0b1010000
# 5 * 2**4
5 << 4 # 80

Check if a number is odd

The & operator is the bitwise AND operator. When we use & between any integer and a 1, we are effectively cheching if the last bit is a 1 (odd) or a 0 (even):

# 0b1110 & 0b0001 = 0b0000 = 0
assert not 14 & 1

# 0b1111 & 0b0001 = 0b0001 = 1
assert 15 & 1

Iterate a list from the end

The ~ operator is the complement operator, which switches 1s by 0s and vice versa. Let’s see it in action:

# 0b01 -> 0b10
assert ~1 == -2
assert ~-2 == 3

Since the first bit represents the sign, it has the effect of turning \(x\) into \(-x - 1\). This is useful when we need to simultaneously iterate the front and the back of a list:

def is_palindrome(word: str) -> bool:
    return all([word[i] == word[~i] for i in range(len(word) // 2)])

assert is_palindrome("kayak")
assert not is_palindrome("dog")

Exceptions

Handling exceptions with try: ... except: ... is a common in Python code. But there are some additional nuances:

y = list()
x = 1
try:
    x + 1
    y.append(1)
    {}[1]
# we can handle multiple, specific exceptions
except TypeError:
    print(f"Can't sum an integer and a {type(x)}.")
except AttributeError:
    print(f"Can't append to {type(y)}.")
# we can still add a catch-all exception
except:
    # we can throw our own exception
    raise Exception("Something went wrong.")
# behavior if no error is raised
else:
    print("All good.")
# a block that will be run no matter what,
# usually good for clean up
finally:
    print("Thanks anyway.")

Context managers

A context manager is a programming construct that makes it easy to allocate and release resources. It is useful to handle file operations, network connections or database transactions, when it is important to release the resource when we are done with it. They can be used using the with statement. The context manager class needs two methods: __enter__, to setup the resource, and __exit__, to clean up and release the resource.

References