List comprehensions Readability Performance

A concise way to create new lists from other sequences, for example applying some operations to each element (like ‘map’ in other languages), or selecting elements that satisfy a certain condition (like ‘filter’ in other languages).

list_comp = [x * 2 for x in range(10)]

No Pythonic

result_list = []
for el in range(10000000):
    result_list.append(el)

Pythonic way

result_list = [el for el in range(10000000)]

It is more readable and also improve the performance:

# No pythonic
0.655946016312 seconds
# Pythonic
0.255126953125 seconds

References:

[1] PEP 202 - List Comprehensions

Dict comprehensions Readability Performance

Is an easy and elegant way to construct a dictionary. Is a similar case as list comprehensions

dict_compr = {k: k**2 for k in range(4)}

No Pythonic

d = {}
for k in range(10000):
    d[k] = k**2

Pythonic way

dict_compr = {k: k**2 for k in range(10000)}

It is more readable and also improve the performance:

# No Pythonic
0.00253295898438 seconds
# Pythonic
0.00185489654541 seconds

References:

[1] PEP 274 - Dict Comprehensions

Decorators Readability

A decorator is the name used for a software design pattern. Decorators dynamically alter the functionality of a function, method, or class without having to directly use subclasses or change the source code of the function being decorated. [1]

import time

def time_dec(func):
    def wrap(*arg, **kwargs):
        init_time = time.time()
        res = func(*arg, **kwargs)
        print(func.__name__, time.time()-init_time)
        return res
    return wrap

@time_dec
def foo(n):
    out = 0
    for i in range(n):
        out = out + i
    return out

@time_dec
def bar(n):
    return sum(range(n))

foo(10000000)
bar(10000000)


"""
OUTPUT:
foo 0.555375337600708
bar 0.183899402618408
"""

References:

[1] https://wiki.python.org/moin/PythonDecorators

[2] PEP 318 -- Decorators for Functions and Methods

Magic methods Readability

Are special methods that can be invoked by special syntax [1]. For example:

class Rectangle():
    def __init__(self, height, width):
        self.height = height
        self.width = width

    def __eq__(self, rect):
        return (self.height * self.width) == (rect.height * rect.width)

    def __lt__(self, rect):
        return (self.height * self.width) < (rect.height * rect.width)

    def __gt__(self, rect):
        return (self.height * self.width) > (rect.height * rect.width)

r1 = Rectangle(3,6)
r2 = Rectangle(3,5)

print( r1 > r2 )# True
print( r1 < r2 )# False
print( r1 == r2 ) # False

That code in Python2 could be improved with the use of the decorator functools.total_ordering

A complete list well documented can be found here


References:

[1] docs.python.org - Special Method Names

Finally Block Readability

Finally is a optional clause intended for cleaning-up actions. It is always executed before leaving the try statement. For example:

def divide(a, b):
    try:
        result = a / b
    except ZeroDivisionError:
        print("Division by zero not allowed!")
    else:
        print("The result is", result)
    finally:
        print("Goodbye :)")

divide(6, 2)
# The result is 3
# Goodbye :)

divide(6, 0)
# Division by zero not allowed!
# Goodbye :)

References:

[1] docs.python.org - Defining clean up actions

With statement Readability

The with statement allows you to execute two related operations as a pair, with a block of code in between [1]. Opening a file is the typical example. It opens the file in the with Statement and closes it when the block ends:

with open("El_Quijote.txt", 'w') as f:
    f.write("Sancho Panza")

But there are more like threading.Lock for threads and you can define your owns context managers with the use of __enter__ and __exit__ methods in a class or with the use of contextlib library [2]. For example:

class File(object):
    def __init__(self, name, mode='r'):
        self.file = open(name, mode)
    def __enter__(self):
        return self.file
    def __exit__(self, type, value, traceback):
        self.file.close()

with File("my_file.txt", 'w') as f:
    f.write("hi")

References:

[1] The with statement by example

[2] contextlib - Utilities for with-statements contexts

[3] PEP 343 -- The "with" Statement

enumerate Readability

The built-in enumerate returns a enumerate object from an object that can be iterable. The next method, return a tuple containing a count (from 0 by default) and the the next value of the iterable object. It is useful when you need the index of the iterable. For example:

seasons = ['Spring', 'Summer', 'Fall', 'Winter']
print(list(enumerate(seasons)))
# [(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]

References:

[1] docs.python.org - enumerate built-in

[2] PEP 279 - The enumerate() built-in function

Generators Performance

Generator functions allow you to define a function that can be used as an iterator. For example:

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

You can also define a generator class with the use of __iter__ and __next__ magic methods

A concise way of making generators are generator expressions: Generator expressions

In the following example you can also see that can improve the performance because it doesn't build the full list in memory.

For example:

def my_xrange(n):
    # I should use range in Python3 or xrange in Python2
    # But this is bit more intuitive for beginner Python programmers
    i = 0
    while i < n:
        yield i
        i += 1


def my_range(n):
    # I should use range in Python2 for a list comprehension
    # But this is bit more intuitive for beginner Python programmers
    l = []
    i = 0
    while i < n:
        l.append(i)
        i += 1
    return l

for i in my_xrange(10000000):
    pass
# It took 0.514423131943 sec

for i in my_range(10000000):
    pass
# It took 0.966587781906 sec

To fully understand generators, you should understand the keyword yield


References:

[1] Iterators and Generators

[2] Python Wiki - Generators

[3] Jeff Knupp - Yield and Generators

Yield Performance

The keyword yield allows you to write generators. It is used as a return statement, but when you run the funcion, it will just return a generator, the code won't run.

Try to undertand the following example:

def fib():
    """ Fibonacci generator """
    a, b = 0, 1
    while 1:
       yield b
       a, b = b, a + b

f = fib()
print(f)  # <generator object fib at 0x7f72dce25870>

print(f.next()) # 1
print(f.next()) # 1
print(f.next()) # 2
for i in f:
    print(i)

# 3
# 5
# 8
# "Never" ends

References:

[1] PEP 255 - Simple generators

[2] Jeff Knupp - Yield and Generators

Generator Expressions Performance

Generator expressions are a concise way to create generators. The syntax is the same as list comprehensions but are more memory efficient because don't store the whole list in memory

squares = sum((x * 2 for x in range(10)))

References:

[1] PEP 289 - Generator Expressions

collections.defaultdict Performance Readability

defaultdict is a subclass of the built-in dict class. The functionality is the same as a dictionary. The powerful is that you can define a default value when the key is not found and it doesn't raise KeyError.

For example, with int as default_factory

from collections import defaultdict

s = "Honorificabilitudinitatibus"
d = defaultdict(int)
for char in s:
    d[char] += 1 # Doesn't raise KeyError

print(d)
#defaultdict(<type 'int'>, {'a': 2, 'c': 1, 'b': 2, 'd': 1, 'f': 1, 'i': 7, 'H': 1, 'l': 1, 'o': 2, 'n': 2, 's': 1, 'r': 1, 'u': 2, 't': 3})

With list as default_factory:

from collections import defaultdict

d = defaultdict(list)

d["a"].append(1)
d["a"].append(2)
d["b"].append(5)
print(d["c"]) # []
print(d) # defaultdict(<type 'list'>, {'a': [1, 2], 'c': [], 'b': [5]})

You can use more default values as set or dict and also you can define your own constant function with the use of itertools.repeat [1]

defaultdict can improve the redability and the maintenance and sometimes the performance, but actually, the keyword in for checking a key in a dictionary is really fast


References:

[1] docs.python.org - defaultdict objects

collections.namedtuple Readability

namedTuple is a factory functions for creating tuples with fields that can be accessible by attribute lookup as well as being indexable and iterable [1] They are useful for storing the results of some functions. For example, in the next case it returns a point, but you is not really readable:

def get_point():
    # ...
    x = 2.0
    y = -3.5
    return x, y

point = get_point()
print(point[0], point[1]) # not readable

You can also unpack the point with x, y = get_point(), but you get 2 variables and for multiple points it could be confusing. The alternative is the use of namedTuple:

from collections import namedtuple

def get_point():
    point = namedtuple("point", "x y")
    # ...
    x = 2.0
    y = -3.5
    return point(x, y)

point = get_point()
print(point.x, point.y) # better

It is more readable, easier to use and more maintainable


References:

[1] docs.python.org - namedtuple

collections.deque Performance Readability

deque (double-ended-queue) is a generalization of queues and stacks. It supports thread-safe and memory efficient, with approximately O(1) performance.

One example of the use of deque methods is shown below:

from collections import deque

d = deque([1,2,3,4])
d.append(d.pop())
print(d) # deque([1, 2, 3, 4])
d.appendleft(d.popleft())
print(d) # deque([1, 2, 3, 4])

References:

[1] docs.python.org - deque objects

collections.Counter Readability

Counter is a dict subclass for counting objects. It supports 3 methods: elements(), most_common([n]) and substract([iterable-or-mapping])

from collections import Counter

s = "Honorificabilitudinitatibus"

c = Counter(s) # Counter from iterable
print(c['i']) # 7
print(list(c.elements())) # ['a', 'a', 'c', 'b', 'b', 'd', 'f', 'i', 'i', 'i', 'i', 'i', 'i', 'i', 'H', 'l', 'o', 'o', 'n', 'n', 's', 'r', 'u', 'u', 't', 't', 't']
print(c.most_common(4)) # [('i', 7), ('t', 3), ('a', 2), ('b', 2)]

print(c) # Counter({'i': 7, 't': 3, 'a': 2, ...
c.subtract(['i'])
print(c) # Counter({'i': 6, 't': 3, 'a': 2,

References:

[1] docs.python.org - Counter

@classmethod Readability

A class method is the function that belongs to the class. It receives the class as first argument and doesn't need an instance of the class to be called. @classmethod is a function decorator for a class-method.

class A(object):
    def foo(self):
        return "I am a method of A's instance :)"

    @classmethod
    def class_foo(cls):
        return "I am a class_method, I know about the class :)", cls

print(A.class_foo()) # ('I am a class_method, I know about the class :)', <class '__main__.A'>)

References:

[1] docs.python.org - classmethod

@staticmethod Readability

A static method doesn't receive an implicit first argument, therefore that method "doesn't have" any reference to the class from inside. @staticmethod is a function decorator for a static-method.

class A(object):
    def foo(self):
        return "I am a method of A's instance :)"

    @staticmethod
    def static_foo():
        return "I am a static_method, I don't know about the class :)"


print(A.static_foo()) # I am a static_method, I don't know about the class :)

References:

[1] docs.python.org - staticmethod

zip readability performance

This built-in returns a generator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables [1].

names = ["John", "Alexander", "Bob"]
marks = [5.5, 7, 10]
for name, mark in zip(names, marks):
    print(name, mark)

# Jhon 5.5
# Alexander 7
# Bob 10

References:

[1] docs.python.org - zip

itertools readability performance

iziplong

This function is like zip, generate a iterator of tuples from each of the argument sequences. If any sequence is shorter, Generates a new iterator from all the iterables given. If any iterable is shorter, the last tuples are formed from the largest tuple and a default value giver or None.

Better explained with one example:

from itertools import izip_longest

names = ["John", "Alexander", "Bob", "Alice"]
marks = [5.5, 7, 10]
for name, mark in izip_longest(names, marks, fillvalue="absent"):
    print name, mark

# Jhon 5.5
# Alexander 7
# Bob 10
# Alice absent

starmap

starmap is like map (cumputes a function using arguments obtained from iterable arguments), but the difference is that the iterable here is a tuple

map(func, iter1, iter2) is similar to starmap(func, map(iter1, iter2))

from itertools import starmap
items = [(1, 7), (2, 8), (3, 9)]
res = starmap(lambda x, y: x*y, items)
print list(res) # res is an iterator
# [7, 16, 27]

tee

This a useful function to return n independent iterators from a single iterator. Important: When a copy is made, the original iterator shouldn't be used because the copies could get advanced without the tee objects being informed

from itertools import tee

it = xrange(10)

it1, it2, it3 = tee(it, 3)

print it1.next() # 0
print it2.next() # 0
print it2.next() # 1
print it3.next() # 0

groupby

It is useful to group an interable given a function. For example:

from itertools import groupby

# Show the words with the same initial letter together
words = ["dog", "cat", "house", "car", "function", "class", "foo"]

# words MUST be sorted
words_sorted = sorted(words)

for key, group in groupby(words_sorted, lambda x: x[0]):
    print list(group) # group is iterator

# ['car', 'cat', 'class']
# ['dog']
# ['foo', 'function']
# ['house']

References:

[1] docs.python.org - izip

[2] docs.python.org - starmap

[3] docs.python.org - tee

[4] docs.python.org - groupby

functools.@total_ordering readability

Given a class defining one comparison ordering methods (>, <, >=, <=) and == , this class decorator supplies the rest.

from functools import total_ordering

@total_ordering
class Rectangle():
    def __init__(self, height, width):
        self.height = height
        self.width = width

    def __eq__(self, rect):
        return (self.height * self.width) == (rect.height * rect.width)

    def __lt__(self, rect):
        return (self.height * self.width) < (rect.height * rect.width)

r1 = Rectangle(3,6)
r2 = Rectangle(3,5)

print(r1 > r2) # True
print(r1 < r2) # False
print(r1 >= r2) # True
print(r1 <= r2) # False
print(r1 == r2) # False

References:

[1] docs.python.org - @total_ordering

map Readability

Apply function to every item of iterable and return a list of the results. [1]

transformed_data = map(my_function, iterable_data)

No Pythonic

In this case we can see a how to compute the square of a list of numbers

numbers = range(5)
square = []
for i in numbers:
    square.append(i**2)

Pythonic

A better way of doing that is with the use of map. In this case it applies a lambda function:

numbers = range(5)
print map(lambda x: x**2, numbers)

But the more pythonic way could be done with the use of list comprehensions in this case


References:

[1] docs.python.org - map

filter Readability

Construct a list from those elements of iterable for which function returns true. [1]

filter(boolean_function, items)

No Pythonic

numbers = range(20)

result = []
for n in numbers:
    if n % 2:
        result.append(n)

print result

Pythonic

numbers = range(20)

result = filter(lambda x: x % 2, numbers)

References:

[1] docs.python.org - filter

reduce Readability

Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. [1]

reduce(func_for_operation, items)

No Pythonic

numbers = range(20)

result = 0
for n in numbers:
    result += n

print result

Pythonic

numbers = range(20)

result = reduce(lambda x, y: x + y, numbers)

print result

References:

[1] docs.python.org - reduce

lambda Readability

lambda is a keyword of Python that allows to construct anonymus functions in one line

The simpliest example can be shown below:

add_one = lambda x: x + 1
print(add_one(2)) # 3

But it can be applied to map, filter and reduce. Or other functions:

numbers = range(-10, 10)

print(sorted(numbers, key=lambda x: x**2))
# [0, -1, 1, -2, 2, -3, 3, -4, 4, -5, 5, -6, 6, -7, 7, -8, 8, -9, 9, -10]

References:

[1] docs.python.org - lambda

dictionaries Readability Performance

Dictionaries in Python are like objects in Java. This kind of structures allows you to store items indexed by a key. This structures are basic in Python, and could be very powerful if you know how to manage them.

Get a value.

There are multiple ways, but one idiomatic practice in Python is EAFP (it's Easier to Ask for Forgiveness than Permission). Jeff Knupp explain it in one of his articles [1]. Here we describe multiples examples:

  • Catching the exception (EAFP):
  • try:
        my_value = dict['k']
    except KeyError:
        print("Key not found")
    
  • Look Before You Leap (LBYL): Opposite for the EAFP principle:
  • if 'k' in my_dict:
        my_value = my_dict['k']
    
  • dict.get(key, default): Try to get the value and if does not exist, retrieve a default value
  • my_dict.get('k', 0)
    
  • collections.defaultdict

Create a dictionary.

The default way could be with the use of d = dict() or d = {}, but it can be also accomplised with the use of dict comprehensions as we explain in other section:

dict_compr = {k: k**2 for k in range(4)}

Iterate over keys and values:

If you want a copy of the dictionary's list, you can use dict.items() and returns a list with (key, value) pairs. But if you want to iterate over the dictionary is better to use an iterator:

for k, v in d.iteritems():
    print(k, v)

References:

[1] Jeff Knupp - Write Cleaner Python: Use Exceptions

* and + for sequences Performance

In order to concatenate multiple copies of the same sequence, the most Pythonic approach is with the use of *:

l = [1, 2, 3]
print(l * 3)
# [1, 2, 3, 1, 2, 3, 1, 2, 3]

For concatenating 2 list, the best approach is to sum them up:

l_1 = [1, 2, 3]
l_2 = [4, 5, 6]
print(l_1 + l_2)
# [1, 2, 3, 4, 5, 6]

References:

[1] "Fluent Python" - Using + and * with sequences

Pythonic operations over strings Performance

str.format()

Sometimes, for creating a new string is necessary to concatenate multiple strings

No pythonic

my_str = "%s is %i years old" % ("Peter", 50)
# Peter is 50 years old
data = {'name': 'Peter', 'age': 50}
my_str = data['name'] + " is " + str(data['age']) + " years old"
# Peter is 50 years old

Pythonic

The best way is the use of .format() method. It has less disadvantages and is more powerful:

my_str = "{} is {} years old".format("Peter", 50)
# Peter is 50 years old

my_str = "{0} is {1} years old. {1} years old!".format("Peter", 50)
# Peter is 50 years old. 50 years old!

data = {'name': 'Peter', 'age': 50}
my_str = "{name} is {age} years old.".format(**data)
# Peter is 50 years old.

my_str = "{p[name]} is {p[age]} years old.".format(p=data)
# Peter is 50 years old.

str.join()

When you have to concatenate multiple string from a finite sequence, the easiest way could be a for-loop, but there is one better

No Pythonic

out_str = ""
for num in range(10):
    out_str += '{},'.format(num)
# 0,1,2,3,4,5,6,7,8,9,

Pythonic

The use of join is simpliest, with better results and better performance [2]

out_str = ','.join(str(n) for n in range(10))
# 0,1,2,3,4,5,6,7,8,9

References:

[1] Using % and .format() for great good!

[2] Efficient string concatenation