By Vincent Driessen
on Monday, June 15, 2015

I was reading Igor Kalnitsky's blog post on why Python's map() is mad, and wanted to provide a different perspective. In fact, I would call the design of Python's map() beautiful instead.

First off, what does map(f, xs) represent mathematically in the first place? It should invoke function f(x) for every x in xs. Functions, of course, can take many arguments—single argument functions are just the simplest case. So what would be reasonable to assume map(f, xs, ys) would do? In the blog post, Igor suggests the behaviour should be to chain xs and ys, but chances are they represent completely different things, so chaining them would lead to a heterogenous collection of items. Mathematically, you would expect the function calls made to be f(x1, y1), f(x2, y2), ...

Note that this is different from zip()'ing the function arguments. A function f with 2 arguments is different from a function f with 1 argument, expecting a tuple.

Compare:

def f(x, y):
    return x * y

map(f, ['a', 'b', 'c'], [1, 2, 3])    # ['a', 'bb', 'ccc']

to

def f(pair):
    x, y = pair
    return x * y

map(f, zip(['a', 'b', 'c'], [1, 2, 3]))  # ['a', 'bb', 'ccc']

The confusion around the items appearing to be zipped is caused by the implicit behaviour in Python 2 when the first argument is None. I think it's handled as a special case, which is unfortunate. A more consistent behaviour would have been to

Python 2.7.9 (default, Dec 19 2014, 06:00:59)
>>> map(lambda x: x, ['a'], [1])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: <lambda>() takes exactly 1 argument (2 given)
>>> map(None, ['a'], [1])
[('a', 1)]

The TypeError would have been the sane thing to do, since the identity function should only ever take one argument.

Therefore, my advice would be to never use the implicit None as the first argument. It is broken under Python 3 anyway.

To zip() or to zip_longest()?

The fact that the behaviour changed in Python 3 is unfortunate, but I think it changed for the better. The problem with zip_longest()-like default semantics is that it will only ever work with finite iterables. If only one of the given iterables is infinite, the map will be infinite too. Now, perhaps this is what you want, but in that case you should probably be explicit about it anyway. I think using zip()-like semantics as the default makes perfect sense. It enables the following usage in Python 3:

>>> from itertools import count
>>> 
>>> def f(x, y):
...     return x * y
... 
>>> for x in map(f, ['a', 'b', 'c'], count(1)):
...     print(x)
...
a
bb
ccc

Compare this to Python 2's map behaviour, which would do:

>>> for x in map(f, ['a', 'b', 'c'], count(1)):
...     print(x)
...
a
bb
ccc
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in f
TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'

Because it tries to invoke f(None, 4) the fourth time, which happens to fail. If it would not fail, it would produce results infinitely.

But what if you actually want zip_longest()-like behaviour? Well, you can either make all arguments be infinite iterables, or you can explicitly wrap your arguments in a zip_longest() wrapper, and pass that to starmap(), which will take an iterable of tuples and spread it over the arguments to f(), just like map:

>>> from itertools import count, islice, starmap, zip_longest
>>>
>>> result = starmap(f, zip_longest(['a', 'b', 'c'], count(1), fillvalue='?'))
>>> for x in islice(result, 7):
...     print(x)
...
a
bb
ccc
????
?????
??????
???????

As a bonus, you can pass in a fillvalue this way, instead of being stuck with the assumption of None (which could happen to be a valid value within the iterable).

However, personally, in this case, I'd prefer the following, more readable version that avoids the zip_longest() and starmap() calls:

map(f, chain(['a', 'b', 'c'], repeat('?')), count(1))

Note how you can thus make the map result infinite by simply making all iterables infinite. Consuming iterables until the first one is exhausted (so zip()-like), thus, is the sanest default behaviour, and the most beautiful of the options.

Python 3 gets it right

I'm glad that Python 3 changed map() to be sane in every way:

  • It's made a lazy iterator, does not directly produce a list
  • It disallows the ambiguous None first argument
  • It consumes the iterables until the first one is exhausted.

Bonus: your own zip()

Did you know you could express zip() with map()? It's easy, now you know the exact semantics:

def zip(*iterables):
    return map(lambda *args: tuple(args), *iterables)

If you want to get in touch, I'm @nvie on Twitter.