I was reading Igor Kalnitsky's blog post on why Python's map()
is
mad, and wanted to provide
a different perspective. In fact, I would call the design of Python's map()
beautiful instead.
First off, what does map(f, xs)
represent mathematically in the first place?
It should invoke function f(x)
for every x
in xs
. Functions, of course,
can take many arguments—single argument functions are just the simplest case.
So what would be reasonable to assume map(f, xs, ys)
would do? In the blog
post, Igor suggests the behaviour should be to chain xs
and ys
, but chances
are they represent completely different things, so chaining them would lead to
a heterogenous collection of items. Mathematically, you would expect the
function calls made to be f(x1, y1)
, f(x2, y2)
, ...
Note that this is different from zip()
'ing the function arguments.
A function f
with 2 arguments is different from a function f
with
1 argument, expecting a tuple.
Compare:
def f(x, y): return x * y map(f, ['a', 'b', 'c'], [1, 2, 3]) # ['a', 'bb', 'ccc']
to
def f(pair): x, y = pair return x * y map(f, zip(['a', 'b', 'c'], [1, 2, 3])) # ['a', 'bb', 'ccc']
The confusion around the items appearing to be zipped is caused by the implicit
behaviour in Python 2 when the first argument is None
. I think it's handled
as a special case, which is unfortunate. A more consistent behaviour would have been to
Python 2.7.9 (default, Dec 19 2014, 06:00:59) >>> map(lambda x: x, ['a'], [1]) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: <lambda>() takes exactly 1 argument (2 given) >>> map(None, ['a'], [1]) [('a', 1)]
The TypeError would have been the sane thing to do, since the identity function should only ever take one argument.
Therefore, my advice would be to never use the implicit None
as the first
argument. It is broken under Python 3 anyway.
To zip()
or to zip_longest()
? ¶
The fact that the behaviour changed in Python 3 is unfortunate, but I think it
changed for the better. The problem with zip_longest()
-like default
semantics is that it will only ever work with finite iterables. If only one of
the given iterables is infinite, the map will be infinite too. Now, perhaps
this is what you want, but in that case you should probably be explicit about
it anyway. I think using zip()
-like semantics as the default makes perfect
sense. It enables the following usage in Python 3:
>>> from itertools import count >>> >>> def f(x, y): ... return x * y ... >>> for x in map(f, ['a', 'b', 'c'], count(1)): ... print(x) ... a bb ccc
Compare this to Python 2's map behaviour, which would do:
>>> for x in map(f, ['a', 'b', 'c'], count(1)): ... print(x) ... a bb ccc Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in f TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'
Because it tries to invoke f(None, 4)
the fourth time, which happens to fail.
If it would not fail, it would produce results infinitely.
But what if you actually want zip_longest()
-like behaviour? Well, you can
either make all arguments be infinite iterables, or you can explicitly wrap
your arguments in a zip_longest()
wrapper, and pass that to starmap()
,
which will take an iterable of tuples and spread it over the arguments to
f()
, just like map
:
>>> from itertools import count, islice, starmap, zip_longest >>> >>> result = starmap(f, zip_longest(['a', 'b', 'c'], count(1), fillvalue='?')) >>> for x in islice(result, 7): ... print(x) ... a bb ccc ???? ????? ?????? ???????
As a bonus, you can pass in a fillvalue this way, instead of being stuck with
the assumption of None
(which could happen to be a valid value within the
iterable).
However, personally, in this case, I'd prefer the following, more readable
version that avoids the zip_longest()
and starmap()
calls:
map(f, chain(['a', 'b', 'c'], repeat('?')), count(1))
Note how you can thus make the map result infinite by simply making all
iterables infinite. Consuming iterables until the first one is exhausted (so
zip()
-like), thus, is the sanest default behaviour, and the most beautiful of
the options.
Python 3 gets it right ¶
I'm glad that Python 3 changed map()
to be sane in every way:
- It's made a lazy iterator, does not directly produce a list
- It disallows the ambiguous
None
first argument - It consumes the iterables until the first one is exhausted.
Bonus: your own zip()
¶
Did you know you could express zip()
with map()
? It's easy, now you know
the exact semantics:
def zip(*iterables): return map(lambda *args: tuple(args), *iterables)