<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>nvie.com &#187; Python</title>
	<atom:link href="http://nvie.com/archives/category/python/feed" rel="self" type="application/rss+xml" />
	<link>http://nvie.com</link>
	<description>Anything that interests me.</description>
	<lastBuildDate>Tue, 24 Aug 2010 11:21:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Unexpected side effects in Python classes</title>
		<link>http://nvie.com/archives/470</link>
		<comments>http://nvie.com/archives/470#comments</comments>
		<pubDate>Wed, 03 Mar 2010 01:22:09 +0000</pubDate>
		<dc:creator>Vincent Driessen</dc:creator>
				<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://nvie.com/?p=470</guid>
		<description><![CDATA[Today, I lost several hours while debugging a language implementation detail in Python that I did not know of and that really feels counterintuitive and dangerous to me. I was writing unit tests for a Python class that I was implementing, when one of the tests that had repeatedly been passing suddenly failed. Moreover, the [...]]]></description>
			<content:encoded><![CDATA[<p>Today, I lost several hours while debugging a language implementation detail in Python that I did not know of and that really feels counterintuitive and dangerous to me.</p>
<p>I was writing unit tests for a Python class that I was implementing, when one of the tests that had repeatedly been passing suddenly failed. Moreover, the failing test case was really for testing some completely unrelated piece of functionaly. This simply could not be broken!</p>
<p>After at least an hour of scrutinizing the code, I was able to distill the real problem, which I think is summarized here in the most compact way:<br />
<script src="http://gist.github.com/321150.js"></script><br />
Creating a simple <code>Foo</code> instance twice exposes the ugly side effect: the second <code>Foo</code> instance has an already initialized <code>x</code> instance variable when the constructor enters! Yuck! Moreover, now, too:</p>
<script src="http://gist.github.com/321155.js"></script>
<p>Apparently, the <code>x</code> &#8220;instance variable&#8221; is a shared object, much like a global or class variable.</p>
<p>To be even more confusing, this doesn&#8217;t seem to hold for basic data types. For example, change the dictionary to an integer, and the example behaves as expected:</p>
<script src="http://gist.github.com/321157.js"></script>
<h2>The behaviour demystified</h2>
<p>The real confusion here is that I was thinking that I was creating &#8220;instance variables&#8221;, like you would in C++ or Java. As the <a href="http://docs.python.org/tutorial/classes.html#instance-objects">Python documentation</a> mentions:</p>
<blockquote><p>&#8220;data attributes correspond to [...] to data members in C++. Data attributes need not be declared; like local variables, they spring into existence when they are first assigned to.&#8221;</p></blockquote>
<p>Yes, I knew that, but nonetheless my real-world class is much bigger than <code>Foo</code> and I wanted an explicit overview on which instance variables are in this class. Hence the data member.</p>
<p>However, this is not how the Python interpreter processes Python code. In fact, upon class definition, the statement <code>x = {}</code> is executed within the scope of the newly defined class. To prove this:</p>
<script src="http://gist.github.com/321158.js"></script>
<p>Even without a constructor or instance variable, we can access the data member <code>x</code>. Of course. Now this suddenly seems obvious.</p>
<p>But what about our instance variables? Apparently, when we create a new instance of <code>Bar</code>, the instance data member <code>x</code> is initially <em>pointing to the same object</em> as the class data member <code>x</code>. The following example proves this:</p>
<script src="http://gist.github.com/321172.js"></script>
<p>This example also demonstrates the subtlety of the accidentally discovered side-effect. Remember how we were changing the dictionary in our initial example? <code>self.x[id] = id</code><br />
The instance data member was pointing to the same object as the class data member. By updating the dictionary, the single dictionary object was changed, causing unwanted side effects in other class instances.</p>
<p>In the gist above, we force <code>x</code> to point to a new dictionary by the assignment <code>self.x = { id:id }</code>. In other words, <code>x</code> points to a new object! This also perfectly explains why the integer example worked—it&#8217;s the same kind of assignment.</p>
<h2>Conclusion</h2>
<p>To summarize, I learned some important lessons today:</p>
<ul>
<li>All the time, I have been creating class data members in all my classes, without knowing this.</li>
<li>I initialized those members to default values, effectively creating useless objects that are never accessed and just claiming memory.</li>
<li>Although it can be explained, a seemingly innocent statement like <code>x = {}</code> can have very ugly side effects. Be warned!</li>
<li>Never underestimate the power of unit tests. It is absolutely worth the investment.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://nvie.com/archives/470/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
