"## The C3 class resolution algorithm for multiple class inheritance"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \"class A should be checked before class B\".\n",
"\n",
"If you want to learn more, please read the [original blog](http://python-history.blogspot.ru/2010/06/method-resolution-order.html) post by Guido van Rossum.\n",
"Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`"
"If we are using the `+=` operator on lists, we extend the list by modifying the object directly. However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:\n",
"This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong)).\n",
"So the take home message is: always use \"==\" for equality, \"is\" for identity!\n",
"\n",
"Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it using \"boxes\" (for people with C background) and \"name tags\" (in the case of Python):"
"Don't use mutable objects (e.g., dictionaries, lists, sets, etc.) as default arguments for functions! You might expect that a new list is created every time when we call the function without providing an argument for the default parameter, but this is not the case: **Python will create the mutable object (default parameter) the first time the function is defined - not when it is called**, see the following code:\n",
"Chicken or egg? In the history of Python (Python 2.2 to be specific) truth values were implemented via 1 and 0 (similar to the old C), to avoid syntax error in old (but perfectly working) code, `bool` was added as a subclass of `int` in Python 2.3.\n",
"In the first example below, where we call a `lambda` function in a list comprehension, the value `i` is dereferenced every time we call `lambda` within the scope of the list comprehension. Since the list is already constructed when we `for-loop` through the list, it is set to the last value 4."
"## Python's LEGB scope resolution and the keywords `global` and `nonlocal`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There is nothing particularly surprising about Python's LEGB scope resolution (Local -> Enclosed -> Global -> Built-in), but it is still useful to take a look at some examples!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `global` vs. `local`\n",
"\n",
"According to the LEGB rule, Python will first look for a variable in the local scope. So if we set the variable `x = 1` in the `local`ly in the function's scope, it won't have an effect on the `global` `x`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = 0\n",
"def in_func():\n",
" x = 1\n",
" print('in_func:', x)\n",
" \n",
"in_func()\n",
"print('global:', x)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"in_func: 1\n",
"global: 0\n"
]
}
],
"prompt_number": 33
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we want to modify the `global` x via a function, we can simply use the `global` keyword to import the variable into the function's scope:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = 0\n",
"def in_func():\n",
" global x\n",
" x = 1\n",
" print('in_func:', x)\n",
" \n",
"in_func()\n",
"print('global:', x)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"in_func: 1\n",
"global: 1\n"
]
}
],
"prompt_number": 34
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `local` vs. `enclosed`\n",
"\n",
"Now, let us take a look at `local` vs. `enclosed`. Here, we set the variable `x = 1` in the `outer` function and set `x = 1` in the enclosed function `inner`. Since `inner` looks in the local scope first, it won't modify `outer`'s `x`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def outer():\n",
" x = 1\n",
" print('outer before:', x)\n",
" def inner():\n",
" x = 2\n",
" print(\"inner:\", x)\n",
" inner()\n",
" print(\"outer after:\", x)\n",
"outer()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"outer before: 1\n",
"inner: 2\n",
"outer after: 1\n"
]
}
],
"prompt_number": 36
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here is where the `nonlocal` keyword comes in handy - it allows us to modify the `x` variable in the `enclosed` scope:"
"However, **there are ways** to modify the mutable contents of the tuple without raising the `TypeError`, the solution is the `.extend()` method, or alternatively `.append()` (for lists):"
"**A. Jesse Jiryu Davis** has a nice explanation for this phenomenon (Original source: [http://emptysqua.re/blog/python-increment-is-weird-part-ii/](http://emptysqua.re/blog/python-increment-is-weird-part-ii/))\n",
"If we try to extend the list via `+=` *\"then the statement executes `STORE_SUBSCR`, which calls the C function `PyObject_SetItem`, which checks if the object supports item assignment. In our case the object is a tuple, so `PyObject_SetItem` throws the `TypeError`. Mystery solved.\"*"
"#### One more note about the `immutable` status of tuples. Tuples are famous for being immutable. However, how comes that this code works?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"my_tup = (1,)\n",
"my_tup += (4,)\n",
"my_tup = my_tup + (5,)\n",
"print(my_tup)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"(1, 4, 5)\n"
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What happens \"behind\" the curtains is that the tuple is not modified, but every time a new object is generated, which will inherit the old \"name tag\":"
"## Public vs. private class methods and name mangling\n",
"\n",
"Who has not stumbled across this quote \"we are all consenting adults here\" in the Python community, yet? Unlike in other languages like C++ (sorry, there are many more, but that's I am most familiar with), we can't really protect class methods from being used outside the class. \n",
"All we can do is to indicate methods as private to make clear that they are better not used outside the class, but it is really up to the class user, since \"we are all consenting adults here\"! \n",
"So, when we want to \"make\" class methods private, we just put a double-underscore in front of it (same with other class members), which invokes some name mangling if we want to acess the private class member outside the class! \n",
"This doesn't prevent the class user to access this class member though, but he has to know the trick and also knows that it his own risk...\n",
"\n",
"Let the following example illustrate what I mean:"
"**The solution** is that we are iterating through the list index by index, and if we remove one of the items in-between, we inevitably mess around with the indexing, look at the following example, and it will become clear:"
"\u001b[0;31mIndexError\u001b[0m: list index out of range"
]
}
],
"prompt_number": 15
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But suprisingly, it is not raised when we are doing list slicing, which can be a really pain for debugging:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"my_list = [1, 2, 3, 4, 5]\n",
"print(my_list[5:])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[]\n"
]
}
],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='unboundlocalerror'></a>\n",
"## Reusing global variable names and `UnboundLocalErrors`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Usually, it is no problem to access global variables in the local scope of a function:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def my_func():\n",
" print(var)\n",
"\n",
"var = 'global'\n",
"my_func()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"global\n"
]
}
],
"prompt_number": 37
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And is also no problem to use the same variable name in the local scope without affecting the local counterpart: "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def my_func():\n",
" var = 'locally changed'\n",
"\n",
"var = 'global'\n",
"my_func()\n",
"print(var)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"global\n"
]
}
],
"prompt_number": 38
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But we have to be careful if we use a variable name that occurs in the global scope, and we want to access it in the local function scope if we want to reuse this name:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def my_func():\n",
" print(var) # want to access global variable\n",
" var = 'locally changed' # but Python thinks we forgot to define the local variable!\n",
" \n",
"var = 'global'\n",
"my_func()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "UnboundLocalError",
"evalue": "local variable 'var' referenced before assignment",
"Let's assume a scenario where we want to duplicate sub`list`s of values stored in another list. If we want to create independent sub`list` object, using the arithmetic multiplication operator could lead to rather unexpected (or undesired) results:"
"But it might be still worthwhile, especially for Python newcomers, to take a look at some of those!\n",
"(Note: the the code was executed in Python 3.4.0 and Python 2.7.5 and copied from interactive shell sessions.)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Unicode...\n",
"####- Python 2: \n",
"We have ASCII `str()` types, separate `unicode()`, but no `byte` type\n",
"####- Python 3: \n",
"Now, we finally have Unicode (utf-8) `str`ings, and 2 byte classes: `byte` and `bytearray`s"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#############\n",
"# Python 2\n",
"#############\n",
"\n",
">>> type(unicode('is like a python3 str()'))\n",
"<type 'unicode'>\n",
"\n",
">>> type(b'byte type does not exist')\n",
"<type 'str'>\n",
"\n",
">>> 'they are really' + b' the same'\n",
"'they are really the same'\n",
"\n",
">>> type(bytearray(b'bytearray oddly does exist though'))\n",
"<type 'bytearray'>\n",
"\n",
"#############\n",
"# Python 3\n",
"#############\n",
"\n",
">>> print('strings are now utf-8 \\u03BCnico\\u0394\u00e9!')\n",
"strings are now utf-8 \u03bcnico\u0394\u00e9!\n",
"\n",
"\n",
">>> type(b' and we have byte types for storing data')\n",
"<class 'bytes'>\n",
"\n",
">>> type(bytearray(b'but also bytearrays for those who prefer them over strings'))\n",
"<class 'bytearray'>\n",
"\n",
">>> 'string' + b'bytes for data'\n",
"Traceback (most recent call last):s\n",
" File \"<stdin>\", line 1, in <module>\n",
"TypeError: Can't convert 'bytes' object to str implicitly"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The print statement\n",
"Very trivial, but this change makes sense, Python 3 now only accepts `print`s with proper parentheses - just like the other function calls ..."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Python 2\n",
">>> print 'Hello, World!'\n",
"Hello, World!\n",
">>> print('Hello, World!')\n",
"Hello, World!\n",
"\n",
"# Python 3\n",
">>> print('Hello, World!')\n",
"Hello, World!\n",
">>> print 'Hello, World!'\n",
" File \"<stdin>\", line 1\n",
" print 'Hello, World!'\n",
" ^\n",
"SyntaxError: invalid syntax"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And if we want to print the output of 2 consecutive print functions on the same line, you would use a comma in Python 2, and a `end=\"\"` in Python 3:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Python 2\n",
">>> print \"line 1\", ; print 'same line'\n",
"line 1 same line\n",
"\n",
"# Python 3\n",
">>> print(\"line 1\", end=\"\") ; print (\" same line\")\n",
"line 1 same line"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Integer division\n",
"This is a pretty dangerous thing if you are porting code, or executing Python 3 code in Python 2 since the change in integer-division behavior can often go unnoticed. \n",
"So, I still tend to use a `float(3/2)` or `3/2.0` instead of a `3/2` in my Python 3 scripts to save the Python 2 guys some trouble ... (PS: and vice versa, you can `from __future__ import division` in your Python 2 scripts)."
"`xrange()` was pretty popular in Python 2.x if you wanted to create an iterable object. The behavior was quite similar to a generator ('lazy evaluation'), but you could iterate over it infinitely. The advantage was that it was generally faster than `range()` (e.g., in a for-loop) - not if you had to iterate over the list multiple times, since the generation happens every time from scratch! \n",
"In Python 3, the `range()` was implemented like the `xrange()` function so that a dedicated `xrange()` function does not exist anymore."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Python 2\n",
"> python -m timeit 'for i in range(1000000):' ' pass'\n",
"10 loops, best of 3: 66 msec per loop\n",
"\n",
" > python -m timeit 'for i in xrange(1000000):' ' pass'\n",
"10 loops, best of 3: 27.8 msec per loop\n",
"\n",
"# Python 3\n",
"> python3 -m timeit 'for i in range(1000000):' ' pass'\n",
"10 loops, best of 3: 51.1 msec per loop\n",
"\n",
"> python3 -m timeit 'for i in xrange(1000000):' ' pass'\n",
"Traceback (most recent call last):\n",
" File \"/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/timeit.py\", line 292, in main\n",
" x = t.timeit(number)\n",
" File \"/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/timeit.py\", line 178, in timeit\n",
" timing = self.inner(it, self.timer)\n",
" File \"<timeit-src>\", line 6, in inner\n",
" for i in xrange(1000000):\n",
"NameError: name 'xrange' is not defined"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Raising exceptions\n",
"\n",
"Where Python 2 accepts both notations, the 'old' and the 'new' way, Python 3 chokes (and raises a `SyntaxError` in turn) if we don't enclose the exception argument in parentheses:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Python 2\n",
">>> raise IOError, \"file error\"\n",
"Traceback (most recent call last):\n",
" File \"<stdin>\", line 1, in <module>\n",
"IOError: file error\n",
">>> raise IOError(\"file error\")\n",
"Traceback (most recent call last):\n",
" File \"<stdin>\", line 1, in <module>\n",
"IOError: file error\n",
"\n",
" \n",
"# Python 3 \n",
">>> raise IOError, \"file error\"\n",
" File \"<stdin>\", line 1\n",
" raise IOError, \"file error\"\n",
" ^\n",
"SyntaxError: invalid syntax\n",
">>> raise IOError(\"file error\")\n",
"Traceback (most recent call last):\n",
" File \"<stdin>\", line 1, in <module>\n",
"OSError: file error"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Handling exceptions\n",
"\n",
"Also the handling of excecptions has slightly changed in Python 3. Now, we have to use the `as` keyword!"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Python 2\n",
">>> try:\n",
"... blabla\n",
"... except NameError, err:\n",
"... print err, '--> our error msg'\n",
"... \n",
"name 'blabla' is not defined --> our error msg\n",
"\n",
"# Python 3\n",
">>> try:\n",
"... blabla\n",
"... except NameError as err:\n",
"... print(err, '--> our error msg')\n",
"... \n",
"name 'blabla' is not defined --> our error msg"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The `next()` function and `.next()` method\n",
"\n",
"Where you can use both function and method in Python 2.7.5, the `next()` function is all that remain in Python 3!"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Python 2\n",
">>> my_generator = (letter for letter in 'abcdefg')\n",
">>> my_generator.next()\n",
"'a'\n",
">>> next(my_generator)\n",
"'b'\n",
"\n",
"# Python 3\n",
">>> my_generator = (letter for letter in 'abcdefg')\n",
">>> next(my_generator)\n",
"'a'\n",
">>> my_generator.next()\n",
"Traceback (most recent call last):\n",
" File \"<stdin>\", line 1, in <module>\n",
"AttributeError: 'generator' object has no attribute 'next'"