python_reference/not_so_obvious_python_stuff.ipynb
2014-04-17 15:39:55 -04:00

1406 lines
40 KiB
Plaintext

{
"metadata": {
"name": "",
"signature": "sha256:02c6c63beb1de9373d69615a4ba37640a7b01c8f2d088dbfaa84bdaf3452f1c5"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Sebastian Raschka \n",
"last updated: 04/15/2014\n",
"\n",
"[Link to this IPython Notebook on GitHub](https://github.com/rasbt/python_reference/blob/master/not_so_obvious_python_stuff.ipynb)\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### All code was executed in Python 3.4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A collection of not so obvious Python stuff you should know!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<hr>\n",
"I am really looking forward to your comments and suggestions to improve and extend this little collection! Just send me a quick note \n",
"via Twitter: [@rasbt](https://twitter.com/rasbt) \n",
"or Email: [bluewoodtree@gmail.com](mailto:bluewoodtree@gmail.com)\n",
"<hr>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Sections\n",
"- [The C3 class resolution algorithm for multiple class inheritance](#c3_class_res)\n",
"- [The behavior of += for lists](#pm_in_lists)\n",
"- [`True` and `False` in the datetime module](#datetime_module)\n",
"- [Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity](#python_small_int)\n",
"- [Shallow vs. deep copies if list contains other structures and objects](#shallow_vs_deep)\n",
"- [Picking True values from and and or expressions](#false_true_expressions)\n",
"- [Don't use mutable objects as default arguments for functions!](#def_mutable_func)\n",
"- [Be aware of the consuming generator](#consuming_generator)\n",
"- [`bool` is a subclass of `int`](#bool_int)\n",
"- [About lambda and closures-in-a-loop pitfall](#lambda_closure)\n",
"- [Python's LEGB scope resolution and the keywords `global` and `nonlocal`](#python_legb)\n",
"- [When mutable contents of immutable tuples aren't so mutable](#immutable_tuple)\n",
"- [List comprehensions are fast, but generators are faster!?](#list_generator)\n",
"- [Public vs. private class methods and name mangling](#private_class)\n",
"- [The consequences of modifying a list when looping through it](#looping_pitfall)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='c3_class_res'></a>\n",
"## The C3 class resolution algorithm for multiple class inheritance"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \"class A should be checked before class B\".\n",
"\n",
"If you want to learn more, please read the [original blog](http://python-history.blogspot.ru/2010/06/method-resolution-order.html) post by Guido van Rossum.\n",
"\n",
"(Original source: [http://gistroll.com/rolls/21/horizontal_assessments/new](http://gistroll.com/rolls/21/horizontal_assessments/new))"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class A(object):\n",
" def foo(self):\n",
" print(\"class A\")\n",
"\n",
"class B(object):\n",
" def foo(self):\n",
" print(\"class B\")\n",
"\n",
"class C(A, B):\n",
" pass\n",
"\n",
"C().foo()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"class A\n"
]
}
],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So what actually happened above was that class `C` was looking in the parent class `A` for the method `.foo()` first (and found it)!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I received an email with a nice suggestion using a more nested example to illustrate Guido van Rossum's point a little bit better:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class A(object):\n",
" def foo(self):\n",
" print(\"class A\")\n",
"\n",
"class B(A):\n",
" pass\n",
"\n",
"class C(A):\n",
" def foo(self):\n",
" print(\"class C\")\n",
"\n",
"class D(B,C):\n",
" pass\n",
"\n",
"D().foo()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"class C\n"
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='pm_in_lists'></a>\n",
"## The behavior of `+=` for lists"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we are using the `+=` operator on lists, we extend the list by modifying the object directly. However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:\n",
"\n",
"(Original source: [http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists](http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists))"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"list_a = []\n",
"print('ID of list_a', id(list_a))\n",
"list_a += [1]\n",
"print('ID of list_a after `+= [1]`', id(list_a))\n",
"list_a = list_a + [2]\n",
"print('ID of list_a after `list_a = list_a + [2]`', id(list_a))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"ID of list_a 4356439144\n",
"ID of list_a after `+= [1]` 4356439144\n",
"ID of list_a after `list_a = list_a + [2]` 4356446112\n"
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='datetime_module'></a>\n",
"## `True` and `False` in the datetime module\n",
"\n",
"\"it often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that,\n",
"unlike any other time value, midnight (i.e. datetime.time(0,0,0)) is False.\n",
"A long discussion on the python-ideas mailing list shows that, while surprising,\n",
"that behavior is desirable\u2014at least in some quarters.\"\n",
"\n",
"(Original source: [http://lwn.net/SubscriberLink/590299/bf73fe823974acea/](http://lwn.net/SubscriberLink/590299/bf73fe823974acea/))"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import datetime\n",
"\n",
"print('\"datetime.time(0,0,0)\" (Midnight) evaluates to', bool(datetime.time(0,0,0)))\n",
"\n",
"print('\"datetime.time(1,0,0)\" (1 am) evaluates to', bool(datetime.time(1,0,0)))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\"datetime.time(0,0,0)\" (Midnight) evaluates to False\n",
"\"datetime.time(1,0,0)\" (1 am) evaluates to True\n"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='python_small_int'></a>\n",
"## Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity\n",
"\n",
"This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong)).\n",
"\n",
"\n",
"(*I received a comment that this is in fact a CPython artefact and must not necessarily be true in all implementations of Python!*)\n",
"\n",
"So the take home message is: always use \"==\" for equality, \"is\" for identity!\n",
"\n",
"Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it using \"boxes\" (for people with C background) and \"name tags\" (in the case of Python):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = 1\n",
"b = 1\n",
"print('a is b', bool(a is b))\n",
"True\n",
"\n",
"a = 999\n",
"b = 999\n",
"print('a is b', bool(a is b))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"a is b True\n",
"a is b False\n"
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Another popular example to illustrate the reuse of objects for small integers is:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print(256 is 257 - 1)\n",
"print(257 is 258 - 1)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"True\n",
"False\n"
]
}
],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### And to illustrate the test for equality (`==`) vs. identity (`is`):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = 'hello world!'\n",
"b = 'hello world!'\n",
"print('a is b,', a is b)\n",
"print('a == b,', a == b)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"a is b, False\n",
"a == b, True\n"
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### And this example shows when `==` does not necessarilu implies that two objects are the same:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = float('nan')\n",
"print('a == a,', a == a)\n",
"print('a is a,', a is a)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"a == a, False\n",
"a is a, True\n"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='shallow_vs_deep'></a>\n",
"## Shallow vs. deep copies if list contains other structures and objects\n",
"\n",
"List modification of the original list does affect shallow copies, but not deep copies if the list contains compound objects."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from copy import deepcopy\n",
"\n",
"my_first_list = [[1],[2]]\n",
"my_second_list = [[1],[2]]\n",
"print('my_first_list == my_second_list:', my_first_list == my_second_list)\n",
"print('my_first_list is my_second_list:', my_first_list is my_second_list)\n",
"\n",
"my_third_list = my_first_list\n",
"print('my_first_list == my_third_list:', my_first_list == my_third_list)\n",
"print('my_first_list is my_third_list:', my_first_list is my_third_list)\n",
"\n",
"my_shallow_copy = my_first_list[:]\n",
"print('my_first_list == my_shallow_copy:', my_first_list == my_shallow_copy)\n",
"print('my_first_list is my_shallow_copy:', my_first_list is my_shallow_copy)\n",
"\n",
"my_deep_copy = deepcopy(my_first_list)\n",
"print('my_first_list == my_deep_copy:', my_first_list == my_deep_copy)\n",
"print('my_first_list is my_deep_copy:', my_first_list is my_deep_copy)\n",
"\n",
"print('\\nmy_third_list:', my_third_list)\n",
"print('my_shallow_copy:', my_shallow_copy)\n",
"print('my_deep_copy:', my_deep_copy)\n",
"\n",
"my_first_list[0][0] = 2\n",
"print('after setting \"my_first_list[0][0] = 2\"')\n",
"print('my_third_list:', my_third_list)\n",
"print('my_shallow_copy:', my_shallow_copy)\n",
"print('my_deep_copy:', my_deep_copy)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"my_first_list == my_second_list: True\n",
"my_first_list is my_second_list: False\n",
"my_first_list == my_third_list: True\n",
"my_first_list is my_third_list: True\n",
"my_first_list == my_shallow_copy: True\n",
"my_first_list is my_shallow_copy: False\n",
"my_first_list == my_deep_copy: True\n",
"my_first_list is my_deep_copy: False\n",
"\n",
"my_third_list: [[1], [2]]\n",
"my_shallow_copy: [[1], [2]]\n",
"my_deep_copy: [[1], [2]]\n",
"after setting \"my_first_list[0][0] = 2\"\n",
"my_third_list: [[2], [2]]\n",
"my_shallow_copy: [[2], [2]]\n",
"my_deep_copy: [[1], [2]]\n"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='false_true_expressions'></a>\n",
"## Picking `True` values from `and` and `or` expressions\n",
"\n",
"If both values of in a `or` expression are True, Python will select the first one, and the second one in `and` expressions\n",
"\n",
"Or - as a reader suggested - picture it as \n",
"`a or b == a if a else b` \n",
"`a and b == b if a else a` \n",
"\n",
"(Original source: [http://gistroll.com/rolls/21/horizontal_assessments/new](http://gistroll.com/rolls/21/horizontal_assessments/new))"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"result = (2 or 3) * (5 and 7)\n",
"print('2 * 7 =', result)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"2 * 7 = 14\n"
]
}
],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='def_mutable_func'></a>\n",
"## Don't use mutable objects as default arguments for functions!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Don't use mutable objects (e.g., dictionaries, lists, sets, etc.) as default arguments for functions! You might expect that a new list is created every time when we call the function without providing an argument for the default parameter, but this is not the case: Python will create the mutable object (default parameter) the first time the function is defined - not when it is called, see the following code:\n",
"\n",
"(Original source: [http://docs.python-guide.org/en/latest/writing/gotchas/](http://docs.python-guide.org/en/latest/writing/gotchas/)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def append_to_list(value, def_list=[]):\n",
" def_list.append(value)\n",
" return def_list\n",
"\n",
"my_list = append_to_list(1)\n",
"print(my_list)\n",
"\n",
"my_other_list = append_to_list(2)\n",
"print(my_other_list)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[1]\n",
"[1, 2]\n"
]
}
],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='consuming_generators'></a>\n",
"\n",
"## Be aware of the consuming generator\n",
"\n",
"Be aware using `in` checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"gen = (i for i in range(5))\n",
"print('2 in gen,', 2 in gen)\n",
"print('3 in gen,', 3 in gen)\n",
"print('1 in gen,', 1 in gen) "
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"2 in gen, True\n",
"3 in gen, True\n",
"1 in gen, False\n"
]
}
],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**We can circumvent this problem by using a simple list, though:**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"l = [i for i in range(5)]\n",
"print('2 in l,', 2 in l)\n",
"print('3 in l,', 3 in l)\n",
"print('1 in l,', 1 in l) "
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"2 in l, True\n",
"3 in l, True\n",
"1 in l, True\n"
]
}
],
"prompt_number": 10
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='bool_int'></a>\n",
"\n",
"## `bool` is a subclass of `int`\n",
"\n",
"Chicken or egg? In the history of Python (Python 2.2 to be specific) truth values were implemented via 1 and 0 (similar to the old C), to avoid syntax error in old (but perfectly working) code, `bool` was added as a subclass of `int` in Python 2.3.\n",
"\n",
"Original source: [http://www.peterbe.com/plog/bool-is-int](http://www.peterbe.com/plog/bool-is-int)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print('isinstance(True, int):', isinstance(True, int))\n",
"print('True + True:', True + True)\n",
"print('3*True:', 3*True)\n",
"print('3*True - False:', 3*True - False)\n"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"isinstance(True, int): True\n",
"True + True: 2\n",
"3*True: 3\n",
"3*True - False: 3\n"
]
}
],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='lambda_closure'></a>\n",
"\n",
"## About lambda and closures-in-a-loop pitfall\n",
"\n",
"Remember the [\"consuming generators\"](consuming_generators)? This example is somewhat related, but the result might still come unexpected. \n",
"\n",
"(Original source: [http://openhome.cc/eGossip/Blog/UnderstandingLambdaClosure3.html](http://openhome.cc/eGossip/Blog/UnderstandingLambdaClosure3.html))\n",
"\n",
"In the first example below, where we call a `lambda` function in a list comprehension, the value `i` is dereferenced every time we call `lambda` within the scope of the list comprehension. Since the list is already constructed when we `for-loop` through the list, it is set to the last value 4."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"my_list = [lambda: i for i in range(5)]\n",
"for l in my_list:\n",
" print(l())"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"4\n",
"4\n",
"4\n",
"4\n",
"4\n"
]
}
],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This, however, does not apply to generators:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"my_gen = (lambda: n for n in range(5))\n",
"for l in my_gen:\n",
" print(l())"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"0\n",
"1\n",
"2\n",
"3\n",
"4\n"
]
}
],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='python_legb'></a>\n",
"\n",
"## Python's LEGB scope resolution and the keywords `global` and `nonlocal`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There is nothing particularly surprising about Python's LEGB scope resolution (Local -> Enclosed -> Global -> Built-in), but it is still useful to take a look at some examples!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `global` vs. `local`\n",
"\n",
"According to the LEGB rule, Python will first look for a variable in the local scope. So if we set the variable `x = 1` in the `local`ly in the function's scope, it won't have an effect on the `global` `x`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = 0\n",
"def in_func():\n",
" x = 1\n",
" print('in_func:', x)\n",
" \n",
"in_func()\n",
"print('global:', x)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"in_func: 1\n",
"global: 0\n"
]
}
],
"prompt_number": 33
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we want to modify the `global` x via a function, we can simply use the `global` keyword to import the variable into the function's scope:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = 0\n",
"def in_func():\n",
" global x\n",
" x = 1\n",
" print('in_func:', x)\n",
" \n",
"in_func()\n",
"print('global:', x)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"in_func: 1\n",
"global: 1\n"
]
}
],
"prompt_number": 34
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `local` vs. `enclosed`\n",
"\n",
"Now, let us take a look at `local` vs. `enclosed`. Here, we set the variable `x = 1` in the `outer` function and set `x = 1` in the enclosed function `inner`. Since `inner` looks in the local scope first, it won't modify `outer`'s `x`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def outer():\n",
" x = 1\n",
" print('outer before:', x)\n",
" def inner():\n",
" x = 2\n",
" print(\"inner:\", x)\n",
" inner()\n",
" print(\"outer after:\", x)\n",
"outer()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"outer before: 1\n",
"inner: 2\n",
"outer after: 1\n"
]
}
],
"prompt_number": 36
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here is where the `nonlocal` keyword comes in handy - it allows us to modify the `x` variable in the `enclosed` scope:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def outer():\n",
" x = 1\n",
" print('outer before:', x)\n",
" def inner():\n",
" nonlocal x\n",
" x = 2\n",
" print(\"inner:\", x)\n",
" inner()\n",
" print(\"outer after:\", x)\n",
"outer()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"outer before: 1\n",
"inner: 2\n",
"outer after: 2\n"
]
}
],
"prompt_number": 35
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='immutable_tuple'></a>\n",
"## When mutable contents of immutable tuples aren't so mutable"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As we all know, tuples are immutable objects in Python, right!? But what happens if they contain mutable objects? \n",
"\n",
"First, let us have a look at the expected behavior: a `TypeError` is raised if we try to modify immutable types in a tuple: "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"tup = (1,)\n",
"tup[0] += 1"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "TypeError",
"evalue": "'tuple' object does not support item assignment",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-41-c3bec6c3fe6f>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mtup\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mtup\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment"
]
}
],
"prompt_number": 41
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### But what if we put a mutable object into the immutable tuple? Well, modification works, but we **also** get a `TypeError` at the same time."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"tup = ([],)\n",
"print('tup before: ', tup)\n",
"tup[0] += [1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"tup before: ([],)\n"
]
},
{
"ename": "TypeError",
"evalue": "'tuple' object does not support item assignment",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-42-aebe9a31dbeb>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mtup\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'tup before: '\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtup\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mtup\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment"
]
}
],
"prompt_number": 42
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print('tup after: ', tup)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"tup after: ([1],)\n"
]
}
],
"prompt_number": 43
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"However, **there are ways** to modify the mutable contents of the tuple without raising the `TypeError`, the solution is the `.extend()` method, or alternatively `.append()` (for lists):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"tup = ([],)\n",
"print('tup before: ', tup)\n",
"tup[0].extend([1])\n",
"print('tup after: ', tup)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"tup before: ([],)\n",
"tup after: ([1],)\n"
]
}
],
"prompt_number": 44
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"tup = ([],)\n",
"print('tup before: ', tup)\n",
"tup[0].append(1)\n",
"print('tup after: ', tup)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"tup before: ([],)\n",
"tup after: ([1],)\n"
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explanation\n",
"\n",
"**A. Jesse Jiryu Davis** has a nice explanation for this phenomenon (Original source: [http://emptysqua.re/blog/python-increment-is-weird-part-ii/](http://emptysqua.re/blog/python-increment-is-weird-part-ii/))\n",
"\n",
"If we try to extend the list via `+=` *\"then the statement executes `STORE_SUBSCR`, which calls the C function `PyObject_SetItem`, which checks if the object supports item assignment. In our case the object is a tuple, so `PyObject_SetItem` throws the `TypeError`. Mystery solved.\"*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### One more note about the `immutable` status of tuples. Tuples are famous for being immutable. However, how comes that this code works?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"my_tup = (1,)\n",
"my_tup += (4,)\n",
"my_tup = my_tup + (5,)\n",
"print(my_tup)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"(1, 4, 5)\n"
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What happens \"behind\" the curtains is that the tuple is not modified, but every time a new object is generated, which will inherit the old \"name tag\":"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"my_tup = (1,)\n",
"print(id(my_tup))\n",
"my_tup += (4,)\n",
"print(id(my_tup))\n",
"my_tup = my_tup + (5,)\n",
"print(id(my_tup))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"4337381840\n",
"4357415496\n",
"4357289952\n"
]
}
],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='list_generator'></a>\n",
"\n",
"## List comprehensions are fast, but generators are faster!?\n",
"\n",
"Not, really (or significantly, see the benchmarks below). So what's the reason to prefer one over the other?\n",
"- use lists if you want to use list methods \n",
"- use generators when you are dealing with huge collections to avoid memory issues"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"\n",
"def plainlist(n=100000):\n",
" my_list = []\n",
" for i in range(n):\n",
" if i % 5 == 0:\n",
" my_list.append(i)\n",
" return my_list\n",
"\n",
"def listcompr(n=100000):\n",
" my_list = [i for i in range(n) if i % 5 == 0]\n",
" return my_list\n",
"\n",
"def generator(n=100000):\n",
" my_gen = (i for i in range(n) if i % 5 == 0)\n",
" return my_gen\n",
"\n",
"def generator_yield(n=100000):\n",
" for i in range(n):\n",
" if i % 5 == 0:\n",
" yield i"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### To be fair to the list, let us exhaust the generators:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def test_plainlist(plain_list):\n",
" for i in plain_list():\n",
" pass\n",
"\n",
"def test_listcompr(listcompr):\n",
" for i in listcompr():\n",
" pass\n",
"\n",
"def test_generator(generator):\n",
" for i in generator():\n",
" pass\n",
"\n",
"def test_generator_yield(generator_yield):\n",
" for i in generator_yield():\n",
" pass\n",
"\n",
"print('plain_list: ', end = '')\n",
"%timeit test_plainlist(plainlist)\n",
"print('\\nlistcompr: ', end = '')\n",
"%timeit test_listcompr(listcompr)\n",
"print('\\ngenerator: ', end = '')\n",
"%timeit test_generator(generator)\n",
"print('\\ngenerator_yield: ', end = '')\n",
"%timeit test_generator_yield(generator_yield)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"plain_list: 10 loops, best of 3: 22.4 ms per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"listcompr: 10 loops, best of 3: 20.8 ms per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"generator: 10 loops, best of 3: 22 ms per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"generator_yield: 10 loops, best of 3: 21.9 ms per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
"prompt_number": 13
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='private_class'></a>\n",
"\n",
"## Public vs. private class methods and name mangling\n",
"\n",
"Who has not stumbled across this quote \"we are all consenting adults here\" in the Python community, yet? Unlike in other languages like C++ (sorry, there are many more, but that's I am most familiar with), we can't really protect class methods from being used outside the class. \n",
"All we can do is to indicate methods as private to make clear that they are better not used outside the class, but it is really up to the class user, since \"we are all consenting adults here\"! \n",
"So, when we want to \"make\" class methods private, we just put a double-underscore in front of it (same with other class members), which invokes some name mangling if we want to acess the private class member outside the class! \n",
"This doesn't prevent the class user to access this class member though, but he has to know the trick and also knows that it his own risk...\n",
"\n",
"Let the following example illustrate what I mean:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class my_class():\n",
" def public_method(self):\n",
" print('Hello public world!')\n",
" def __private_method(self):\n",
" print('Hello private world!')\n",
" def call_private_method_in_class(self):\n",
" self.__private_method()\n",
" \n",
"my_instance = my_class()\n",
"\n",
"my_instance.public_method()\n",
"my_instance._my_class__private_method()\n",
"my_instance.call_private_method_in_class()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Hello public world!\n",
"Hello private world!\n",
"Hello private world!\n"
]
}
],
"prompt_number": 28
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='looping_pitfall'></a>\n",
"## The consequences of modifying a list when looping through it"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It can be really dangerous to modify a list when iterating through - it is a very common pitfall that can cause unintended behavour!"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = [1, 2, 3, 4, 5]\n",
"for i in a:\n",
" if not i % 2:\n",
" a.remove(i)\n",
"print(a)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[1, 3, 5]\n"
]
}
],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b = [2, 4, 5, 6]\n",
"for i in b:\n",
" if not i % 2:\n",
" b.remove(i)\n",
"print(b)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[4, 5]\n"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"**The solution** is that we are iterating through the list index by index, and if we remove one of the items in-between, we inevitably mess around with the indexing, look at the following example, and it will become clear:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b = [2, 4, 5, 6]\n",
"for index, item in enumerate(b):\n",
" print(index, item)\n",
" if not item % 2:\n",
" b.remove(item)\n",
"print(b)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"0 2\n",
"1 5\n",
"2 6\n",
"[4, 5]\n"
]
}
],
"prompt_number": 7
}
],
"metadata": {}
}
]
}