mirror of
https://github.com/rasbt/python_reference.git
synced 2024-11-24 20:41:14 +00:00
1194 lines
34 KiB
Plaintext
1194 lines
34 KiB
Plaintext
{
|
|
"metadata": {
|
|
"name": "",
|
|
"signature": "sha256:29a120258e2d108ed5eace08e071ad866ae379b4f24fde804401ee858a2090fb"
|
|
},
|
|
"nbformat": 3,
|
|
"nbformat_minor": 0,
|
|
"worksheets": [
|
|
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Sebastian Raschka \n",
|
|
"last updated: 04/15/2014\n",
|
|
"\n",
|
|
"[Link to this IPython Notebook on GitHub](https://github.com/rasbt/python_reference/blob/master/not_so_obvious_python_stuff.ipynb)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"All code was executed in Python 3.4"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# A collection of not so obvious Python stuff you should know!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Sections\n",
|
|
"- [The C3 class resolution algorithm for multiple class inheritance](#c3_class_res)\n",
|
|
"- [The behavior of += for lists](#pm_in_lists)\n",
|
|
"- [`True` and `False` in the datetime module](#datetime_module)\n",
|
|
"- [Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity](#python_small_int)\n",
|
|
"- [Shallow vs. deep copies if list contains other structures and objects](#shallow_vs_deep)\n",
|
|
"- [Picking True values from and and or expressions](#false_true_expressions)\n",
|
|
"- [Don't use mutable objects as default arguments for functions!](#def_mutable_func)\n",
|
|
"- [Be aware of the consuming generator](#consuming_generator)\n",
|
|
"- [`bool` is a subclass of `int`](#bool_int)\n",
|
|
"- [About lambda and closures-in-a-loop pitfall](#lambda_closure)\n",
|
|
"- [Python's LEGB scope resolution and the keywords `global` and `nonlocal`](#python_legb)\n",
|
|
"- [When mutable contents of immutable tuples aren't so mutable](#immutable_tuple)\n",
|
|
"- [List comprehensions are fast, but generators are faster!?](#list_generator)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='c3_class_res'></a>\n",
|
|
"## The C3 class resolution algorithm for multiple class inheritance"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \"class A should be checked before class B\".\n",
|
|
"\n",
|
|
"If you want to learn more, please read the [original blog](http://python-history.blogspot.ru/2010/06/method-resolution-order.html) post by Guido van Rossum.\n",
|
|
"\n",
|
|
"(Original source: [http://gistroll.com/rolls/21/horizontal_assessments/new](http://gistroll.com/rolls/21/horizontal_assessments/new))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"class A(object):\n",
|
|
" def foo(self):\n",
|
|
" print(\"class A\")\n",
|
|
"\n",
|
|
"class B(object):\n",
|
|
" def foo(self):\n",
|
|
" print(\"class B\")\n",
|
|
"\n",
|
|
"class C(A, B):\n",
|
|
" pass\n",
|
|
"\n",
|
|
"C().foo()"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"class A\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 2
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='pm_in_lists'></a>\n",
|
|
"## The behavior of `+=` for lists"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"If we are using the `+=` operator on lists, we extend the list by modifying the object directly. However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:\n",
|
|
"\n",
|
|
"(Original source: [http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists](http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"list_a = []\n",
|
|
"print('ID of list_a', id(list_a))\n",
|
|
"list_a += [1]\n",
|
|
"print('ID of list_a after `+= [1]`', id(list_a))\n",
|
|
"list_a = list_a + [2]\n",
|
|
"print('ID of list_a after `list_a = list_a + [2]`', id(list_a))"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"ID of list_a 4356439144\n",
|
|
"ID of list_a after `+= [1]` 4356439144\n",
|
|
"ID of list_a after `list_a = list_a + [2]` 4356446112\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 3
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='datetime_module'></a>\n",
|
|
"## `True` and `False` in the datetime module\n",
|
|
"\n",
|
|
"\"it often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that,\n",
|
|
"unlike any other time value, midnight (i.e. datetime.time(0,0,0)) is False.\n",
|
|
"A long discussion on the python-ideas mailing list shows that, while surprising,\n",
|
|
"that behavior is desirable\u2014at least in some quarters.\"\n",
|
|
"\n",
|
|
"(Original source: [http://lwn.net/SubscriberLink/590299/bf73fe823974acea/](http://lwn.net/SubscriberLink/590299/bf73fe823974acea/))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"import datetime\n",
|
|
"\n",
|
|
"print('\"datetime.time(0,0,0)\" (Midnight) evaluates to', bool(datetime.time(0,0,0)))\n",
|
|
"\n",
|
|
"print('\"datetime.time(1,0,0)\" (1 am) evaluates to', bool(datetime.time(1,0,0)))"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"\"datetime.time(0,0,0)\" (Midnight) evaluates to False\n",
|
|
"\"datetime.time(1,0,0)\" (1 am) evaluates to True\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 4
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='python_small_int'></a>\n",
|
|
"## Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity\n",
|
|
"\n",
|
|
"This oddity occurs, because Python tends to stores small integers as the same object, but not so for larger ones! \n",
|
|
"(*I received a comment that this is in fact a CPython artefact and must not necessarily be true in all implementations of Python!*)\n",
|
|
"\n",
|
|
"So the take home message is: always use \"==\" for equality, \"is\" for identity!\n",
|
|
"\n",
|
|
"Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it using \"boxes\" (for people with C background) and \"name tags\" (in the case of Python):"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"a = 1\n",
|
|
"b = 1\n",
|
|
"print('a is b', bool(a is b))\n",
|
|
"True\n",
|
|
"\n",
|
|
"a = 999\n",
|
|
"b = 999\n",
|
|
"print('a is b', bool(a is b))"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"a is b True\n",
|
|
"a is b False\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 5
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Another popular example to illustrate the reuse of objects for small integers is:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"print(256 is 257 - 1)\n",
|
|
"print(257 is 258 - 1)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"True\n",
|
|
"False\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 2
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### And to illustrate the test for equality (`==`) vs. identity (`is`):"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"a = 'hello world!'\n",
|
|
"b = 'hello world!'\n",
|
|
"print('a is b,', a is b)\n",
|
|
"print('a == b,', a == b)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"a is b, False\n",
|
|
"a == b, True\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 6
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### And this example shows when `==` does not necessarilu implies that two objects are the same:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"a = float('nan')\n",
|
|
"print('a == a,', a == a)\n",
|
|
"print('a is a,', a is a)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"a == a, False\n",
|
|
"a is a, True\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 7
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='shallow_vs_deep'></a>\n",
|
|
"## Shallow vs. deep copies if list contains other structures and objects\n",
|
|
"\n",
|
|
"List modification of the original list does affect shallow copies, but not deep copies if the list contains compound objects."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"from copy import deepcopy\n",
|
|
"\n",
|
|
"my_first_list = [[1],[2]]\n",
|
|
"my_second_list = [[1],[2]]\n",
|
|
"print('my_first_list == my_second_list:', my_first_list == my_second_list)\n",
|
|
"print('my_first_list is my_second_list:', my_first_list is my_second_list)\n",
|
|
"\n",
|
|
"my_third_list = my_first_list\n",
|
|
"print('my_first_list == my_third_list:', my_first_list == my_third_list)\n",
|
|
"print('my_first_list is my_third_list:', my_first_list is my_third_list)\n",
|
|
"\n",
|
|
"my_shallow_copy = my_first_list[:]\n",
|
|
"print('my_first_list == my_shallow_copy:', my_first_list == my_shallow_copy)\n",
|
|
"print('my_first_list is my_shallow_copy:', my_first_list is my_shallow_copy)\n",
|
|
"\n",
|
|
"my_deep_copy = deepcopy(my_first_list)\n",
|
|
"print('my_first_list == my_deep_copy:', my_first_list == my_deep_copy)\n",
|
|
"print('my_first_list is my_deep_copy:', my_first_list is my_deep_copy)\n",
|
|
"\n",
|
|
"print('\\nmy_third_list:', my_third_list)\n",
|
|
"print('my_shallow_copy:', my_shallow_copy)\n",
|
|
"print('my_deep_copy:', my_deep_copy)\n",
|
|
"\n",
|
|
"my_first_list[0][0] = 2\n",
|
|
"print('after setting \"my_first_list[0][0] = 2\"')\n",
|
|
"print('my_third_list:', my_third_list)\n",
|
|
"print('my_shallow_copy:', my_shallow_copy)\n",
|
|
"print('my_deep_copy:', my_deep_copy)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"my_first_list == my_second_list: True\n",
|
|
"my_first_list is my_second_list: False\n",
|
|
"my_first_list == my_third_list: True\n",
|
|
"my_first_list is my_third_list: True\n",
|
|
"my_first_list == my_shallow_copy: True\n",
|
|
"my_first_list is my_shallow_copy: False\n",
|
|
"my_first_list == my_deep_copy: True\n",
|
|
"my_first_list is my_deep_copy: False\n",
|
|
"\n",
|
|
"my_third_list: [[1], [2]]\n",
|
|
"my_shallow_copy: [[1], [2]]\n",
|
|
"my_deep_copy: [[1], [2]]\n",
|
|
"after setting \"my_first_list[0][0] = 2\"\n",
|
|
"my_third_list: [[2], [2]]\n",
|
|
"my_shallow_copy: [[2], [2]]\n",
|
|
"my_deep_copy: [[1], [2]]\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 7
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='false_true_expressions'></a>\n",
|
|
"## Picking `True` values from `and` and `or` expressions\n",
|
|
"\n",
|
|
"If both values of in a `or` expression are True, Python will select the first one, and the second one in `and` expressions\n",
|
|
"\n",
|
|
"(Original source: [http://gistroll.com/rolls/21/horizontal_assessments/new](http://gistroll.com/rolls/21/horizontal_assessments/new))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"result = (2 or 3) * (5 and 7)\n",
|
|
"print('2 * 7 =', result)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"2 * 7 = 14\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 9
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"And a fun fact"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='def_mutable_func'></a>\n",
|
|
"## Don't use mutable objects as default arguments for functions!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Don't use mutable objects (e.g., dictionaries, lists, sets, etc.) as default arguments for functions! You might expect that a new list is created every time when we call the function without providing an argument for the default parameter, but this is not the case: Python will create the mutable object (default parameter) only the first time the function is called, see the following code:\n",
|
|
"\n",
|
|
"(Original source: [http://docs.python-guide.org/en/latest/writing/gotchas/](http://docs.python-guide.org/en/latest/writing/gotchas/)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"def append_to_list(value, def_list=[]):\n",
|
|
" def_list.append(value)\n",
|
|
" return def_list\n",
|
|
"\n",
|
|
"my_list = append_to_list(1)\n",
|
|
"print(my_list)\n",
|
|
"\n",
|
|
"my_other_list = append_to_list(2)\n",
|
|
"print(my_other_list)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"[1]\n",
|
|
"[1, 2]\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 1
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='consuming_generators'></a>\n",
|
|
"\n",
|
|
"## Be aware of the consuming generator\n",
|
|
"\n",
|
|
"Be aware using `in` checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"gen = (i for i in range(5))\n",
|
|
"print('2 in gen,', 2 in gen)\n",
|
|
"print('3 in gen,', 3 in gen)\n",
|
|
"print('1 in gen,', 1 in gen) "
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"2 in gen, True\n",
|
|
"3 in gen, True\n",
|
|
"1 in gen, False\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 9
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"**We can circumvent this problem by using a simple list, though:**"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"l = [i for i in range(5)]\n",
|
|
"print('2 in l,', 2 in l)\n",
|
|
"print('3 in l,', 3 in l)\n",
|
|
"print('1 in l,', 1 in l) "
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"2 in l, True\n",
|
|
"3 in l, True\n",
|
|
"1 in l, True\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 10
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='bool_int'></a>\n",
|
|
"\n",
|
|
"## `bool` is a subclass of `int`\n",
|
|
"\n",
|
|
"Chicken or egg? In the history of Python (Python 2.2 to be specific) truth values were implemented via 1 and 0 (similar to the old C), to avoid syntax error in old (but perfectly working) code, `bool` was added as a subclass of `int` in Python 2.3.\n",
|
|
"\n",
|
|
"Original source: [http://www.peterbe.com/plog/bool-is-int](http://www.peterbe.com/plog/bool-is-int)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"print('isinstance(True, int):', isinstance(True, int))\n",
|
|
"print('True + True:', True + True)\n",
|
|
"print('3*True:', 3*True)\n",
|
|
"print('3*True - False:', 3*True - False)\n"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"isinstance(True, int): True\n",
|
|
"True + True: 2\n",
|
|
"3*True: 3\n",
|
|
"3*True - False: 3\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 16
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='lambda_closure'></a>\n",
|
|
"\n",
|
|
"## About lambda and closures-in-a-loop pitfall\n",
|
|
"\n",
|
|
"The following example illustrates how the (last) `lambda` is being reused:\n",
|
|
"\n",
|
|
"(Original source: [http://openhome.cc/eGossip/Blog/UnderstandingLambdaClosure3.html](http://openhome.cc/eGossip/Blog/UnderstandingLambdaClosure3.html))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"my_list = [lambda: i for i in range(5)]\n",
|
|
"for l in my_list:\n",
|
|
" print(l())"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"4\n",
|
|
"4\n",
|
|
"4\n",
|
|
"4\n",
|
|
"4\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 24
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"**Here, a generator can save you some pain:**"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"my_gen = (lambda: n for n in range(5))\n",
|
|
"for l in my_gen:\n",
|
|
" print(l())"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"0\n",
|
|
"1\n",
|
|
"2\n",
|
|
"3\n",
|
|
"4\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 25
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='python_legb'></a>\n",
|
|
"\n",
|
|
"## Python's LEGB scope resolution and the keywords `global` and `nonlocal`"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"There is nothing particularly surprising about Python's LEGB scope resolution (Local -> Enclosed -> Global -> Built-in), but it is still useful to take a look at some examples!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### `global` vs. `local`\n",
|
|
"\n",
|
|
"According to the LEGB rule, Python will first look for a variable in the local scope. So if we set the variable `x = 1` in the `local`ly in the function's scope, it won't have an effect on the `global` `x`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"x = 0\n",
|
|
"def in_func():\n",
|
|
" x = 1\n",
|
|
" print('in_func:', x)\n",
|
|
" \n",
|
|
"in_func()\n",
|
|
"print('global:', x)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"in_func: 1\n",
|
|
"global: 0\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 33
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"If we want to modify the `global` x via a function, we can simply use the `global` keyword to import the variable into the function's scope:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"x = 0\n",
|
|
"def in_func():\n",
|
|
" global x\n",
|
|
" x = 1\n",
|
|
" print('in_func:', x)\n",
|
|
" \n",
|
|
"in_func()\n",
|
|
"print('global:', x)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"in_func: 1\n",
|
|
"global: 1\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 34
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### `local` vs. `enclosed`\n",
|
|
"\n",
|
|
"Now, let us take a look at `local` vs. `enclosed`. Here, we set the variable `x = 1` in the `outer` function and set `x = 1` in the enclosed function `inner`. Since `inner` looks in the local scope first, it won't modify `outer`'s `x`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"def outer():\n",
|
|
" x = 1\n",
|
|
" print('outer before:', x)\n",
|
|
" def inner():\n",
|
|
" x = 2\n",
|
|
" print(\"inner:\", x)\n",
|
|
" inner()\n",
|
|
" print(\"outer after:\", x)\n",
|
|
"outer()"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"outer before: 1\n",
|
|
"inner: 2\n",
|
|
"outer after: 1\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 36
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Here is where the `nonlocal` keyword comes in handy - it allows us to modify the `x` variable in the `enclosed` scope:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"def outer():\n",
|
|
" x = 1\n",
|
|
" print('outer before:', x)\n",
|
|
" def inner():\n",
|
|
" nonlocal x\n",
|
|
" x = 2\n",
|
|
" print(\"inner:\", x)\n",
|
|
" inner()\n",
|
|
" print(\"outer after:\", x)\n",
|
|
"outer()"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"outer before: 1\n",
|
|
"inner: 2\n",
|
|
"outer after: 2\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 35
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='immutable_tuple'></a>\n",
|
|
"## When mutable contents of immutable tuples aren't so mutable"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"As we all know, tuples are immutable objects in Python, right!? But what happens if they contain mutable objects? \n",
|
|
"\n",
|
|
"First, let us have a look at the expected behavior: a `TypeError` is raised if we try to modify immutable types in a tuple: "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"tup = (1,)\n",
|
|
"tup[0] += 1"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "TypeError",
|
|
"evalue": "'tuple' object does not support item assignment",
|
|
"output_type": "pyerr",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
|
|
"\u001b[0;32m<ipython-input-41-c3bec6c3fe6f>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mtup\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mtup\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
|
|
"\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 41
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### But what if we put a mutable object into the immutable tuple? Well, modification works, but we **also** get a `TypeError` at the same time."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"tup = ([],)\n",
|
|
"print('tup before: ', tup)\n",
|
|
"tup[0] += [1]"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"tup before: ([],)\n"
|
|
]
|
|
},
|
|
{
|
|
"ename": "TypeError",
|
|
"evalue": "'tuple' object does not support item assignment",
|
|
"output_type": "pyerr",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
|
|
"\u001b[0;32m<ipython-input-42-aebe9a31dbeb>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mtup\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'tup before: '\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtup\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mtup\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
|
|
"\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 42
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"print('tup after: ', tup)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"tup after: ([1],)\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 43
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"However, **there are ways** to modify the mutable contents of the tuple without raising the `TypeError`, the solution is the `.extend()` method, or alternatively `.append()` (for lists):"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"tup = ([],)\n",
|
|
"print('tup before: ', tup)\n",
|
|
"tup[0].extend([1])\n",
|
|
"print('tup after: ', tup)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"tup before: ([],)\n",
|
|
"tup after: ([1],)\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 44
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"tup = ([],)\n",
|
|
"print('tup before: ', tup)\n",
|
|
"tup[0].append(1)\n",
|
|
"print('tup after: ', tup)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"tup before: ([],)\n",
|
|
"tup after: ([1],)\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 5
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Explanation\n",
|
|
"\n",
|
|
"**A. Jesse Jiryu Davis** has a nice explanation for this phenomenon (Original source: [http://emptysqua.re/blog/python-increment-is-weird-part-ii/](http://emptysqua.re/blog/python-increment-is-weird-part-ii/))\n",
|
|
"\n",
|
|
"If we try to extend the list via `+=` *\"then the statement executes STORE_SUBSCR, which calls the C function PyObject_SetItem, which checks if the object supports item assignment. In our case the object is a tuple, so PyObject_SetItem throws the TypeError. Mystery solved.\"*"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### One more note about the `immutable` status of tuples. Tuples are famous for being immutable. However, how comes that this code works?"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"my_tup = (1,)\n",
|
|
"my_tup += (4,)\n",
|
|
"my_tup = my_tup + (5,)\n",
|
|
"print(my_tup)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"(1, 4, 5)\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 6
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"What happens \"behind\" the curtains is that the tuple is not modified, but every time a new object is generated, which will inherit the old \"name tag\":"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"my_tup = (1,)\n",
|
|
"print(id(my_tup))\n",
|
|
"my_tup += (4,)\n",
|
|
"print(id(my_tup))\n",
|
|
"my_tup = my_tup + (5,)\n",
|
|
"print(id(my_tup))"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"4337381840\n",
|
|
"4357415496\n",
|
|
"4357289952\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 8
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='list_generator'></a>\n",
|
|
"\n",
|
|
"## List comprehensions are fast, but generators are faster!?\n",
|
|
"\n",
|
|
"Not, really (or significantly, see the benchmarks below). So what's the reason to prefer one over the other?\n",
|
|
"- use lists if you want to use list methods \n",
|
|
"- use generators when you are dealing with huge collections to avoid memory issues"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"import timeit\n",
|
|
"\n",
|
|
"def plainlist(n=100000):\n",
|
|
" my_list = []\n",
|
|
" for i in range(n):\n",
|
|
" if i % 5 == 0:\n",
|
|
" my_list.append(i)\n",
|
|
" return my_list\n",
|
|
"\n",
|
|
"def listcompr(n=100000):\n",
|
|
" my_list = [i for i in range(n) if i % 5 == 0]\n",
|
|
" return my_list\n",
|
|
"\n",
|
|
"def generator(n=100000):\n",
|
|
" my_gen = (i for i in range(n) if i % 5 == 0)\n",
|
|
" return my_gen\n",
|
|
"\n",
|
|
"def generator_yield(n=100000):\n",
|
|
" for i in range(n):\n",
|
|
" if i % 5 == 0:\n",
|
|
" yield i"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"prompt_number": 75
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### To be fair to the list, let us exhaust the generators:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"def test_plainlist(plain_list):\n",
|
|
" for i in plain_list:\n",
|
|
" print(i)\n",
|
|
"\n",
|
|
"def test_listcompr(listcompr):\n",
|
|
" for i in listcompr:\n",
|
|
" print(i)\n",
|
|
"\n",
|
|
"def test_generator(generator):\n",
|
|
" for i in listcompr:\n",
|
|
" print(i)\n",
|
|
"\n",
|
|
"def test_generator_yield(generator_yield):\n",
|
|
" for i in listcompr:\n",
|
|
" print(i)\n",
|
|
"\n",
|
|
"print('plain_list: ', end = '')\n",
|
|
"%timeit test_plainlist\n",
|
|
"print('\\nlistcompr: ', end = '')\n",
|
|
"%timeit test_listcompr\n",
|
|
"print('\\ngenerator: ', end = '')\n",
|
|
"%timeit test_generator\n",
|
|
"print('\\ngenerator_yield: ', end = '')\n",
|
|
"%timeit test_generator_yield"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"plain_list: 10000000 loops, best of 3: 26.2 ns per loop"
|
|
]
|
|
},
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"\n",
|
|
"\n",
|
|
"listcompr: 10000000 loops, best of 3: 26.1 ns per loop"
|
|
]
|
|
},
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"\n",
|
|
"\n",
|
|
"generator: 10000000 loops, best of 3: 25.9 ns per loop"
|
|
]
|
|
},
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"\n",
|
|
"\n",
|
|
"generator_yield: 10000000 loops, best of 3: 26 ns per loop"
|
|
]
|
|
},
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 76
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": []
|
|
}
|
|
],
|
|
"metadata": {}
|
|
}
|
|
]
|
|
} |