mirror of
https://github.com/rasbt/python_reference.git
synced 2024-11-24 04:21:15 +00:00
669 lines
18 KiB
Plaintext
669 lines
18 KiB
Plaintext
{
|
|
"metadata": {
|
|
"name": "",
|
|
"signature": "sha256:faa74a34746bf250ef2d72e308074083ee5e60789203d70f630f8c67a709e6fe"
|
|
},
|
|
"nbformat": 3,
|
|
"nbformat_minor": 0,
|
|
"worksheets": [
|
|
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Sebastian Raschka \n",
|
|
"last updated: 04/15/2014\n",
|
|
"\n",
|
|
"[Link to this IPython Notebook on GitHub](https://github.com/rasbt/python_reference/blob/master/not_so_obvious_python_stuff.ipynb)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"All code was executed in Python 3.4"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# A collection of not so obvious Python stuff you should know!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Sections\n",
|
|
"- [The C3 class resolution algorithm for multiple class inheritance](#c3_class_res)\n",
|
|
"- [The behavior of += for lists](#pm_in_lists)\n",
|
|
"- [`True` and `False` in the datetime module](#datetime_module)\n",
|
|
"- [Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity](#python_small_int)\n",
|
|
"- [Shallow vs. deep copies if list contains other structures and objects](#shallow_vs_deep)\n",
|
|
"- [Picking True values from and and or expressions](#false_true_expressions)\n",
|
|
"- [Don't use mutable objects as default arguments for functions!](#def_mutable_func)\n",
|
|
"- [Be aware of the consuming generator](#consuming_generator)\n",
|
|
"- [`bool` is a subclass of `int`](#bool_int)\n",
|
|
"- [About lambda and closures-in-a-loop pitfall](#lambda_closure)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='c3_class_res'></a>\n",
|
|
"## The C3 class resolution algorithm for multiple class inheritance"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \"class A should be checked before class B\".\n",
|
|
"\n",
|
|
"If you want to learn more, please read the [original blog](http://python-history.blogspot.ru/2010/06/method-resolution-order.html) post by Guido van Rossum.\n",
|
|
"\n",
|
|
"(Original source: [http://gistroll.com/rolls/21/horizontal_assessments/new](http://gistroll.com/rolls/21/horizontal_assessments/new))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"class A(object):\n",
|
|
" def foo(self):\n",
|
|
" print(\"class A\")\n",
|
|
"\n",
|
|
"class B(object):\n",
|
|
" def foo(self):\n",
|
|
" print(\"class B\")\n",
|
|
"\n",
|
|
"class C(A, B):\n",
|
|
" pass\n",
|
|
"\n",
|
|
"C().foo()"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"class A\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 2
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='pm_in_lists'></a>\n",
|
|
"## The behavior of `+=` for lists"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"If we are using the `+=` operator on lists, we extend the list by modifying the object directly. However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:\n",
|
|
"\n",
|
|
"(Original source: [http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists](http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"list_a = []\n",
|
|
"print('ID of list_a', id(list_a))\n",
|
|
"list_a += [1]\n",
|
|
"print('ID of list_a after `+= [1]`', id(list_a))\n",
|
|
"list_a = list_a + [2]\n",
|
|
"print('ID of list_a after `list_a = list_a + [2]`', id(list_a))"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"ID of list_a 4356439144\n",
|
|
"ID of list_a after `+= [1]` 4356439144\n",
|
|
"ID of list_a after `list_a = list_a + [2]` 4356446112\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 3
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='datetime_module'></a>\n",
|
|
"## `True` and `False` in the datetime module\n",
|
|
"\n",
|
|
"\"it often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that,\n",
|
|
"unlike any other time value, midnight (i.e. datetime.time(0,0,0)) is False.\n",
|
|
"A long discussion on the python-ideas mailing list shows that, while surprising,\n",
|
|
"that behavior is desirable\u2014at least in some quarters.\"\n",
|
|
"\n",
|
|
"(Original source: [http://lwn.net/SubscriberLink/590299/bf73fe823974acea/](http://lwn.net/SubscriberLink/590299/bf73fe823974acea/))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"import datetime\n",
|
|
"\n",
|
|
"print('\"datetime.time(0,0,0)\" (Midnight) evaluates to', bool(datetime.time(0,0,0)))\n",
|
|
"\n",
|
|
"print('\"datetime.time(1,0,0)\" (1 am) evaluates to', bool(datetime.time(1,0,0)))"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"\"datetime.time(0,0,0)\" (Midnight) evaluates to False\n",
|
|
"\"datetime.time(1,0,0)\" (1 am) evaluates to True\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 4
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='python_small_int'></a>\n",
|
|
"## Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity\n",
|
|
"\n",
|
|
"This oddity occurs, because Python tends to stores small integers as the same object, but not so for larger ones! \n",
|
|
"(*I received a comment that this is in fact a CPython artefact and must not necessarily be true in all implementations of Python!*)\n",
|
|
"\n",
|
|
"So the take home message is: always use \"==\" for equality, \"is\" for identity!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"a = 1\n",
|
|
"b = 1\n",
|
|
"print('a is b', bool(a is b))\n",
|
|
"True\n",
|
|
"\n",
|
|
"a = 999\n",
|
|
"b = 999\n",
|
|
"print('a is b', bool(a is b))"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"a is b True\n",
|
|
"a is b False\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 5
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Another popular example to illustrate the reuse of objects for small integers is:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"print(256 is 257 - 1)\n",
|
|
"print(257 is 258 - 1)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"True\n",
|
|
"False\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 2
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### And to illustrate the test for equality (`==`) vs. identity (`is`):"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"a = 'hello world!'\n",
|
|
"b = 'hello world!'\n",
|
|
"print('a is b,', a is b)\n",
|
|
"print('a == b,', a == b)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"a is b, False\n",
|
|
"a == b, True\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 6
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### And this example shows when `==` does not necessarilu implies that two objects are the same:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"a = float('nan')\n",
|
|
"print('a == a,', a == a)\n",
|
|
"print('a is a,', a is a)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"a == a, False\n",
|
|
"a is a, True\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 7
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='shallow_vs_deep'></a>\n",
|
|
"## Shallow vs. deep copies if list contains other structures and objects\n",
|
|
"\n",
|
|
"List modification of the original list does affect shallow copies, but not deep copies if the list contains compound objects."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"from copy import deepcopy\n",
|
|
"\n",
|
|
"my_first_list = [[1],[2]]\n",
|
|
"my_second_list = [[1],[2]]\n",
|
|
"print('my_first_list == my_second_list:', my_first_list == my_second_list)\n",
|
|
"print('my_first_list is my_second_list:', my_first_list is my_second_list)\n",
|
|
"\n",
|
|
"my_third_list = my_first_list\n",
|
|
"print('my_first_list == my_third_list:', my_first_list == my_third_list)\n",
|
|
"print('my_first_list is my_third_list:', my_first_list is my_third_list)\n",
|
|
"\n",
|
|
"my_shallow_copy = my_first_list[:]\n",
|
|
"print('my_first_list == my_shallow_copy:', my_first_list == my_shallow_copy)\n",
|
|
"print('my_first_list is my_shallow_copy:', my_first_list is my_shallow_copy)\n",
|
|
"\n",
|
|
"my_deep_copy = deepcopy(my_first_list)\n",
|
|
"print('my_first_list == my_deep_copy:', my_first_list == my_deep_copy)\n",
|
|
"print('my_first_list is my_deep_copy:', my_first_list is my_deep_copy)\n",
|
|
"\n",
|
|
"print('\\nmy_third_list:', my_third_list)\n",
|
|
"print('my_shallow_copy:', my_shallow_copy)\n",
|
|
"print('my_deep_copy:', my_deep_copy)\n",
|
|
"\n",
|
|
"my_first_list[0][0] = 2\n",
|
|
"print('after setting \"my_first_list[0][0] = 2\"')\n",
|
|
"print('my_third_list:', my_third_list)\n",
|
|
"print('my_shallow_copy:', my_shallow_copy)\n",
|
|
"print('my_deep_copy:', my_deep_copy)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"my_first_list == my_second_list: True\n",
|
|
"my_first_list is my_second_list: False\n",
|
|
"my_first_list == my_third_list: True\n",
|
|
"my_first_list is my_third_list: True\n",
|
|
"my_first_list == my_shallow_copy: True\n",
|
|
"my_first_list is my_shallow_copy: False\n",
|
|
"my_first_list == my_deep_copy: True\n",
|
|
"my_first_list is my_deep_copy: False\n",
|
|
"\n",
|
|
"my_third_list: [[1], [2]]\n",
|
|
"my_shallow_copy: [[1], [2]]\n",
|
|
"my_deep_copy: [[1], [2]]\n",
|
|
"after setting \"my_first_list[0][0] = 2\"\n",
|
|
"my_third_list: [[2], [2]]\n",
|
|
"my_shallow_copy: [[2], [2]]\n",
|
|
"my_deep_copy: [[1], [2]]\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 7
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='false_true_expressions'></a>\n",
|
|
"## Picking `True` values from `and` and `or` expressions\n",
|
|
"\n",
|
|
"If both values of in a `or` expression are True, Python will select the first one, and the second one in `and` expressions\n",
|
|
"\n",
|
|
"(Original source: [http://gistroll.com/rolls/21/horizontal_assessments/new](http://gistroll.com/rolls/21/horizontal_assessments/new))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"result = (2 or 3) * (5 and 7)\n",
|
|
"print('2 * 7 =', result)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"2 * 7 = 14\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 9
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"And a fun fact"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='def_mutable_func'></a>\n",
|
|
"## Don't use mutable objects as default arguments for functions!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Don't use mutable objects (e.g., dictionaries, lists, sets, etc.) as default arguments for functions! You might expect that a new list is created every time when we call the function without providing an argument for the default parameter, but this is not the case: Python will create the mutable object (default parameter) only the first time the function is called, see the following code:\n",
|
|
"\n",
|
|
"(Original source: [http://docs.python-guide.org/en/latest/writing/gotchas/](http://docs.python-guide.org/en/latest/writing/gotchas/)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"def append_to_list(value, def_list=[]):\n",
|
|
" def_list.append(value)\n",
|
|
" return def_list\n",
|
|
"\n",
|
|
"my_list = append_to_list(1)\n",
|
|
"print(my_list)\n",
|
|
"\n",
|
|
"my_other_list = append_to_list(2)\n",
|
|
"print(my_other_list)"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"[1]\n",
|
|
"[1, 2]\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 1
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='consuming_generators'></a>\n",
|
|
"\n",
|
|
"## Be aware of the consuming generator\n",
|
|
"\n",
|
|
"Be aware using `in` checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"gen = (i for i in range(5))\n",
|
|
"print('2 in gen,', 2 in gen)\n",
|
|
"print('3 in gen,', 3 in gen)\n",
|
|
"print('1 in gen,', 1 in gen) "
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"2 in gen, True\n",
|
|
"3 in gen, True\n",
|
|
"1 in gen, False\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 9
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"**We can circumvent this problem by using a simple list, though:**"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"l = [i for i in range(5)]\n",
|
|
"print('2 in l,', 2 in l)\n",
|
|
"print('3 in l,', 3 in l)\n",
|
|
"print('1 in l,', 1 in l) "
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"2 in l, True\n",
|
|
"3 in l, True\n",
|
|
"1 in l, True\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 10
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='bool_int'></a>\n",
|
|
"\n",
|
|
"## `bool` is a subclass of `int`\n",
|
|
"\n",
|
|
"Chicken or egg? In the history of Python (Python 2.2 to be specific) truth values were implemented via 1 and 0 (similar to the old C), to avoid syntax error in old (but perfectly working) code, `bool` was added as a subclass of `int` in Python 2.3.\n",
|
|
"\n",
|
|
"Original source: [http://www.peterbe.com/plog/bool-is-int](http://www.peterbe.com/plog/bool-is-int)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"print('isinstance(True, int):', isinstance(True, int))\n",
|
|
"print('True + True:', True + True)\n",
|
|
"print('3*True:', 3*True)\n",
|
|
"print('3*True - False:', 3*True - False)\n"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"isinstance(True, int): True\n",
|
|
"True + True: 2\n",
|
|
"3*True: 3\n",
|
|
"3*True - False: 3\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 16
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<br>\n",
|
|
"<br>\n",
|
|
"<a name='lambda_closure'></a>\n",
|
|
"\n",
|
|
"## About lambda and closures-in-a-loop pitfall\n",
|
|
"\n",
|
|
"The following example illustrates how the (last) `lambda` is being reused:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"my_list = [lambda: i for i in range(5)]\n",
|
|
"for l in my_list:\n",
|
|
" print(l())"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"4\n",
|
|
"4\n",
|
|
"4\n",
|
|
"4\n",
|
|
"4\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 24
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"**Here, a generator can save you some pain:**"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [
|
|
"my_gen = (lambda: n for n in range(5))\n",
|
|
"for l in my_gen:\n",
|
|
" print(l())"
|
|
],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"stream": "stdout",
|
|
"text": [
|
|
"0\n",
|
|
"1\n",
|
|
"2\n",
|
|
"3\n",
|
|
"4\n"
|
|
]
|
|
}
|
|
],
|
|
"prompt_number": 25
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"collapsed": false,
|
|
"input": [],
|
|
"language": "python",
|
|
"metadata": {},
|
|
"outputs": []
|
|
}
|
|
],
|
|
"metadata": {}
|
|
}
|
|
]
|
|
} |