many revisions

This commit is contained in:
rasbt 2014-04-25 13:05:32 -04:00
parent 89b7da504c
commit cf64608760
3 changed files with 408 additions and 217 deletions

View File

@ -1,7 +1,7 @@
{
"metadata": {
"name": "",
"signature": "sha256:90703033799353a31e4e44d4a78b991bbc5f3fceb2709614057dd367c91b2b0f"
"signature": "sha256:d8ba69c66769cf62e5201b70ed7d717913017f6f09492848ce164b50068bd2ba"
},
"nbformat": 3,
"nbformat_minor": 0,
@ -24,14 +24,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### All code was executed in Python 3.4"
"#### All code was executed in Python 3.4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A collection of not so obvious Python stuff you should know!"
"# A collection of not-so-obvious Python stuff you should know!"
]
},
{
@ -59,7 +59,7 @@
"source": [
"# Sections\n",
"- [The C3 class resolution algorithm for multiple class inheritance](#c3_class_res)\n",
"- [The behavior of += for lists](#pm_in_lists)\n",
"- [Using `+=` on lists creates new objects](#pm_in_lists)\n",
"- [`True` and `False` in the datetime module](#datetime_module)\n",
"- [Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity](#python_small_int)\n",
"- [Shallow vs. deep copies if list contains other structures and objects](#shallow_vs_deep)\n",
@ -109,7 +109,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \"class A should be checked before class B\".\n",
"If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \n",
"Assuming that child class C inherits from two parent classes A and B, \"class A should be checked before class B\".\n",
"\n",
"If you want to learn more, please read the [original blog](http://python-history.blogspot.ru/2010/06/method-resolution-order.html) post by Guido van Rossum.\n",
"\n",
@ -150,14 +151,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"So what actually happened above was that class `C` was looking in the parent class `A` for the method `.foo()` first (and found it)!"
"So what actually happened above was that class `C` looked in the scope of the parent class `A` for the method `.foo()` first (and found it)!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I received an email with a nice suggestion using a more nested example to illustrate Guido van Rossum's point a little bit better:"
"I received an email containing a suggestion which uses a more nested example to illustrate Guido van Rossum's point a little bit better:"
]
},
{
@ -197,7 +198,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`"
"Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`. "
]
},
{
@ -213,7 +214,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## The behavior of `+=` for lists"
"## Using `+=` on lists creates new objects"
]
},
{
@ -227,21 +228,23 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"If we are using the `+=` operator on lists, we extend the list by modifying the object directly. However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:\n",
"Python `list`s are mutable objects as we all know. So, if we are using the `+=` operator on `list`s, we extend the `list` by directly modifying the object directly. \n",
"\n",
"(Original source: [http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists](http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists))"
"However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"list_a = []\n",
"print('ID of list_a', id(list_a))\n",
"list_a += [1]\n",
"print('ID of list_a after `+= [1]`', id(list_a))\n",
"list_a = list_a + [2]\n",
"print('ID of list_a after `list_a = list_a + [2]`', id(list_a))"
"a_list = []\n",
"print('ID:', id(a_list))\n",
"\n",
"a_list += [1]\n",
"print('ID (+=):', id(a_list))\n",
"\n",
"a_list = a_list + [2]\n",
"print('ID (list = list + ...):', id(a_list))"
],
"language": "python",
"metadata": {},
@ -250,13 +253,56 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"ID of list_a 4356429080\n",
"ID of list_a after `+= [1]` 4356429080\n",
"ID of list_a after `list_a = list_a + [2]` 4356453584\n"
"ID: 4366496544\n",
"ID (+=): 4366496544\n",
"ID (list = list + ...): 4366495472\n"
]
}
],
"prompt_number": 2
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just for reference, the `.append()` and `.extends()` methods are modifying the `list` object in place, just as expected."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a_list = []\n",
"print('ID:',id(a_list))\n",
"\n",
"a_list.append(1)\n",
"print('ID (append):',id(a_list))\n",
"\n",
"a_list.append(2)\n",
"print('ID (extend):',id(a_list))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"ID: 4366495544\n",
"ID (append): 4366495544\n",
"ID (extend): 4366495544\n"
]
}
],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
@ -280,10 +326,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\"it often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that,\n",
"unlike any other time value, midnight (i.e. datetime.time(0,0,0)) is False.\n",
"A long discussion on the python-ideas mailing list shows that, while surprising,\n",
"that behavior is desirable\u2014at least in some quarters.\"\n",
"\"It often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that, unlike any other time value, midnight (i.e. `datetime.time(0,0,0)`) is False. A long discussion on the python-ideas mailing list shows that, while surprising, that behavior is desirable\u2014at least in some quarters.\" \n",
"\n",
"(Original source: [http://lwn.net/SubscriberLink/590299/bf73fe823974acea/](http://lwn.net/SubscriberLink/590299/bf73fe823974acea/))"
]
@ -301,9 +344,9 @@
"input": [
"import datetime\n",
"\n",
"print('\"datetime.time(0,0,0)\" (Midnight) evaluates to', bool(datetime.time(0,0,0)))\n",
"print('\"datetime.time(0,0,0)\" (Midnight) ->', bool(datetime.time(0,0,0)))\n",
"\n",
"print('\"datetime.time(1,0,0)\" (1 am) evaluates to', bool(datetime.time(1,0,0)))"
"print('\"datetime.time(1,0,0)\" (1 am) ->', bool(datetime.time(1,0,0)))"
],
"language": "python",
"metadata": {},
@ -312,12 +355,12 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"\"datetime.time(0,0,0)\" (Midnight) evaluates to False\n",
"\"datetime.time(1,0,0)\" (1 am) evaluates to True\n"
"\"datetime.time(0,0,0)\" (Midnight) -> False\n",
"\"datetime.time(1,0,0)\" (1 am) -> True\n"
]
}
],
"prompt_number": 4
"prompt_number": 8
},
{
"cell_type": "markdown",
@ -333,7 +376,7 @@
"metadata": {},
"source": [
"\n",
"## Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity\n",
"## Python reuses objects for small integers - use \"==\" for equality, \"is\" for identity\n",
"\n"
]
},
@ -348,14 +391,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong)).\n",
"\n",
"\n",
"(*I received a comment that this is in fact a CPython artefact and must not necessarily be true in all implementations of Python!*)\n",
"\n",
"So the take home message is: always use \"==\" for equality, \"is\" for identity!\n",
"\n",
"Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it using \"boxes\" (for people with C background) and \"name tags\" (in the case of Python):"
"This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong))."
]
},
{
@ -367,9 +403,9 @@
"print('a is b', bool(a is b))\n",
"True\n",
"\n",
"a = 999\n",
"b = 999\n",
"print('a is b', bool(a is b))"
"c = 999\n",
"d = 999\n",
"print('c is d', bool(c is d))"
],
"language": "python",
"metadata": {},
@ -379,25 +415,38 @@
"stream": "stdout",
"text": [
"a is b True\n",
"a is b False\n"
"c is d False\n"
]
}
],
"prompt_number": 5
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Another popular example to illustrate the reuse of objects for small integers is:"
"(*I received a comment that this is in fact a CPython artefact and **must not necessarily be true** in all implementations of Python!*)\n",
"\n",
"So the take home message is: always use \"==\" for equality, \"is\" for identity!\n",
"\n",
"Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it by comparing \"boxes\" (C language) with \"name tags\" (Python)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This example demonstrates that this applies indeed for integers in the range in -5 to 256:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print(256 is 257 - 1)\n",
"print(257 is 258 - 1)"
"print('256 is 257-1', 256 is 257-1)\n",
"print('257 is 258-1', 257 is 258 - 1)\n",
"print('-5 is -6+1', -5 is -6+1)\n",
"print('-7 is -6-1', -7 is -6-1)"
],
"language": "python",
"metadata": {},
@ -406,12 +455,14 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"True\n",
"False\n"
"256 is 257-1 True\n",
"257 is 258-1 False\n",
"-5 is -6+1 True\n",
"-7 is -6-1 False\n"
]
}
],
"prompt_number": 2
"prompt_number": 11
},
{
"cell_type": "markdown",
@ -447,7 +498,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### And this example shows when `==` does not necessarilu implies that two objects are the same:"
"We would think that identity would always imply equality, but this is not always true, as we can see in the next example:"
]
},
{
@ -455,8 +506,8 @@
"collapsed": false,
"input": [
"a = float('nan')\n",
"print('a == a,', a == a)\n",
"print('a is a,', a is a)"
"print('a is a,', a is a)\n",
"print('a == a,', a == a)"
],
"language": "python",
"metadata": {},
@ -465,12 +516,12 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"a == a, False\n",
"a is a, True\n"
"a is a, True\n",
"a == a, False\n"
]
}
],
"prompt_number": 7
"prompt_number": 12
},
{
"cell_type": "markdown",
@ -500,41 +551,28 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"List modification of the original list does affect shallow copies, but not deep copies if the list contains compound objects."
"**Shallow copy**: \n",
"If we use the assignment operator to assign one list to another list, we just create a new name reference to the original list. If we want to create a new list object, we have to make a copy of the original list. This can be done via `a_list[:]` of `a_list.copy()`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from copy import deepcopy\n",
"list1 = [1,2]\n",
"list2 = list1 # reference\n",
"list3 = list1[:] # shallow copy\n",
"list4 = list1.copy() # shallow copy\n",
"\n",
"my_first_list = [[1],[2]]\n",
"my_second_list = [[1],[2]]\n",
"print('my_first_list == my_second_list:', my_first_list == my_second_list)\n",
"print('my_first_list is my_second_list:', my_first_list is my_second_list)\n",
"print('IDs:\\nlist1: {}\\nlist2: {}\\nlist3: {}\\nlist4: {}\\n'\n",
" .format(id(list1), id(list2), id(list3), id(list4)))\n",
"\n",
"my_third_list = my_first_list\n",
"print('my_first_list == my_third_list:', my_first_list == my_third_list)\n",
"print('my_first_list is my_third_list:', my_first_list is my_third_list)\n",
"list2[0] = 3\n",
"print('list1:', list1)\n",
"\n",
"my_shallow_copy = my_first_list[:]\n",
"print('my_first_list == my_shallow_copy:', my_first_list == my_shallow_copy)\n",
"print('my_first_list is my_shallow_copy:', my_first_list is my_shallow_copy)\n",
"\n",
"my_deep_copy = deepcopy(my_first_list)\n",
"print('my_first_list == my_deep_copy:', my_first_list == my_deep_copy)\n",
"print('my_first_list is my_deep_copy:', my_first_list is my_deep_copy)\n",
"\n",
"print('\\nmy_third_list:', my_third_list)\n",
"print('my_shallow_copy:', my_shallow_copy)\n",
"print('my_deep_copy:', my_deep_copy)\n",
"\n",
"my_first_list[0][0] = 2\n",
"print('after setting \"my_first_list[0][0] = 2\"')\n",
"print('my_third_list:', my_third_list)\n",
"print('my_shallow_copy:', my_shallow_copy)\n",
"print('my_deep_copy:', my_deep_copy)"
"list3[0] = 4\n",
"list4[1] = 4\n",
"print('list1:', list1)"
],
"language": "python",
"metadata": {},
@ -543,26 +581,70 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"my_first_list == my_second_list: True\n",
"my_first_list is my_second_list: False\n",
"my_first_list == my_third_list: True\n",
"my_first_list is my_third_list: True\n",
"my_first_list == my_shallow_copy: True\n",
"my_first_list is my_shallow_copy: False\n",
"my_first_list == my_deep_copy: True\n",
"my_first_list is my_deep_copy: False\n",
"IDs:\n",
"list1: 4377955288\n",
"list2: 4377955288\n",
"list3: 4377955432\n",
"list4: 4377954784\n",
"\n",
"my_third_list: [[1], [2]]\n",
"my_shallow_copy: [[1], [2]]\n",
"my_deep_copy: [[1], [2]]\n",
"after setting \"my_first_list[0][0] = 2\"\n",
"my_third_list: [[2], [2]]\n",
"my_shallow_copy: [[2], [2]]\n",
"my_deep_copy: [[1], [2]]\n"
"list1: [3, 2]\n",
"list1: [3, 2]\n"
]
}
],
"prompt_number": 7
"prompt_number": 23
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Deep copy** \n",
"As we have seen above, a shallow copy works fine if we want to create a new list with contents of the original list which we want to modify independently. \n",
"\n",
"However, if we are dealing with compound objects (e.g., lists that contain other lists, [read here](https://docs.python.org/2/library/copy.html) for more information) it becomes a little trickier.\n",
"\n",
"In the case of compound objects, a shallow copy would create a new compound object, but it would just insert the references to the contained objects into the new compound object. In contrast, a deep copy would go \"deeper\" and create also new objects \n",
"for the objects found in the original compound object. \n",
"If you follow the code, the concept should become more clear:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from copy import deepcopy\n",
"\n",
"list1 = [[1],[2]]\n",
"list2 = list1.copy() # shallow copy\n",
"list3 = deepcopy(list1) # deep copy\n",
"\n",
"print('IDs:\\nlist1: {}\\nlist2: {}\\nlist3: {}\\n'\n",
" .format(id(list1), id(list2), id(list3)))\n",
"\n",
"list2[0][0] = 3\n",
"print('list1:', list1)\n",
"\n",
"list3[0][0] = 5\n",
"print('list1:', list1)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"IDs:\n",
"list1: 4377956296\n",
"list2: 4377961752\n",
"list3: 4377954928\n",
"\n",
"list1: [[3], [2]]\n",
"list1: [[3], [2]]\n"
]
}
],
"prompt_number": 25
},
{
"cell_type": "markdown",
@ -752,7 +834,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Be aware using `in` checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"."
"Be aware of what is happening when combining \"`in`\" checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"."
]
},
{
@ -783,17 +865,18 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"**We can circumvent this problem by using a simple list, though:**"
"Although this defeats the purpose of an generator (in most cases), we can convert a generator into a list to circumvent the problem. "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"l = [i for i in range(5)]\n",
"print('2 in l,', 2 in l)\n",
"print('3 in l,', 3 in l)\n",
"print('1 in l,', 1 in l) "
"gen = (i for i in range(5))\n",
"a_list = list(gen)\n",
"print('2 in l,', 2 in a_list)\n",
"print('3 in l,', 3 in a_list)\n",
"print('1 in l,', 1 in a_list) "
],
"language": "python",
"metadata": {},
@ -808,7 +891,7 @@
]
}
],
"prompt_number": 10
"prompt_number": 27
},
{
"cell_type": "markdown",

View File

@ -1,7 +1,7 @@
{
"metadata": {
"name": "",
"signature": "sha256:0827512c142a04f764ebbbcad51defe8005ffc48f52010c4fa1ac24eda4d9c13"
"signature": "sha256:4f74947620f3ebd04a28a448392a201107339760170f1eb74e815a3e8b8267e8"
},
"nbformat": 3,
"nbformat_minor": 0,
@ -1023,6 +1023,14 @@
"# Comprehesions vs. for-loops"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Comprehensions are not only shorter and prettier than ye goode olde for-loop, \n",
"but they are also up to ~1.2x faster."
]
},
{
"cell_type": "code",
"collapsed": false,
@ -1149,7 +1157,24 @@
],
"language": "python",
"metadata": {},
"outputs": []
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"10000 loops, best of 3: 129 \u00b5s per loop\n",
"10000 loops, best of 3: 111 \u00b5s per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
"prompt_number": 25
},
{
"cell_type": "markdown",
@ -1172,7 +1197,7 @@
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 15
"prompt_number": 26
},
{
"cell_type": "code",
@ -1184,7 +1209,7 @@
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 17
"prompt_number": 27
},
{
"cell_type": "code",
@ -1200,8 +1225,8 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"10000 loops, best of 3: 120 \u00b5s per loop\n",
"10000 loops, best of 3: 118 \u00b5s per loop"
"10000 loops, best of 3: 121 \u00b5s per loop\n",
"10000 loops, best of 3: 127 \u00b5s per loop"
]
},
{
@ -1212,7 +1237,7 @@
]
}
],
"prompt_number": 18
"prompt_number": 28
},
{
"cell_type": "code",

View File

@ -1,7 +1,7 @@
{
"metadata": {
"name": "",
"signature": "sha256:90703033799353a31e4e44d4a78b991bbc5f3fceb2709614057dd367c91b2b0f"
"signature": "sha256:d8ba69c66769cf62e5201b70ed7d717913017f6f09492848ce164b50068bd2ba"
},
"nbformat": 3,
"nbformat_minor": 0,
@ -24,14 +24,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### All code was executed in Python 3.4"
"#### All code was executed in Python 3.4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A collection of not so obvious Python stuff you should know!"
"# A collection of not-so-obvious Python stuff you should know!"
]
},
{
@ -59,7 +59,7 @@
"source": [
"# Sections\n",
"- [The C3 class resolution algorithm for multiple class inheritance](#c3_class_res)\n",
"- [The behavior of += for lists](#pm_in_lists)\n",
"- [Using `+=` on lists creates new objects](#pm_in_lists)\n",
"- [`True` and `False` in the datetime module](#datetime_module)\n",
"- [Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity](#python_small_int)\n",
"- [Shallow vs. deep copies if list contains other structures and objects](#shallow_vs_deep)\n",
@ -109,7 +109,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \"class A should be checked before class B\".\n",
"If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \n",
"Assuming that child class C inherits from two parent classes A and B, \"class A should be checked before class B\".\n",
"\n",
"If you want to learn more, please read the [original blog](http://python-history.blogspot.ru/2010/06/method-resolution-order.html) post by Guido van Rossum.\n",
"\n",
@ -150,14 +151,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"So what actually happened above was that class `C` was looking in the parent class `A` for the method `.foo()` first (and found it)!"
"So what actually happened above was that class `C` looked in the scope of the parent class `A` for the method `.foo()` first (and found it)!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I received an email with a nice suggestion using a more nested example to illustrate Guido van Rossum's point a little bit better:"
"I received an email containing a suggestion which uses a more nested example to illustrate Guido van Rossum's point a little bit better:"
]
},
{
@ -197,7 +198,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`"
"Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`. "
]
},
{
@ -213,7 +214,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## The behavior of `+=` for lists"
"## Using `+=` on lists creates new objects"
]
},
{
@ -227,21 +228,23 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"If we are using the `+=` operator on lists, we extend the list by modifying the object directly. However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:\n",
"Python `list`s are mutable objects as we all know. So, if we are using the `+=` operator on `list`s, we extend the `list` by directly modifying the object directly. \n",
"\n",
"(Original source: [http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists](http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists))"
"However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"list_a = []\n",
"print('ID of list_a', id(list_a))\n",
"list_a += [1]\n",
"print('ID of list_a after `+= [1]`', id(list_a))\n",
"list_a = list_a + [2]\n",
"print('ID of list_a after `list_a = list_a + [2]`', id(list_a))"
"a_list = []\n",
"print('ID:', id(a_list))\n",
"\n",
"a_list += [1]\n",
"print('ID (+=):', id(a_list))\n",
"\n",
"a_list = a_list + [2]\n",
"print('ID (list = list + ...):', id(a_list))"
],
"language": "python",
"metadata": {},
@ -250,13 +253,56 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"ID of list_a 4356429080\n",
"ID of list_a after `+= [1]` 4356429080\n",
"ID of list_a after `list_a = list_a + [2]` 4356453584\n"
"ID: 4366496544\n",
"ID (+=): 4366496544\n",
"ID (list = list + ...): 4366495472\n"
]
}
],
"prompt_number": 2
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just for reference, the `.append()` and `.extends()` methods are modifying the `list` object in place, just as expected."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a_list = []\n",
"print('ID:',id(a_list))\n",
"\n",
"a_list.append(1)\n",
"print('ID (append):',id(a_list))\n",
"\n",
"a_list.append(2)\n",
"print('ID (extend):',id(a_list))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"ID: 4366495544\n",
"ID (append): 4366495544\n",
"ID (extend): 4366495544\n"
]
}
],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
@ -280,10 +326,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\"it often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that,\n",
"unlike any other time value, midnight (i.e. datetime.time(0,0,0)) is False.\n",
"A long discussion on the python-ideas mailing list shows that, while surprising,\n",
"that behavior is desirable\u2014at least in some quarters.\"\n",
"\"It often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that, unlike any other time value, midnight (i.e. `datetime.time(0,0,0)`) is False. A long discussion on the python-ideas mailing list shows that, while surprising, that behavior is desirable\u2014at least in some quarters.\" \n",
"\n",
"(Original source: [http://lwn.net/SubscriberLink/590299/bf73fe823974acea/](http://lwn.net/SubscriberLink/590299/bf73fe823974acea/))"
]
@ -301,9 +344,9 @@
"input": [
"import datetime\n",
"\n",
"print('\"datetime.time(0,0,0)\" (Midnight) evaluates to', bool(datetime.time(0,0,0)))\n",
"print('\"datetime.time(0,0,0)\" (Midnight) ->', bool(datetime.time(0,0,0)))\n",
"\n",
"print('\"datetime.time(1,0,0)\" (1 am) evaluates to', bool(datetime.time(1,0,0)))"
"print('\"datetime.time(1,0,0)\" (1 am) ->', bool(datetime.time(1,0,0)))"
],
"language": "python",
"metadata": {},
@ -312,12 +355,12 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"\"datetime.time(0,0,0)\" (Midnight) evaluates to False\n",
"\"datetime.time(1,0,0)\" (1 am) evaluates to True\n"
"\"datetime.time(0,0,0)\" (Midnight) -> False\n",
"\"datetime.time(1,0,0)\" (1 am) -> True\n"
]
}
],
"prompt_number": 4
"prompt_number": 8
},
{
"cell_type": "markdown",
@ -333,7 +376,7 @@
"metadata": {},
"source": [
"\n",
"## Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity\n",
"## Python reuses objects for small integers - use \"==\" for equality, \"is\" for identity\n",
"\n"
]
},
@ -348,14 +391,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong)).\n",
"\n",
"\n",
"(*I received a comment that this is in fact a CPython artefact and must not necessarily be true in all implementations of Python!*)\n",
"\n",
"So the take home message is: always use \"==\" for equality, \"is\" for identity!\n",
"\n",
"Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it using \"boxes\" (for people with C background) and \"name tags\" (in the case of Python):"
"This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong))."
]
},
{
@ -367,9 +403,9 @@
"print('a is b', bool(a is b))\n",
"True\n",
"\n",
"a = 999\n",
"b = 999\n",
"print('a is b', bool(a is b))"
"c = 999\n",
"d = 999\n",
"print('c is d', bool(c is d))"
],
"language": "python",
"metadata": {},
@ -379,25 +415,38 @@
"stream": "stdout",
"text": [
"a is b True\n",
"a is b False\n"
"c is d False\n"
]
}
],
"prompt_number": 5
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Another popular example to illustrate the reuse of objects for small integers is:"
"(*I received a comment that this is in fact a CPython artefact and **must not necessarily be true** in all implementations of Python!*)\n",
"\n",
"So the take home message is: always use \"==\" for equality, \"is\" for identity!\n",
"\n",
"Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it by comparing \"boxes\" (C language) with \"name tags\" (Python)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This example demonstrates that this applies indeed for integers in the range in -5 to 256:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print(256 is 257 - 1)\n",
"print(257 is 258 - 1)"
"print('256 is 257-1', 256 is 257-1)\n",
"print('257 is 258-1', 257 is 258 - 1)\n",
"print('-5 is -6+1', -5 is -6+1)\n",
"print('-7 is -6-1', -7 is -6-1)"
],
"language": "python",
"metadata": {},
@ -406,12 +455,14 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"True\n",
"False\n"
"256 is 257-1 True\n",
"257 is 258-1 False\n",
"-5 is -6+1 True\n",
"-7 is -6-1 False\n"
]
}
],
"prompt_number": 2
"prompt_number": 11
},
{
"cell_type": "markdown",
@ -447,7 +498,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### And this example shows when `==` does not necessarilu implies that two objects are the same:"
"We would think that identity would always imply equality, but this is not always true, as we can see in the next example:"
]
},
{
@ -455,8 +506,8 @@
"collapsed": false,
"input": [
"a = float('nan')\n",
"print('a == a,', a == a)\n",
"print('a is a,', a is a)"
"print('a is a,', a is a)\n",
"print('a == a,', a == a)"
],
"language": "python",
"metadata": {},
@ -465,12 +516,12 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"a == a, False\n",
"a is a, True\n"
"a is a, True\n",
"a == a, False\n"
]
}
],
"prompt_number": 7
"prompt_number": 12
},
{
"cell_type": "markdown",
@ -500,41 +551,28 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"List modification of the original list does affect shallow copies, but not deep copies if the list contains compound objects."
"**Shallow copy**: \n",
"If we use the assignment operator to assign one list to another list, we just create a new name reference to the original list. If we want to create a new list object, we have to make a copy of the original list. This can be done via `a_list[:]` of `a_list.copy()`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from copy import deepcopy\n",
"list1 = [1,2]\n",
"list2 = list1 # reference\n",
"list3 = list1[:] # shallow copy\n",
"list4 = list1.copy() # shallow copy\n",
"\n",
"my_first_list = [[1],[2]]\n",
"my_second_list = [[1],[2]]\n",
"print('my_first_list == my_second_list:', my_first_list == my_second_list)\n",
"print('my_first_list is my_second_list:', my_first_list is my_second_list)\n",
"print('IDs:\\nlist1: {}\\nlist2: {}\\nlist3: {}\\nlist4: {}\\n'\n",
" .format(id(list1), id(list2), id(list3), id(list4)))\n",
"\n",
"my_third_list = my_first_list\n",
"print('my_first_list == my_third_list:', my_first_list == my_third_list)\n",
"print('my_first_list is my_third_list:', my_first_list is my_third_list)\n",
"list2[0] = 3\n",
"print('list1:', list1)\n",
"\n",
"my_shallow_copy = my_first_list[:]\n",
"print('my_first_list == my_shallow_copy:', my_first_list == my_shallow_copy)\n",
"print('my_first_list is my_shallow_copy:', my_first_list is my_shallow_copy)\n",
"\n",
"my_deep_copy = deepcopy(my_first_list)\n",
"print('my_first_list == my_deep_copy:', my_first_list == my_deep_copy)\n",
"print('my_first_list is my_deep_copy:', my_first_list is my_deep_copy)\n",
"\n",
"print('\\nmy_third_list:', my_third_list)\n",
"print('my_shallow_copy:', my_shallow_copy)\n",
"print('my_deep_copy:', my_deep_copy)\n",
"\n",
"my_first_list[0][0] = 2\n",
"print('after setting \"my_first_list[0][0] = 2\"')\n",
"print('my_third_list:', my_third_list)\n",
"print('my_shallow_copy:', my_shallow_copy)\n",
"print('my_deep_copy:', my_deep_copy)"
"list3[0] = 4\n",
"list4[1] = 4\n",
"print('list1:', list1)"
],
"language": "python",
"metadata": {},
@ -543,26 +581,70 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"my_first_list == my_second_list: True\n",
"my_first_list is my_second_list: False\n",
"my_first_list == my_third_list: True\n",
"my_first_list is my_third_list: True\n",
"my_first_list == my_shallow_copy: True\n",
"my_first_list is my_shallow_copy: False\n",
"my_first_list == my_deep_copy: True\n",
"my_first_list is my_deep_copy: False\n",
"IDs:\n",
"list1: 4377955288\n",
"list2: 4377955288\n",
"list3: 4377955432\n",
"list4: 4377954784\n",
"\n",
"my_third_list: [[1], [2]]\n",
"my_shallow_copy: [[1], [2]]\n",
"my_deep_copy: [[1], [2]]\n",
"after setting \"my_first_list[0][0] = 2\"\n",
"my_third_list: [[2], [2]]\n",
"my_shallow_copy: [[2], [2]]\n",
"my_deep_copy: [[1], [2]]\n"
"list1: [3, 2]\n",
"list1: [3, 2]\n"
]
}
],
"prompt_number": 7
"prompt_number": 23
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Deep copy** \n",
"As we have seen above, a shallow copy works fine if we want to create a new list with contents of the original list which we want to modify independently. \n",
"\n",
"However, if we are dealing with compound objects (e.g., lists that contain other lists, [read here](https://docs.python.org/2/library/copy.html) for more information) it becomes a little trickier.\n",
"\n",
"In the case of compound objects, a shallow copy would create a new compound object, but it would just insert the references to the contained objects into the new compound object. In contrast, a deep copy would go \"deeper\" and create also new objects \n",
"for the objects found in the original compound object. \n",
"If you follow the code, the concept should become more clear:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from copy import deepcopy\n",
"\n",
"list1 = [[1],[2]]\n",
"list2 = list1.copy() # shallow copy\n",
"list3 = deepcopy(list1) # deep copy\n",
"\n",
"print('IDs:\\nlist1: {}\\nlist2: {}\\nlist3: {}\\n'\n",
" .format(id(list1), id(list2), id(list3)))\n",
"\n",
"list2[0][0] = 3\n",
"print('list1:', list1)\n",
"\n",
"list3[0][0] = 5\n",
"print('list1:', list1)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"IDs:\n",
"list1: 4377956296\n",
"list2: 4377961752\n",
"list3: 4377954928\n",
"\n",
"list1: [[3], [2]]\n",
"list1: [[3], [2]]\n"
]
}
],
"prompt_number": 25
},
{
"cell_type": "markdown",
@ -752,7 +834,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Be aware using `in` checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"."
"Be aware of what is happening when combining \"`in`\" checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"."
]
},
{
@ -783,17 +865,18 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"**We can circumvent this problem by using a simple list, though:**"
"Although this defeats the purpose of an generator (in most cases), we can convert a generator into a list to circumvent the problem. "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"l = [i for i in range(5)]\n",
"print('2 in l,', 2 in l)\n",
"print('3 in l,', 3 in l)\n",
"print('1 in l,', 1 in l) "
"gen = (i for i in range(5))\n",
"a_list = list(gen)\n",
"print('2 in l,', 2 in a_list)\n",
"print('3 in l,', 3 in a_list)\n",
"print('1 in l,', 1 in a_list) "
],
"language": "python",
"metadata": {},
@ -808,7 +891,7 @@
]
}
],
"prompt_number": 10
"prompt_number": 27
},
{
"cell_type": "markdown",