diff --git a/.ipynb_checkpoints/not_so_obvious_python_stuff-checkpoint.ipynb b/.ipynb_checkpoints/not_so_obvious_python_stuff-checkpoint.ipynb index bf44f43..508aebf 100644 --- a/.ipynb_checkpoints/not_so_obvious_python_stuff-checkpoint.ipynb +++ b/.ipynb_checkpoints/not_so_obvious_python_stuff-checkpoint.ipynb @@ -1,7 +1,7 @@ { "metadata": { "name": "", - "signature": "sha256:90703033799353a31e4e44d4a78b991bbc5f3fceb2709614057dd367c91b2b0f" + "signature": "sha256:d8ba69c66769cf62e5201b70ed7d717913017f6f09492848ce164b50068bd2ba" }, "nbformat": 3, "nbformat_minor": 0, @@ -24,14 +24,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### All code was executed in Python 3.4" + "#### All code was executed in Python 3.4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# A collection of not so obvious Python stuff you should know!" + "# A collection of not-so-obvious Python stuff you should know!" ] }, { @@ -59,7 +59,7 @@ "source": [ "# Sections\n", "- [The C3 class resolution algorithm for multiple class inheritance](#c3_class_res)\n", - "- [The behavior of += for lists](#pm_in_lists)\n", + "- [Using `+=` on lists creates new objects](#pm_in_lists)\n", "- [`True` and `False` in the datetime module](#datetime_module)\n", "- [Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity](#python_small_int)\n", "- [Shallow vs. deep copies if list contains other structures and objects](#shallow_vs_deep)\n", @@ -109,7 +109,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \"class A should be checked before class B\".\n", + "If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \n", + "Assuming that child class C inherits from two parent classes A and B, \"class A should be checked before class B\".\n", "\n", "If you want to learn more, please read the [original blog](http://python-history.blogspot.ru/2010/06/method-resolution-order.html) post by Guido van Rossum.\n", "\n", @@ -150,14 +151,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "So what actually happened above was that class `C` was looking in the parent class `A` for the method `.foo()` first (and found it)!" + "So what actually happened above was that class `C` looked in the scope of the parent class `A` for the method `.foo()` first (and found it)!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "I received an email with a nice suggestion using a more nested example to illustrate Guido van Rossum's point a little bit better:" + "I received an email containing a suggestion which uses a more nested example to illustrate Guido van Rossum's point a little bit better:" ] }, { @@ -197,7 +198,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`" + "Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`. " ] }, { @@ -213,7 +214,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## The behavior of `+=` for lists" + "## Using `+=` on lists creates new objects" ] }, { @@ -227,21 +228,23 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If we are using the `+=` operator on lists, we extend the list by modifying the object directly. However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:\n", + "Python `list`s are mutable objects as we all know. So, if we are using the `+=` operator on `list`s, we extend the `list` by directly modifying the object directly. \n", "\n", - "(Original source: [http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists](http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists))" + "However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:" ] }, { "cell_type": "code", "collapsed": false, "input": [ - "list_a = []\n", - "print('ID of list_a', id(list_a))\n", - "list_a += [1]\n", - "print('ID of list_a after `+= [1]`', id(list_a))\n", - "list_a = list_a + [2]\n", - "print('ID of list_a after `list_a = list_a + [2]`', id(list_a))" + "a_list = []\n", + "print('ID:', id(a_list))\n", + "\n", + "a_list += [1]\n", + "print('ID (+=):', id(a_list))\n", + "\n", + "a_list = a_list + [2]\n", + "print('ID (list = list + ...):', id(a_list))" ], "language": "python", "metadata": {}, @@ -250,13 +253,56 @@ "output_type": "stream", "stream": "stdout", "text": [ - "ID of list_a 4356429080\n", - "ID of list_a after `+= [1]` 4356429080\n", - "ID of list_a after `list_a = list_a + [2]` 4356453584\n" + "ID: 4366496544\n", + "ID (+=): 4366496544\n", + "ID (list = list + ...): 4366495472\n" ] } ], - "prompt_number": 2 + "prompt_number": 6 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Just for reference, the `.append()` and `.extends()` methods are modifying the `list` object in place, just as expected." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "a_list = []\n", + "print('ID:',id(a_list))\n", + "\n", + "a_list.append(1)\n", + "print('ID (append):',id(a_list))\n", + "\n", + "a_list.append(2)\n", + "print('ID (extend):',id(a_list))" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "ID: 4366495544\n", + "ID (append): 4366495544\n", + "ID (extend): 4366495544\n" + ] + } + ], + "prompt_number": 7 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [], + "language": "python", + "metadata": {}, + "outputs": [] }, { "cell_type": "markdown", @@ -280,10 +326,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "\"it often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that,\n", - "unlike any other time value, midnight (i.e. datetime.time(0,0,0)) is False.\n", - "A long discussion on the python-ideas mailing list shows that, while surprising,\n", - "that behavior is desirable\u2014at least in some quarters.\"\n", + "\"It often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that, unlike any other time value, midnight (i.e. `datetime.time(0,0,0)`) is False. A long discussion on the python-ideas mailing list shows that, while surprising, that behavior is desirable\u2014at least in some quarters.\" \n", "\n", "(Original source: [http://lwn.net/SubscriberLink/590299/bf73fe823974acea/](http://lwn.net/SubscriberLink/590299/bf73fe823974acea/))" ] @@ -301,9 +344,9 @@ "input": [ "import datetime\n", "\n", - "print('\"datetime.time(0,0,0)\" (Midnight) evaluates to', bool(datetime.time(0,0,0)))\n", + "print('\"datetime.time(0,0,0)\" (Midnight) ->', bool(datetime.time(0,0,0)))\n", "\n", - "print('\"datetime.time(1,0,0)\" (1 am) evaluates to', bool(datetime.time(1,0,0)))" + "print('\"datetime.time(1,0,0)\" (1 am) ->', bool(datetime.time(1,0,0)))" ], "language": "python", "metadata": {}, @@ -312,12 +355,12 @@ "output_type": "stream", "stream": "stdout", "text": [ - "\"datetime.time(0,0,0)\" (Midnight) evaluates to False\n", - "\"datetime.time(1,0,0)\" (1 am) evaluates to True\n" + "\"datetime.time(0,0,0)\" (Midnight) -> False\n", + "\"datetime.time(1,0,0)\" (1 am) -> True\n" ] } ], - "prompt_number": 4 + "prompt_number": 8 }, { "cell_type": "markdown", @@ -333,7 +376,7 @@ "metadata": {}, "source": [ "\n", - "## Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity\n", + "## Python reuses objects for small integers - use \"==\" for equality, \"is\" for identity\n", "\n" ] }, @@ -348,14 +391,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong)).\n", - "\n", - "\n", - "(*I received a comment that this is in fact a CPython artefact and must not necessarily be true in all implementations of Python!*)\n", - "\n", - "So the take home message is: always use \"==\" for equality, \"is\" for identity!\n", - "\n", - "Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it using \"boxes\" (for people with C background) and \"name tags\" (in the case of Python):" + "This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong))." ] }, { @@ -367,9 +403,9 @@ "print('a is b', bool(a is b))\n", "True\n", "\n", - "a = 999\n", - "b = 999\n", - "print('a is b', bool(a is b))" + "c = 999\n", + "d = 999\n", + "print('c is d', bool(c is d))" ], "language": "python", "metadata": {}, @@ -379,25 +415,38 @@ "stream": "stdout", "text": [ "a is b True\n", - "a is b False\n" + "c is d False\n" ] } ], - "prompt_number": 5 + "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ - "#### Another popular example to illustrate the reuse of objects for small integers is:" + "(*I received a comment that this is in fact a CPython artefact and **must not necessarily be true** in all implementations of Python!*)\n", + "\n", + "So the take home message is: always use \"==\" for equality, \"is\" for identity!\n", + "\n", + "Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it by comparing \"boxes\" (C language) with \"name tags\" (Python)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This example demonstrates that this applies indeed for integers in the range in -5 to 256:" ] }, { "cell_type": "code", "collapsed": false, "input": [ - "print(256 is 257 - 1)\n", - "print(257 is 258 - 1)" + "print('256 is 257-1', 256 is 257-1)\n", + "print('257 is 258-1', 257 is 258 - 1)\n", + "print('-5 is -6+1', -5 is -6+1)\n", + "print('-7 is -6-1', -7 is -6-1)" ], "language": "python", "metadata": {}, @@ -406,12 +455,14 @@ "output_type": "stream", "stream": "stdout", "text": [ - "True\n", - "False\n" + "256 is 257-1 True\n", + "257 is 258-1 False\n", + "-5 is -6+1 True\n", + "-7 is -6-1 False\n" ] } ], - "prompt_number": 2 + "prompt_number": 11 }, { "cell_type": "markdown", @@ -447,7 +498,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### And this example shows when `==` does not necessarilu implies that two objects are the same:" + "We would think that identity would always imply equality, but this is not always true, as we can see in the next example:" ] }, { @@ -455,8 +506,8 @@ "collapsed": false, "input": [ "a = float('nan')\n", - "print('a == a,', a == a)\n", - "print('a is a,', a is a)" + "print('a is a,', a is a)\n", + "print('a == a,', a == a)" ], "language": "python", "metadata": {}, @@ -465,12 +516,12 @@ "output_type": "stream", "stream": "stdout", "text": [ - "a == a, False\n", - "a is a, True\n" + "a is a, True\n", + "a == a, False\n" ] } ], - "prompt_number": 7 + "prompt_number": 12 }, { "cell_type": "markdown", @@ -500,41 +551,28 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "List modification of the original list does affect shallow copies, but not deep copies if the list contains compound objects." + "**Shallow copy**: \n", + "If we use the assignment operator to assign one list to another list, we just create a new name reference to the original list. If we want to create a new list object, we have to make a copy of the original list. This can be done via `a_list[:]` of `a_list.copy()`." ] }, { "cell_type": "code", "collapsed": false, "input": [ - "from copy import deepcopy\n", + "list1 = [1,2]\n", + "list2 = list1 # reference\n", + "list3 = list1[:] # shallow copy\n", + "list4 = list1.copy() # shallow copy\n", "\n", - "my_first_list = [[1],[2]]\n", - "my_second_list = [[1],[2]]\n", - "print('my_first_list == my_second_list:', my_first_list == my_second_list)\n", - "print('my_first_list is my_second_list:', my_first_list is my_second_list)\n", + "print('IDs:\\nlist1: {}\\nlist2: {}\\nlist3: {}\\nlist4: {}\\n'\n", + " .format(id(list1), id(list2), id(list3), id(list4)))\n", "\n", - "my_third_list = my_first_list\n", - "print('my_first_list == my_third_list:', my_first_list == my_third_list)\n", - "print('my_first_list is my_third_list:', my_first_list is my_third_list)\n", + "list2[0] = 3\n", + "print('list1:', list1)\n", "\n", - "my_shallow_copy = my_first_list[:]\n", - "print('my_first_list == my_shallow_copy:', my_first_list == my_shallow_copy)\n", - "print('my_first_list is my_shallow_copy:', my_first_list is my_shallow_copy)\n", - "\n", - "my_deep_copy = deepcopy(my_first_list)\n", - "print('my_first_list == my_deep_copy:', my_first_list == my_deep_copy)\n", - "print('my_first_list is my_deep_copy:', my_first_list is my_deep_copy)\n", - "\n", - "print('\\nmy_third_list:', my_third_list)\n", - "print('my_shallow_copy:', my_shallow_copy)\n", - "print('my_deep_copy:', my_deep_copy)\n", - "\n", - "my_first_list[0][0] = 2\n", - "print('after setting \"my_first_list[0][0] = 2\"')\n", - "print('my_third_list:', my_third_list)\n", - "print('my_shallow_copy:', my_shallow_copy)\n", - "print('my_deep_copy:', my_deep_copy)" + "list3[0] = 4\n", + "list4[1] = 4\n", + "print('list1:', list1)" ], "language": "python", "metadata": {}, @@ -543,26 +581,70 @@ "output_type": "stream", "stream": "stdout", "text": [ - "my_first_list == my_second_list: True\n", - "my_first_list is my_second_list: False\n", - "my_first_list == my_third_list: True\n", - "my_first_list is my_third_list: True\n", - "my_first_list == my_shallow_copy: True\n", - "my_first_list is my_shallow_copy: False\n", - "my_first_list == my_deep_copy: True\n", - "my_first_list is my_deep_copy: False\n", + "IDs:\n", + "list1: 4377955288\n", + "list2: 4377955288\n", + "list3: 4377955432\n", + "list4: 4377954784\n", "\n", - "my_third_list: [[1], [2]]\n", - "my_shallow_copy: [[1], [2]]\n", - "my_deep_copy: [[1], [2]]\n", - "after setting \"my_first_list[0][0] = 2\"\n", - "my_third_list: [[2], [2]]\n", - "my_shallow_copy: [[2], [2]]\n", - "my_deep_copy: [[1], [2]]\n" + "list1: [3, 2]\n", + "list1: [3, 2]\n" ] } ], - "prompt_number": 7 + "prompt_number": 23 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Deep copy** \n", + "As we have seen above, a shallow copy works fine if we want to create a new list with contents of the original list which we want to modify independently. \n", + "\n", + "However, if we are dealing with compound objects (e.g., lists that contain other lists, [read here](https://docs.python.org/2/library/copy.html) for more information) it becomes a little trickier.\n", + "\n", + "In the case of compound objects, a shallow copy would create a new compound object, but it would just insert the references to the contained objects into the new compound object. In contrast, a deep copy would go \"deeper\" and create also new objects \n", + "for the objects found in the original compound object. \n", + "If you follow the code, the concept should become more clear:" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "from copy import deepcopy\n", + "\n", + "list1 = [[1],[2]]\n", + "list2 = list1.copy() # shallow copy\n", + "list3 = deepcopy(list1) # deep copy\n", + "\n", + "print('IDs:\\nlist1: {}\\nlist2: {}\\nlist3: {}\\n'\n", + " .format(id(list1), id(list2), id(list3)))\n", + "\n", + "list2[0][0] = 3\n", + "print('list1:', list1)\n", + "\n", + "list3[0][0] = 5\n", + "print('list1:', list1)" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "IDs:\n", + "list1: 4377956296\n", + "list2: 4377961752\n", + "list3: 4377954928\n", + "\n", + "list1: [[3], [2]]\n", + "list1: [[3], [2]]\n" + ] + } + ], + "prompt_number": 25 }, { "cell_type": "markdown", @@ -752,7 +834,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Be aware using `in` checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"." + "Be aware of what is happening when combining \"`in`\" checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"." ] }, { @@ -783,17 +865,18 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**We can circumvent this problem by using a simple list, though:**" + "Although this defeats the purpose of an generator (in most cases), we can convert a generator into a list to circumvent the problem. " ] }, { "cell_type": "code", "collapsed": false, "input": [ - "l = [i for i in range(5)]\n", - "print('2 in l,', 2 in l)\n", - "print('3 in l,', 3 in l)\n", - "print('1 in l,', 1 in l) " + "gen = (i for i in range(5))\n", + "a_list = list(gen)\n", + "print('2 in l,', 2 in a_list)\n", + "print('3 in l,', 3 in a_list)\n", + "print('1 in l,', 1 in a_list) " ], "language": "python", "metadata": {}, @@ -808,7 +891,7 @@ ] } ], - "prompt_number": 10 + "prompt_number": 27 }, { "cell_type": "markdown", diff --git a/benchmarks/timeit_tests.ipynb b/benchmarks/timeit_tests.ipynb index 96bc0e8..06ab26d 100644 --- a/benchmarks/timeit_tests.ipynb +++ b/benchmarks/timeit_tests.ipynb @@ -1,7 +1,7 @@ { "metadata": { "name": "", - "signature": "sha256:0827512c142a04f764ebbbcad51defe8005ffc48f52010c4fa1ac24eda4d9c13" + "signature": "sha256:4f74947620f3ebd04a28a448392a201107339760170f1eb74e815a3e8b8267e8" }, "nbformat": 3, "nbformat_minor": 0, @@ -1023,6 +1023,14 @@ "# Comprehesions vs. for-loops" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Comprehensions are not only shorter and prettier than ye goode olde for-loop, \n", + "but they are also up to ~1.2x faster." + ] + }, { "cell_type": "code", "collapsed": false, @@ -1149,7 +1157,24 @@ ], "language": "python", "metadata": {}, - "outputs": [] + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "10000 loops, best of 3: 129 \u00b5s per loop\n", + "10000 loops, best of 3: 111 \u00b5s per loop" + ] + }, + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "\n" + ] + } + ], + "prompt_number": 25 }, { "cell_type": "markdown", @@ -1172,7 +1197,7 @@ "language": "python", "metadata": {}, "outputs": [], - "prompt_number": 15 + "prompt_number": 26 }, { "cell_type": "code", @@ -1184,7 +1209,7 @@ "language": "python", "metadata": {}, "outputs": [], - "prompt_number": 17 + "prompt_number": 27 }, { "cell_type": "code", @@ -1200,8 +1225,8 @@ "output_type": "stream", "stream": "stdout", "text": [ - "10000 loops, best of 3: 120 \u00b5s per loop\n", - "10000 loops, best of 3: 118 \u00b5s per loop" + "10000 loops, best of 3: 121 \u00b5s per loop\n", + "10000 loops, best of 3: 127 \u00b5s per loop" ] }, { @@ -1212,7 +1237,7 @@ ] } ], - "prompt_number": 18 + "prompt_number": 28 }, { "cell_type": "code", diff --git a/not_so_obvious_python_stuff.ipynb b/not_so_obvious_python_stuff.ipynb index bf44f43..508aebf 100644 --- a/not_so_obvious_python_stuff.ipynb +++ b/not_so_obvious_python_stuff.ipynb @@ -1,7 +1,7 @@ { "metadata": { "name": "", - "signature": "sha256:90703033799353a31e4e44d4a78b991bbc5f3fceb2709614057dd367c91b2b0f" + "signature": "sha256:d8ba69c66769cf62e5201b70ed7d717913017f6f09492848ce164b50068bd2ba" }, "nbformat": 3, "nbformat_minor": 0, @@ -24,14 +24,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### All code was executed in Python 3.4" + "#### All code was executed in Python 3.4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# A collection of not so obvious Python stuff you should know!" + "# A collection of not-so-obvious Python stuff you should know!" ] }, { @@ -59,7 +59,7 @@ "source": [ "# Sections\n", "- [The C3 class resolution algorithm for multiple class inheritance](#c3_class_res)\n", - "- [The behavior of += for lists](#pm_in_lists)\n", + "- [Using `+=` on lists creates new objects](#pm_in_lists)\n", "- [`True` and `False` in the datetime module](#datetime_module)\n", "- [Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity](#python_small_int)\n", "- [Shallow vs. deep copies if list contains other structures and objects](#shallow_vs_deep)\n", @@ -109,7 +109,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \"class A should be checked before class B\".\n", + "If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \n", + "Assuming that child class C inherits from two parent classes A and B, \"class A should be checked before class B\".\n", "\n", "If you want to learn more, please read the [original blog](http://python-history.blogspot.ru/2010/06/method-resolution-order.html) post by Guido van Rossum.\n", "\n", @@ -150,14 +151,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "So what actually happened above was that class `C` was looking in the parent class `A` for the method `.foo()` first (and found it)!" + "So what actually happened above was that class `C` looked in the scope of the parent class `A` for the method `.foo()` first (and found it)!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "I received an email with a nice suggestion using a more nested example to illustrate Guido van Rossum's point a little bit better:" + "I received an email containing a suggestion which uses a more nested example to illustrate Guido van Rossum's point a little bit better:" ] }, { @@ -197,7 +198,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`" + "Here, class `D` searches in `B` first, which in turn inherits from `A` (note that class `C` also inherits from `A`, but has its own `.foo()` method) so that we come up with the search order: `D, B, C, A`. " ] }, { @@ -213,7 +214,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## The behavior of `+=` for lists" + "## Using `+=` on lists creates new objects" ] }, { @@ -227,21 +228,23 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If we are using the `+=` operator on lists, we extend the list by modifying the object directly. However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:\n", + "Python `list`s are mutable objects as we all know. So, if we are using the `+=` operator on `list`s, we extend the `list` by directly modifying the object directly. \n", "\n", - "(Original source: [http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists](http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists))" + "However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:" ] }, { "cell_type": "code", "collapsed": false, "input": [ - "list_a = []\n", - "print('ID of list_a', id(list_a))\n", - "list_a += [1]\n", - "print('ID of list_a after `+= [1]`', id(list_a))\n", - "list_a = list_a + [2]\n", - "print('ID of list_a after `list_a = list_a + [2]`', id(list_a))" + "a_list = []\n", + "print('ID:', id(a_list))\n", + "\n", + "a_list += [1]\n", + "print('ID (+=):', id(a_list))\n", + "\n", + "a_list = a_list + [2]\n", + "print('ID (list = list + ...):', id(a_list))" ], "language": "python", "metadata": {}, @@ -250,13 +253,56 @@ "output_type": "stream", "stream": "stdout", "text": [ - "ID of list_a 4356429080\n", - "ID of list_a after `+= [1]` 4356429080\n", - "ID of list_a after `list_a = list_a + [2]` 4356453584\n" + "ID: 4366496544\n", + "ID (+=): 4366496544\n", + "ID (list = list + ...): 4366495472\n" ] } ], - "prompt_number": 2 + "prompt_number": 6 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Just for reference, the `.append()` and `.extends()` methods are modifying the `list` object in place, just as expected." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "a_list = []\n", + "print('ID:',id(a_list))\n", + "\n", + "a_list.append(1)\n", + "print('ID (append):',id(a_list))\n", + "\n", + "a_list.append(2)\n", + "print('ID (extend):',id(a_list))" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "ID: 4366495544\n", + "ID (append): 4366495544\n", + "ID (extend): 4366495544\n" + ] + } + ], + "prompt_number": 7 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [], + "language": "python", + "metadata": {}, + "outputs": [] }, { "cell_type": "markdown", @@ -280,10 +326,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "\"it often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that,\n", - "unlike any other time value, midnight (i.e. datetime.time(0,0,0)) is False.\n", - "A long discussion on the python-ideas mailing list shows that, while surprising,\n", - "that behavior is desirable\u2014at least in some quarters.\"\n", + "\"It often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that, unlike any other time value, midnight (i.e. `datetime.time(0,0,0)`) is False. A long discussion on the python-ideas mailing list shows that, while surprising, that behavior is desirable\u2014at least in some quarters.\" \n", "\n", "(Original source: [http://lwn.net/SubscriberLink/590299/bf73fe823974acea/](http://lwn.net/SubscriberLink/590299/bf73fe823974acea/))" ] @@ -301,9 +344,9 @@ "input": [ "import datetime\n", "\n", - "print('\"datetime.time(0,0,0)\" (Midnight) evaluates to', bool(datetime.time(0,0,0)))\n", + "print('\"datetime.time(0,0,0)\" (Midnight) ->', bool(datetime.time(0,0,0)))\n", "\n", - "print('\"datetime.time(1,0,0)\" (1 am) evaluates to', bool(datetime.time(1,0,0)))" + "print('\"datetime.time(1,0,0)\" (1 am) ->', bool(datetime.time(1,0,0)))" ], "language": "python", "metadata": {}, @@ -312,12 +355,12 @@ "output_type": "stream", "stream": "stdout", "text": [ - "\"datetime.time(0,0,0)\" (Midnight) evaluates to False\n", - "\"datetime.time(1,0,0)\" (1 am) evaluates to True\n" + "\"datetime.time(0,0,0)\" (Midnight) -> False\n", + "\"datetime.time(1,0,0)\" (1 am) -> True\n" ] } ], - "prompt_number": 4 + "prompt_number": 8 }, { "cell_type": "markdown", @@ -333,7 +376,7 @@ "metadata": {}, "source": [ "\n", - "## Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity\n", + "## Python reuses objects for small integers - use \"==\" for equality, \"is\" for identity\n", "\n" ] }, @@ -348,14 +391,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong)).\n", - "\n", - "\n", - "(*I received a comment that this is in fact a CPython artefact and must not necessarily be true in all implementations of Python!*)\n", - "\n", - "So the take home message is: always use \"==\" for equality, \"is\" for identity!\n", - "\n", - "Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it using \"boxes\" (for people with C background) and \"name tags\" (in the case of Python):" + "This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, [see the doc](https://docs.python.org/2/c-api/int.html#PyInt_FromLong))." ] }, { @@ -367,9 +403,9 @@ "print('a is b', bool(a is b))\n", "True\n", "\n", - "a = 999\n", - "b = 999\n", - "print('a is b', bool(a is b))" + "c = 999\n", + "d = 999\n", + "print('c is d', bool(c is d))" ], "language": "python", "metadata": {}, @@ -379,25 +415,38 @@ "stream": "stdout", "text": [ "a is b True\n", - "a is b False\n" + "c is d False\n" ] } ], - "prompt_number": 5 + "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ - "#### Another popular example to illustrate the reuse of objects for small integers is:" + "(*I received a comment that this is in fact a CPython artefact and **must not necessarily be true** in all implementations of Python!*)\n", + "\n", + "So the take home message is: always use \"==\" for equality, \"is\" for identity!\n", + "\n", + "Here is a [nice article](http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) explaining it by comparing \"boxes\" (C language) with \"name tags\" (Python)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This example demonstrates that this applies indeed for integers in the range in -5 to 256:" ] }, { "cell_type": "code", "collapsed": false, "input": [ - "print(256 is 257 - 1)\n", - "print(257 is 258 - 1)" + "print('256 is 257-1', 256 is 257-1)\n", + "print('257 is 258-1', 257 is 258 - 1)\n", + "print('-5 is -6+1', -5 is -6+1)\n", + "print('-7 is -6-1', -7 is -6-1)" ], "language": "python", "metadata": {}, @@ -406,12 +455,14 @@ "output_type": "stream", "stream": "stdout", "text": [ - "True\n", - "False\n" + "256 is 257-1 True\n", + "257 is 258-1 False\n", + "-5 is -6+1 True\n", + "-7 is -6-1 False\n" ] } ], - "prompt_number": 2 + "prompt_number": 11 }, { "cell_type": "markdown", @@ -447,7 +498,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### And this example shows when `==` does not necessarilu implies that two objects are the same:" + "We would think that identity would always imply equality, but this is not always true, as we can see in the next example:" ] }, { @@ -455,8 +506,8 @@ "collapsed": false, "input": [ "a = float('nan')\n", - "print('a == a,', a == a)\n", - "print('a is a,', a is a)" + "print('a is a,', a is a)\n", + "print('a == a,', a == a)" ], "language": "python", "metadata": {}, @@ -465,12 +516,12 @@ "output_type": "stream", "stream": "stdout", "text": [ - "a == a, False\n", - "a is a, True\n" + "a is a, True\n", + "a == a, False\n" ] } ], - "prompt_number": 7 + "prompt_number": 12 }, { "cell_type": "markdown", @@ -500,41 +551,28 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "List modification of the original list does affect shallow copies, but not deep copies if the list contains compound objects." + "**Shallow copy**: \n", + "If we use the assignment operator to assign one list to another list, we just create a new name reference to the original list. If we want to create a new list object, we have to make a copy of the original list. This can be done via `a_list[:]` of `a_list.copy()`." ] }, { "cell_type": "code", "collapsed": false, "input": [ - "from copy import deepcopy\n", + "list1 = [1,2]\n", + "list2 = list1 # reference\n", + "list3 = list1[:] # shallow copy\n", + "list4 = list1.copy() # shallow copy\n", "\n", - "my_first_list = [[1],[2]]\n", - "my_second_list = [[1],[2]]\n", - "print('my_first_list == my_second_list:', my_first_list == my_second_list)\n", - "print('my_first_list is my_second_list:', my_first_list is my_second_list)\n", + "print('IDs:\\nlist1: {}\\nlist2: {}\\nlist3: {}\\nlist4: {}\\n'\n", + " .format(id(list1), id(list2), id(list3), id(list4)))\n", "\n", - "my_third_list = my_first_list\n", - "print('my_first_list == my_third_list:', my_first_list == my_third_list)\n", - "print('my_first_list is my_third_list:', my_first_list is my_third_list)\n", + "list2[0] = 3\n", + "print('list1:', list1)\n", "\n", - "my_shallow_copy = my_first_list[:]\n", - "print('my_first_list == my_shallow_copy:', my_first_list == my_shallow_copy)\n", - "print('my_first_list is my_shallow_copy:', my_first_list is my_shallow_copy)\n", - "\n", - "my_deep_copy = deepcopy(my_first_list)\n", - "print('my_first_list == my_deep_copy:', my_first_list == my_deep_copy)\n", - "print('my_first_list is my_deep_copy:', my_first_list is my_deep_copy)\n", - "\n", - "print('\\nmy_third_list:', my_third_list)\n", - "print('my_shallow_copy:', my_shallow_copy)\n", - "print('my_deep_copy:', my_deep_copy)\n", - "\n", - "my_first_list[0][0] = 2\n", - "print('after setting \"my_first_list[0][0] = 2\"')\n", - "print('my_third_list:', my_third_list)\n", - "print('my_shallow_copy:', my_shallow_copy)\n", - "print('my_deep_copy:', my_deep_copy)" + "list3[0] = 4\n", + "list4[1] = 4\n", + "print('list1:', list1)" ], "language": "python", "metadata": {}, @@ -543,26 +581,70 @@ "output_type": "stream", "stream": "stdout", "text": [ - "my_first_list == my_second_list: True\n", - "my_first_list is my_second_list: False\n", - "my_first_list == my_third_list: True\n", - "my_first_list is my_third_list: True\n", - "my_first_list == my_shallow_copy: True\n", - "my_first_list is my_shallow_copy: False\n", - "my_first_list == my_deep_copy: True\n", - "my_first_list is my_deep_copy: False\n", + "IDs:\n", + "list1: 4377955288\n", + "list2: 4377955288\n", + "list3: 4377955432\n", + "list4: 4377954784\n", "\n", - "my_third_list: [[1], [2]]\n", - "my_shallow_copy: [[1], [2]]\n", - "my_deep_copy: [[1], [2]]\n", - "after setting \"my_first_list[0][0] = 2\"\n", - "my_third_list: [[2], [2]]\n", - "my_shallow_copy: [[2], [2]]\n", - "my_deep_copy: [[1], [2]]\n" + "list1: [3, 2]\n", + "list1: [3, 2]\n" ] } ], - "prompt_number": 7 + "prompt_number": 23 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Deep copy** \n", + "As we have seen above, a shallow copy works fine if we want to create a new list with contents of the original list which we want to modify independently. \n", + "\n", + "However, if we are dealing with compound objects (e.g., lists that contain other lists, [read here](https://docs.python.org/2/library/copy.html) for more information) it becomes a little trickier.\n", + "\n", + "In the case of compound objects, a shallow copy would create a new compound object, but it would just insert the references to the contained objects into the new compound object. In contrast, a deep copy would go \"deeper\" and create also new objects \n", + "for the objects found in the original compound object. \n", + "If you follow the code, the concept should become more clear:" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "from copy import deepcopy\n", + "\n", + "list1 = [[1],[2]]\n", + "list2 = list1.copy() # shallow copy\n", + "list3 = deepcopy(list1) # deep copy\n", + "\n", + "print('IDs:\\nlist1: {}\\nlist2: {}\\nlist3: {}\\n'\n", + " .format(id(list1), id(list2), id(list3)))\n", + "\n", + "list2[0][0] = 3\n", + "print('list1:', list1)\n", + "\n", + "list3[0][0] = 5\n", + "print('list1:', list1)" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "IDs:\n", + "list1: 4377956296\n", + "list2: 4377961752\n", + "list3: 4377954928\n", + "\n", + "list1: [[3], [2]]\n", + "list1: [[3], [2]]\n" + ] + } + ], + "prompt_number": 25 }, { "cell_type": "markdown", @@ -752,7 +834,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Be aware using `in` checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"." + "Be aware of what is happening when combining \"`in`\" checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"." ] }, { @@ -783,17 +865,18 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**We can circumvent this problem by using a simple list, though:**" + "Although this defeats the purpose of an generator (in most cases), we can convert a generator into a list to circumvent the problem. " ] }, { "cell_type": "code", "collapsed": false, "input": [ - "l = [i for i in range(5)]\n", - "print('2 in l,', 2 in l)\n", - "print('3 in l,', 3 in l)\n", - "print('1 in l,', 1 in l) " + "gen = (i for i in range(5))\n", + "a_list = list(gen)\n", + "print('2 in l,', 2 in a_list)\n", + "print('3 in l,', 3 in a_list)\n", + "print('1 in l,', 1 in a_list) " ], "language": "python", "metadata": {}, @@ -808,7 +891,7 @@ ] } ], - "prompt_number": 10 + "prompt_number": 27 }, { "cell_type": "markdown",