python_reference/not_so_obvious_python_stuff.ipynb

{
 "metadata": {
  "name": "",
  "signature": "sha256:faa74a34746bf250ef2d72e308074083ee5e60789203d70f630f8c67a709e6fe"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Sebastian Raschka  \n",
      "last updated: 04/15/2014\n",
      "\n",
      "[Link to this IPython Notebook on GitHub](https://github.com/rasbt/python_reference/blob/master/not_so_obvious_python_stuff.ipynb)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "All code was executed in Python 3.4"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# A collection of not so obvious Python stuff you should know!"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# Sections\n",
      "- [The C3 class resolution algorithm for multiple class inheritance](#c3_class_res)\n",
      "- [The behavior of += for lists](#pm_in_lists)\n",
      "- [`True` and `False` in the datetime module](#datetime_module)\n",
      "- [Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity](#python_small_int)\n",
      "- [Shallow vs. deep copies if list contains other structures and objects](#shallow_vs_deep)\n",
      "- [Picking True values from and and or expressions](#false_true_expressions)\n",
      "- [Don't use mutable objects as default arguments for functions!](#def_mutable_func)\n",
      "- [Be aware of the consuming generator](#consuming_generator)\n",
      "- [`bool` is a subclass of `int`](#bool_int)\n",
      "- [About lambda and closures-in-a-loop pitfall](#lambda_closure)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='c3_class_res'></a>\n",
      "## The C3 class resolution algorithm for multiple class inheritance"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies: \"class A should be checked before class B\".\n",
      "\n",
      "If you want to learn more, please read the [original blog](http://python-history.blogspot.ru/2010/06/method-resolution-order.html) post by Guido van Rossum.\n",
      "\n",
      "(Original source: [http://gistroll.com/rolls/21/horizontal_assessments/new](http://gistroll.com/rolls/21/horizontal_assessments/new))"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class A(object):\n",
      "    def foo(self):\n",
      "        print(\"class A\")\n",
      "\n",
      "class B(object):\n",
      "    def foo(self):\n",
      "        print(\"class B\")\n",
      "\n",
      "class C(A, B):\n",
      "    pass\n",
      "\n",
      "C().foo()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "class A\n"
       ]
      }
     ],
     "prompt_number": 2
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='pm_in_lists'></a>\n",
      "## The behavior of `+=` for lists"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "If we are using the `+=` operator on lists, we extend the list by modifying the object directly. However, if we use the assigment via `my_list = my_list + ...`, we create a new list object, which can be demonstrated by the following code:\n",
      "\n",
      "(Original source: [http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists](http://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists))"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "list_a = []\n",
      "print('ID of list_a', id(list_a))\n",
      "list_a += [1]\n",
      "print('ID of list_a after `+= [1]`', id(list_a))\n",
      "list_a = list_a + [2]\n",
      "print('ID of list_a after `list_a = list_a + [2]`', id(list_a))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "ID of list_a 4356439144\n",
        "ID of list_a after `+= [1]` 4356439144\n",
        "ID of list_a after `list_a = list_a + [2]` 4356446112\n"
       ]
      }
     ],
     "prompt_number": 3
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='datetime_module'></a>\n",
      "## `True` and `False` in the datetime module\n",
      "\n",
      "\"it often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that,\n",
      "unlike any other time value, midnight (i.e. datetime.time(0,0,0)) is False.\n",
      "A long discussion on the python-ideas mailing list shows that, while surprising,\n",
      "that behavior is desirable\u2014at least in some quarters.\"\n",
      "\n",
      "(Original source: [http://lwn.net/SubscriberLink/590299/bf73fe823974acea/](http://lwn.net/SubscriberLink/590299/bf73fe823974acea/))"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import datetime\n",
      "\n",
      "print('\"datetime.time(0,0,0)\" (Midnight) evaluates to', bool(datetime.time(0,0,0)))\n",
      "\n",
      "print('\"datetime.time(1,0,0)\" (1 am) evaluates to', bool(datetime.time(1,0,0)))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\"datetime.time(0,0,0)\" (Midnight) evaluates to False\n",
        "\"datetime.time(1,0,0)\" (1 am) evaluates to True\n"
       ]
      }
     ],
     "prompt_number": 4
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='python_small_int'></a>\n",
      "## Python reuses objects for small integers - always use \"==\" for equality, \"is\" for identity\n",
      "\n",
      "This oddity occurs, because Python tends to stores small integers as the same object, but not so for larger ones!  \n",
      "(*I received a comment that this is in fact a CPython artefact and must not necessarily be true in all implementations of Python!*)\n",
      "\n",
      "So the take home message is: always use \"==\" for equality, \"is\" for identity!"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "a = 1\n",
      "b = 1\n",
      "print('a is b', bool(a is b))\n",
      "True\n",
      "\n",
      "a = 999\n",
      "b = 999\n",
      "print('a is b', bool(a is b))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "a is b True\n",
        "a is b False\n"
       ]
      }
     ],
     "prompt_number": 5
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "#### Another popular example to illustrate the reuse of objects for small integers is:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "print(256 is 257 - 1)\n",
      "print(257 is 258 - 1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "True\n",
        "False\n"
       ]
      }
     ],
     "prompt_number": 2
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "#### And to illustrate the test for equality (`==`) vs. identity (`is`):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "a = 'hello world!'\n",
      "b = 'hello world!'\n",
      "print('a is b,', a is b)\n",
      "print('a == b,', a == b)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "a is b, False\n",
        "a == b, True\n"
       ]
      }
     ],
     "prompt_number": 6
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "#### And this example shows when `==` does not necessarilu implies that two objects are the same:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "a = float('nan')\n",
      "print('a == a,', a == a)\n",
      "print('a is a,', a is a)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "a == a, False\n",
        "a is a, True\n"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='shallow_vs_deep'></a>\n",
      "## Shallow vs. deep copies if list contains other structures and objects\n",
      "\n",
      "List modification of the original list does affect shallow copies, but not deep copies if the list contains compound objects."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from copy import deepcopy\n",
      "\n",
      "my_first_list = [[1],[2]]\n",
      "my_second_list = [[1],[2]]\n",
      "print('my_first_list == my_second_list:', my_first_list == my_second_list)\n",
      "print('my_first_list is my_second_list:', my_first_list is my_second_list)\n",
      "\n",
      "my_third_list = my_first_list\n",
      "print('my_first_list == my_third_list:', my_first_list == my_third_list)\n",
      "print('my_first_list is my_third_list:', my_first_list is my_third_list)\n",
      "\n",
      "my_shallow_copy = my_first_list[:]\n",
      "print('my_first_list == my_shallow_copy:', my_first_list == my_shallow_copy)\n",
      "print('my_first_list is my_shallow_copy:', my_first_list is my_shallow_copy)\n",
      "\n",
      "my_deep_copy = deepcopy(my_first_list)\n",
      "print('my_first_list == my_deep_copy:', my_first_list == my_deep_copy)\n",
      "print('my_first_list is my_deep_copy:', my_first_list is my_deep_copy)\n",
      "\n",
      "print('\\nmy_third_list:', my_third_list)\n",
      "print('my_shallow_copy:', my_shallow_copy)\n",
      "print('my_deep_copy:', my_deep_copy)\n",
      "\n",
      "my_first_list[0][0] = 2\n",
      "print('after setting \"my_first_list[0][0] = 2\"')\n",
      "print('my_third_list:', my_third_list)\n",
      "print('my_shallow_copy:', my_shallow_copy)\n",
      "print('my_deep_copy:', my_deep_copy)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "my_first_list == my_second_list: True\n",
        "my_first_list is my_second_list: False\n",
        "my_first_list == my_third_list: True\n",
        "my_first_list is my_third_list: True\n",
        "my_first_list == my_shallow_copy: True\n",
        "my_first_list is my_shallow_copy: False\n",
        "my_first_list == my_deep_copy: True\n",
        "my_first_list is my_deep_copy: False\n",
        "\n",
        "my_third_list: [[1], [2]]\n",
        "my_shallow_copy: [[1], [2]]\n",
        "my_deep_copy: [[1], [2]]\n",
        "after setting \"my_first_list[0][0] = 2\"\n",
        "my_third_list: [[2], [2]]\n",
        "my_shallow_copy: [[2], [2]]\n",
        "my_deep_copy: [[1], [2]]\n"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='false_true_expressions'></a>\n",
      "## Picking `True` values from `and` and `or` expressions\n",
      "\n",
      "If both values of in a `or` expression are True, Python will select the first one, and the second one in `and` expressions\n",
      "\n",
      "(Original source: [http://gistroll.com/rolls/21/horizontal_assessments/new](http://gistroll.com/rolls/21/horizontal_assessments/new))"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "result = (2 or 3) * (5 and 7)\n",
      "print('2 * 7 =', result)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2 * 7 = 14\n"
       ]
      }
     ],
     "prompt_number": 9
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "And a fun fact"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='def_mutable_func'></a>\n",
      "## Don't use mutable objects as default arguments for functions!"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Don't use mutable objects (e.g., dictionaries, lists, sets, etc.) as default arguments for functions! You might expect that a new list is created every time when we call the function without providing an argument for the default parameter, but this is not the case: Python will create the mutable object (default parameter) only the first time the function is called, see the following code:\n",
      "\n",
      "(Original source: [http://docs.python-guide.org/en/latest/writing/gotchas/](http://docs.python-guide.org/en/latest/writing/gotchas/)"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def append_to_list(value, def_list=[]):\n",
      "    def_list.append(value)\n",
      "    return def_list\n",
      "\n",
      "my_list = append_to_list(1)\n",
      "print(my_list)\n",
      "\n",
      "my_other_list = append_to_list(2)\n",
      "print(my_other_list)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "[1]\n",
        "[1, 2]\n"
       ]
      }
     ],
     "prompt_number": 1
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='consuming_generators'></a>\n",
      "\n",
      "## Be aware of the consuming generator\n",
      "\n",
      "Be aware using `in` checks with generators, since they won't evaluate from the beginning once a position is \"consumed\"."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "gen = (i for i in range(5))\n",
      "print('2 in gen,', 2 in gen)\n",
      "print('3 in gen,', 3 in gen)\n",
      "print('1 in gen,', 1 in gen) "
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2 in gen, True\n",
        "3 in gen, True\n",
        "1 in gen, False\n"
       ]
      }
     ],
     "prompt_number": 9
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**We can circumvent this problem by using a simple list, though:**"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "l = [i for i in range(5)]\n",
      "print('2 in l,', 2 in l)\n",
      "print('3 in l,', 3 in l)\n",
      "print('1 in l,', 1 in l) "
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2 in l, True\n",
        "3 in l, True\n",
        "1 in l, True\n"
       ]
      }
     ],
     "prompt_number": 10
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='bool_int'></a>\n",
      "\n",
      "## `bool` is a subclass of `int`\n",
      "\n",
      "Chicken or egg? In the history of Python (Python 2.2 to be specific) truth values were implemented via 1 and 0 (similar to the old C), to avoid syntax error in old (but perfectly working) code, `bool` was added as a subclass of `int` in Python 2.3.\n",
      "\n",
      "Original source: [http://www.peterbe.com/plog/bool-is-int](http://www.peterbe.com/plog/bool-is-int)"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "print('isinstance(True, int):', isinstance(True, int))\n",
      "print('True + True:', True + True)\n",
      "print('3*True:', 3*True)\n",
      "print('3*True - False:', 3*True - False)\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "isinstance(True, int): True\n",
        "True + True: 2\n",
        "3*True: 3\n",
        "3*True - False: 3\n"
       ]
      }
     ],
     "prompt_number": 16
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<br>\n",
      "<br>\n",
      "<a name='lambda_closure'></a>\n",
      "\n",
      "## About lambda and closures-in-a-loop pitfall\n",
      "\n",
      "The following example illustrates how the (last) `lambda` is being reused:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "my_list = [lambda: i for i in range(5)]\n",
      "for l in my_list:\n",
      "    print(l())"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "4\n",
        "4\n",
        "4\n",
        "4\n",
        "4\n"
       ]
      }
     ],
     "prompt_number": 24
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**Here, a generator can save you some pain:**"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "my_gen = (lambda: n for n in range(5))\n",
      "for l in my_gen:\n",
      "    print(l())"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "0\n",
        "1\n",
        "2\n",
        "3\n",
        "4\n"
       ]
      }
     ],
     "prompt_number": 25
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}