python_reference/benchmarks/timeit_tests.ipynb

3832 lines
1.2 MiB
Plaintext
Raw Normal View History

2014-03-25 19:36:28 +00:00
{
"metadata": {
2014-04-13 23:22:30 +00:00
"name": "",
"signature": "sha256:a4749ce2a9f843d9846081abaa9265690ebabfaf5a0aa18877f65945b1f56805"
2014-03-25 19:36:28 +00:00
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
2014-04-13 23:44:19 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-05-07 22:38:42 +00:00
"[Sebastian Raschka](http://sebastianraschka.com)\n",
"last updated: 05/07/2014 \n",
2014-04-14 18:28:42 +00:00
"\n",
2014-05-07 22:38:42 +00:00
"- [Link to this IPython Notebook on GitHub](https://github.com/rasbt/python_reference/blob/master/benchmarks/timeit_tests.ipynb) \n",
"- [Link to the GitHub repository](https://github.com/rasbt/python_reference/blob/master/benchmarks/timeit_tests.ipynb) \n"
2014-04-14 18:28:42 +00:00
]
},
2014-04-22 17:30:30 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<hr>\n",
"I am really looking forward to your comments and suggestions to improve and extend this collection! Just send me a quick note \n",
2014-05-08 01:57:53 +00:00
"via Twitter: [&#64;rasbt](https://twitter.com/rasbt) \n",
2014-04-22 17:30:30 +00:00
"or Email: [bluewoodtree@gmail.com](mailto:bluewoodtree@gmail.com)\n",
"<hr>"
]
},
2014-04-14 18:28:42 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-13 23:44:19 +00:00
"# Python benchmarks via `timeit`"
]
},
2014-05-07 22:38:42 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Code was executed in Python 3.4.0"
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"sections\"></a>\n",
"<br>\n",
"<br>"
]
},
2014-04-13 23:22:30 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-14 15:48:34 +00:00
"# Sections\n",
2014-04-14 18:28:42 +00:00
"- [String operations](#string_operations)\n",
" - [String formatting: .format() vs. binary operator %s](#str_format_bin)\n",
" - [String reversing: [::-1] vs. `''.join(reversed())`](#str_reverse)\n",
" - [String concatenation: `+=` vs. `''.join()`](#string_concat)\n",
" - [Assembling strings](#string_assembly) \n",
2014-04-22 17:30:30 +00:00
" - [Testing if a string is an integer](#is_integer)\n",
" - [Testing if a string is a number](#is_number)\n",
2014-04-14 18:28:42 +00:00
"- [List operations](#list_operations)\n",
" - [List reversing: [::-1] vs. reverse() vs. reversed()](#list_reverse)\n",
" - [Creating lists using conditional statements](#create_cond_list)\n",
"- [Dictionary operations](#dict_ops) \n",
2014-04-24 21:25:45 +00:00
" - [Adding elements to a dictionary](#adding_dict_elements)\n",
2014-04-26 05:15:18 +00:00
"- [Comprehensions vs. for-loops](#comprehensions)\n",
2014-04-26 18:40:28 +00:00
"- [Copying files by searching directory trees](#find_copy)\n",
2014-05-01 20:07:40 +00:00
"- [Returning column vectors slicing through a numpy array](#row_vectors)\n",
2014-05-07 07:04:41 +00:00
"- [Speed of numpy functions vs Python built-ins and std. lib.](#numpy)\n",
2014-05-07 22:38:42 +00:00
" - [`sum()` vs. `numpy.sum()`](#np_sum)\n",
" - [`range()` vs. `numpy.arange()`](#np_arange)\n",
" - [`statistics.mean()` vs. `numpy.mean()`](#np_mean)\n",
2014-05-08 18:57:48 +00:00
"- [Cython vs. regular (C)Python](#cython)\n",
2014-05-08 22:59:11 +00:00
"- [Numba vs. Cython vs. regular (C)Python & NumPy](#numba)"
2014-04-24 21:25:45 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='string_operations'></a>"
2014-04-14 15:48:34 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-14 18:28:42 +00:00
"# String operations"
2014-04-14 15:48:34 +00:00
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-14 15:48:34 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='str_format_bin'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-14 18:28:42 +00:00
"\n",
2014-04-24 21:25:45 +00:00
"## String formatting: `.format()` vs. binary operator `%s`\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-05-07 07:04:41 +00:00
"We expect the string `.format()` method to perform slower than %, because it is doing the formatting for each object itself, where formatting via the binary % is hard-coded for known types. But let's see how big the difference really is..."
2014-04-13 23:22:30 +00:00
]
},
2014-03-25 19:36:28 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"\n",
2014-05-07 07:04:41 +00:00
"n = 10000\n",
2014-03-25 19:36:28 +00:00
"\n",
2014-05-07 07:04:41 +00:00
"def test_format(n):\n",
" return ['{}'.format(i) for i in range(n)]\n",
2014-03-25 19:36:28 +00:00
"\n",
2014-05-07 07:04:41 +00:00
"def test_binaryop(n):\n",
" return ['%s' %i for i in range(n)]\n",
2014-03-25 19:36:28 +00:00
"\n",
2014-05-07 07:04:41 +00:00
"%timeit test_format(n)\n",
"%timeit test_binaryop(n)"
2014-03-25 19:36:28 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"100 loops, best of 3: 4.01 ms per loop\n",
"1000 loops, best of 3: 1.82 ms per loop"
2014-03-25 19:36:28 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 131
2014-03-25 19:36:28 +00:00
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['test_format', 'test_binaryop']\n",
"\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(n)' %f, \n",
" 'from __main__ import %s, n' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-07 22:38:42 +00:00
"prompt_number": 132
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('test_format', '.format() method'), \n",
" ('test_binaryop', 'binary operator %')] \n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of different string formatting methods')\n",
2014-05-07 22:38:42 +00:00
"max_perf = max( f/b for f,b in zip(times_n['test_format'],\n",
" times_n['test_binaryop']) )\n",
"min_perf = min( f/b for f,b in zip(times_n['test_format'],\n",
" times_n['test_binaryop']) )\n",
" \n",
"ftext = 'The binary op. % is {:.2f}x to {:.2f}x faster than .format()'\\\n",
" .format(min_perf, max_perf) \n",
2014-05-07 07:04:41 +00:00
"plt.figtext(.14,.75, ftext, fontsize=11, ha='left')\n",
2014-05-07 22:38:42 +00:00
"\n",
"\n",
2014-05-07 07:04:41 +00:00
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
2014-05-07 22:38:42 +00:00
"png": "iVBORw0KGgoAAAANSUhEUgAAAnIAAAIECAYAAACdVcNJAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXlczdn/x1/3tq9KKRKtpL0ou9xIYxlDvr7WqNDIzDBj\nJssYJDPMDD/rMBgzlH0wtrGTVkK2UJSiZCktKgm3bu/fH3376NZt1b0tzvPxuA99zuec93mf9+fc\nz317n/f5fHhERGAwGAwGg8FgNDv4ja0Ag8FgMBgMBqN+MEeOwWAwGAwGo5nCHDkGg8FgMBiMZgpz\n5BgMBoPBYDCaKcyRYzAYDAaDwWimMEeOwWAwGAwGo5nCHDlGs6O4uBhTpkyBrq4u+Hw+IiIiGlul\nZsmBAwdgZmYGeXl5TJkypdbtlixZgk6dOlV5XJXssLAw2NjYQFFREQMGDGiYQTRDgoKCoKCgILP+\n6nudmzJ8Ph979uxpbDWkgkAggK+vr1Rke3t7Y9CgQVKRzWg8mCPHkAre3t7g8/ng8/lQUFCAsbEx\nZsyYgZycnA+W/c8//2Dv3r04fvw40tPT0atXrwbQ+ONCJBJhypQpGDduHNLS0rBu3bp6y5ozZw6u\nXLlSo+wZM2bAyckJjx49wqFDhz54DA2Bubk5AgMDP1iOvLw8duzYUau648aNw7Nnzz64z9rQkNdZ\nmuzatQt8fuWfIzc3N/j4+FQqT09Px3/+8x9ZqCY1qhozj8cDj8eTWr/SlM1oHOQbWwFGy8XFxQX7\n9+9HcXExrl27Bl9fX6SlpeH48eP1kicUCqGoqIgHDx6gffv26Nmz5wfpVybvY+TZs2d4/fo1hgwZ\ngnbt2n2QLDU1NaipqVUrm4iQlJSEH374Ae3bt693X0QEkUgEefmGuXU11I8aj8dDTc9WL9NdWVkZ\nysrKDdJvTTTUdS4qKpJpFLEm9PT0GluFZgt7B0ALhBgMKeDl5UVubm5iZcuWLSM5OTl6+/YtERHt\n3buX7O3tSVlZmYyNjenbb7+l169fc/X79+9PU6dOpYULF1K7du2obdu2JBAIiMfjcR8TExMiIhIK\nhTRv3jxq3749KSoqkpWVFe3Zs0esfx6PR+vXr6fx48dTq1ataOzYsbR9+3aSl5en0NBQsrGxIRUV\nFXJ1daXnz5/ThQsXyN7entTU1MjNzY2ePn3KyXr48CF5eHiQgYEBqaqqkq2tLe3cuVOsv/79+9O0\nadNo6dKl1LZtW2rdujVNnjyZCgoKxOrt27ePunbtSsrKyqSjo0NDhgyhly9fcufXr19PFhYWpKys\nTJ06daJly5ZRcXFxtfaPjo6mfv36kYqKCmlra9OECRPoxYsXRES0fft2MRvyeDwKDw+XKOfNmzfk\n5+dHrVq1Im1tbZoxYwbNnz+fzM3NuToBAQHccUXZfD6fwsLCKvUXHBxMREQPHjygUaNGkZaWFmlr\na5O7uzvduXOHk13++jg4OJCioiKdPn2ahEIhBQQEkImJCSkrK5O1tTVt2bKl0vX+/fffydPTkzQ0\nNMjQ0JB+/vlnsetTUa/U1FSJdrh79y65u7uTlpYWqampkaWlJXe9jYyMKo25Kt1PnTrFlVcc48WL\nF8nR0ZFUVVWpW7duFBMTI6bD+fPnycbGhpSVlcnBwYEiIiKIx+PRrl27JOpc3XU+ceIEde3alZSU\nlEhPT4+++OILse9e2fd3/fr1ZGRkRHJycvTmzRvi8Xj022+/0ZgxY0hNTY2MjIzo0KFDlJOTQ+PG\njSMNDQ0yNTWlf/75R0yXBQsWkKWlJamqqlKHDh3Iz8+P8vLyiIgoNDS0kp7e3t7k7e1dpf4Vx13T\ntSYiysrKotGjR5Oamhq1bduWAgMDJd6nyvPo0SPi8Xi0Z88ecnd3J1VVVbK0tKTIyEhKTU2lTz75\nhNTU1MjKyooiIyPF2lY3tyWN2cfHh4iIBAJBre4bK1euJBMTE1JUVCQzMzNau3at2Pns7GzuOunr\n69PChQtp8uTJYuONjIyk3r17k4aGBmloaJC9vT2dOXOmSnswmibMkWNIBS8vLxo0aJBY2apVq4jH\n41FBQQFt376dtLW1adeuXfTo0SOKiIggOzs7mjRpEle/f//+pKGhQTNmzKB79+7R3bt3KScnh/z9\n/cnExIQyMjIoKyuLiIj8/f1JR0eHDh48SA8ePKDly5cTn8+nkJAQTh6PxyMdHR3auHEjPXz4kB48\neEDbt28nPp9Prq6udPXqVbpx4wZ16tSJ+vbtSy4uLnTlyhW6desWdenShcaOHcvJunPnDm3cuJFu\n375NDx8+pN9++4370S6vv5aWFn377beUkJBAZ8+epdatW9OiRYu4Otu2bSMFBQX66aefuDFu2LCB\nG1dAQAAZGRnRkSNHKCUlhU6ePEkdO3YUk1GR58+fk4aGBk2cOJHu3r1LUVFRZGdnRy4uLkRU6pzF\nxMQQj8ejf//9lzIyMkgoFEqU9c0335Cenh4dO3aMEhISyN/fnzQ1NalTp05cnYCAAO64Ktnp6enc\nj21GRga9efOG0tPTSV9fn7744gu6e/cuJSYm0syZM0lHR4cyMzOJiLjr06NHDwoLC6NHjx5RZmYm\neXl5kb29PZ07d45SUlLo77//Ji0tLfrrr7/Erre+vj79+eef9PDhQ9q4cSPxeDxuTuTk5JCJiQnN\nmTOHMjIyKCMjg0QikUQ72Nra0sSJE+nevXv06NEjOnXqFB0/fpyIiDIzM0leXp7Wr1/PyalOd0mO\nHJ/Pp/79+1NUVBTdv3+fhgwZQiYmJpzD/uTJE1JRUSFfX1+6d+8ehYSEUNeuXYnH49Hu3bsl6lzV\ntYiNjSU5OTluXp46dYo6duwo9t3z8vIiTU1NGjVqFN2+fZvu3r1LxcXFxOPxqG3btrRjxw5KTk6m\nL774gtTU1Mjd3Z2Cg4MpOTmZZs6cSWpqapSdnc3J++mnnygqKopSU1MpJCSEunTpQl5eXkRU+p+w\nsmtTZr/8/HzKy8sjFxcXGjduHFdeNk8rjruma01ENHz4cLKwsKCwsDCKi4sjHx8f0tLSqnSfKk+Z\nI2dmZkZHjx6lxMRE8vDwoPbt25NAIKAjR45QYmIijR49mjp06EBFRUVERDXO7arGTFS7+8aGDRtI\nRUWFtm7dSklJSbR582ZSVlYWm/8jR46kTp06UWhoKMXFxZGnpydpampy4y0qKiJtbW367rvvKCkp\niZKSkujIkSOVHFJG04c5cgypUPF/unFxcWRqakq9evUiotIoRsUISnh4OPF4PMrNzSWi0huahYVF\nJdnlI0BERK9fvyYlJSXatGmTWD0PDw8aMGAAd8zj8WjatGlidcqiFrGxsVzZypUricfj0Y0bN7iy\nNWvWkK6ubrVjHjFiBPn6+nLH/fv3JwcHB7E6M2bM4GxARNShQweaOXOmRHmvX78mVVXVSv9DDg4O\nJi0trSr1WLhwodiPChFRbGws8Xg8ioiIIKL3P1AXL16sUk5BQQEpKyvTn3/+KVbu5ORUyZErfz2q\nkl3xxzcgIIB69uwpVqekpEQsulB2faKiorg6Dx8+JD6fTwkJCWJtAwMDxezN4/Ho66+/FqtjaWlJ\n33//PXdsbm5OgYGBVdqgjFatWlFQUFCV5+Xl5bkoYxmSdC8rr+jI8Xg8unnzJld25coV4vF4lJiY\nSESlES0TExMqKSnh6pw+fbpaR45I8rXw9PSkHj16iNU7evQo8fl8evz4MRGVfn+1tbXFonREpTad\nPXs2d5yZmUk8Ho9mzZrFlb18+ZJ4PB6dOHGiSr0OHTpESkpK3PHOnTuJx+NVqufm5sZFqirqUdGR\nq+5aJyYmEo/HowsXLnDni4qKqEOHDrVy5NatW8eVlTnHq1ev5spu3rxJPB6P4uLiiKh2c7uqMdfm\nvmFoaEjz5s0TqzN79mwyNTUlotJoII/Ho/Pnz3PnhUIhtW/fnhtvTk4O8Xg8CgsLq3L8jOYB2+zA\nkBphYWHQ0NCAqqoqbG1tYW5u
2014-05-07 07:04:41 +00:00
"text": [
2014-05-07 22:38:42 +00:00
"<matplotlib.figure.Figure at 0x107efc590>"
2014-05-07 07:04:41 +00:00
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 135
2014-05-07 07:04:41 +00:00
},
2014-04-13 23:22:30 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-14 15:48:34 +00:00
"<a name='str_reverse'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
2014-04-14 18:28:42 +00:00
"## String reversing: `[::-1]` vs. `''.join(reversed())`"
2014-04-13 23:22:30 +00:00
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-03-25 19:36:28 +00:00
{
"cell_type": "code",
"collapsed": false,
2014-04-13 23:22:30 +00:00
"input": [
2014-04-13 23:44:19 +00:00
"import timeit\n",
"\n",
2014-04-13 23:22:30 +00:00
"def reverse_join(my_str):\n",
" return ''.join(reversed(my_str))\n",
" \n",
"def reverse_slizing(my_str):\n",
" return my_str[::-1]\n",
"\n",
2014-05-07 07:04:41 +00:00
"test_str = 'abcdefg'\n",
2014-04-13 23:22:30 +00:00
"\n",
"# Test to show that both work\n",
2014-05-07 07:04:41 +00:00
"a = reverse_join(test_str)\n",
"b = reverse_slizing(test_str)\n",
"assert(a == b and a == 'gfedcba')\n",
2014-04-13 23:22:30 +00:00
"\n",
2014-05-07 07:04:41 +00:00
"%timeit reverse_join(test_str)\n",
"%timeit reverse_slizing(test_str)"
2014-04-13 23:22:30 +00:00
],
2014-03-25 19:36:28 +00:00
"language": "python",
"metadata": {},
2014-04-13 23:22:30 +00:00
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"1000000 loops, best of 3: 1.33 \u00b5s per loop\n",
"1000000 loops, best of 3: 268 ns per loop"
2014-04-13 23:22:30 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 10
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['reverse_join', 'reverse_slizing']\n",
"\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"test_strings = (test_str*n for n in orders_n)\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for st,n in zip(test_strings, orders_n):\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(st)' %f, \n",
" 'from __main__ import %s, st' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-07 22:38:42 +00:00
"prompt_number": 11
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('reverse_join', '\"\".join(reversed(my_str))'), \n",
" ('reverse_slizing', 'my_str[::-1]')] \n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
2014-05-07 22:38:42 +00:00
" plt.plot([n*len(test_str) for n in orders_n], \n",
" times_n[lb[0]], alpha=0.5, label=lb[1], marker='o', lw=3)\n",
2014-05-07 07:04:41 +00:00
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of different string reversing methods')\n",
2014-05-07 22:38:42 +00:00
"max_perf = max( j/s for j,s in zip(times_n['reverse_join'],\n",
" times_n['reverse_slizing']) )\n",
"min_perf = min( j/s for j,s in zip(times_n['reverse_join'],\n",
" times_n['reverse_slizing']) )\n",
" \n",
2014-05-07 07:04:41 +00:00
"ftext = 'my_str[::-1] is {:.2f}x to {:.2f}x faster than \"\".join(reversed(my_str))'\\\n",
" .format(min_perf, max_perf)\n",
"plt.figtext(.14,.75, ftext, fontsize=11, ha='left')\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
2014-05-07 22:38:42 +00:00
"png": "iVBORw0KGgoAAAANSUhEUgAAAnIAAAIECAYAAACdVcNJAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XdUFNfbB/DvLEWaFEGwIiBSRYqALQoCgog91oiABSOa\nn72FqEAsMWo01iSiEWsM9hIrAlZUooiKgCiIFTsKiNT7/sG7ExYWWIpL8fmcs+cwd2bu3Hl2ZvZy\n584djjHGQAghhBBC6h1BbReAEEIIIYRUDVXkCCGEEELqKarIEUIIIYTUU1SRI4QQQgipp6giRwgh\nhBBST1FFjhBCCCGknqKKHKn38vPzMXbsWGhpaUEgEOD8+fO1XaR6ae/evWjbti1kZWUxduxYidcL\nDAxEu3btypwuK+/IyEi0b98e8vLycHJyqpmdqIdCQkIgJydX28Wosx4+fAiBQIDLly/XdlE+Cz09\nPSxZsuSz5O3o6AhfX9/PkjepO6giR6TCx8cHAoEAAoEAcnJy0NPTg5+fH96+fVvtvPfv34+//voL\nx44dQ1paGrp06VIDJf6yFBQUYOzYsRgxYgQeP36MNWvWVDmv2bNn4+rVqxXm7efnB1tbW6SkpODA\ngQPV3oeaYGhoiKCgoGrnIysri+3bt0u07IgRI/Ds2bNqb7Oh0tXVRVpaGuzt7Wu7KNWyePFi6Ovr\nl0rnOA4cx32WbX7OvEndIVvbBSBfjh49eiA0NBT5+fn4999/4evri8ePH+PYsWNVyi83Nxfy8vJI\nSkpCy5Yt0blz52qVT5jfl+jZs2fIysqCu7s7mjdvXq28lJWVoaysXG7ejDHcv38fP/zwA1q2bFnl\nbTHGUFBQAFnZmrmU1dSPHsdxqGisdWHZFRQUoKCgUCPbrazCwkIAgEBQO//TS3LOCQQCaGtrS6lE\nhNQ/1CJHpEZOTg7a2tpo0aIF+vfvj6lTp+LkyZPIyckBAOzZswdWVlZQVFSEvr4+Zs6ciY8fP/Lr\nOzo6Yvz48ViwYAFatGiBNm3aoGfPnli4cCGSk5MhEAhgYGAAAMjLy8O8efPQqlUrNGrUCObm5vjr\nr79EyiMQCLBu3Tp88803UFdXh5eXF3+bKzIyEhYWFlBSUoKTkxPS0tIQEREBKysrqKiooFevXiKt\nKCkpKRg8eDBatmwJZWVldOjQATt37hTZnvA2x6JFi9C8eXNoamrC29sbWVlZIsv9/fff6NixIxQV\nFaGlpYU+ffogPT2dn79u3TqYmJhAUVERRkZGWLp0KQoKCsqN/ZUrV9CjRw8oKSmhSZMmGDVqFF69\negWg6NZemzZtABRVtsu7Pf3p0yf4+flBXV0dTZo0waRJk/jvT6j4rdWSecvIyODcuXOQkZFBQUEB\nvLy8IBAI+Nar+/fv4+uvv4aGhgaaNGkCNzc33Llzh8+7+PdjbW0NBQUFnD17Fnl5eQgMDISBgQEU\nFRXRvn17bNq0qdT3/dtvv2H06NFQVVVF69atsWzZMpHv58GDBwgKCuJbjx89eiQ2DnFxcXBzc4OG\nhgZUVFRgZmbGf996enooKCjAmDFjIBAIICMjU2bZw8LCSt1aFU5fvnwZNjY2UFZWhq2tLf7991+R\nMpw9exYWFhZQVFSEtbU1Lly4AIFAgF27doktc/HvJjQ0FCYmJmjUqBGSkpKQmZmJqVOnolWrVlBW\nVoaNjQ0OHjzIr9etWzd8++23pfIzNTXFwoUL+enKnMPNmzeHnp4eAODw4cOwtraGsrIyNDQ00KlT\nJ9y8eRNA6Vurwum9e/eib9++UFZWRtu2bbFt2zaRsqWkpMDV1RWKiorQ09PDH3/8UeGtxsjISAgE\nApw4cQJdunSBkpIS7OzsEB8fj1u3bqFbt25QVlZGp06dEB8fL7Lu9evX4erqisaNG0NbWxtff/01\nf/yEhIRg4cKFSE1N5Y+tH3/8kV83JycHU6dOhaamJpo1a4YZM2aInNOSXM9SU1PRu3dvKCkpQVdX\nF+vWrSu1f+XFmdRjjBAp8Pb2Zr169RJJ++WXXxjHcSwzM5Nt3bqVaWhosJ07d7KUlBR2/vx51qFD\nBzZ69Gh+eQcHB9a4cWPm5+fH4uPj2Z07d9jbt2/ZrFmzmL6+Pnvx4gV7/fo1Y4yxWbNmMU1NTbZv\n3z6WlJTEli5dygQCATt79iyfH8dxTFNTk23YsIElJyezpKQktnXrViYQCFjPnj3ZtWvX2I0bN1i7\ndu3YV199xXr06MGuXr3Kbt68yUxMTNjw4cP5vG7fvs02bNjAbt26xZKTk9m6deuYrKwsi4iIECm/\nuro6mzFjBktMTGSnT59mTZo0YQsWLOCX+fPPP5mcnBxbvHgxv4/r16/n9ysgIIC1adOGHTp0iD18\n+JAdP36c6erqiuRR0vPnz1njxo3ZqFGj2J07d9jFixdZhw4dWI8ePRhjjGVnZ7Po6GjGcRw7evQo\ne/HiBcvNzRWb17Rp05i2tjY7cuQIS0xMZLNmzWKqqqqsXbt2/DIBAQH8dFl5p6WlMY7j2MaNG9mL\nFy9YdnY2S0tLYzo6OmzSpEnszp077N69e+x///sf09TUZK9evWKMMf776dSpE4uMjGQpKSns1atX\nzNvbm1laWrIzZ86whw8fsr///pupq6uzLVu2iHzfOjo6bPPmzSw5OZlt2LCBcRzHHxNv375l+vr6\nbPbs2ezFixfsxYsXrKCgQGwcLCws2KhRo1h8fDxLSUlhJ06cYMeOHWOMMfbq1SsmKyvL1q5dy+dT\nXtm3bt3KZGVl+byFyzk4OLCLFy+yhIQE5u7uzvT19Vl+fj5jjLEnT54wRUVF5uvry+Lj49nZs2eZ\njY0N4ziO7dq1q8xjISAggCkpKTFHR0d27do1lpSUxDIyMpijoyPr2bMnu3TpEktJSWGbNm1i8vLy\nfGw2bdrENDQ0WE5ODp/X1atXGcdxLCkpiS93Vc7h58+fMzk5ObZixQr28OFDlpCQwP766y92+/Zt\nxhhjKSkpjOM4dunSJZFpAwMDtnfvXvbgwQPm7+/PZGVl2b179xhjjBUWFjJLS0vWuXNnFh0dzW7e\nvMn69OnD1NTUmK+vb5nxiYiIYBzHMRsbGxYREcHu3r3LunTpwjp06MC6devGwsPDWXx8PPvqq69Y\np06d+PXi4uKYiooKCwwMZImJiezOnTts6NChzMjIiH369IllZ2ezefPmsdatW/PHRFZWFmOMsTZt\n2jANDQ32888/s/v377PQ0FAmJycncuxWdD0rLCxk1tbWzN7enl27do3dvHmT9erVi6mqqvL7W1Gc\nSf1FFTkiFd7e3szFxYWfjouLYwYGBqxLly6MsaKL2R9//CGyzrlz5xjHcSw9PZ0xVvQjYGxsXCrv\ngIAAZmhoyE9nZWWxRo0asd9++01kuUGDBjEnJyd+muM4Nn78eJFltm7dyjiOY7GxsXzaihUrGMdx\n7MaNG3za6tWrmZaWVrn7PGDAAJEfDQcHB2ZlZSWyjJ+fHx8Dxhhr3bo1+9///ic2v6ysLKakpMRO\nnTolkr5t2zamrq5eZjnmz5/PWrduzfLy8vi02NhYxnEcO3/+PGOs9I+lOJmZmUxBQYFt3rxZJN3W\n1rZURa7491FW3iUrHQEBAaxz584iyxQWFrK2bduyX3/9lTH23/dz8eJFfpnk5GQmEAhYYmKiyLpB\nQUEi8eY4jk2dOlVkGVNTU/b999/z04aGhiwoKKjMGAipqamxkJCQMufLysqybdu2iaSJK7swvWRF\njuM4FhMTw6cJK03Cioq/vz/T19dnhYWF/DInT56UqCInEAjY48eP+bSIiAimoKDA3r9/L7LsmDFj\n2MCBAxljjL17944pKiqyvXv38vMnT57Munbtyk9X9Ry+ceMG4ziOPXz4UGyZy6rIrV69ml+moKCA\nNW7cmG3atIkxxtjp06cZx3HswYMH/DJv375lSkpKElXkDh8+zKft3buXcRzHDhw4wKcdPHiQcRzH\nV8a8vb3ZiBEjRPL69OkTU1JS
2014-05-07 07:04:41 +00:00
"text": [
2014-05-07 22:38:42 +00:00
"<matplotlib.figure.Figure at 0x105a6fe90>"
2014-05-07 07:04:41 +00:00
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 13
2014-04-13 23:44:19 +00:00
},
2014-04-14 18:28:42 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='string_concat'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
2014-04-14 18:28:42 +00:00
"## String concatenation: `+=` vs. `''.join()`"
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-14 18:28:42 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Strings in Python are immutable objects. So, each time we append a character to a string, it has to be created \u201cfrom scratch\u201d in memory. Thus, the answer to the question \u201cWhat is the most efficient way to concatenate strings?\u201d is a quite obvious, but the relative numbers of performance gains are nonetheless interesting."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"\n",
"def string_add(in_chars):\n",
" new_str = ''\n",
" for char in in_chars:\n",
" new_str += char\n",
" return new_str\n",
"\n",
"def string_join(in_chars):\n",
" return ''.join(in_chars)\n",
"\n",
"test_chars = ['a', 'b', 'c', 'd', 'e', 'f']\n",
"\n",
"%timeit string_add(test_chars)\n",
2014-05-07 07:04:41 +00:00
"%timeit string_join(test_chars)"
2014-04-14 18:28:42 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"1000000 loops, best of 3: 764 ns per loop\n",
"1000000 loops, best of 3: 321 ns per loop"
2014-04-14 18:28:42 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 15
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['string_add', 'string_join']\n",
"\n",
2014-05-07 22:38:42 +00:00
"orders_n = [10**n for n in range(1, 6)]\n",
"test_chars_n = (test_chars*n for n in orders_n)\n",
2014-05-07 07:04:41 +00:00
"times_n = {f:[] for f in funcs}\n",
"\n",
2014-05-07 22:38:42 +00:00
"for st,n in zip(test_chars_n, orders_n):\n",
2014-05-07 07:04:41 +00:00
" for f in funcs:\n",
2014-05-07 22:38:42 +00:00
" times_n[f].append(min(timeit.Timer('%s(st)' %f, \n",
" 'from __main__ import %s, st' %f)\n",
" .repeat(repeat=3, number=1000)))"
2014-05-07 07:04:41 +00:00
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-07 22:38:42 +00:00
"prompt_number": 16
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
2014-05-07 22:38:42 +00:00
"outputs": [],
"prompt_number": 4
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('string_add', 'new_str += char'), \n",
" ('string_join', '\"\".join(chars)')] \n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
2014-05-07 22:38:42 +00:00
" plt.plot([len(test_chars)*n for n in orders_n], \n",
" times_n[lb[0]], alpha=0.5, label=lb[1], marker='o', lw=3)\n",
2014-05-07 07:04:41 +00:00
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
2014-05-07 22:38:42 +00:00
"#plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
2014-05-07 07:04:41 +00:00
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
2014-05-07 22:38:42 +00:00
"plt.title('Performance of different string concatenation methods')\n",
"max_perf = max( a/j for a,j in zip(times_n['string_add'],\n",
" times_n['string_join']) )\n",
"min_perf = min( a/j for a,j in zip(times_n['string_add'],\n",
" times_n['string_join']) )\n",
"\n",
"ftext = '\"\".join(chars) is {:.2f}x to {:.2f}x faster than new_str += char'\\\n",
" .format(min_perf, max_perf)\n",
2014-05-07 07:04:41 +00:00
"plt.figtext(.14,.75, ftext, fontsize=11, ha='left')\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
2014-05-07 22:38:42 +00:00
"png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAIECAYAAACUvmMzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXdYVEf3x7+7dAQEQVBEaSJdmgJBhaXYY/eNGhtiiRhf\nE2OJsQTQYDR5o8YSYwVFjUETE7tSxRYbCopIU4gNwQI2pJ7fH/vbCwsLLLgKi/N5nn3gzp0598yc\ne++enTkzwyMiAoPBYDAYDAZD7uE3tQIMBoPBYDAYDNnAHDsGg8FgMBiMFgJz7BgMBoPBYDBaCMyx\nYzAYDAaDwWghMMeOwWAwGAwGo4XAHDsGg8FgMBiMFgJz7BhNSllZGQICAqCnpwc+n4+EhISmVkku\n2bdvH8zNzaGoqIiAgACpywUHB8PCwqLW49pkx8fHw87ODsrKyvDx8ZFNJeSQ8PBwKCkpNbUaDCnh\n8/nYs2dPU6shc7Kzs8Hn83Hu3Ll3Ir+ltltLhTl2jHrx9/cHn88Hn8+HkpISTExMEBgYiKdPn761\n7D/++AO//fYbDh8+jNzcXHz00Ucy0PjDory8HAEBARg9ejTu3r2Ln3/+udGy5s2bhwsXLtQrOzAw\nEN26dcOdO3fw559/vnUdZEHnzp0REhLy1nIUFRWxc+dOqfKOHj0aDx48eOtrtiTu3bvX5D/S/Pz8\nMGnSpBrpubm5GDFiRBNoJDtqqxuDIUKxqRVgyAeenp6IjIxEWVkZLl++jKlTp+Lu3bs4fPhwo+SV\nlJRAWVkZGRkZ6NChA9zd3d9KP5G8D5EHDx7g1atX6N+/P9q3b/9Wslq1aoVWrVrVKZuIkJmZiUWL\nFqFDhw6NvhYRoby8HIqKsnkN8Xg8mcmpb912ke6qqqpQVVWVyXVbGs1x7Xt9ff2mVoHBePcQg1EP\nEydOJD8/P7G00NBQUlBQoDdv3hAR0W+//UYODg6kqqpKJiYm9NVXX9GrV6+4/F5eXjR58mRavHgx\ntW/fntq1a0cCgYB4PB73MTU1JSKikpIS+vrrr6lDhw6krKxMNjY2tGfPHrHr83g8Wrt2LY0ZM4Za\nt25No0aNorCwMFJUVKS4uDiys7MjNTU18vb2pocPH1JsbCw5ODhQq1atyM/Pj+7fv8/Jun37Ng0b\nNowMDQ1JXV2d7O3tKSIiQux6Xl5eNGXKFFq6dCm1a9eO2rRpQxMmTKCXL1+K5du7dy85OzuTqqoq\n6erqUv/+/enZs2fc+bVr15KlpSWpqqqShYUFhYaGUllZWZ3tf/78eerVqxepqamRjo4Offrpp5SX\nl0dERGFhYWJtyOPx6NSpUxLlFBUV0fTp06l169ako6NDgYGBtGDBAurcuTOXJygoiDuuLpvP51N8\nfHyN6+3YsYOIiDIyMmj48OGkra1NOjo61KdPH7p+/Tonu6p9HB0dSVlZmY4fP04lJSUUFBREpqam\npKqqSra2trRp06Ya9v7ll19o3LhxpKmpSUZGRvT999+L2ae6Xjk5ORLb4caNG9SnTx/S1tamVq1a\nkbW1NWdvY2PjGnWuTfdjx45x6dXrePbsWXJyciJ1dXVycXGhS5cuiekQHR1NdnZ2pKqqSo6OjpSQ\nkEA8Ho927dpVx51AFBUVRT179iR1dXVq3bo1eXl5UVZWFnf+xx9/JFNTU1JWViZzc3Nas2aNWHlj\nY2P69ttvadasWdSmTRsyMDCg2bNn17gH169fT9bW1qSiokL6+vo0YsQI7tzu3bvJ1dWVWrduTXp6\nejRw4EBKT08Xs5Wk55qI6OTJk+Th4UFqamrUoUMHmjRpEj158oQ7L3rXbNq0iTp16kRaWlo0ePBg\nevToEZenvud14sSJtT4TPB6Pdu/ezeV98OABjRo1irS1tUlNTY0EAgFdvnyZOx8XF0c8Ho+ioqKo\nV69epK6uTjY2NnTs2LE67SR6jiIjI8nc3JzU1dVp+PDh9OLFC4qMjKQuXbqQpqYmjRw5kgoLC8XK\n1vUura1ud+7cIR6PR5GRkTRw4EBSV1cnMzMzCg8PF5NdX32JiGJjY8ne3p5UVVWpa9euFBsbW6Pd\nQkNDyczMjFRUVKht27bUt29fKioqqrNNGO8P5tgx6mXixInUu3dvsbSffvqJeDwevXz5ksLCwkhH\nR4d27dpFd+7coYSEBOratSuNHz+ey+/l5UWampoUGBhIqampdOPGDXr69CnNnTuXTE1N6dGjR/T4\n8WMiIpo7dy7p6urS/v37KSMjg5YvX058Pp9iYmI4eTwej3R1dWnDhg10+/ZtysjIoLCwMOLz+eTt\n7U0XL16kxMREsrCwoJ49e5KnpydduHCBrl27RlZWVjRq1ChO1vXr12nDhg2UnJxMt2/fpnXr1nFf\n4lX119bWpq+++orS0tLo5MmT1KZNG1qyZAmXZ/v27aSkpETfffcdV8f169dz9QoKCiJjY2P666+/\nKDs7m44ePUqdOnUSk1Gdhw8fkqamJo0dO5Zu3LhBZ86coa5du5KnpycRCZ21S5cuEY/Ho0OHDtGj\nR4+opKREoqwvv/yS9PX16eDBg5SWlkZz584lLS0tsrCw4PIEBQVxx7XJzs3N5RytR48eUVFREeXm\n5pKBgQHNmDGDbty4Qenp6fTf//6XdHV1KT8/n4iIs4+bmxvFx8fTnTt3KD8/nyZOnEgODg4UFRVF\n2dnZ9Pvvv5O2tjZt27ZNzN4GBga0detWun37Nm3YsIF4PB53Tzx9+pRMTU1p3rx59OjRI3r06BGV\nl5dLbAd7e3saO3Yspaam0p07d+jYsWN0+PBhIiLKz88nRUVFWrt2LSenLt0lOXZ8Pp+8vLzozJkz\ndOvWLerfvz+ZmppyztO9e/dITU2Npk6dSqmpqRQTE0POzs41vjyrExUVRQoKCjR79mxKTk6mtLQ0\nCg8Pp7S0NCISOmNqamq0ZcsWyszMpF9//ZVUVVXF2tHY2Jh0dHRo5cqVlJmZSZGRkaSkpCSW59tv\nvyUNDQ3asGEDZWRk0LVr18Sc6LCwMDp8+DDdvn2brl27RoMHDyYLCwvuvrt69SrxeDw6cOCA2HMd\nExND6urqtH79esrMzKRLly6Rt7c3eXl5cbInTpxIrVu3pk8//ZRSUlLo/PnzZGpqKvYuqe95LSws\nJE9PTxo9ejRnQ5FuVdu4oqKCXF1dycnJic6ePUvXr1+nUaNGkY6ODqezyLFzcHCgEydOUGZmJk2a\nNIm0tLTEfrBVJygoiFq1akUff/wxXb9+nU6dOkVt27al3r1704ABAyg5OZnOnDlDBgYG9PXXX4u1\nbV3v0trqJnLszMzMaN++fZSVlUULFy4kRUVFzumWpr73798ndXV1CggIoNTUVIqKiiJ7e3uxdvvj\njz9IS0uLDh8+THfv3qVr167Rzz//zBy7ZgRz7Bj1Ur3HLiUlhczMzOijjz4iIuGXRfUellOnThGP\nx6OCggIiEjpGlpaWNWRX7SEiInr16hWpqKjQxo0bxfINGzaMfHx8uGMej0dTpkwRyyPqYUpKSuLS\nfvzxR+LxeJSYmMilrV69mvT09Oqs85AhQ2jq1KncsZeXFzk6OorlCQwM5NqAiKhjx4703//+V6K8\nV69ekbq6Op04cUIsfceOHaStrV2rHosXL6aOHTtSaWkpl5aUlEQ8Ho8SEhKIiLiX+tmzZ2uV8/Ll\nS1JVVaWtW7eKpXfr1q2GY1fVHrXJru6EBAUFkbu7u1ieiooKsV4jkX3OnDnD5bl9+zbx+XzOORER\nEhIi1t48Ho+++OILsTzW1tb0zTffcMedO3emkJCQWttAROvWrWv0ZFRFUVGR64UUIUl3UXp1x47H\n49HVq1e5tAsXLhCPx+O+YBcuXEimpqZUUVHB5Tl+/Hi9jl3Pnj1p0KBBtZ43MjIScxKIiGbPnk1m\nZmbcsbGxMQ0ZMkQsT//+/WnMmDFEVHmf/PTTT7VepzpPnjwhHo9H586dIyKiu3fvSuw59vLyErMX\nEVFOTo7YMztx4kQyMDAQ+3Gy
2014-05-07 07:04:41 +00:00
"text": [
2014-05-07 22:38:42 +00:00
"<matplotlib.figure.Figure at 0x106fa92d0>"
2014-05-07 07:04:41 +00:00
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 18
2014-04-14 18:28:42 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-24 21:25:45 +00:00
"\n",
2014-04-14 18:28:42 +00:00
"<a name='string_assembly'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-14 18:28:42 +00:00
"\n",
2014-04-24 21:25:45 +00:00
"## Assembling strings\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-14 18:28:42 +00:00
"Next, I wanted to compare different methods string \u201cassembly.\u201d This is different from simple string concatenation, which we have seen in the previous section, since we insert values into a string, e.g., from a variable."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"\n",
2014-05-07 22:38:42 +00:00
"n = 1000\n",
"\n",
"def plus_operator(n):\n",
" my_str = 'a'\n",
" for i in range(n):\n",
" my_str = my_str + str(1) + str(2)\n",
" return my_str \n",
2014-04-14 18:28:42 +00:00
" \n",
2014-05-07 22:38:42 +00:00
"def format_method(n):\n",
" my_str = 'a'\n",
" for i in range(n):\n",
" my_str = '{}{}{}'.format(my_str,1,2)\n",
2014-04-14 18:28:42 +00:00
" \n",
2014-05-07 22:38:42 +00:00
"def binary_operator(n):\n",
" my_str = 'a'\n",
" for i in range(n):\n",
" my_str = '%s%s%s' %(my_str,1,2)\n",
" return my_str\n",
2014-04-14 18:28:42 +00:00
"\n",
2014-05-07 22:38:42 +00:00
"%timeit plus_operator(n)\n",
"%timeit format_method(n)\n",
"%timeit binary_operator(n)"
2014-04-14 18:28:42 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"1000 loops, best of 3: 869 \u00b5s per loop\n",
"1000 loops, best of 3: 686 \u00b5s per loop"
2014-04-14 18:28:42 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"1000 loops, best of 3: 445 \u00b5s per loop"
2014-04-14 18:28:42 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 21
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['plus_operator', 'format_method', 'binary_operator']\n",
"\n",
"orders_n = [10**n for n in range(1, 5)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(n)' %f, \n",
" 'from __main__ import %s, n' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 23
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('plus_operator', 'my_str + str(1) + str(2)'), \n",
" ('format_method', '\"{}{}{}\".format(my_str,1,2)'),\n",
" ('binary_operator', '\"%s%s%s\" %(my_str,1,2)'),\n",
" ] \n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], \n",
" alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"#plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of different string assembly methods')\n",
"\n",
"max_perf = max( p/b for p,b in zip(times_n['plus_operator'],\n",
" times_n['binary_operator']) )\n",
"min_perf = min( p/b for p,b in zip(times_n['plus_operator'],\n",
" times_n['binary_operator']) )\n",
"\n",
"ftext = '\"%s%s%s\" %(my_str,1,2) is {:.2f}x to'\\\n",
" '{:.2f}x faster than my_str + str(1) + str(2)'\\\n",
" .format(min_perf, max_perf)\n",
"plt.figtext(.14,.75, ftext, fontsize=11, ha='left')\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAIECAYAAACUvmMzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XlcT9n/B/DX/bSv2nxKRYsolcIkeyX7Nso0Y43s+xiD\n0ZihEmaYsZYxssVERr4Gg68tJcmuQlRKmyUqRErr+/eHX/fr1qf6lGx1no/H58Hn3HPe95x7b7fT\nuefeyxERgWEYhmEYhvnsiT52BRiGYRiGYZj6wTp2DMMwDMMwDQTr2DEMwzAMwzQQrGPHMAzDMAzT\nQLCOHcMwDMMwTAPBOnYMwzAMwzANBOvYMZ+1kpISTJgwATo6OhCJRIiIiPjYVfoshYSEoGXLlpCV\nlcWECROkLuft7Y1WrVpV+b2q2OHh4bC2toa8vDycnZ3rpxGfocDAQMjJyX3sanzynJycMHny5Grz\neHh4oE+fPh+oRvUjNTUVIpEIUVFR7yW+SCTCnj173kts5tPFOnbMe+fh4QGRSASRSAQ5OTkYGxtj\n+vTpePr06TvH/s9//oPg4GAcOXIEmZmZ6NKlSz3UuHEpLS3FhAkTMGLECGRkZGD9+vV1jrVgwQJc\nunSpxtjTp0+HnZ0dUlJScODAgXduQ30wMzODj4/PO8eRlZXFrl27pMo7YsQIPHz48J3X2dBxHAeO\n4945z8fUu3dvjB8//mNXg2kEZD92BZjGwcHBAfv27UNJSQmuXr2KyZMnIyMjA0eOHKlTvKKiIsjL\ny+Pu3bswMDBA586d36l+5fEao4cPH+LVq1cYMGAAmjVr9k6xVFRUoKKiUm1sIkJSUhJ++uknGBgY\n1HldRITS0lLIytbPaay+OgUcx6Gm576X111RURGKior1st7Gjohq3O4M0xiwETvmg5CTk4NYLIa+\nvj6+/PJLzJkzB8ePH0dhYSEAYO/evWjXrh2UlJRgYmKCefPmIT8/ny/v5OSESZMmYfHixdDX14eR\nkRF69uyJJUuW4N69exCJRDA1NQUAFBcXw9PTE4aGhlBQUICVlRWCg4MF9RGJRPDz88OoUaOgoaGB\nsWPH8pfFwsPD0bZtWygrK8PZ2RmZmZkICwtDu3btoKqqij59+ghGWVJSUjBs2DAYGBhARUUFNjY2\nCAoKEqyv/FKSr68vmjVrBm1tbYwbNw6vXr0S5Pv777/xxRdfQElJCTo6Ohg4cCCeP3/OL/fz84OF\nhQWUlJTQunVrrFixAqWlpdVu+4sXL8LBwQHKysrQ0tLC6NGjkZWVBeDNpUAjIyMAbzrf1V3Ofv36\nNaZPnw4NDQ1oaWlhxowZ/P4r9/al2IqxZWRkcPbsWcjIyKC0tBRjx46FSCTiR7eSkpLw1VdfQVNT\nE1paWujXrx9u3brFx357/7Rv3x6KiooIDQ1FcXExvL29YWpqCiUlJVhbWyMgIKDS/t60aRPc3d2h\nrq6O5s2b49dffxXsn+TkZPj4+PCjy+np6RK3Q1xcHPr16wdNTU2oqqrC0tKS39/GxsYoLS3F+PHj\nIRKJICMjU2XdT58+XelSbPn3qKgodOjQASoqKrCzs8PVq1cFdQgNDUXbtm2hpKSE9u3b49y5cxCJ\nRNi9e7fEOgPSHaeRkZHo1q0b1NXVoa6ujnbt2uHkyZP88hUrVqBly5ZQVFSEWCxG//798fr1a375\nqVOn0K1bNygrK8PQ0BATJkwQjMyXXy718/ODoaEh1NTUMG3aNJSWlsLf3x9GRkbQ0tLC1KlTUVxc\nLKhbaWkpPD090bRpUzRp0gRTp06tdPyVd87Dw8MhKyuL+/fvC5bv2rULGhoaKCgokLiNyo/fkJAQ\nmJmZQUVFBV999RXy8vIQEhICc3NzqKur4+uvv8aLFy8EZas7h3l4eODMmTPYuXMnf3y9/XP24MED\nDB48GCoqKmjZsiV27twpiP3o0SOMGDECmpqaUFZWRs+ePXHt2jVBnrCwMNjY2EBJSQm2trYICwur\n1L6a9h/TQBDDvGfjxo2jPn36CNJWr15NHMdRXl4e7dixgzQ1NSkoKIhSUlIoIiKCbGxsyN3dnc/v\n6OhIampqNH36dLpz5w7dunWLnj59SvPnzycTExN6/PgxZWdnExHR/PnzSVtbm/bv3093796lFStW\nkEgkotDQUD4ex3Gkra1NGzdupHv37tHdu3dpx44dJBKJqGfPnnT58mW6fv06tWrVirp3704ODg50\n6dIliomJIQsLCxo+fDgf6+bNm7Rx40a6ceMG3bt3j/z8/EhWVpbCwsIE9dfQ0KDvv/+eEhIS6OTJ\nk6SlpUWLFy/m82zfvp3k5ORo2bJlfBv9/f35dnl5eZGRkREdPHiQUlNT6dixY9SiRQtBjIoePXpE\nampqNHr0aLp16xZFRkaSjY0NOTg4EBFRQUEBXblyhTiOo3///ZceP35MRUVFEmN99913JBaL6fDh\nw5SQkEDz588ndXV1atWqFZ/Hy8uL/15V7MzMTOI4jv744w96/PgxFRQUUGZmJunq6tKMGTPo1q1b\nlJiYSLNnzyZtbW3KysoiIuL3T6dOnSg8PJxSUlIoKyuLxo0bR7a2tnTq1ClKTU2lv//+mzQ0NGjb\ntm2C/a2rq0tbt26le/fu0caNG4njOP6YePr0KZmYmNCCBQvo8ePH9PjxYyotLZW4Hdq2bUujR4+m\nO3fuUEpKCv33v/+lI0eOEBFRVlYWycrK0oYNG/g41dV9x44dJCsry8cuz+fo6EiRkZEUHx9PAwYM\nIBMTEyopKSEiovv375OSkhJNnjyZ7ty5Q6GhodShQwfiOI52795d5bFQ03FaXFxMmpqaNG/ePEpK\nSqKkpCQ6ePAgnTt3joiI/vOf/5C6ujodOXKEMjIyKCYmhtavX08FBQVERBQaGkrKysrk7+9PSUlJ\ndOXKFerZsyc5OjrydRg3bhypq6uTh4cHxcfH07///kuKiorUr18/GjduHMXHx9PRo0dJSUmJNm3a\nxJdzdHQkdXV1mjJlCl9OLBbT3LlzBbF79+7Nf7ewsCAfHx/BNujevTvNmDGjym3k5eVFKioqNHjw\nYLp58yadPXuWmjZtSn369KGBAwfSjRs3KDIyknR1dWnhwoWC/VbdOSw3N5ccHBxoxIgR/HFRVFRE\nKSkpxHEcmZqaUkhICCUnJ9OiRYtIVlaWEhMTiYiorKyM7O3tqX379nT+/Hm6efMmDR8+nDQ1Nflz\nw4MHD0hZWZkmTJhAd+7coVOnTlHbtm0Fx0RN+49pOFjHjnnvKp5w4+LiyNTUlLp06UJEREZGRrR5\n82ZBmbNnzxLHcfT8+XMienNiNzc3rxTby8uLzMzM+O+vXr0iBQUFwS8FIiJXV1dydnbmv3McR5Mm\nTRLk2bFjB3EcR7GxsXzab7/9RhzH0fXr1/m0tWvXko6OTrVtHjp0KE2ePJn/7ujoSO3atRPkmT59\nOr8NiIiaN29Os2fPlhjv1atXpKysTCdOnBCk79y5kzQ0NKqsx88//0zNmzen4uJiPi02NpY4jqOI\niAgiIv6Xy/nz56uMk5eXR4qKirR161ZBup2dXaWO3dv7o6rYFTshXl5e1LlzZ0GesrIyatmyJa1b\nt46I/rd/IiMj+Tz37t0jkUhECQkJgrI+Pj6C7c1xHM2ZM0eQp02bNvTjjz/y383MzCp1BCRp0qQJ\nBQYGVrlcVlaWdu7cKUiTVPfy9IodO47jKDo6mk+7dOkScRzH/6JftGgRmZiYUFlZGZ/n+PHjNXbs\nJHn7OH369ClxHEfh4eES865Zs4Zat24tOJbe5ujoKNieRERpaWmCn6lx48aRrq6uIMagQYOoadOm\ngj8ohg4dSm5uboLYFdscEBBAioqKlJ+fz8d++zyzZs0aMjIy4svcuXOHOI6jmJiYKreHl5cXycrK\nUk5ODp82c+ZMkpGR4TtRRERz5swhOzs7/rs057BevXrR+PHjBXnKfz7Wrl3Lp5WWlpKamhoFBAQQ\nEdHp06eJ4zi6c+cOn6ewsJCa
"text": [
"<matplotlib.figure.Figure at 0x105595650>"
]
}
],
"prompt_number": 27
2014-04-14 18:28:42 +00:00
},
2014-04-13 23:44:19 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-24 21:25:45 +00:00
"\n",
2014-04-22 17:30:30 +00:00
"<a name='is_integer'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
2014-04-22 17:30:30 +00:00
"## Testing if a string is an integer"
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-22 17:30:30 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"\n",
"def string_is_int(a_str):\n",
" try:\n",
" int(a_str)\n",
" return True\n",
" except ValueError:\n",
" return False\n",
"\n",
"an_int = '123'\n",
"no_int = '123abc'\n",
"\n",
"%timeit string_is_int(an_int)\n",
"%timeit string_is_int(no_int)\n",
"%timeit an_int.isdigit()\n",
2014-05-07 22:38:42 +00:00
"%timeit no_int.isdigit()"
2014-04-22 17:30:30 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"1000000 loops, best of 3: 420 ns per loop\n",
"100000 loops, best of 3: 2.83 \u00b5s per loop"
2014-04-22 17:30:30 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"10000000 loops, best of 3: 116 ns per loop"
2014-04-22 17:30:30 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"10000000 loops, best of 3: 119 ns per loop"
2014-04-22 17:30:30 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 28
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['string_is_int', 'isdigit']\n",
"t1 = '123'\n",
"t2 = '123abc'\n",
"isdigit_method = []\n",
"string_is_int_method = []\n",
"\n",
"for t in [t1,t2]:\n",
" string_is_int_method.append(min(timeit.Timer('string_is_int(t)', \n",
" 'from __main__ import string_is_int, t')\n",
" .repeat(repeat=3, number=1000000)))\n",
" isdigit_method.append(min(timeit.Timer('t.isdigit()', \n",
" 'from __main__ import t')\n",
" .repeat(repeat=3, number=1000000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 52
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"N = len(isdigit_method)\n",
"ind = np.arange(N) # the x locations for the groups\n",
"width = 0.25 # the width of the bars\n",
"\n",
" \n",
"fig, ax = plt.subplots()\n",
"plt.bar(ind, \n",
" [i for i in string_is_int_method], \n",
" width,\n",
" alpha=0.5,\n",
" color='g',\n",
" label='string_is_int(a_str)')\n",
"\n",
"plt.bar(ind + width, \n",
" [i for i in isdigit_method], \n",
" width,\n",
" alpha=0.5,\n",
" color='b',\n",
" label='a_str.isdigit()')\n",
" \n",
"ax.set_ylabel('time in microseconds')\n",
"ax.set_title('Time to check if a string is an integer')\n",
"ax.set_xticks(ind + width)\n",
"ax.set_xticklabels(['\"%s\"' %t for t in [t1, t2]])\n",
"plt.xlabel('test strings')\n",
"plt.xlim(-0.1,1.6)\n",
"#plt.ylim(0,15)\n",
"plt.legend(loc='upper left')\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEgCAYAAAC0MAQrAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XlcTfn/B/DXuZGWe1u1kZQsbYr5YtAgxr5np0WUGQaD\nGUu2lokh6wwGY8bIZJlhmGGGZEgSIiahRdOuoWxpoaT6/P6g83N1qxPde5P38/HwcM+5n/P5vO/t\n3Pu+53PO+Xw4xhgDIYSQ955I2QEQQgipHyghEEIIAUAJgRBCyEuUEAghhACghEAIIeQlSgiEEEIA\nUEKQu/DwcIhEIty5c0fZocidSCTCvn375NpGeno6RCIRLly4UKvtXo8tPz8fzs7O0NHRgUgkQmZm\nZl2HWic8PDzQr1+/BtdWbTk5OeGTTz5RdhgNXiNlB/AuE4mqz6fm5uZISkpCdnY2DAwMFBSVtNat\nW8PNzQ2+vr5Kab++yM7Ohra2Nr+8bds2REVF4fz58zAwMEDTpk0VEkdkZCR69uyJ9PR0mJmZ1Vh+\n8+bNKC8vV0Bkim2rtv744w80alS7rysvLy+kpKTgzJkzcoqq4aGE8Bays7P5x+fPn8fo0aMRExMD\nExMTAICKigoaNWoEQ0NDZYUIjuOU1nZ98vrf4N9//4WtrS1sbW2VEk9N94M+f/4cjRs3hkQiUVBE\nUGhbtaWjo6PsEOpExd+13mKkTpw5c4ZxHMf++++/atdXLB8/fpx17dqVqaurs06dOrH4+HgWGxvL\nunfvzjQ0NFiXLl1YfHy8VF1Xrlxh/fr1Y2KxmBkYGLBRo0axjIyMKmPq1asX4zhO6l9F+YsXL7Ie\nPXowdXV1pquryyZNmsTu3btX7Wt8/vw58/PzY61atWJNmjRhzZs3Z7Nnz+af5ziObd26lbm6ujKJ\nRMJMTU3ZqlWrpOooKSlhvr6+zMLCgqmpqTFbW1v2/fffS5UpKChgc+bMYS1atGBNmjRh5ubm7Ouv\nv2aMMZaWlsY4jmPnz5/ny69atYrp6emxc+fOVRk7x3Fs7969jDHGWrZsKfWe9O7du8rtvLy8mKWl\nJVNXV2etWrViS5YsYc+ePav2ffrjjz9Yhw4dmIaGBtPR0WFdunRhMTExfOyy2p48eTLr27cv27Rp\nE2vZsiVTUVFhRUVF/PoKFcvff/89MzMzY1paWmz48OEsJydHKoaNGzey5s2bMw0NDTZ48GC2Z88e\nmfvnq15v6+bNm6x///5MR0eHaWpqMmtraxYcHFzl9rm5uczFxYWZmZkxdXV11q5dO7Z+/XqZbdQU\n/+t69erFvLy8Ki1/9dVXzNjYmOnp6TF3d3dWWFjIGGPM19e30nu9e/duxtiL/evzzz/n35+OHTuy\nw4cPS7X3zz//sA8//JCpqamxdu3asUOHDrGWLVuyFStW8GVqqqfi77137142aNAgpqmpyby9vat9\nncpGCaGO1DYhfPDBB+zMmTMsPj6edevWjdnb2zNHR0cWFhbGEhIS2EcffcQ+/PBDvp64uDgmFouZ\nn58fu3XrFrt58yYbO3Ysa9u2LSsuLpYZ06NHj5iFhQVbsGABy8nJYTk5OaysrIzdvXuXSSQS5uLi\nwm7evMkiIyOZvb0969mzZ7Wv0d3dnRkaGrI9e/aw1NRUFh0dzb799lv+eY7jmJGREfvxxx9Zamoq\n++677xjHcez06dN8mcmTJzMHBwf2999/s/T0dPbrr78yHR0dtnPnTsYYY+Xl5axXr17M0tKSHTly\nhKWlpbHIyEj++VcTQllZGZs1axYzNTVlN2/erDb2VxPC/fv32fjx41mvXr1YTk4Oy83NlblNeXk5\nW7p0Kbt8+TLLyMhgR48eZSYmJszX17fKdu7evcsaN27M1q5dy9LT01liYiLbv38/u3HjBisrK2NH\njx5lHMexK1euSLU9efJkpqWlxUaNGsWuX7/Obt68ycrKytjkyZNZv379pN4/bW1tNmnSJBYXF8cu\nXrzILCwsmJubG1/m0KFDrFGjRmzTpk0sOTmZBQUFMRMTEyYSiapNCB4eHlJttW/fnrm4uLCEhASW\nlpbGQkJC2F9//VXl9tnZ2Wz16tUsJiaGpaensz179jCxWMx27dpVq/hlcXJyYtOmTeOXe/XqxXR0\ndNgXX3zBbt26xU6ePMn09PTY8uXLGWOMFRYWMhcXF+bo6Mjv+0VFRay8vJw5OTmx3r17s/Pnz7O0\ntDS2Y8cOpqqqyu+nT548YcbGxmz48OHsxo0bLCoqiv+htnLlSsYYE1RPxb5qamrK9u3bx9LT01la\nWlq1r1PZKCHUkdomhCNHjvBlDh48yDiOk/p18fvvvzOO49iTJ08YYy8+SBMmTJCqu7i4mGloaLA/\n/vijyrhat27N/P39pdYtW7aMtWjRgj1//pxfFxsbyziOYxERETLr+ffffxnHcezQoUNVtsVxHJsz\nZ47UOmtra7Z48WLGGGOpqalMJBKxW7duSZXx9/dnHTp0YIwxdurUKcZxHLt69arMNio+ZKdPn2Zj\nxoxhNjY27Pbt21XG9GpsFQmBscq/hoXasGEDa9OmTZXP//PPP4zjOJaeni7z+XPnzkkdqb0aj66u\nLv/3rirOyZMnMyMjI1ZSUsKvCwwMZCYmJvxy9+7dmbu7u1Q93t7etT5C0NbWZkFBQVWWF+Lzzz+v\nlNBqil8WWQmhYp+pMGPGDNatWzd+2dPTkzk5OUmVOXPmDFNTU2N5eXlS66dMmcJGjhzJGGNsx44d\nTCwWs/z8fP75xMRExnEcnxCE1FOxr756VFHf0TkEJXFwcOAfGxkZAQDs7e0rrbt37x7Mzc0RHR2N\nlJSUSv28z549Q3Jycq3ajouLQ9euXaVO0tnb20NbWxvx8fHo0aNHpW3++ecfAED//v2rrbtDhw5S\ny82aNcO9e/cAAFeuXAFjDP/73/+kypSWlvKxXL16Fbq6uvjggw+qbWfKlCnQ0NDAhQsXoKurW23Z\nt/HDDz/gxx9/REZGBp48eYLS0tJq+/8dHBwwYMAA2NnZoV+/fnBycsKoUaNgampaY1vW1tbQ0NCo\nsZyVlZVUP7SJiQlycnL45YSEBLi6ukpt07Vr1xrrfd38+fPh5eWFoKAgODk5Yfjw4ejYsWOV5cvL\ny7FmzRr88ssv+O+//1BcXIznz5/D3Ny8VvELwXGc1Geoop7Q0NBqt4uOjkZJSQmaN28utb6kpARt\n27YFAMTHx8PGxkbqs9auXTup8xhC6qnQpUsX4S9MySghKMmrH4iKE7+y1lVc9cEYg7u7O7y9vSvV\npaenV6u2OY6r8aTmm1JVVa20ruI1VPx/8eLFSl98tT35PXToUOzcuRMhISGYNGnSG0ZbvYMHD2LW\nrFkIDAxEr169oKWlhQMHDmDp0qVVbiMSiRASEoLo6GicOnUKhw4dgre3Nw4ePIghQ4ZU256QZACg\n0klJWX/PuriYYNmyZXBxccGJEycQFhaGr7/+GgsXLkRAQIDM8uvXr8fq1avxzTffoGPHjpBIJNiw\nYQOOHTtW6/iFeH1f4ziuxqukysvLoa2tjStXrtRYX13Vo6mpKbheZaOE8I7o1KkTYmNj0apVq1pt\np6qqirKyMql1tra22LVrl9QVD7GxscjLy4OdnZ3Meip+sYeGhmL06NGC23/1i6niyCAjI6PKL8dO\nnTohNzcXV69erXQk8SoXFxf07NkTkydPRmlpKdzd3QXHJCs2WSIiItCxY0fMnTuXX5eWliboy7Zz\n587o3LkzFi9ejEGDBmHXrl0YMmQI/2Xx+t+kLuO2sbHBhQsXMH36dH5dVFTUG9VtYWGBGTNmYMaM\nGVi9ejXWrVtXZUKIiIjAoEGD4OHhwa9LSkqqVKeirnyTte937twZjx8/RlFRUZVXmNna2mLnzp3I\nz8+HlpYWAODWrVt4/PgxX6ZTp0411vMuohvT3hFLlizhuwKio6ORlpaGM2fOYO7cuUhLS6tyOwsL\nC0RGRuL27dt48OABGGOYNWsW
"text": [
"<matplotlib.figure.Figure at 0x107b9de10>"
]
}
],
"prompt_number": 90
2014-04-22 17:30:30 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='is_number'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
2014-04-22 17:30:30 +00:00
"## Testing if a string is a number"
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-22 17:30:30 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"\n",
"def string_is_number(a_str):\n",
" try:\n",
" float(a_str)\n",
" return True\n",
" except ValueError:\n",
" return False\n",
" \n",
"a_float = '1.234'\n",
"no_float = '123abc'\n",
"\n",
"a_float.replace('.','',1).isdigit()\n",
"no_float.replace('.','',1).isdigit()\n",
"\n",
"%timeit string_is_number(an_int)\n",
"%timeit string_is_number(no_int)\n",
"%timeit a_float.replace('.','',1).isdigit()\n",
2014-05-07 22:38:42 +00:00
"%timeit no_float.replace('.','',1).isdigit()"
2014-04-22 17:30:30 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"1000000 loops, best of 3: 418 ns per loop\n",
"1000000 loops, best of 3: 1.28 \u00b5s per loop"
2014-04-22 17:30:30 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"1000000 loops, best of 3: 500 ns per loop"
2014-04-22 17:30:30 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"1000000 loops, best of 3: 427 ns per loop"
2014-04-22 17:30:30 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 91
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = [\"string_is_number\", \"replace('.','',1).isdigit()\"]\n",
"t1 = '1.234'\n",
"t2 = '123abc'\n",
"isdigit_method = []\n",
"string_is_number_method = []\n",
"\n",
"for t in [t1,t2]:\n",
" string_is_number_method.append(min(timeit.Timer('string_is_number(t)', \n",
" 'from __main__ import string_is_number, t')\n",
" .repeat(repeat=3, number=1000000)))\n",
" isdigit_method.append(min(timeit.Timer(\"t.replace('.','',1).isdigit()\", \n",
" 'from __main__ import t')\n",
" .repeat(repeat=3, number=1000000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 96
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"N = len(isdigit_method)\n",
"ind = np.arange(N) # the x locations for the groups\n",
"width = 0.25 # the width of the bars\n",
"\n",
" \n",
"fig, ax = plt.subplots()\n",
"\n",
"plt.bar(ind , \n",
" [i for i in isdigit_method], \n",
" width,\n",
" alpha=0.5,\n",
" color='b',\n",
" label=\"a_str.replace('.','',1).isdigit()\")\n",
"\n",
"plt.bar(ind + width, \n",
" [i for i in string_is_number_method], \n",
" width,\n",
" alpha=0.5,\n",
" color='g',\n",
" label='string_is_number(a_str)')\n",
"\n",
" \n",
"ax.set_ylabel('time in microseconds')\n",
"ax.set_title('Time to check if a string is a number')\n",
"ax.set_xticks(ind + width)\n",
"ax.set_xticklabels(['\"%s\"' %t for t in [t2, t1]])\n",
"plt.xlabel('test strings')\n",
"plt.xlim(-0.1,1.6)\n",
"#plt.ylim(0,15)\n",
"plt.legend(loc='upper left')\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEgCAYAAAC0MAQrAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XtcTPn/B/DXjNBtutGNykTShXInucS6rGuby7p0QeV+\ny3cti6S2teKLRdi2xbLV+q7dFVnrsiuxqagkK5HuomRJNaFU5/cHnV/TNHVKMyXv5+PRQ/OZzzmf\n9xnTvOd8Pp9zPjyGYRgQQgj54PGbOwBCCCEtAyUEQgghACghEEIIeYsSAiGEEACUEAghhLxFCYEQ\nQggASghyFRERAT6fj0ePHjV3KDLH5/Px008/ybSNzMxM8Pl8REVFNWi7mrEVFRXBwcEBGhoa4PP5\nyM7ObupQm8S8efMwZsyYVtdWSyIUCrFly5bmDqPZKDR3AK0Fn193bhUKhUhJSUFeXh60tbXlFJU4\nExMTODs7Y/Pmzc3SfkuRl5cHdXV19vG3336LmJgYXL16Fdra2ujYsaNc4oiMjMTw4cORmZkJIyOj\neuv7+/ujsrJSDpHJt62WhMfjgcfjNXcYzYYSQhPJy8tjf7969SqmTZuGhIQE6OvrAwDatGkDBQUF\n6OjoNFeIH/Qbvbqa/wf379+HpaUlLC0tmyWe+q4Nff36Ndq2bQuBQCCniCDXtj4EZWVlaNeuXXOH\nUS/qMmoiOjo67I+mpiYAQFtbmy3r0KGDRJdR1eOzZ8/CxsYGysrKGDBgAJKTk3Hr1i3Y2tpCRUUF\ngwYNQnJyslh78fHxGDt2LAQCAXR0dDBt2rQ6uzrs7OyQlpYGHx8f8Pl8sa6RmJgYDB8+HMrKytDS\n0oKjoyOePHlS5/GWl5fDx8cH3bp1g6KiIgwMDLBy5UqxOoWFhXB2doaamhoMDQ3h5+cn9vzr16/h\n7e2Nrl27QklJCT179kRgYKBYHZFIBA8PDxgZGUFRURHGxsbYunWr1Lj8/PzQoUMHREZGSq1TvctI\nKBTi8OHDCA8PB5/Px6hRo6Rut2DBApiYmEBZWRndunXDxo0bUVZWJrU+AJw6dQp9+vSBiooKNDU1\nMWjQINy8eROZmZkYPnw4AMDY2Fis7aruGn9/fwiFQigpKeHVq1cS3ThVjwMDA9GlSxeoq6vD3t4e\n+fn5YjHs3r0bBgYGUFFRwcSJExESElJv12XNtpKSkjBu3DhoampCVVUVFhYWCA4Olrr98+fP4eTk\nhC5dukBZWRlmZmbYtWtXna8V8Ob/5ttvv63zfVNbt467uztGjhzJPrazs4O7uzs8PT3Zv0kvLy8w\nDIPNmzdDT08POjo68PT0lIjhxYsXcHd3h7q6OrS1tbFx40axpM3lfcvn8+Hv7485c+ZAQ0MDc+fO\nrffYWwSGNLlLly4xPB6PefjwYZ3lVY/79u3LXLp0iblz5w5jY2PDWFlZMba2tkx4eDiTnJzMDB06\nlBk0aBC7n6SkJEZVVZXx9vZm7t27x9y+fZuZMWMGY2pqyrx69arWmJ49e8YYGxszn3/+OfP48WPm\n8ePHTEVFBZObm8sIBALG0dGRuX37NhMZGclYWVkxw4cPr/MYXVxcGB0dHSY4OJhJT09nYmNjmT17\n9rDP83g8RldXlzl48CCTnp7O7N+/n+HxeMzFixfZOnPnzmWsra2ZP//8k8nMzGR+/vlnRkNDgzl0\n6BDDMAxTWVnJjBgxgunWrRtz6tQpJiMjg4mMjGSfz8jIYHg8HnP16lWmoqKCWb58OWNgYMDcvn27\nzth5PB4TEhLCMAzDPHnyhJk5cyYzYsQI5vHjx0xBQUGt21RWVjIbN25krl+/zmRlZTFhYWGMvr4+\ns3nzZqnt5ObmMm3btmX++9//MpmZmczdu3eZY8eOMf/88w9TUVHBhIWFMTwej4mLixNre+7cuYya\nmhozdepU5tatW8zt27eZiooKZu7cucyYMWPEXj91dXVmzpw5TFJSEhMdHc0YGxszzs7ObJ3ffvuN\nUVBQYPbu3cukpqYyR44cYfT19Rk+ny/x/qxu3rx5Ym316tWLcXR0ZJKTk5mMjAzm7NmzzO+//y51\n+7y8PMbPz49JSEhgMjMzmeDgYEZVVZX54YcfpG7DMNzeN0KhkNmyZYvYdm5ubszIkSPZxyNGjGDU\n1dWZL774grl//z5z+PBhhsfjMePGjWPWrVvH3L9/nzl69CjD4/GYs2fPstt16dKFUVNTYzZv3syk\npKQwQUFBjIqKith7u773bdVxdOjQgdm/fz+Tnp7OpKam1nncLQUlBBloaEI4deoUW+eXX35heDwe\nc+LECbYsNDSU4fF4TElJCcMwb96Qs2bNEtv3q1evGGVlZebkyZNS4zIxMWF8fHzEyjw9PRlDQ0Pm\n9evXbFliYiLD4/GYK1eu1Lqf+/fvMzwej/ntt9+ktsXj8ZhVq1aJlZmbmzPr169nGIZh0tPTGT6f\nz9y7d0+sjo+PD9O7d2+GYRjmr7/+Yng8HhMfH19rG1UJ4eLFi8z06dMZCwsL5sGDB1Jjqh5bVUJg\nmDev5+jRo+vdrqZdu3Yx3bt3l/r8jRs3GB6Px2RmZtb6/N9//83weDwmKytLrHzu3LmMpqYm+/8t\nLc65c+cyurq6TFlZGVu2bds2Rl9fn308ZMgQxsXFRWw/X3zxRa3vz7raUldXZ44cOSK1PhcrV64U\nSzK1qe99wzDSE4KdnR37eMSIEUyfPn3E6lhaWjJWVlZiZdbW1syaNWvYx126dJH4MrRhwwbG0NCQ\nYRhu79uq43B3d6/zWFsiGkNoAaytrdnfdXV1AQBWVlYSZfn5+RAKhYiNjUVaWppEP29paSlSU1Mb\n1HZSUhIGDx4MBYX/fytYWVlBXV0dd+7cwbBhwyS2uXHjBgBg7Nixde67d+/eYo87derEdmfExcWB\nYRj069dPrE55eTkbS3x8PDQ1NdG3b98625k/fz6UlZURFRXFdtfJwvfff4+DBw8iKysLJSUlKC8v\nr7P/39raGuPGjUPPnj0xZswY2NnZYerUqTAwMKi3LXNzcygrK9dbz8zMDG3btmUf6+vr4/Hjx+zj\n5ORkODk5iW0zePDgevdb05o1a+Du7o4jR47Azs4OU6ZMQZ8+faTWr6ysxPbt2/G///0PDx8+xKtX\nr/D69WsIhcJ626rrfcMVj8cT+7sCAD09PXZMr3pZ9X3zeDzY2NiI1RkyZAi2bt0KkUjE6X1bZeDA\ngQ2KuSWgMYQWoPofdNXAb21lVbM+GIaBi4sLEhMTxX5SUlLg5ubWoLZ5PF69g5qNVdsgWtUxVP0b\nHR0tdgxJSUm4detWg9qZNGkSMjIycPbs2XcPWopffvkFy5cvx+zZs3H27FncvHkTXl5edY4hVI0P\nhYeHY8CAAfjtt99gamqKM2fO1Nsel2QAiL9PgNr/P5tiMoGnpydSUlLw6aef4vbt2xg8eDA2bdok\ntf7OnTvh5+cHDw8P/PXXX0hMTIS7uztKS0vrbavm+4bH44nNeOLz+RLH+Pr1a4n91Pba1CwD6h/U\nr64h71sVFRXO+20p6AzhPdS/f38kJiaia9euDdquXbt2qKioECuztLTEDz/8wM5kAYDExEQUFhai\nZ8+ete6n6hv7+fPnMW3aNM7tV/9gqvqGlZWVhYkTJ9Zav3///igoKEB8fLzEN7LqHB0dMXz4cMyd\nOxfl5eVwcXHhHFNtsdXmypUr6NOnDzw8PNiyjIwMTh+2AwYMwIABA7B+/XqMHz8eP/zwAyZOnMh+\n8NX8P2nKuC0sLBAVFYXFixezZTExMY3at7GxMZYsWYIlS5bAz88PO3bsgK+vb63bXrlyBePHj8e8\nefPYspSUlCZJTjo6Onj48KFYWUJCQqOmC9eMh2EYREdHi5VFRUXBwMAAqqqqnN637zM6Q3gPbdiw\nge0KiI2NRUZGBi5dugQPDw9kZGRI3c7Y2BiRkZF48OAB/v33XzAMg+XLl6OoqAjz5s1DUlISIiMj\n4ezsjOHDh8PW1rbW/ZiYmMDR
"text": [
"<matplotlib.figure.Figure at 0x10782cfd0>"
]
}
],
"prompt_number": 98
2014-04-22 17:30:30 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-24 21:25:45 +00:00
"\n",
2014-04-14 15:48:34 +00:00
"<a name='list_operations'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
2014-04-14 18:28:42 +00:00
"# List operations"
2014-04-14 15:48:34 +00:00
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-14 15:48:34 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='list_reverse'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
2014-04-14 18:28:42 +00:00
"## List reversing - `[::-1]` vs. `reverse()` vs. `reversed()`"
2014-04-13 23:44:19 +00:00
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-13 23:44:19 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
2014-05-07 22:38:42 +00:00
"import copy\n",
2014-04-13 23:44:19 +00:00
"\n",
"def reverse_func(my_list):\n",
2014-05-07 22:38:42 +00:00
" return copy.deepcopy(my_list).reverse()\n",
2014-04-13 23:44:19 +00:00
" \n",
"def reversed_func(my_list):\n",
" return list(reversed(my_list))\n",
"\n",
"def reverse_slizing(my_list):\n",
" return my_list[::-1]\n",
"\n",
2014-05-07 22:38:42 +00:00
"n = 10\n",
"test_list = list([i for i in range(n)])\n",
2014-04-13 23:44:19 +00:00
"\n",
2014-05-07 22:38:42 +00:00
"%timeit reverse_func(test_list)\n",
"%timeit reversed_func(test_list)\n",
"%timeit reverse_slizing(test_list)"
2014-04-13 23:44:19 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"100000 loops, best of 3: 13.8 \u00b5s per loop\n",
"1000000 loops, best of 3: 1.44 \u00b5s per loop"
2014-04-13 23:44:19 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"1000000 loops, best of 3: 330 ns per loop"
2014-04-13 23:44:19 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 104
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['reverse_func', 'reversed_func',\n",
" 'reverse_slizing']\n",
"\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" test_list = list([i for i in range(n)])\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(test_list)' %f, \n",
" 'from __main__ import %s, test_list' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 117
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('reverse_func', 'copy.deepcopy(my_list).reverse()'), \n",
" ('reversed_func', 'list(reversed(my_list))'),\n",
" ('reverse_slizing', 'my_list[::-1]'),\n",
" ] \n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], \n",
" alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"#plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of different list reversing approaches')\n",
"\n",
"max_perf = max( f/s for f,s in zip(times_n['reverse_func'],\n",
" times_n['reverse_slizing']) )\n",
"min_perf = min( f/s for f,s in zip(times_n['reverse_func'],\n",
" times_n['reverse_slizing']) )\n",
"\n",
"ftext = 'my_list[::-1] is {:.2f}x to '\\\n",
" '{:.2f}x faster than copy.deepcopy(my_list).reverse()'\\\n",
" .format(min_perf, max_perf)\n",
"plt.figtext(.14,.75, ftext, fontsize=11, ha='left')\n",
"\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAIECAYAAACUvmMzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXlczdn/x1/3tpfbqk3RqlKRpcWgulRI1jBfe9nGxMxg\nmPkZNOqLmBnL12C2GNkNZuwUbciaKCSVUmJkjySt798fd/pMn26lcmlxno/HffB5n3Pe530+n889\nve8573OOgIgIDAaDwWAwGIxmj7CxDWAwGAwGg8FgyAbm2DEYDAaDwWC0EJhjx2AwGAwGg9FCYI4d\ng8FgMBgMRguBOXYMBoPBYDAYLQTm2DEYDAaDwWC0EJhjx2j2lJaWYtKkSWjdujWEQiFOnTrV2CY1\nS/bs2QMLCwvIy8tj0qRJdS4XFBSE9u3b13hdk+7Y2FjY29tDUVERffr0kU0j3hNV2xgWFgYFBYVG\ntKjpkJWVBaFQiLNnzza2KS0Gf39/eHl5NbYZjGYCc+wY7wV/f38IhUIIhUIoKCjA1NQUAQEBePr0\n6Vvr/vPPP7Fz504cPnwYubm5+Oijj2Rg8YdFWVkZJk2ahFGjRiEnJwdr1qxpsK6vvvoKFy5ceKPu\ngIAAODo64vbt2/jrr7/eug2ywNLSEsHBwXXKKxAIuP+PGjUKf//9d53r8fT0xMSJE+ttX3OgXbt2\nyM3NhbOzc2Ob0mIQCAS8943BqA35xjaA8eHg5uaG3bt3o7S0FJcuXcLUqVORk5ODw4cPN0hfcXEx\nFBUVkZ6eDiMjI3Tv3v2t7KvQ9yHy999/o6CgAN7e3jA0NHwrXWpqalBTU6tVNxHh1q1bWLBgAYyM\njBpcFxGhrKwM8vKy6crq88ez8t7uysrKUFZWlokNtVFeXg4AEAob5zd5Xb4jQqEQenp678mixuN9\nPgt2jgCjPrARO8Z7Q0FBAXp6emjTpg0GDx6MmTNnIjw8HEVFRQCAXbt2oXPnzlBRUYGZmRnmzJmD\nV69eceXFYjGmTJmCwMBAtGnTBiYmJujduze+/fZbZGZmQigUwtzcHABQUlKCefPmwdjYGEpKSrCz\ns8POnTt59giFQqxduxZjxoyBpqYmJkyYwE2pxcbGomPHjlBVVUWfPn2Qm5uLmJgYdO7cGa1atYKX\nlxdvhOb27dvw9fWFkZER1NTU0KlTJ2zbto1Xn1gsxtSpU7F48WIYGhpCR0cHfn5+KCgo4OX7448/\n0K1bN6ioqKB169YYMGAA8vLyuPS1a9fCxsYGKioqsLKyQkhICMrKymq99+fPn4ebmxtUVVWhra2N\nsWPH4tGjRwAk04gmJiYAJM53bdPZr1+/RkBAADQ1NaGtrY3p06dzz6+CytOUVXXLycnh5MmTkJOT\nQ1lZGSZMmAChUIgtW7YAAG7duoXhw4dDS0sL2tra6NevH65fv87prvx8unTpAmVlZURFRaGkpARB\nQUEwNzeHiooK7O3t8dtvv0k9759//hnjx4+Huro62rZti+XLl/OeT0ZGBoKDg7nR5Tt37tR6X6va\nVcGLFy8wceJEGBoaQllZGe3atcOcOXMASEavo6OjsXnzZq6emu53xb3cvXs3bGxsoKSkhPT0dLx8\n+RIzZ86EsbEx1NTU0LVrV+zbt48r17NnT0ybNk1KX4cOHfDtt99y1/X5zhkaGsLU1BQAcODAAXTp\n0gVqamrQ0tKCi4sLEhMTAUhPxVZc79mzBwMHDoSamhosLCywefNmnm23b99G3759oaKiAlNTU/z6\n66/cd6Y2pk6dCktLS6iqqsLCwgILFixAcXGx1D3csWMH93707dsX2dnZDcpT9Vncv38fo0aNgpaW\nFlRVVdG7d28kJCTUy0YAiIyMhKurK9TU1KCpqQmxWIzMzEwunYjw22+/wcTEBBoaGhgyZAgePnzI\n03HixAn07NkTqqqqMDY2xqRJk3izIsnJyejXrx+0tLTQqlUr2NraSvVTjBYAMRjvAT8/P/Ly8uLJ\nVq5cSQKBgF6+fEmbNm0iLS0t2rZtG92+fZtOnTpFnTp1ovHjx3P53d3dSSQSUUBAAKWkpND169fp\n6dOnNHfuXDIzM6MHDx7Q48ePiYho7ty5pKOjQ3v37qX09HQKCQkhoVBIUVFRnD6BQEA6Ojq0fv16\nyszMpPT0dNq0aRMJhULq3bs3Xbx4kS5fvkzt27enXr16kZubG124cIESExPJxsaG/vOf/3C6rl27\nRuvXr6erV69SZmYmrV27luTl5SkmJoZnv6amJn355ZeUmppKx48fJ21tbQoMDOTy/P7776SgoEBL\nlizh2rhu3TquXYsWLSITExPav38/ZWVl0dGjR6ldu3Y8HVW5f/8+iUQiGjt2LF2/fp3i4uKoU6dO\n5ObmRkREhYWFFB8fTwKBgA4dOkQPHjyg4uLianXNmjWL9PT06ODBg5Samkpz584ldXV1at++PZdn\n0aJF3HVNunNzc0kgENBPP/1EDx48oMLCQsrNzSV9fX2aPn06Xb9+ndLS0ujzzz8nHR0devToERER\n93xcXFwoNjaWbt++TY8ePSI/Pz9ycHCgEydOUFZWFv3xxx+kqalJGzdu5D1vfX192rBhA2VmZtL6\n9etJIBBw78TTp0/JzMyMvvrqK3rw4AE9ePCAysrKqr0PixYtIktLS+5606ZNJC8vz11//vnn5ODg\nQBcvXqScnBw6e/YsbdiwgYiInj9/Tm5ubjRq1Ciunpru96JFi0hVVZXEYjFdvHiR0tPTKT8/n8Ri\nMfXu3ZvOnDlDt2/fpt9++40UFRW5tvz222+kpaVFRUVFnK4LFy6QQCCg9PR0zuaGfOfu379PCgoK\n9MMPP1BWVhbdvHmTdu7cSdeuXSMiotu3b5NAIKAzZ87wrs3NzWnPnj2UkZFB8+fPJ3l5eUpLSyMi\novLycnJwcKDu3btTfHw8JSYm0oABA0hDQ4OmTp1a7b2pKLdgwQK6ePEiZWdn08GDB8nQ0JAWLVrE\nu4dqamrk6upKCQkJFB8fTy4uLtS1a9d656nuWTg7O1OXLl3ozJkzdO3aNfrPf/5DWlpa3He2Ljae\nOHGC5OTkaPbs2XT16lVKTU2lsLAwSk1NJSJJ/6mhoUFjxoyh5ORkOnfuHJmZmfGeVVRUFKmqqtK6\ndevo1q1bFB8fT7179yZ3d3cuT8eOHWns2LGUkpJCt2/fpmPHjtHhw4drvL+M5glz7BjvBT8/P/L0\n9OSuk5OTydzcnD766CMiIjIxMaFff/2VV+bkyZMkEAgoLy+PiCR/ZKytraV0V/0jW1BQQEpKSvTz\nzz/z8g0bNoz69OnDXQsEApoyZQovz6ZNm0ggEFBSUhIn++GHH0ggENDly5c52erVq6l169a1tnnI\nkCG8P0ru7u7UuXNnXp6AgADuHhARtW3blj7//PNq9RUUFJCqqipFRETw5Js3byZNTc0a7Vi4cCG1\nbduWSkpKOFlSUhIJBAI6deoUEUn/Ma6Oly9fkrKyMuegVODo6Cjl2FV+HjXpFggEtH37dl657t27\n8/KUl5eThYUF/e9//yOif59PXFwclyczM5OEQiH3R7CC4OBg3v0WCAQ0c+ZMXp4OHTrQN998w11b\nWlpScHBwjfegpjZWdeyGDBlC/v7+NZb39PSkiRMn1qkeoVBIOTk5nCwmJoaUlZXp+fPnvLwTJ06k\noUOHEhHRs2fPSEVFhfbs2cOlz5gxg3r06MFdN/Q7d/nyZRIIBJSVlVWtzTU5dqtXr+bylJWVkUgk\not9++42IiI4fP04CgYAyMjK4PE+fPiVVVdVaHbvqWLVqldT7WFV3WloaCQQCio6Orleeqs8iMjKS\nBAIBpaSkcLKioiIyNDSk//73v3W2sVevXjRo0KAa8/v5+ZG+vj7vB8B3331HhoaG3LW7uzvvXSYi\nys7O5vVnGhoaFBYWVmM9jJYBm4plvDdiY2MhEomgqqqKjh07wtLSEtu3b8ejR49w584dzJ49GyKR\niPsMGDAAAoEAt27d4nR069bt
"text": [
"<matplotlib.figure.Figure at 0x1069c7890>"
]
}
],
"prompt_number": 118
2014-04-14 15:48:34 +00:00
},
2014-04-14 18:28:42 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='create_cond_list'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"## Creating lists using conditional statements\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-14 18:28:42 +00:00
"\n",
"In this test, I attempted to figure out the fastest way to create a new list of elements that meet a certain criterion. For the sake of simplicity, the criterion was to check if an element is even or odd, and only if the element was even, it should be included in the list. For example, the resulting list for numbers in the range from 1 to 10 would be \n",
"[2, 4, 6, 8, 10].\n",
"\n",
"Here, I tested three different approaches: \n",
"1) a simple for loop with an if-statement check (`cond_loop()`) \n",
"2) a list comprehension (`list_compr()`) \n",
"3) the built-in filter() function (`filter_func()`) \n",
"\n",
"Note that the filter() function now returns a generator in Python 3, so I had to wrap it in an additional list() function call."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-05-07 22:38:42 +00:00
"import timeit\n",
"\n",
"def cond_loop(n):\n",
" even_nums = []\n",
" for i in range(n):\n",
" if i % 2 == 0:\n",
" even_nums.append(i)\n",
" return even_nums\n",
"\n",
"def list_compr(n):\n",
" even_nums = [i for i in range(n) if i % 2 == 0]\n",
" return even_nums\n",
" \n",
"def filter_func(n):\n",
" even_nums = list(filter((lambda x: x % 2 != 0), range(n)))\n",
" return even_nums\n",
"\n",
"%timeit cond_loop(n)\n",
"%timeit list_compr(n)\n",
"%timeit filter_func(n)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"10 loops, best of 3: 18.1 ms per loop\n",
"100 loops, best of 3: 15.3 ms per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"10 loops, best of 3: 22.8 ms per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
"prompt_number": 119
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['cond_loop', 'list_compr',\n",
" 'filter_func']\n",
"\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" test_list = list([i for i in range(n)])\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(n)' %f, \n",
" 'from __main__ import %s, n' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 120
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
2014-04-14 18:28:42 +00:00
"\n",
2014-05-07 22:38:42 +00:00
"labels = [('cond_loop', 'explicit for-loop'), \n",
" ('list_compr', 'list comprehension'),\n",
" ('filter_func', 'lambda function'),\n",
" ] \n",
2014-04-14 18:28:42 +00:00
"\n",
2014-05-07 22:38:42 +00:00
"matplotlib.rcParams.update({'font.size': 12})\n",
2014-04-14 18:28:42 +00:00
"\n",
2014-05-07 22:38:42 +00:00
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], \n",
" alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"#plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of different conditional list creation methods')\n",
2014-04-14 18:28:42 +00:00
"\n",
2014-05-07 22:38:42 +00:00
"max_perf = max( f/c for f,c in zip(times_n['filter_func'],\n",
" times_n['cond_loop']) )\n",
"min_perf = min( f/c for f,c in zip(times_n['filter_func'],\n",
" times_n['cond_loop']) )\n",
"\n",
"ftext = 'the list comprehension is {:.2f}x to '\\\n",
" '{:.2f}x faster than the lambda function'\\\n",
" .format(min_perf, max_perf)\n",
"plt.figtext(.14,.75, ftext, fontsize=11, ha='left')\n",
"\n",
"plt.show()"
2014-04-14 18:28:42 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
2014-05-07 22:38:42 +00:00
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAIECAYAAACUvmMzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXlcjdkfxz/3at+UciPtK0qLVGjkkr2msQ0hRAzFDMY6\nBhUyWUf2ZZB9GT+GwdhSEdlCQ5GKspOdRKnv74+mR0/d6t5cSs779bqv133Oc873fM95vs9zv/d7\nlkdARAQGg8FgMBgMxhePsKoVYDAYDAaDwWDIB+bYMRgMBoPBYNQQmGPHYDAYDAaDUUNgjh2DwWAw\nGAxGDYE5dgwGg8FgMBg1BObYMRgMBoPBYNQQmGPHkDvv37/H4MGDoaenB6FQiOPHj1e1Sl8kf/75\nJywsLKCgoIDBgwdLXS4kJARWVlZlHpclOyYmBnZ2dlBSUkLbtm3l04ivAH9/f7Rv377M47IwMzPD\nrFmzPqVqEsnIyIBQKMSpU6fKzBMZGQlFRUXuOCYmBkKhEPfu3fscKlYp0l6/LxGxWIyhQ4d+Etk1\nud++NJhj95Xi7+8PoVAIoVAIRUVFmJqaIjAwEE+fPv1o2f/73/+wdetW7Nu3Dw8ePECLFi3koPHX\nRX5+PgYPHgxfX1/cvn0bERERlZY1fvx4nDlzpkLZgYGBaNasGW7evIldu3Z9dBvkgaWlJUJDQ6ta\njQoRCATc98WLF2Pnzp3c8ZAhQ9CmTZtSZc6fP48xY8Z8Fv0+Fnd3dzx48AD169eXKn9Zba5ObNq0\nCUJh6Z/AktfvS6SstgkEAp6typtPKZshPQpVrQCj6vDw8MCOHTvw/v17nD9/HkOHDsXt27exb9++\nSsnLzc2FkpISUlNT0aBBAzRv3vyj9CuS9zVy7949ZGdno3PnzlL/mJaFuro61NXVy5VNREhLS8Ov\nv/6KBg0aVLouIkJ+fj4UFOTzaPlSfiiK7/OuqakpVRldXd1PpY7cUVRUhEgkqmo1UFBQAAASnRZ5\nIe31Y5SGve+gesAidl8xRQ9rAwMD+Pj4YNSoUTh48CDevXsHANi2bRscHR2hqqoKMzMzjB07Fm/e\nvOHKi8ViDBkyBFOnToWBgQFMTEzQpk0bTJs2DTdu3IBQKIS5uTkAIC8vD5MmTYKhoSGUlZVha2uL\nrVu38vQRCoVYvHgx+vbtC21tbQwYMIAbEoqJiUGTJk2gpqaGtm3b4sGDB4iOjoajoyM0NDTQvn17\n3jDRzZs30b17dzRo0ADq6uqwt7fHpk2bePUVDUvMmDED9evXh66uLgYOHIjs7Gxevu3bt8PZ2Rmq\nqqrQ09NDly5d8Pz5c+784sWL0bBhQ6iqqsLa2hqzZs1Cfn5+uX1/+vRpeHh4QE1NDXXq1EG/fv2Q\nlZUFoHAYzMTEBECh813ecPbbt28RGBgIbW1t1KlTB0FBQdz1K6L4UGxJ2bVq1UJsbCxq1aqF/Px8\nDBgwAEKhEBs2bAAApKWloUePHtDR0UGdOnXQsWNHXLlyhZNd/Po4OTlBRUUFUVFRyMvLQ0hICMzN\nzaGqqgo7OzusWrWq1PVevnw5+vfvDy0tLRgZGSE8PJx3fdLT0xEaGspFl2/dulVmn5Z3naS1v/L0\nAYCnT5+id+/e0NDQQL169TB16tRSP2bFh6RCQkKwdu1axMbGcm0o6ltTU1OEhYVx5V69eoVhw4ZB\nJBJBRUUFLi4uOHLkCHe+aAj1zz//hLe3N9TV1WFhYYH169fz6o+IiICTkxM0NTVRv3599OnTBw8e\nPCiz36Sh5FBsXl4efv75ZxgZGUFFRQUGBgbo06dPhW2WREJCAjp16oTatWtDU1MTbm5uOHv2LCfL\nysoKO3bsQMOGDaGsrIzU1FS8fv0ao0aNgqGhIdTV1dG0aVPs3r2bJ/fXX39F48aNoa6uDmNjYwQG\nBuLly5dcewYMGAAAnI5FUxIkDSnOmzcP5ubmUFZWhqWlZakIuqmpKYKDgzFq1Cjo6uqiXr16+Pnn\nn8t9DhRdz61bt6Jjx45QV1dH48aNERcXh1u3bqFTp07Q0NCAra0t4uLieGXLuy/LaxtQ6HxV9Myr\nqL3S3AdxcXFwd3eHlpYWtLS04OjoiMOHD5fZHww5QoyvkoEDB1L79u15afPnzyeBQECvX7+mdevW\nkY6ODm3atIlu3rxJx48fJ3t7e+rfvz+Xv3Xr1qSpqUmBgYF09epVunLlCj19+pTGjRtHZmZm9PDh\nQ3r8+DEREY0bN450dXVp586dlJqaSrNmzSKhUEhRUVGcPIFAQLq6urR06VK6ceMGpaam0rp160go\nFFKbNm3o7NmzdOHCBbKysqJvvvmGPDw86MyZM3Tp0iVq2LAh9e7dm5N1+fJlWrp0Kf37779048YN\nWrx4MSkoKFB0dDRPf21tbfr5558pJSWFDh8+THXq1KGpU6dyedauXUuKioo0c+ZMro1Llizh2hUc\nHEwmJib0119/UUZGBh04cICMjY15Mkpy//590tTUpH79+tGVK1coLi6O7O3tycPDg4iIcnJy6Ny5\ncyQQCOjvv/+mhw8fUm5urkRZo0ePJpFIRHv37qWUlBQaN24caWlpkZWVFZcnODiYOy5L9oMHD0gg\nENCyZcvo4cOHlJOTQw8ePCB9fX0KCgqiK1eu0PXr1+nHH38kXV1dysrKIiLiro+bmxvFxMTQzZs3\nKSsriwYOHEgODg505MgRysjIoO3bt5O2tjatWbOGd7319fXpjz/+oBs3btDSpUtJIBBwNvH06VMy\nMzOj8ePH08OHD+nhw4eUn58vsR8quk7S2l95+hARde3alaysrCg6OpqSkpLIz8+PtLS0ePdS8Xvr\n9evX1K9fP3J3d+fakJOTQ0REpqamFBYWxpXr2bMnmZmZ0eHDh+natWs0atQoUlJSomvXrhER0c2b\nN0kgEJC5uTn9+eeflJ6eTpMnTyYFBQW6fv06JyciIoKioqIoIyOD4uPjqWXLltS6dWvufJGckydP\nSuzLouuqoKDAHUdHR5NAIKC7d+8SUeGzwtDQkGJjY+n27dt07tw5ioiIqLDNJbly5QqpqalR3759\nKSEhgdLT02nHjh0UHx9PRIW2q6amRmKxmM6ePUupqan06tUrEovF1KZNGzp58iTdvHmTVq1aRUpK\nSrxrNXPmTIqLi6PMzEyKioqihg0b0sCBA4mIKDc3l7u+RTq+fPmy1PUjIlqyZAmpqqrS6tWrKS0t\njVasWEEqKio8WzYxMSEdHR2aPXs2paWl0Y4dO0hRUZGXpyRF18HCwoL27NlD169fp27dulGDBg1I\nLBbTX3/9RdevX6eePXuSkZER5eXlERFVeF+W1zZpnnnStLei+yAvL490dHRo7NixlJaWRmlpafTX\nX3/RiRMnyuwPhvxgjt1XysCBA6ldu3bccVJSEpmbm1OLFi2IqPBBtXLlSl6Z2NhYEggE9Pz5cyIq\nfEjY2NiUkh0cHEyWlpbccXZ2NikrK9Py5ct5+bp160Zt27bljgUCAQ0ZMoSXZ926dSQQCCgxMZFL\nmzt3LgkEArpw4QKX9vvvv5Oenl65bf7uu+9o6NCh3HHr1q3J0dGRlycwMJDrAyIiIyMj+vHHHyXK\ny87OJjU1NTp06BAvff369aStrV2mHlOmTOE9qImIEhMTSSAQ0PHjx4lIuh/f169fk4qKCv3xxx+8\n9GbNmpVy7Ipfj7JkCwQC2rx5M69c8+bNeXkKCgrIwsKCFi5cSEQfrk9cXByX58aNGyQUCiklJYVX\nNjQ0lNffAoGARo0axcvTqFEj+uWXX7hjS0tLCg0NLbMPiqjoOklrf+Xpk5qaSgKBgI4ePcqdz83N\npQYNGpRy7IrfWwEBASQWi0vpVdyxK5L9zz//8PI0bdqUBg8eTEQfrtvvv//Onc/PzydNTU1atWqV\nxLYTEV24cIEEAgHdu3ePJ+djHLtRo0bx+q4kZbW5JH5+fqXuweIEBweTUCik27dv83RRUVGhFy9e\n8PIOGjSIunbtWqasXbt2kbKy
2014-04-14 18:28:42 +00:00
"text": [
2014-05-07 22:38:42 +00:00
"<matplotlib.figure.Figure at 0x106fe4290>"
2014-04-14 18:28:42 +00:00
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 122
2014-04-14 18:28:42 +00:00
},
2014-04-14 15:48:34 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='dict_ops'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
2014-04-14 18:28:42 +00:00
"# Dictionary operations "
2014-04-14 15:48:34 +00:00
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-14 15:48:34 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='adding_dict_elements'></a>\n",
2014-04-24 21:25:45 +00:00
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"## Adding elements to a Dictionary\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-14 15:48:34 +00:00
"\n",
"All three functions below count how often different elements (values) occur in a list. \n",
"E.g., for the list ['a', 'b', 'a', 'c'], the dictionary would look like this: \n",
"`my_dict = {'a': 2, 'b': 1, 'c': 1}`"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import random\n",
"import timeit\n",
2014-05-02 01:25:08 +00:00
"from collections import defaultdict\n",
2014-04-14 15:48:34 +00:00
"\n",
"\n",
2014-05-02 01:25:08 +00:00
"def add_element_check1(elements):\n",
2014-05-07 22:38:42 +00:00
" \"\"\"if ele not in dict (v1)\"\"\"\n",
2014-05-02 01:25:08 +00:00
" d = dict()\n",
2014-04-14 15:48:34 +00:00
" for e in elements:\n",
2014-05-02 01:25:08 +00:00
" if e not in d:\n",
" d[e] = 1\n",
2014-04-14 15:48:34 +00:00
" else:\n",
2014-05-02 01:25:08 +00:00
" d[e] += 1\n",
" return d\n",
2014-04-14 15:48:34 +00:00
" \n",
2014-05-02 01:25:08 +00:00
"def add_element_check2(elements):\n",
2014-05-07 22:38:42 +00:00
" \"\"\"if ele not in dict (v2)\"\"\"\n",
2014-05-02 01:25:08 +00:00
" d = dict()\n",
2014-04-14 15:48:34 +00:00
" for e in elements:\n",
2014-05-02 01:25:08 +00:00
" if e not in d:\n",
" d[e] = 0\n",
" d[e] += 1 \n",
" return d\n",
" \n",
"def add_element_except(elements):\n",
2014-05-07 22:38:42 +00:00
" \"\"\"try-except\"\"\"\n",
2014-05-02 01:25:08 +00:00
" d = dict()\n",
2014-04-14 15:48:34 +00:00
" for e in elements:\n",
" try:\n",
2014-05-02 01:25:08 +00:00
" d[e] += 1\n",
2014-04-14 15:48:34 +00:00
" except KeyError:\n",
2014-05-02 01:25:08 +00:00
" d[e] = 1\n",
" return d\n",
2014-04-14 15:48:34 +00:00
" \n",
2014-05-02 01:25:08 +00:00
"def add_element_defaultdict(elements):\n",
2014-05-07 22:38:42 +00:00
" \"\"\"defaultdict\"\"\"\n",
2014-05-02 01:25:08 +00:00
" d = defaultdict(int)\n",
" for e in elements:\n",
" d[e] += 1\n",
" return d\n",
2014-04-14 15:48:34 +00:00
"\n",
2014-05-02 01:34:45 +00:00
"def add_element_get(elements):\n",
2014-05-07 22:38:42 +00:00
" \"\"\".get() method\"\"\"\n",
2014-05-02 01:34:45 +00:00
" d = dict()\n",
" for e in elements:\n",
" d[e] = d.get(e, 1) + 1\n",
" return d\n",
"\n",
"\n",
2014-04-14 15:48:34 +00:00
"random.seed(123)\n",
"\n",
"print('Results for 100 integers in range 1-10') \n",
2014-05-02 01:25:08 +00:00
"rand_ints = [random.randrange(1, 10) for i in range(100)]\n",
"%timeit add_element_check1(rand_ints)\n",
"%timeit add_element_check2(rand_ints)\n",
"%timeit add_element_except(rand_ints)\n",
"%timeit add_element_defaultdict(rand_ints)\n",
2014-05-02 01:34:45 +00:00
"%timeit add_element_get(rand_ints)\n",
2014-05-02 01:25:08 +00:00
"\n",
"print('\\nResults for 1000 integers in range 1-5') \n",
"rand_ints = [random.randrange(1, 5) for i in range(1000)]\n",
"%timeit add_element_check1(rand_ints)\n",
"%timeit add_element_check2(rand_ints)\n",
"%timeit add_element_except(rand_ints)\n",
"%timeit add_element_defaultdict(rand_ints)\n",
2014-05-02 01:34:45 +00:00
"%timeit add_element_get(rand_ints)\n",
2014-04-14 15:48:34 +00:00
"\n",
"print('\\nResults for 1000 integers in range 1-1000') \n",
2014-05-02 01:25:08 +00:00
"rand_ints = [random.randrange(1, 1000) for i in range(1000)]\n",
"%timeit add_element_check1(rand_ints)\n",
"%timeit add_element_check2(rand_ints)\n",
"%timeit add_element_except(rand_ints)\n",
2014-05-02 01:34:45 +00:00
"%timeit add_element_defaultdict(rand_ints)\n",
"%timeit add_element_get(rand_ints)"
2014-04-14 15:48:34 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Results for 100 integers in range 1-10\n",
2014-05-07 22:38:42 +00:00
"100000 loops, best of 3: 16.8 \u00b5s per loop"
2014-05-02 01:25:08 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"100000 loops, best of 3: 18.2 \u00b5s per loop"
2014-04-14 15:48:34 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"100000 loops, best of 3: 17.1 \u00b5s per loop"
2014-05-02 01:34:45 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"100000 loops, best of 3: 14.7 \u00b5s per loop"
2014-04-14 15:48:34 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"10000 loops, best of 3: 20.8 \u00b5s per loop"
2014-04-14 15:48:34 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
2014-05-02 01:25:08 +00:00
"Results for 1000 integers in range 1-5\n",
2014-05-07 22:38:42 +00:00
"10000 loops, best of 3: 161 \u00b5s per loop"
2014-05-02 01:34:45 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"10000 loops, best of 3: 166 \u00b5s per loop"
2014-04-14 15:48:34 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"10000 loops, best of 3: 128 \u00b5s per loop"
2014-04-14 15:48:34 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"10000 loops, best of 3: 114 \u00b5s per loop"
2014-05-02 01:25:08 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"1000 loops, best of 3: 197 \u00b5s per loop"
2014-04-14 15:48:34 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Results for 1000 integers in range 1-1000\n",
2014-05-07 22:38:42 +00:00
"10000 loops, best of 3: 166 \u00b5s per loop"
2014-05-02 01:34:45 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"1000 loops, best of 3: 240 \u00b5s per loop"
2014-05-02 01:25:08 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"1000 loops, best of 3: 354 \u00b5s per loop"
2014-04-14 15:48:34 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"1000 loops, best of 3: 281 \u00b5s per loop"
2014-04-14 15:48:34 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-05-07 22:38:42 +00:00
"1000 loops, best of 3: 234 \u00b5s per loop"
2014-04-14 15:48:34 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 123
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['add_element_check1', 'add_element_check2',\n",
" 'add_element_except', 'add_element_defaultdict',\n",
" 'add_element_get']\n",
"\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" elements = [random.randrange(1, 100) for i in range(n)]\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(elements)' %f, \n",
" 'from __main__ import %s, elements' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 136
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('add_element_check1', 'if ele not in dict (v1)'), \n",
" ('add_element_check2', 'if ele not in dict (v2)'),\n",
" ('add_element_except', 'try-except'),\n",
" ('add_element_defaultdict', 'defaultdict'),\n",
" ('add_element_get', '.get() method')\n",
" ] \n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,10))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], \n",
" alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"#plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of different methods to count elements in a dictionary')\n",
"\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAJ0CAYAAACfnG7yAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XlcVNX7B/DPvew7KA4oiKyigqK4LwEqWG6V5jdRXNC0\n1DQ1M80NzLCyNJU0v2qCSLjmml+1XMBwSRS1QAFZxYXFDRURYXh+f/BjcmCAGRxB8Hm/Xr5kzj3n\nuefeOffOmXPPvSMQEYExxhhjjNV7Yl1XgDHGGGOMqQd37BhjjDHGGgju2DHGGGOMNRDcsWOMMcYY\nayC4Y8cYY4wx1kBwx44xxhhjrIHgjl0DUVxcjPHjx8Pc3ByiKOLkyZN1XaV6aefOnXBwcICmpibG\njx+vdLnAwEA4OTlV+rqy2JGRkXB1dYW2tjb69Omjno1ooCIjIyGKIm7duqX22Onp6RBFEadPn1Z7\nbPYvW1tbBAUF1XU1at2r2L68vLwwceJEudcffvjhC8V8FbfzdcQdu1rk7+8PURQhiiK0tLRga2uL\nyZMn4969ey8c+9dff8XWrVvx22+/ISsrC927d1dDjV8vUqkU48ePh6+vLzIzM7Fq1aoax5o9ezb+\n+uuvamNPnjwZnTp1QlpaGnbv3v3C26AOjo6OWLx48WtfB1VpamoiLCysrquhVjdu3FDrF0VBECAI\nglpi1QZ1bb+NjQ2ysrLQpUsXNdXsxZV/L/bu3YsVK1YoXV7RMfoqbufrSLOuK/C68fDwwI4dO1Bc\nXIzz589j4sSJyMzMxG+//VajeM+ePYO2tjauXbsGKysrdOvW7YXqVxbvdXTr1i3k5+ejf//+aNq0\n6QvFMjAwgIGBQZWxiQjJycmYP38+rKysarwuIoJUKoWmpnoO51fhg/dVqIOqBEFAQ33ee0PdLmW9\n6PaLogiJRKKm2rwcpqamKuVXdIy+SttZUlICoLROrx1itWbs2LHk7e0tlxYUFEQaGhr09OlTIiLa\nunUrubm5ka6uLtna2tKnn35K+fn5svyenp70wQcf0IIFC6hp06ZkaWlJXl5eJAiC7J+dnR0RET17\n9ozmzJlDVlZWpK2tTW3atKGIiAi59QuCQKtXr6YRI0aQiYkJDR8+nEJCQkhTU5NOnDhBrq6upKen\nR71796bbt2/T8ePHyc3NjQwMDMjb25tu3rwpi5WamkpDhgyhZs2akb6+PrVt25a2bNkitz5PT0+a\nMGECffnll2RpaUmNGjWiMWPG0OPHj+Xybdu2jdzd3UlXV5caN25M/fv3p/v378uWr169mpydnUlX\nV5ecnJwoKCiIiouLq9z/Z86coTfeeIP09PTIzMyMRo4cSTk5OUREFBISIrcPBUGgqKgohXEKCgpo\n0qRJZGJiQmZmZjR58mSaO3cuOTo6yvIEBATIXpePLYoiRUZGVljf5s2biYjo2rVrNHToUDI1NSUz\nMzPq168f/fPPP7LYz78/7du3J21tbTp8+DA9e/aMAgICyM7OjnR1dcnFxYX++9//Vni/165dS6NG\njSIjIyOytramr7/+Wu79KV+vjIwMhfuhrD2vXr2arKysyNDQkD766CMqLi6m4OBgsrGxITMzM/rw\nww/p2bNncmWrev/K10EURcrIyKATJ06QIAj0xx9/0BtvvEH6+vrUpk0bOnTokFzshIQEGjBgABka\nGpKhoSENHjyYkpOT5fJs376dHBwcSFdXl3r06EH79u0jQRDo1KlTRFR67MycOZOsra1JR0eHmjZt\nSr6+vgr3AxFRixYtKtS5zMGDB8nd3Z10dHRIIpHQlClT5I5pRR49ekTTp0+n5s2bk46ODtna2tLS\npUuV3sayNvK8zMxMuXatzP4s3xbKzi2KKNP+bG1tKSgoSKUygiBQcHAwvf/++2RgYEAtWrSg3bt3\n071798jX15eMjIzI3t6efv31V7lyWVlZNHbsWGrSpAkZGRlRz5496eTJk7LlL7L9mZmZNHToUDI3\nNyddXV2yt7en7777rtJ9k5aWJte+yl7v2LGDBg4cSPr6+mRvb0+hoaGVxiAiun//Pvn5+ZGNjQ3p\n6emRs7MzLV++vMoyRETp6en05ptvkp6eHjVv3pxWr15NXl5eNGHCBFmesnPz83788Udq3bq1rO2+\n9957sryKzhPlt5NI+bZ66tQp6tChA+nr61PHjh0pJiZGri4TJkwgBwcH0tPTI3t7e5o3bx4VFhbK\nlpedc7dv307Ozs6kqalJP/30E2loaFBmZqZcrM2bN5OJiQk9efKk2n1XH3HHrhaNHTuWfHx85NKW\nL19OgiDQ48ePKSQkhMzMzCg8PJzS0tLo5MmT1K5dOxo9erQsv6enJxkZGdHkyZPp6tWrFBcXR/fu\n3aPPPvuM7OzsKDs7m+7cuUNERJ999hk1btyYdu3aRdeuXaOlS5eSKIp07NgxWTxBEKhx48a0Zs0a\nSk1NpWvXrlFISAiJoki9e/emc+fOUWxsLDk5OVGvXr3Iw8OD/vrrL7p06RK1atWKhg8fLov1zz//\n0Jo1a+jvv/+m1NRUCg4OlnVAnq+/qakpffrpp5SYmEi///47NWrUiBYuXCjLs2nTJtLS0qKvvvpK\nto0//vijbLsCAgKoRYsWtHfvXkpPT6f//e9/ZGNjIxejvNu3b5ORkRH5+flRXFwcRUdHU7t27cjD\nw4OISjtrMTExJAgCHThwgLKzsyt0RsrMmDGDJBIJ7d+/nxITE+mzzz4jY2NjcnJykuUJCAiQva4s\ndlZWlqyjlZ2dTQUFBZSVlUUWFhY0ZcoUiouLo6SkJJo2bRo1btyYcnNziYhk70/Xrl0pMjKS0tLS\nKDc3l8aOHUtubm70xx9/UHp6Om3fvp1MTU3p559/lnu/LSwsaOPGjZSamkpr1qwhQRBkbeLevXtk\nZ2dHs2fPpuzsbMrOziapVKpwP4wdO5aMjY3J39+fEhIS6MCBA6Srq0tvvvkmjR07lhISEujgwYOk\np6dHP/30k9y+qer9q6wOZR/Ebm5udOTIEUpOTqZx48aRsbGxrNP/5MkTsrGxIW9vb4qNjaULFy5Q\n7969ydHRUfZ+xsbGkoaGBs2bN4+SkpJo9+7dZGtrK/eBtHz5crK2tqaoqCjKzMykmJgYWrVqVaXt\nKzc3lzQ1NWn16tWyOhMRXb58mTQ0NGTt/dChQ2RjYyN3TJdXUlJCnp6e5ODgQPv27aO0tDSKjo6W\nvY/KbKMqHbuq9ufFixdJEATas2eP3LmlsvZQXfsr37FTts1aWlpSWFgYpaSk0JQpU8jAwID69etH\nmzdvppSUFJo2bRoZGBjQ3bt3ZfuodevWNGzYMLpw4QKlpKRQUFAQ6ejo0NWrV194+wcPHkw+Pj50\n+fJl2ZeObdu2VbpvKuvY2dvb086dOyklJYXmzZtHmpqalJSUVGmcrKws+uabb+jixYuUnp5O4eHh\nZGhoSCEhIZWWKSkpoQ4dOlCXLl3o3LlzdOnSJfLx8SFjY2OaOHGiLJ+Xl5fc60WLFpGhoSGtWbOG\nrl27RpcuXZJ9CazsGC2/ncq2VVEUydPTk6KjoykhIYH69+9PdnZ2si97JSUlNH/+fDp37hxlZGTQ\n/v37qWnTphQQECCrb0BAAOnr65OXlxedO3eOrl27Ro8ePaJWrVrR4sWL5fZJr169aMqUKZXus/qO\nO3a1qPyIXXx8PNnb21P37t2JqPRbf/lvq1FRUSQIAj148ICISjtGzs7OFWI/P0JERJSfn086Ojpy\nH6hEREOGDKE+ffrIXguCUOFbWtkI0+XLl2Vp3333HQmCQLGxsbK0H374gczNzavc5nfeeUfuZOHp\n6Unt27eXyzN58mTZPiAiat68OU2bNk1hvPz8fNLX16cjR47IpW/evJlMTU0rrceCBQuoefPmVFRU\nJEu7fPkyCYIg+xav6NtmeY8fPyZdXV3auHGjXHqnTp0qdOyefz8qiy0IAv3yyy9y5bp16yaXp6Sk\nhBwcHGjlypVE9O/7Ex0dLcuT
"text": [
"<matplotlib.figure.Figure at 0x107ebb4d0>"
]
}
],
"prompt_number": 139
2014-05-02 01:34:45 +00:00
},
2014-05-02 01:25:08 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Conclusion"
]
2014-04-14 15:48:34 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-05-02 01:25:08 +00:00
"We see from the results that the `try-except` variant is faster than then the `if element in my_dict` alternative if we have a low number of unique elements (here: 1000 integers in the range 1-5), which makes sense: the `except`-block is skipped if an element is already added as a key to the dictionary. However, in this case the `collections.defaultdict` has even a better performance. \n",
"However, if we are having a relative large number of unique entries(here: 1000 integers in range 1-1000), the `if element in my_dict` approach outperforms the alternative approaches."
2014-04-14 15:48:34 +00:00
]
2014-04-22 17:30:30 +00:00
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"comprehensions\"></a>\n",
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Comprehesions vs. for-loops"
]
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-25 17:05:32 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Comprehensions are not only shorter and prettier than ye goode olde for-loop, \n",
"but they are also up to ~1.2x faster."
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
2014-05-07 22:38:42 +00:00
"n = 1000"
2014-04-24 21:25:45 +00:00
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-07 22:38:42 +00:00
"prompt_number": 140
2014-04-24 21:25:45 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set comprehensions"
]
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"def set_loop(n):\n",
" a_set = set()\n",
" for i in range(n):\n",
" if i % 3 == 0:\n",
" a_set.add(i)\n",
2014-05-07 22:38:42 +00:00
" return a_set\n",
"\n",
2014-04-24 21:25:45 +00:00
"def set_compr(n):\n",
" return {i for i in range(n) if i % 3 == 0}"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-07 22:38:42 +00:00
"prompt_number": 141
2014-04-24 21:25:45 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit set_loop(n)\n",
"%timeit set_compr(n)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"1000 loops, best of 3: 165 \u00b5s per loop\n",
"10000 loops, best of 3: 145 \u00b5s per loop"
2014-04-24 21:25:45 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 142
2014-04-24 21:25:45 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## List comprehensions"
]
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"def list_loop(n):\n",
" a_list = list()\n",
" for i in range(n):\n",
" if i % 3 == 0:\n",
" a_list.append(i)\n",
" return a_list"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-07 22:38:42 +00:00
"prompt_number": 143
2014-04-24 21:25:45 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def list_compr(n):\n",
" return [i for i in range(n) if i % 3 == 0]"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-07 22:38:42 +00:00
"prompt_number": 144
2014-04-24 21:25:45 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit list_loop(n)\n",
"%timeit list_compr(n)"
],
"language": "python",
"metadata": {},
2014-04-25 17:05:32 +00:00
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"10000 loops, best of 3: 164 \u00b5s per loop\n",
"10000 loops, best of 3: 143 \u00b5s per loop"
2014-04-25 17:05:32 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 145
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['list_loop', 'list_compr']\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(n)' %f, \n",
" 'from __main__ import %s, n' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 152
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('list_loop', 'explicit for-loop'), \n",
" ('list_compr', 'list comprehension')]\n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], \n",
" alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"#plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of explicit for-loops vs. list comprehensions')\n",
"\n",
"max_perf = max( l/c for l,c in zip(times_n['list_loop'],\n",
" times_n['list_compr']) )\n",
"min_perf = min( l/c for l,c in zip(times_n['list_loop'],\n",
" times_n['list_compr']) )\n",
"\n",
"ftext = 'the list comprehension is {:.2f}x to '\\\n",
" '{:.2f}x faster than the explicit for-loop'\\\n",
" .format(min_perf, max_perf)\n",
"plt.figtext(.14,.75, ftext, fontsize=11, ha='left')\n",
"\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAIECAYAAACUvmMzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XlcTfn/B/DXvdo3JbK3SEXbvUmFSEm2yDKGkGQNMwZf\n688womHGd8bSkGWIkN1gTNZUN/sWolWFGWuyhBatn98ffe/RrVvdm0tp3s/Ho8ejc87nvD+fz7mf\nc/r0OZ9zLo8xxkAIIYQQQr54/NouACGEEEIIUQzq2BFCCCGE1BPUsSOEEEIIqSeoY0cIIYQQUk9Q\nx44QQgghpJ6gjh0hhBBCSD1BHTvyWRQVFWHcuHFo3Lgx+Hw+zp49W9tF+iIdOHAApqamUFJSwrhx\n42q1LAEBATAzM+OWQ0NDoaysLPP+8qYv6+HDh3B3d4eWlhYaNGhQoxhVMTY2xrJlyxQe90vl6uqK\niRMncst+fn7w8PCoxRJ9WepSe+Lz+di9e3dtF4N8QtSxIxw/Pz/w+Xzw+XwoKyvD2NgYU6ZMwatX\nrz469h9//IE9e/YgPDwcz549Q+fOnRVQ4n+X4uJijBs3Dt7e3nj48CGCgoJqu0jg8Xjc797e3njy\n5InM+5ZPHxYWBj5ftkvS8uXL8eLFC8TFxeHp06eyF1hGPB5Pom7/duWPx9q1a3Hw4EGZ91dSUsKO\nHTs+RdG+CHWpPT179gxfffVVbReDfEJKtV0AUre4uLhg//79KCoqwvXr1zFx4kQ8fPgQ4eHhNYpX\nUFAAFRUVpKamomXLlujUqdNHlU8c79/oyZMnyMnJQd++fdG8efPaLg4AoOz7zdXU1KCmpibzvvKm\nLys1NRUODg4wNTWt0f5iRUVFUFKiy6C8tLW15UrP4/FQH9+F/yW2HwMDg9ouAvnEaMSOSFBWVoaB\ngQFatGgBLy8vTJ8+HSdPnkR+fj4AYO/evRAKhVBXV4eJiQlmzZqF3Nxcbn9XV1dMmDABixYtQosW\nLWBkZAQ3Nzf88MMPuHfvHvh8Ptq0aQMAKCwsxPz589GqVSuoqqrCysoKe/bskSgPn8/H2rVrMXLk\nSOjq6sLX15e7hScSiWBjYwMNDQ306NEDz549Q3R0NIRCIbS0tODh4SExInT//n0MGTIELVu2hKam\nJmxtbREWFiaRn/iWU2BgIJo3bw59fX2MGTMGOTk5Eun27dsHe3t7qKuro3HjxujXrx+ysrK47WvX\nrkW7du2grq4Oc3NzLF++HMXFxVUe+8uXL8PFxQUaGhpo1KgRRo0ahczMTAClty2NjIwAlHa+q7ud\nXVX++/fvh6qqKq5du8al37FjBzQ0NBAfHw/gw6221atXc8dr2LBheP36daV5Sru1Ghsbiz59+qBh\nw4bQ1taGk5MTrl69WiG9SCSCr68vAHCjxpXdaubz+YiKisLWrVsl0j19+hTe3t7Q09ODhoYG3Nzc\nEBsby+0nEonA5/Nx/PhxdO3aFerq6ggJCam0PmW9e/cO/v7+MDAwgJqaGhwcHBARESGRJiUlBZ6e\nntDW1oa2tja8vLyQnp5e4fhERkbCysoK6urq6NSpE+Li4rg0b9++xdixY9G8eXOoqanB0NAQs2bN\nqrRczs7O8Pf3r7C+ffv2+OGHHwAACQkJ6N27N/T09KClpQVLS8sK7V5e5W/FVpWHsbExiouLMXbs\nWPD5/GpvnQcHB8PS0hJqampo2rQphg4dym2r7nN48OAB+Hw+9uzZg969e0NTUxOWlpY4f/48/vnn\nH/Tp0wdaWlqwsrLC+fPnuf3EbSM8PByOjo5QV1eHjY0NoqOjK6SR1n5kOd/z8/Mxffp06Ovro1mz\nZvjPf/5TIU11cYyNjbF48eIq45w/fx7Ozs7Q0dGBjo4OhEIhTp8+zW0vfytW1vPmzJkzcHFxgaam\nJqysrHDy5EmJsi9fvhympqZQU1ODgYEB+vTpg/fv31f5WZNPhBHyP2PGjGEeHh4S61auXMl4PB7L\nzs5m27ZtY3p6eiwsLIzdv3+fnT17ltna2rLRo0dz6bt37860tbXZlClTWFJSEouPj2evXr1is2fP\nZiYmJiwjI4O9ePGCMcbY7Nmzmb6+Pjt48CBLTU1ly5cvZ3w+n0VGRnLxeDwe09fXZ8HBwezevXss\nNTWVbdu2jfH5fObm5sauXr3Kbty4wczMzFjXrl2Zi4sLu3LlCrt16xZr164dGz58OBfrzp07LDg4\nmN2+fZvdu3ePrV27likpKbHo6GiJ8uvq6rL//Oc/LCUlhZ0+fZo1atSILVq0iEuzdetWpqyszH78\n8UeujuvWrePqtXjxYmZkZMSOHDnCHjx4wI4fP84MDQ0lYpT39OlTpq2tzUaNGsXi4+PZ+fPnma2t\nLXNxcWGMMZaXl8euXbvGeDwe++uvv1hGRgYrKCiQGkuW/CdOnMhMTU3Z27dvWUpKCtPW1mYbNmyQ\naAs6Ojps4MCBLD4+nolEImZmZsYGDx4skU/btm255W3btjElJSVuOT4+nmloaLCRI0ey2NhYlp6e\nzvbv388uXbpUIX1BQQELDg5mPB6PZWRksIyMDPb27Vup9Xv27Bnr0qUL8/Hx4dKVlJQwR0dHZmdn\nxy5cuMDu3LnDhg8fzvT09LjPJTo6mvF4PNauXTsWHh7OHjx4wB4/fiw1D2NjY7Zs2TJueejQoczE\nxISdPn2aJScns+nTpzMVFRWWnJzMGGMsNzeXGRoasp49e7IbN26w2NhY5ubmxtq2bct9TuJ2a29v\nz86ePctu377N+vfvz1q2bMny8vIYY4xNmzaNCQQCdvXqVfbw4UN28eJFtmXLFqllZIyx33//nenp\n6bH8/Hxu3ZUrVxiPx2OpqamMMcZsbGzYqFGjWFJSErt//z47ceIECw8PrzSmNK6urmzixIncsp+f\nn8S1oqo8MjMzmZKSEvvtt9+4z7YyP/zwA9PS0mLBwcEsNTWV3bp1i/3000/c9uo+h/v37zMej8dM\nTU3Zn3/+ye7evcsGDx7MWrZsyVxdXdmRI0fY3bt32dChQ1nr1q1ZYWEhY+xD2zAzM2PHjh1jycnJ\nbPz48UxTU5M9ffpUIk3Z9vPo0SOZzjcjIyOmp6fHVqxYwdLS0tj+/fuZsrIyCwkJ4dIoIk5hYSHT\n09Njs2bNYmlpaSwtLY0dOXKEnTt3jovB4/HYrl27GGNMrvNGIBCwU6dOsbS0NDZ27Fimo6PDXr9+\nzRhj7I8//mA6OjosPDycPXz4kN26dYsFBQVx7Zp8XtSxI5wxY8awnj17cssJCQmsTZs2rHPnzoyx\n0ovKpk2bJPaJiYlhPB6PZWVlMcZKO0YWFhYVYpfvBOTk5DBVVVWJzgRjjA0ePJj16NGDW+bxeGzC\nhAkSabZt28Z4PB6Li4vj1v3yyy+Mx+OxGzducOtWr17NGjduXGWdBw4cKPEHq3v37kwoFEqkmTJl\nCncMGGOsdevWbNq0aVLj5eTkMA0NDXbq1CmJ9du3b2e6urqVlmPhwoUSf2gYYywuLo7xeDx29uxZ\nxtiHP1oXLlyoNI6s+efm5jIrKys2bNgwJhQK2ZAhQyTSjxkzhmlra0t0rk6fPs14PB5LT09njFXf\nsfPx8alwLMsqn37nzp2Mx+NVmr6s8h2NM2fOMB6Px5KSkrh1+fn5rHnz5mzp0qWMsQ9/oMLCwqqN\nX7Zjl5qayng8Hjtx4oREmg4dOrBx48YxxhjbsmUL09DQYC9fvuS2Z2RkMHV1dbZjxw6uvjwej0VF\nRXFpXr9+zbS0tNjWrVsZY6Xt0c/PT6ZjIN5fXV2dHThwgFv3zTffsC5dunDLDRs2ZKGhoTLHlKb8\n8S5/raguDyUlJbZ9+/Yq88jOzmZqamps5cqVUrfL8jmIz5GgoCBuu/gfolWrVnHrbt68yXg8HktI\nSGCMfWgb4s+BMcaKioqYkZER17GS1n5kPd+MjIzYwIEDJdL07duXjRgxQqFxXr16xXg8HhOJRBUP\n4P+U7djJc94cPnyYS5ORkcF4
"text": [
"<matplotlib.figure.Figure at 0x1055b1910>"
]
}
],
"prompt_number": 153
2014-04-24 21:25:45 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dictionary comprehensions"
]
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-24 21:25:45 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"def dict_loop(n):\n",
" a_dict = dict()\n",
" for i in range(n):\n",
" if i % 3 == 0:\n",
" a_dict[i] = i\n",
" return a_dict"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-07 22:38:42 +00:00
"prompt_number": 146
2014-04-24 21:25:45 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def dict_compr(n):\n",
" return {i:i for i in range(n) if i % 3 == 0}"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-07 22:38:42 +00:00
"prompt_number": 147
2014-04-24 21:25:45 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit dict_loop(n)\n",
"%timeit dict_compr(n)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-07 22:38:42 +00:00
"10000 loops, best of 3: 159 \u00b5s per loop\n",
"10000 loops, best of 3: 151 \u00b5s per loop"
2014-04-24 21:25:45 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-07 22:38:42 +00:00
"prompt_number": 148
2014-04-24 21:25:45 +00:00
},
2014-04-26 05:15:18 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"find_copy\"></a>\n",
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Copying files by searching directory trees"
]
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-26 05:15:18 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Executing `Unix`/`Linux` shell commands:"
]
},
2014-04-22 17:30:30 +00:00
{
"cell_type": "code",
"collapsed": false,
2014-04-26 05:15:18 +00:00
"input": [
"import subprocess\n",
"\n",
"def subprocess_findcopy(path, search_str, dest): \n",
2014-04-26 08:11:20 +00:00
" query = 'find %s -name \"%s\" -exec cp {} %s \\;' %(path, search_str, dest)\n",
2014-04-26 05:15:18 +00:00
" subprocess.call(query, shell=True)\n",
" return "
],
2014-04-22 17:30:30 +00:00
"language": "python",
"metadata": {},
2014-04-26 05:15:18 +00:00
"outputs": [],
2014-04-26 08:11:20 +00:00
"prompt_number": 30
2014-04-26 05:15:18 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using Python's `os.walk()` to search the directory tree recursively and matching patterns via `fnmatch.filter()`"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import shutil\n",
"import os\n",
"import fnmatch\n",
"\n",
"def walk_findcopy(path, search_str, dest):\n",
" for path, subdirs, files in os.walk(path):\n",
" for name in fnmatch.filter(files, search_str):\n",
2014-04-26 08:11:20 +00:00
" try:\n",
" shutil.copy(os.path.join(path,name), dest)\n",
" except NameError:\n",
" pass\n",
" return"
2014-04-26 05:15:18 +00:00
],
"language": "python",
"metadata": {},
"outputs": [],
2014-04-26 08:11:20 +00:00
"prompt_number": 33
2014-04-26 05:15:18 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"\n",
2014-04-26 08:11:20 +00:00
"\n",
"def findcopy_timeit(inpath, outpath, search_str):\n",
" \n",
" shutil.rmtree(outpath)\n",
" os.mkdir(outpath)\n",
" print(50*'#')\n",
" print('subprocsess call')\n",
" %timeit subprocess_findcopy(inpath, search_str, outpath)\n",
" print(\"copied %s files\" %len(os.listdir(outpath)))\n",
" shutil.rmtree(outpath)\n",
" os.mkdir(outpath)\n",
" print('\\nos.walk approach')\n",
" %timeit walk_findcopy(inpath, search_str, outpath)\n",
" print(\"copied %s files\" %len(os.listdir(outpath)))\n",
" print(50*'#')\n",
"\n",
2014-04-26 05:15:18 +00:00
"print('small tree')\n",
"inpath = '/Users/sebastian/Desktop/testdir_in'\n",
"outpath = '/Users/sebastian/Desktop/testdir_out'\n",
2014-04-26 08:11:20 +00:00
"search_str = '*.png'\n",
"findcopy_timeit(inpath, outpath, search_str)\n",
2014-04-26 05:15:18 +00:00
"\n",
2014-04-26 08:11:20 +00:00
"print('larger tree')\n",
2014-04-26 05:15:18 +00:00
"inpath = '/Users/sebastian/Dropbox'\n",
"outpath = '/Users/sebastian/Desktop/testdir_out'\n",
2014-04-26 08:11:20 +00:00
"search_str = '*.csv'\n",
"findcopy_timeit(inpath, outpath, search_str)\n"
2014-04-26 05:15:18 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"small tree\n",
2014-04-26 08:11:20 +00:00
"##################################################"
2014-04-26 05:15:18 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-04-26 08:11:20 +00:00
"subprocsess call\n",
"1 loops, best of 3: 268 ms per loop"
2014-04-26 05:15:18 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-04-26 08:11:20 +00:00
"copied 13 files\n",
"\n",
"os.walk approach\n",
"100 loops, best of 3: 12.2 ms per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-04-26 05:15:18 +00:00
"\n",
2014-04-26 08:11:20 +00:00
"copied 13 files\n",
"##################################################\n",
2014-04-26 05:15:18 +00:00
"larger tree\n",
2014-04-26 08:11:20 +00:00
"##################################################\n",
"subprocsess call\n",
"1 loops, best of 3: 623 ms per loop"
2014-04-26 05:15:18 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
2014-04-26 08:11:20 +00:00
"copied 105 files\n",
"\n",
"os.walk approach\n",
"1 loops, best of 3: 417 ms per loop"
2014-04-26 05:15:18 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-04-26 08:11:20 +00:00
"\n",
"copied 105 files\n",
"##################################################\n"
2014-04-26 05:15:18 +00:00
]
}
],
2014-04-26 08:11:20 +00:00
"prompt_number": 35
},
{
2014-04-26 18:40:28 +00:00
"cell_type": "markdown",
2014-04-26 08:11:20 +00:00
"metadata": {},
2014-04-26 18:40:28 +00:00
"source": [
"I have to say that I am really positively surprised. The shell's `find` scales even better than expected!"
]
2014-04-26 05:15:18 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-04-26 18:40:28 +00:00
"<br>\n",
"<br>\n",
"<a name='row_vectors'></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Returning column vectors slicing through a numpy array"
2014-04-26 05:15:18 +00:00
]
2014-04-26 05:15:39 +00:00
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-04-26 18:40:28 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Given a numpy matrix, I want to iterate through it and return each column as a 1-column vector. \n",
"E.g., if I want to return the 1st column from matrix A below\n",
"\n",
"<pre>\n",
"A = np.array([ [1,2,3], [4,5,6], [7,8,9] ])\n",
">>> A\n",
"array([[1, 2, 3],\n",
" [4, 5, 6],\n",
" [7, 8, 9]])</pre>\n",
"\n",
"I want my result to be:\n",
"<pre>\n",
"array([[1],\n",
" [4],\n",
" [7]])</pre>\n",
"\n",
"with `.shape` = `(3,1)`\n",
"\n",
"\n",
"However, the default behavior of numpy is to return the column as a row vector:\n",
"\n",
"<pre>\n",
">>> A[:,0]\n",
"array([1, 4, 7])\n",
">>> A[:,0].shape\n",
"(3,)\n",
"</pre>"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import numpy as np\n",
"\n",
2014-04-26 18:48:28 +00:00
"# 1st column, e.g., A[:,0,np.newaxis]\n",
"\n",
2014-04-26 18:40:28 +00:00
"def colvec_method1(A):\n",
" for col in A.T:\n",
" colvec = row[:,np.newaxis]\n",
" yield colvec"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 83
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-04-26 18:47:03 +00:00
"# 1st column, e.g., A[:,0:1]\n",
"\n",
2014-04-26 18:40:28 +00:00
"def colvec_method2(A):\n",
" for idx in range(A.shape[1]):\n",
" colvec = A[:,idx:idx+1]\n",
" yield colvec"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 82
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-04-26 18:47:03 +00:00
"# 1st column, e.g., A[:,0].reshape(-1,1)\n",
"\n",
2014-04-26 18:40:28 +00:00
"def colvec_method3(A):\n",
" for idx in range(A.shape[1]):\n",
2014-04-26 18:47:03 +00:00
" colvec = A[:,idx].reshape(-1,1)\n",
2014-04-26 18:40:28 +00:00
" yield colvec"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 81
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-04-26 18:47:03 +00:00
"# 1st column, e.g., np.vstack(A[:,0]\n",
"\n",
2014-04-26 18:40:28 +00:00
"def colvec_method4(A):\n",
" for idx in range(A.shape[1]):\n",
" colvec = np.vstack(A[:,idx])\n",
" yield colvec"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 79
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-04-26 18:47:03 +00:00
"# 1st column, e.g., np.row_stack(A[:,0])\n",
"\n",
2014-04-26 18:40:28 +00:00
"def colvec_method5(A):\n",
" for idx in range(A.shape[1]):\n",
" colvec = np.row_stack(A[:,idx])\n",
" yield colvec"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 77
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-04-26 18:47:03 +00:00
"# 1st column, e.g., np.column_stack((A[:,0],))\n",
"\n",
2014-04-26 18:40:28 +00:00
"def colvec_method6(A):\n",
" for idx in range(A.shape[1]):\n",
" colvec = np.column_stack((A[:,idx],))\n",
" yield colvec"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 74
},
2014-04-26 18:47:03 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"# 1st column, e.g., A[:,[0]]\n",
"\n",
"def colvec_method7(A):\n",
" for idx in range(A.shape[1]):\n",
" colvec = A[:,[idx]]\n",
" yield colvec"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 89
},
2014-04-26 18:40:28 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"def test_method(method, A):\n",
" for i in method(A): \n",
" assert i.shape == (A.shape[0],1), \"{}, {}\".format(i.shape, A.shape[0],1)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 69
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"\n",
"A = np.random.random((300, 3))\n",
"\n",
"for method in [\n",
" colvec_method1, colvec_method2, \n",
" colvec_method3, colvec_method4, \n",
2014-04-26 18:47:03 +00:00
" colvec_method5, colvec_method6,\n",
" colvec_method7]:\n",
2014-04-26 18:40:28 +00:00
" print('\\nTest:', method.__name__)\n",
" %timeit test_method(colvec_method2, A)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"Test: colvec_method1\n",
2014-04-26 18:47:03 +00:00
"100000 loops, best of 3: 16.6 \u00b5s per loop"
2014-04-26 18:40:28 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Test: colvec_method2\n",
2014-04-26 18:47:03 +00:00
"10000 loops, best of 3: 16.1 \u00b5s per loop"
2014-04-26 18:40:28 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Test: colvec_method3\n",
2014-04-26 18:47:03 +00:00
"100000 loops, best of 3: 16.2 \u00b5s per loop"
2014-04-26 18:40:28 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Test: colvec_method4\n",
2014-04-26 18:47:03 +00:00
"100000 loops, best of 3: 16.4 \u00b5s per loop"
2014-04-26 18:40:28 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Test: colvec_method5\n",
2014-04-26 18:47:03 +00:00
"100000 loops, best of 3: 16.2 \u00b5s per loop"
2014-04-26 18:40:28 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Test: colvec_method6\n",
2014-04-26 18:47:03 +00:00
"100000 loops, best of 3: 16.8 \u00b5s per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Test: colvec_method7\n",
"100000 loops, best of 3: 16.3 \u00b5s per loop"
2014-04-26 18:40:28 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-04-26 18:47:03 +00:00
"prompt_number": 91
2014-04-26 18:40:28 +00:00
},
2014-05-01 20:07:40 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='numpy'></a>\n",
"<br>\n",
"<br>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Speed of numpy functions vs Python built-ins and std. lib."
]
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-05-07 22:38:42 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"np_sum\"></a>\n",
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## `sum()` vs. `numpy.sum()`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-05-01 20:07:40 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-05-08 01:57:53 +00:00
"from numpy import sum as np_sum\n",
2014-05-01 20:07:40 +00:00
"import timeit\n",
"\n",
"samples = list(range(1000000))\n",
"\n",
"%timeit(sum(samples))\n",
2014-05-08 01:57:53 +00:00
"%timeit(np_sum(samples))"
2014-05-01 20:07:40 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-08 01:57:53 +00:00
"10 loops, best of 3: 18.2 ms per loop\n",
"10 loops, best of 3: 138 ms per loop"
2014-05-01 20:07:40 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-08 01:57:53 +00:00
"prompt_number": 20
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['sum', 'np_sum']\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" samples = list(range(n))\n",
" times_n['sum'].append(min(timeit.Timer('sum(samples)', \n",
" 'from __main__ import samples')\n",
" .repeat(repeat=3, number=1000)))\n",
" times_n['np_sum'].append(min(timeit.Timer('np_sum(samples)', \n",
" 'from __main__ import np_sum, samples')\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 26
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 24
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('sum', 'in-built sum() function'), \n",
" ('np_sum', 'numpy.sum() function')]\n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], \n",
" alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of explicit for-loops vs. list comprehensions')\n",
"\n",
"max_perf = max( n/i for i,n in zip(times_n['sum'],\n",
" times_n['np_sum']) )\n",
"min_perf = min( n/i for i,n in zip(times_n['sum'],\n",
" times_n['np_sum']) )\n",
"\n",
"ftext = 'the in-built sum() is {:.2f}x to '\\\n",
" '{:.2f}x faster than the numpy.sum()'\\\n",
" .format(min_perf, max_perf)\n",
"plt.figtext(.14,.75, ftext, fontsize=11, ha='left')\n",
"\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAIECAYAAACUvmMzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XdYFMf/B/D3HUXKnYLoAYJyCorSi2CnCGKLJhqjWKLY\n4tcWG6JfY4EYTewFMSY2NMbY8k00xoJKURQlIpKAiiCiWCgWRBCp8/vDHxsWDrjDExA/r+fhedjZ\n2dnP7M4ew+zsnoAxxkAIIYQQQt57wvoOgBBCCCGEKAd17AghhBBCGgnq2BFCCCGENBLUsSOEEEII\naSSoY0cIIYQQ0khQx44QQgghpJGgjh2pE8XFxZgwYQJatGgBoVCI8+fP13dI76XDhw/D1NQUqqqq\nmDBhQr3G4u/vj/bt23PLwcHBUFNTk3t7RfOXl5aWBg8PD4hEIqioqNSqjOpIpVKsWLFC6eW+r9zc\n3DB58mRu2cfHB3369KnHiN4vDak9CYVC7N+/v77DIO8QdewIx8fHB0KhEEKhEGpqapBKpZg6dSqe\nPXv21mX/+uuv+OWXX3D8+HGkp6ejW7duSoj4w1JSUoIJEybA29sbaWlp2LRpU32HBIFAwP3u7e2N\nR48eyb1txfz79u2DUCjfR9LKlSvx5MkTxMXF4fHjx/IHLCeBQMCr24eu4vEIDAzEkSNH5N5eVVUV\ne/fufRehvRcaUntKT0/Hp59+Wt9hkHdItb4DIA2Li4sLDh06hOLiYly9ehWTJ09GWloajh8/Xqvy\nCgsLoa6ujqSkJBgZGaFr165vFV9ZeR+iR48eIS8vD/3794ehoWF9hwMAKP9+cw0NDWhoaMi9raL5\ny0tKSoKTkxNMTU1rtX2Z4uJiqKrSx6CixGKxQvkFAgEa47vw38f2I5FI6jsE8o7RiB3hUVNTg0Qi\nQatWrTB48GDMmjULp06dQkFBAQDgwIEDsLOzg6amJtq2bYt58+bh1atX3PZubm6YNGkSlixZglat\nWsHExATu7u5YunQpUlJSIBQK0a5dOwBAUVERFi5cCGNjYzRp0gSWlpb45ZdfePEIhUIEBgZi1KhR\n0NHRwdixY7lbeOHh4bC2toaWlhZ69+6N9PR0hIWFwc7ODiKRCH369OGNCN29exdDhw6FkZERtLW1\nYWNjg3379vH2V3bLafny5TA0NISenh7GjRuHvLw8Xr6DBw/C0dERmpqaaNGiBQYMGIDs7GxufWBg\nIDp27AhNTU106NABK1euRElJSbXH/vLly3BxcYGWlhaaN2+O0aNHIysrC8Cb25YmJiYA3nS+a7qd\nXd3+Dx06hCZNmuCvv/7i8u/duxdaWlqIj48H8O+ttg0bNnDHa/jw4Xj+/HmV+5R1azUmJgb9+vVD\ns2bNIBaL0aVLF0RHR1fKHx4ejrFjxwIAN2pc1a1moVCI0NBQ7Nq1i5fv8ePH8Pb2hq6uLrS0tODu\n7o6YmBhuu/DwcAiFQpw4cQI9e/aEpqYmdu7cWWV9ynv58iWmTJkCiUQCDQ0NODk54cyZM7w8iYmJ\nGDhwIMRiMcRiMQYPHow7d+5UOj7nzp2DpaUlNDU10bVrV8TFxXF5cnJyMH78eBgaGkJDQwNt2rTB\nvHnzqoyrR48emDJlSqX0Tp06YenSpQCAhIQE9O3bF7q6uhCJRLCwsKjU7hVV8VZsdfuQSqUoKSnB\n+PHjIRQKa7x1HhQUBAsLC2hoaEBfXx/Dhg3j1tV0HlJTUyEUCvHLL7+gb9++0NbWhoWFBSIjI3H/\n/n3069cPIpEIlpaWiIyM5LYraxvHjx+Hs7MzNDU1YW1tjbCwsEp5ZLUfea73goICzJo1C3p6ejAw\nMMDcuXMr5ampHKlUimXLllVbTmRkJHr06IGmTZuiadOmsLOzQ0hICLe+4q1Yea+bs2fPwsXFBdra\n2rC0tMSpU6d4sa9cuRKmpqbQ0NCARCJBv3798Pr162rPNXlHGCH/b9y4caxPnz68tHXr1jGBQMBy\nc3PZ7t27ma6uLtu3bx+7e/cuO3/+PLOxsWGff/45l9/V1ZWJxWI2depUdvPmTRYfH8+ePXvGfH19\nWdu2bVlGRgZ78uQJY4wxX19fpqenx44cOcKSkpLYypUrmVAoZOfOnePKEwgETE9PjwUFBbGUlBSW\nlJTEdu/ezYRCIXN3d2fR0dHs2rVrrH379qxnz57MxcWFXblyhV2/fp117NiRjRgxgivrn3/+YUFB\nQezvv/9mKSkpLDAwkKmqqrKwsDBe/Do6Omzu3LksMTGRhYSEsObNm7MlS5ZweXbt2sXU1NTYN998\nw9Vxy5YtXL2WLVvGTExM2O+//85SU1PZiRMnWJs2bXhlVPT48WMmFovZ6NGjWXx8PIuMjGQ2NjbM\nxcWFMcZYfn4+++uvv5hAIGB//PEHy8jIYIWFhTLLkmf/kydPZqampiwnJ4clJiYysVjMvv/+e15b\naNq0Kfv4449ZfHw8Cw8PZ+3bt2dDhgzh7cfMzIxb3r17N1NVVeWW4+PjmZaWFhs1ahSLiYlhd+7c\nYYcOHWJRUVGV8hcWFrKgoCAmEAhYRkYGy8jIYDk5OTLrl56ezrp3787GjBnD5SstLWXOzs7M3t6e\nXbx4kf3zzz9sxIgRTFdXlzsvYWFhTCAQsI4dO7Ljx4+z1NRU9vDhQ5n7kEqlbMWKFdzysGHDWNu2\nbVlISAi7desWmzVrFlNXV2e3bt1ijDH26tUr1qZNG+bp6cmuXbvGYmJimLu7OzMzM+POU1m7dXR0\nZOfPn2d///03++ijj5iRkRHLz89njDE2c+ZMZmtry6Kjo1laWhq7dOkS27Fjh8wYGWPsxx9/ZLq6\nuqygoIBLu3LlChMIBCwpKYkxxpi1tTUbPXo0u3nzJrt79y47efIkO378eJVlyuLm5sYmT57MLfv4\n+PA+K6rbR1ZWFlNVVWWbN2/mzm1Vli5dykQiEQsKCmJJSUns+vXr7Ntvv+XW13Qe7t69ywQCATM1\nNWVHjx5lt2/fZkOGDGFGRkbMzc2N/f777+z27dts2LBhrHXr1qyoqIgx9m/baN++Pfvzzz/ZrVu3\n2MSJE5m2tjZ7/PgxL0/59vPgwQO5rjcTExOmq6vLVq1axZKTk9mhQ4eYmpoa27lzJ5dHGeUUFRUx\nXV1dNm/ePJacnMySk5PZ77//zi5cuMCVIRAI2M8//8wYYwpdN7a2tuz06dMsOTmZjR8/njVt2pQ9\nf/6cMcbYr7/+ypo2bcqOHz/O0tLS2PXr19mmTZu4dk3qFnXsCGfcuHHM09OTW05ISGDt2rVj3bp1\nY4y9+VD54YcfeNtEREQwgUDAsrOzGWNvOkbm5uaVyq7YCcjLy2NNmjThdSYYY2zIkCGsd+/e3LJA\nIGCTJk3i5dm9ezcTCAQsLi6OS1uzZg0TCATs2rVrXNqGDRtYixYtqq3zxx9/zPuD5erqyuzs7Hh5\npk6dyh0Dxhhr3bo1mzlzpszy8vLymJaWFjt9+jQvfc+ePUxHR6fKOBYvXsz7Q8MYY3FxcUwgELDz\n588zxv79o3Xx4sUqy5F3/69evWKWlpZs+PDhzM7Ojg0dOpSXf9y4cUwsFvM6VyEhIUwgELA7d+4w\nxmru2I0ZM6bSsSyvYv6ffvqJCQSCKvOXV7GjcfbsWSYQCNjNmze5tIKCAmZoaMi+/vprxti/f6D2\n7dtXY/nlO3ZJSUlMIBCwkydP8vI4ODiwCRMmMMYY27FjB9PS0mJPnz7l1mdkZDBNTU22d+9err4C\ngYCFhoZyeZ4/f85EIhHbtWsXY+xNe/Tx8ZHrGJRtr6mpyQ4fPsylTZ8+nXXv3p1bbtasGQsODpa7\nTFkqHu+KnxU17UNVVZXt2bOn2n3k5uYyDQ0Ntm7dOpnr5TkPZdfIpk2buPVl/xCtX7+eS4uNjWUC\ngYAlJCQwxv5tG2XngTHGiouLmYmJCdexktV+5L3eTExM2Mcff8zL079/fzZy5EillvPs2TMmEAhY\neHh45QP4/8p37BS5bn777Tcu
"text": [
"<matplotlib.figure.Figure at 0x106783160>"
]
}
],
"prompt_number": 27
2014-05-07 22:38:42 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"np_arange\"></a>\n",
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## `range()` vs. `numpy.arange()`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
2014-05-01 20:07:40 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-05-08 01:57:53 +00:00
"from numpy import arange as np_arange\n",
"\n",
"n = 1000000\n",
"\n",
"def loop_range(n):\n",
" for i in range(n):\n",
" pass\n",
" return\n",
2014-05-01 20:07:40 +00:00
"\n",
2014-05-08 01:57:53 +00:00
"def loop_arange(n):\n",
" for i in np_arange(n):\n",
" pass\n",
" return\n",
"\n",
"%timeit(loop_range(n))\n",
"%timeit(loop_arange(n))"
2014-05-01 20:07:40 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
2014-05-08 01:57:53 +00:00
"10 loops, best of 3: 50.9 ms per loop\n",
"10 loops, best of 3: 183 ms per loop"
2014-05-01 20:07:40 +00:00
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
2014-05-08 01:57:53 +00:00
"prompt_number": 36
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"funcs = ['loop_range', 'loop_arange']\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(n)' %f, \n",
" 'from __main__ import %s, n' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 38
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('loop_range', 'in-built range()'), \n",
" ('loop_arange', 'numpy.arange()')]\n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], \n",
" alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of explicit for-loops vs. list comprehensions')\n",
"\n",
"max_perf = max( a/r for r,a in zip(times_n['loop_range'],\n",
" times_n['loop_arange']) )\n",
"min_perf = min( a/r for r,a in zip(times_n['loop_range'],\n",
" times_n['loop_arange']) )\n",
"\n",
"ftext = 'the in-built range() is {:.2f}x to '\\\n",
" '{:.2f}x faster than numpy.arange()'\\\n",
" .format(min_perf, max_perf)\n",
"plt.figtext(.14,.75, ftext, fontsize=11, ha='left')\n",
"\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAIECAYAAACUvmMzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XdYFMf/B/D3Hr0KgihFinSQJs0uCPaORkENosRY0kws\n8WssqJFoDPYWKyoaoyb2XjiU2LBGUUEQFQvFAkgXmN8f/Fg5OOAODxHyeT3PPc/t7OyUvdm7udnZ\nXY4xxkAIIYQQQho8QX0XgBBCCCGEyAZ17AghhBBCGgnq2BFCCCGENBLUsSOEEEIIaSSoY0cIIYQQ\n0khQx44QQgghpJGgjh35KIqKijBmzBjo6upCIBDg3Llz9V2kBmnPnj0wNzeHvLw8xowZU69lCQkJ\ngaWlJb8cHh4OBQUFibeXNn55ycnJ8PHxgbq6OuTk5GqVRnVMTU2xYMECmafbUHl5eWHs2LH8clBQ\nELp161aPJWpYPqX2JBAIsHPnzvouBqlD1LEjvKCgIAgEAggEAigoKMDU1BQTJkzA69evPzjtv/76\nC3/88QcOHz6MlJQUtGvXTgYl/m8pLi7GmDFj4O/vj+TkZCxfvry+iwSO4/j3/v7+eP78ucTbVowf\nEREBgUCyr6TQ0FC8fPkSt27dwosXLyQvsIQ4jhOp239dxf2xcuVK7N27V+Lt5eXlsW3btrooWoPw\nKbWnlJQUDB48uL6LQeqQfH0XgHxaOnfujN27d6OoqAhXr17F2LFjkZycjMOHD9cqvcLCQigqKuLB\ngwcwNDRE27ZtP6h8Zen9Fz1//hw5OTno1asX9PX167s4AIDy9zdXVlaGsrKyxNtKG7+8Bw8ewN3d\nHebm5rXavkxRURHk5elrUFoaGhpSxec4Do3xXvgNsf3o6enVdxFIHaMROyJCQUEBenp6MDAwQP/+\n/fHdd9/h+PHjKCgoAADs2rULzs7OUFFRgZmZGSZPnozc3Fx+ey8vL3zxxReYNWsWDAwMYGJiAm9v\nb8yePRsPHz6EQCBAq1atAADv3r3D9OnTYWRkBCUlJdjb2+OPP/4QKY9AIMDKlSsxfPhwaGlpITAw\nkD+FJxQK4eDgAFVVVXTt2hUpKSmIjIyEs7Mz1NXV0a1bN5ERoaSkJPj5+cHQ0BBqampwdHRERESE\nSH5lp5zmz58PfX196OjoYNSoUcjJyRGJ9+eff8LV1RUqKirQ1dVF7969kZGRwa9fuXIlbGxsoKKi\nAisrK4SGhqK4uLjafX/p0iV07twZqqqqaNq0KUaMGIH09HQApactTUxMAJR2vms6nV1d/rt374aS\nkhJiYmL4+Nu2bYOqqiru3LkD4P2ptqVLl/L7a+jQoXjz5k2VeYo7tXrt2jX07NkTTZo0gYaGBjw9\nPXHlypVK8YVCIQIDAwGAHzWu6lSzQCDA2bNnsXnzZpF4L168gL+/P7S1taGqqgpvb29cu3aN304o\nFEIgEODo0aPo2LEjVFRUsGnTpirrU97bt28xbtw46OnpQVlZGe7u7jh16pRInLi4OPTp0wcaGhrQ\n0NBA//79kZiYWGn/nDlzBvb29lBRUUHbtm1x69YtPk5WVhZGjx4NfX19KCsrw9jYGJMnT66yXB06\ndMC4ceMqhdva2mL27NkAgNjYWPTo0QPa2tpQV1eHnZ1dpXYvrYqnYqvLw9TUFMXFxRg9ejQEAkGN\np85Xr14NOzs7KCsro3nz5hgyZAi/rqbP4dGjRxAIBPjjjz/Qo0cPqKmpwc7ODtHR0Xjy5Al69uwJ\ndXV12NvbIzo6mt+urG0cPnwYHh4eUFFRgYODAyIjIyvFEdd+JDneCwoK8N1330FHRwctWrTADz/8\nUClOTemYmppizpw51aYTHR2NDh06QFNTE5qamnB2dsbJkyf59RVPxUp63Jw+fRqdO3eGmpoa7O3t\ncfz4cZGyh4aGwtzcHMrKytDT00PPnj2Rn59f7WdN6ggj5P+NGjWKdevWTSQsLCyMcRzHsrOz2ZYt\nW5i2tjaLiIhgSUlJ7Ny5c8zR0ZF9/vnnfPwuXbowDQ0NNmHCBHbv3j12584d9vr1azZlyhRmZmbG\nUlNT2cuXLxljjE2ZMoXp6OiwvXv3sgcPHrDQ0FAmEAjYmTNn+PQ4jmM6Ojps9erV7OHDh+zBgwds\ny5YtTCAQMG9vb3blyhV2/fp1ZmlpyTp27Mg6d+7MLl++zG7evMlsbGzYsGHD+LRu377NVq9ezf79\n91/28OFDtnLlSiYvL88iIyNFyq+lpcV++OEHFhcXx06ePMmaNm3KZs2axcfZvHkzU1BQYD///DNf\nx1WrVvH1mjNnDjMxMWH79+9njx49YkePHmXGxsYiaVT04sULpqGhwUaMGMHu3LnDoqOjmaOjI+vc\nuTNjjLG8vDwWExPDOI5jhw4dYqmpqaywsFBsWpLkP3bsWGZubs6ysrJYXFwc09DQYGvXrhVpC5qa\nmmzAgAHszp07TCgUMktLSzZo0CCRfCwsLPjlLVu2MHl5eX75zp07TFVVlQ0fPpxdu3aNJSYmst27\nd7OLFy9Wil9YWMhWr17NOI5jqampLDU1lWVlZYmtX0pKCmvfvj0bOXIkH6+kpIR5eHgwFxcX9s8/\n/7Dbt2+zYcOGMW1tbf5ziYyMZBzHMRsbG3b48GH26NEj9uzZM7F5mJqasgULFvDLQ4YMYWZmZuzk\nyZPs/v377LvvvmOKiors/v37jDHGcnNzmbGxMfP19WXXr19n165dY97e3szCwoL/nMraraurKzt3\n7hz7999/Wd++fZmhoSHLy8tjjDH2zTffMCcnJ3blyhWWnJzMLly4wDZu3Ci2jIwxtn79eqatrc0K\nCgr4sMuXLzOO49iDBw8YY4w5ODiwESNGsHv37rGkpCR27Ngxdvjw4SrTFMfLy4uNHTuWXw4KChL5\nrqguj/T0dCYvL89WrFjBf7ZVmT17NlNXV2erV69mDx48YDdv3mS//PILv76mzyEpKYlxHMfMzc3Z\ngQMHWHx8PBs0aBAzNDRkXl5ebP/+/Sw+Pp4NGTKEtWzZkr17944x9r5tWFpasiNHjrD79++z4OBg\npqamxl68eCESp3z7efr0qUTHm4mJCdPW1maLFi1iCQkJbPfu3UxBQYFt2rSJjyOLdN69e8e0tbXZ\n5MmTWUJCAktISGD79+9n58+f59PgOI7t2LGDMcakOm6cnJzYiRMnWEJCAhs9ejTT1NRkb968YYwx\n9tdffzFNTU12+PBhlpyczG7evMmWL1/Ot2vycVHHjvBGjRrFfH19+eXY2FjWqlUr1q5dO8ZY6ZfK\n77//LrJNVFQU4ziOZWRkMMZKO0bW1taV0q7YCcjJyWFKSkoinQnGGBs0aBDr2rUrv8xxHPviiy9E\n4mzZsoVxHMdu3brFhy1evJhxHMeuX7/Ohy1dupTp6upWW+cBAwaI/GB16dKFOTs7i8SZMGECvw8Y\nY6xly5bsm2++EZteTk4OU1VVZSdOnBAJ37p1K9PS0qqyHDNnzhT5oWGMsVu3bjGO49i5c+cYY+9/\ntP75558q05E0/9zcXGZvb8+GDh3KnJ2dmZ+fn0j8UaNGMQ0NDZHO1cmTJxnHcSwxMZExVnPHbuTI\nkZX2ZXkV42/fvp1xHFdl/PIqdjROnz7NOI5j9+7d48MKCgqYvr4+mzdvHmPs/Q9UREREjemX79g9\nePCAcRzHjh07JhKnTZs2bMyYMYwxxjZu3MhUVVXZq1ev+PWpqalMRUWFbdu2ja8vx3Hs7NmzfJw3\nb94wdXV1tnnzZsZYaXsMCgqSaB+Uba+iosL27NnDh3311Vesffv2/HKTJk1YeHi4xGmKU3F/V/yu\nqCkPeXl5tnXr1mrzyM7OZsrKyiwsLEzsekk+h7JjZPny5fz6sj9ES5Ys4cNu3LjBOI5jsbGxjLH3\nbaPsc2CMsaKiImZiYsJ3rMS1H0mPNxMTEzZgwACROL169WIBAQEyTef169eM4zgmFAor78D/V75j\nJ81xs2/fPj5Oamoq4ziOnTx5
"text": [
"<matplotlib.figure.Figure at 0x1050da390>"
]
}
],
"prompt_number": 40
2014-05-07 22:38:42 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"np_mean\"></a>\n",
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## `statistics.mean()` vs. `numpy.mean()`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
2014-05-01 20:07:40 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-05-08 18:57:48 +00:00
"# The statistics module has been added to\n",
"# the standard library in Python 3.4\n",
2014-05-08 01:57:53 +00:00
"\n",
2014-05-08 18:57:48 +00:00
"import timeit\n",
"import statistics as stats\n",
"import numpy as np\n",
2014-05-08 01:57:53 +00:00
"\n",
"def calc_mean(samples):\n",
" return sum(samples)/len(samples)\n",
"\n",
2014-05-08 18:57:48 +00:00
"def np_mean(samples):\n",
" return np.mean(samples)\n",
"\n",
"def np_mean_ary(np_array):\n",
" return np.mean(np_array)\n",
"\n",
"def st_mean(samples):\n",
" return stats.mean(samples)\n",
"\n",
"n = 1000000\n",
"samples = list(range(n))\n",
"samples_array = np.arange(n)\n",
2014-05-08 18:57:48 +00:00
"\n",
"assert(st_mean(samples) == np_mean(samples)\n",
" == calc_mean(samples) == np_mean_ary(samples_array))\n",
2014-05-08 01:57:53 +00:00
"\n",
2014-05-08 18:57:48 +00:00
"%timeit(calc_mean(samples))\n",
2014-05-08 01:57:53 +00:00
"%timeit(np_mean(samples))\n",
2014-05-08 18:57:48 +00:00
"%timeit(np_mean_ary(samples_array))\n",
"%timeit(st_mean(samples))"
2014-05-08 01:57:53 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"100 loops, best of 3: 26.2 ms per loop\n",
"1 loops, best of 3: 144 ms per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"100 loops, best of 3: 3.21 ms per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"1 loops, best of 3: 1.12 s per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
2014-05-08 01:57:53 +00:00
]
}
],
"prompt_number": 2
2014-05-08 01:57:53 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-05-08 18:57:48 +00:00
"funcs = ['st_mean', 'np_mean', 'calc_mean', 'np_mean_ary']\n",
2014-05-08 01:57:53 +00:00
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" samples = list(range(n))\n",
" for f in funcs:\n",
2014-05-08 18:57:48 +00:00
" if f == 'np_mean_ary':\n",
" samples = np.arange(n)\n",
2014-05-08 01:57:53 +00:00
" times_n[f].append(min(timeit.Timer('%s(samples)' %f, \n",
" 'from __main__ import %s, samples' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 6
2014-05-08 01:57:53 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('st_mean', 'statistics.mean()'), \n",
2014-05-08 18:57:48 +00:00
" ('np_mean', 'numpy.mean() on list'),\n",
" ('np_mean_ary', 'numpy.mean() on array'),\n",
" ('calc_mean', 'sum(samples)/len(samples)')\n",
" ]\n",
2014-05-08 01:57:53 +00:00
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], \n",
" alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"plt.title('Performance of different approaches for calculating sample means')\n",
"\n",
"max_perf = max( s/c for s,c in zip(times_n['st_mean'],\n",
2014-05-08 18:57:48 +00:00
" times_n['np_mean_ary']) )\n",
2014-05-08 01:57:53 +00:00
"min_perf = min( s/c for s,c in zip(times_n['st_mean'],\n",
2014-05-08 18:57:48 +00:00
" times_n['np_mean_ary']) )\n",
2014-05-08 01:57:53 +00:00
"\n",
2014-05-08 18:57:48 +00:00
"ftext = 'using numpy.mean() on np.arrays is {:.2f}x to '\\\n",
" '{:.2f}x faster than statistics.mean() on lists'\\\n",
2014-05-08 01:57:53 +00:00
" .format(min_perf, max_perf)\n",
2014-05-08 18:57:48 +00:00
"plt.figtext(.14,.15, ftext, fontsize=11, ha='left')\n",
2014-05-01 20:07:40 +00:00
"\n",
2014-05-08 01:57:53 +00:00
"plt.show()"
2014-05-01 20:07:40 +00:00
],
"language": "python",
"metadata": {},
"outputs": [
{
2014-05-08 01:57:53 +00:00
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAIECAYAAACUvmMzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXlcjen//1/30X46aU+UdocKWSNUlOzMpBmhUXYxGHuT\n1omGMdbiYyamIs2iGcPMd6ytwky0GELSggzJ0oJou35/9Oue7s6pzmkOkev5eJxH3e/7ut7X+762\n8z7XdjOEEAIKhUKhUCgUyjsPr70NoFAoFAqFQqHIBurYUSgUCoVCoXQQqGNHoVAoFAqF0kGgjh2F\nQqFQKBRKB4E6dhQKhUKhUCgdBOrYUSgUCoVCoXQQqGPXAampqcGcOXOgra0NHo+HlJSU9jbpneTw\n4cMwMzODnJwc5syZI3G8oKAgWFhYNHvdnO6kpCRYW1tDQUEBo0aNks1DUGQKj8dDbGxsu6T9+eef\nQ09PDzweDwcOHGgXG1rCy8sLo0ePlqlOcW2nrbRn2b1NODo6Yv78+e1tBuU1Qh27dsLLyws8Hg88\nHg/y8vIwNjaGt7c3njx58p91//zzz/j+++/x+++/48GDBxg6dKgMLH6/qK2txZw5c+Du7o67d+9i\n586dbda1Zs0a/PXXX63q9vb2xsCBA1FQUIBffvnlPz+DLDA3N0dwcHB7m/He89dff2Hz5s3Yv38/\nHjx4gI8//ri9TRKBYRgwDPNa9EqDs7MzZs+eLSJ/8OABpk6dKiuz3lleVzlR3h7k2tuA9xl7e3v8\n9NNPqKmpwaVLlzB//nzcvXsXv//+e5v0VVVVQUFBAbm5uejWrRuGDBnyn+xr0Pc+8s8//+D58+cY\nN24c9PX1/5MuPp8PPp/fom5CCG7duoX169ejW7dubU6LEILa2lrIycmmab8LXwDvQz3Nzc0Fj8fD\nxIkT/5Oe15lXhBC8jvPuZaVTV1dXJnoolLcdOmLXjsjLy0NXVxddu3bF5MmTsXz5cpw4cQKvXr0C\nAPzwww+wsbGBsrIyTExMsGrVKrx48YKN7+joiHnz5sHf3x9du3aFkZERRo4ciYCAAOTn54PH48HU\n1BQAUF1dDR8fHxgYGEBRURFWVlb4/vvvOfbweDyEhYVhxowZUFdXx6xZsxAVFQV5eXkkJSWhd+/e\nUFFRwahRo/DgwQMkJibCxsYGqqqqGD16NP755x9WV0FBAVxdXdGtWzfw+Xz06dMHMTExnPQapgRC\nQkKgr68PLS0teHp64vnz55xwP/74IwYMGABlZWVoa2tj/PjxKC0tZe+HhYWhZ8+eUFZWRo8ePRAa\nGora2toW8/7PP/+Evb09VFRUoKmpiZkzZ6KkpAQAEBUVBSMjIwD1zndL09kvX76Et7c31NXVoamp\nicWLF7Pl10Dj6aSmujt16oTk5GR06tQJtbW1mDVrFmeq7datW5g6dSo0NDSgqamJMWPG4OrVq6zu\nxuXTr18/KCkpIT4+HtXV1QgKCoKpqSmUlZVhbW2Nb7/9VqS8//e//+GTTz6BmpoaDA0NsWnTJk75\n5OXlITg4mB1dvnPnjth8yMjIwLhx46CnpweBQIDBgwfj5MmTnDDGxsbw8/PDvHnz0LlzZ+jo6GD9\n+vWcL25Jw/j7+2Px4sXQ1taGg4MDAOCPP/7AgAEDoKSkBD09PSxZsoTTXiSxsaamBsHBwTAzM4OS\nkhIMDAywbNkyTpiysrJm8wyARHm/b98+9OrVC8rKytDS0oKDgwPu3bsnNm+9vLwwa9Ys1NXVgcfj\noVOnTgDqHZ6vv/4apqamUFRUhLm5ucjIcnN5JY709HSMHTsWnTt3hkAggK2tLdLS0gBI1p7F0VLb\nFTcluGHDBpiYmDSrrzU7vLy8kJCQgOjoaLbONrRdHo+HQ4cOsWFbq/8A8PjxY3z00UdQVVWFvr4+\nvvjiC4mmnENDQ9k6pKuri7Fjx+Lly5cS52VD3+7n5wddXV1oaGggICAAhBAEBgaiS5cu0NXVhZ+f\nHyeeJO1HHNL2oUlJSeDxeDh+/DiGDh0KFRUVDBo0CNevX8fff/+NYcOGgc/nw9bWFtevX+fETU9P\nh4uLCwQCAXR1dTF16lROvyKr747s7GyMGTMGGhoaUFVVhaWlpUR1tkNAKO2Cp6cnGT16NEe2detW\nwjAMefbsGYmMjCQaGhokJiaGFBQUkJSUFNKnTx/yySefsOEdHByIQCAg3t7e5Pr16+Tq1avkyZMn\nZPXq1cTExIQUFxeTR48eEUIIWb16NdHS0iJxcXEkNzeXhIaGEh6PR+Lj41l9DMMQLS0tsnv3bpKf\nn09yc3NJZGQk4fF4ZOTIkSQtLY1kZGQQCwsLMnz4cGJvb0/++usvkpWVRXr27EmmTZvG6rpy5QrZ\nvXs3+fvvv0l+fj4JCwsjcnJyJDExkWO/uro6WblyJcnJySGnTp0impqaxN/fnw3z3XffEXl5ebJh\nwwb2GcPDw9nnCgwMJEZGRuTXX38lhYWF5I8//iDdu3fn6GjK/fv3iUAgIDNnziRXr14lqamppE+f\nPsTe3p4QQkhlZSW5ePEiYRiG/Pbbb6S4uJhUVVWJ1fXZZ58RXV1dcuzYMZKTk0NWr15N1NTUiIWF\nBRsmMDCQvW5O94MHDwjDMGTPnj2kuLiYVFZWkgcPHhA9PT2yePFicvXqVXLz5k2ydOlSoqWlRUpK\nSgghhC0fW1tbkpSURAoKCkhJSQnx9PQkffv2JadPnyaFhYXkxx9/JOrq6mT//v2c8tbT0yP79u0j\n+fn5ZPfu3YRhGLZOPHnyhJiYmJA1a9aQ4uJiUlxcTGpra8XmQ1JSEomOjibXrl0jubm5xM/Pjygo\nKJCbN2+yYYyMjIiamhoJDAwkN2/eJAcPHiR8Pp/s3LmzTWGCg4NJbm4uuX79Orl8+TLp1KkTW5eO\nHz9Ounfvzmkvktg4a9YsoqurS2JiYkh+fj65ePEiJ+3W8owQ0mreX7p0icjJyZGDBw+SO3fukCtX\nrpD9+/eToqIisXlbVlZGdu7cSeTk5NhyIISQ8PBwoqysTCIiIsitW7fI3r17iZKSEqeMxeWVOK5e\nvUpUVFTIjBkzSHp6OsnLyyM//fQTuXDhAiFEsvbs6elJnJ2d2evW2q6joyOZP38+x46QkBBibGzM\nXgcGBhJzc3P2ujU7ysrKiL29PXF3d2fzqqHtMgxDDh06JFVZTpo0iQiFQpKUlESys7PJ7Nmzibq6\nukjf3Ziff/6ZqKmpkd9//53cvXuXZGVlkZ07d5LKykqJ89LBwYF07tyZ+Pj4kNzcXPLdd98RhmHI\nmDFjyLp160hubi6Jjo4mDMOQ48ePi5R3S+2nab63pQ9NTEwkDMOQ/v37k8TERHLt2jUydOhQ0qdP\nHzJs2DCSkJBArl+/ToYPH05sbW3ZeNnZ2URVVZUEBQWRnJwccvXqVfLRRx+RHj16kJcvX0qVP619\nd/Tu3ZvMnDmTXL9+nRQUFJDjx4+T33//vdln6khQx66daNoJZmdnE1NTUzJ06FBCSH0D/eabbzhx\nkpOTCcMwpLS0lBBSX7mFQqGI7qad4fPnz4mioiL53//+xwn34YcfklGjRrHXDMOQefPmccJERkYS\nhmHI5cuXWdmWLVsIwzAkIyODlW3fvp1oa2u3+MxTpkzhdCgODg7ExsaGE8bb25vNA0IIMTQ0JEuX\nLhWr7/nz50RFRYWcPHmSI4+Ojibq6urN2uHn50cMDQ1JdXU1K7t8+TJhGIakpKQQQggpKCggDMOQ\nc+fONavn2bNnRElJiezbt48jHzhwoIhj17g8mtPd9IsnMDCQDBkyhBOmrq6OmJmZkR07dhBC/i2f\n1NRUNkx+fj7h8XgkJyeHEzc4OJiT3wzDkOXLl3PC9OrVi3z++efstbm5OQkODm42D1qib9++ZOPG\njey1kZER6zw34OvrSwwNDaUO07jtEEKIh4cH5wuEEEKOHj1KeDweuXPnjkQ25ubmEoZhyM8//9xs\n+NbyTJK8/+WXX0jnzp1JeXl5
2014-05-08 01:57:53 +00:00
"text": [
"<matplotlib.figure.Figure at 0x1058d4f28>"
2014-05-01 20:07:40 +00:00
]
}
],
"prompt_number": 7
2014-05-01 20:07:40 +00:00
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"cython\"></a>\n",
"<br>\n",
"<br>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cython vs regular (C)Python"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
2014-05-08 18:57:48 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, we implement a linear regression via least squares fitting (with vertical offsets) by solving to fit *n* points $(x_i, y_i)$ with $i=1,2,...n,$ via linear equation of the form \n",
"$f(x) = a\\cdot x + b$. \n",
"\n",
"Therefore we calculate the following parameters as follows:\n",
"\n",
"$a = \\frac{S_{x,y}}{\\sigma_{x}^{2}}\\quad$ (slope)\n",
"\n",
"$b = \\bar{y} - a\\bar{x}\\quad$ (y-axis intercept)\n",
"\n",
"where \n",
"\n",
"\n",
"$S_{xy} = \\sum_{i=1}^{n} (x_i - \\bar{x})(y_i - \\bar{y})\\quad$ (covariance)\n",
"\n",
"\n",
"$\\sigma{_x}^{2} = \\sum_{i=1}^{n} (x_i - \\bar{x})^2\\quad$ (variance)\n",
"\n",
"I have described the approach in more detail in this [IPython notebook](http://sebastianraschka.com/IPython_htmls/cython_least_squares.html)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"**First, the implementation in Python (CPython)**:"
]
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"def py_lstsqr(x, y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
"\n",
" x_avg = sum(x)/len(x)\n",
" y_avg = sum(y)/len(y)\n",
" var_x = 0\n",
" cov_xy = 0\n",
" for x_i, y_i in zip(x,y):\n",
" temp = (x_i - x_avg)\n",
" var_x += temp**2\n",
" cov_xy += temp*(y_i - y_avg)\n",
" slope = cov_xy / var_x\n",
" y_interc = y_avg - slope*x_avg\n",
" return (slope, y_interc)"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-08 18:57:48 +00:00
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"**And now, adding type definitions and compiling the code via Cython**:"
]
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%load_ext cythonmagic"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-08 18:57:48 +00:00
"prompt_number": 2
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%cython\n",
"\n",
"def cy_lstsqr(x, y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
" cdef double x_avg, y_avg, temp, var_x, cov_xy, slope, y_interc, x_i, y_i\n",
" x_avg = sum(x)/len(x)\n",
" y_avg = sum(y)/len(y)\n",
" var_x = 0\n",
" cov_xy = 0\n",
" for x_i, y_i in zip(x,y):\n",
" temp = (x_i - x_avg)\n",
" var_x += temp**2\n",
" cov_xy += temp*(y_i - y_avg)\n",
" slope = cov_xy / var_x\n",
" y_interc = y_avg - slope*x_avg\n",
" return (slope, y_interc)"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-08 18:57:48 +00:00
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"**A small visual proof of concept that our least squares fit works as intended:**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from matplotlib import pyplot as plt\n",
"\n",
"import timeit\n",
"import random\n",
"random.seed(12345)\n",
"\n",
"n = 500\n",
"x = [x_i*random.randrange(8,12)/10 for x_i in range(n)]\n",
"y = [y_i*random.randrange(10,14)/10 for y_i in range(n)]\n",
"\n",
"slope, intercept = cy_lstsqr(x, y)\n",
"\n",
"line_x = [round(min(x)) - 1, round(max(x)) + 1]\n",
"line_y = [slope*x_i + intercept for x_i in line_x]\n",
"\n",
"plt.figure(figsize=(8,8))\n",
"plt.scatter(x,y)\n",
"plt.plot(line_x, line_y, color='red', lw='2')\n",
"\n",
"plt.ylabel('y')\n",
"plt.xlabel('x')\n",
"plt.title('Linear regression via least squares fit')\n",
"\n",
"ftext = 'y = ax + b = {:.3f} + {:.3f}x'\\\n",
" .format(slope, intercept)\n",
"plt.figtext(.15,.8, ftext, fontsize=11, ha='left')\n",
"\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAgIAAAH4CAYAAAA4pIUuAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xd8jdcfwPHPvZn3ZkkkEpIQYsaMhColVmLP2qMoSukw\na+8WtYqqllbt1qoaLapmVa0aNWtTIoIMIzcy7v3+/gj3Z8SIJG7Ieb9eeVXu8zznfM9zb3O+93nO\nc45GRARFURRFUbIlraUDUBRFURTFclQioCiKoijZmEoEFEVRFCUbU4mAoiiKomRjKhFQFEVRlGxM\nJQKKoiiKko2pREDJ8nbs2EHRokUtHcYrq0SJEvzxxx8vtc7333+fTz/99IWO9fPzY/PmzRkcUfb2\n888/4+vri7OzM4cOHbLIZ0LJujRqHgElq/Dz82POnDnUqFHD0qEoFpQ/f37mzJlD9erVM6X8bdu2\n0b59ey5dupQp5WdF/v7+TJ06lQYNGjy2beTIkZw9e5aFCxdaIDIlK1BXBJQsQ6PRoNFoLB2GmdFo\nzJB9npeIoPJy5b6M+myJCP/99x8BAQEZUp7y+lGJgJLlbdu2DV9fX/Pvfn5+TJ48mdKlS5MjRw5a\ntWpFQkKCefsvv/xCmTJlcHV1pVKlShw5csS8bfz48RQsWBBnZ2eKFy/OqlWrzNvmzZtHpUqV6NOn\nD+7u7owaNeqxWEaOHEmzZs1o3749Li4uzJ8/n5s3b9K5c2fy5MmDj48Pw4YNw2QyAWAymejbty8e\nHh4UKFCAGTNmoNVqzdurVq3K0KFDqVSpEg4ODpw/f55///2X0NBQcubMSdGiRVm+fLm5/nXr1lG8\neHGcnZ3x8fFh8uTJANy4cYP69evj6upKzpw5qVKlykPn6/6l9oSEBHr16oW3tzfe3t707t2bxMRE\n83n28fFhypQpeHp6kidPHubNm5fqe7J06VLKlSv30GtffPEFjRo1AqBjx44MGzYMgJiYGOrXr0+u\nXLlwc3OjQYMGhIeHp1ruo0TE/J65u7vTsmVLYmJizNubN29O7ty5yZEjByEhIRw/fvyJ52rKlCkY\nDAbq1KnDlStXcHJywtnZmatXrz5W75POM8DEiRPN7/X333+PVqvl3LlzQMr7OWfOHPO+8+bNo3Ll\nyubfP/74Y/LmzYuLiwvBwcH8+eef5m1p/WydOXOGkJAQcuTIgYeHB61atXqsHQkJCTg5OWE0Gild\nujSFChUC/v+Z2LBhA+PGjWPp0qU4OTkRGBj4XO+L8poRRcki/Pz8ZPPmzY+9vnXrVvHx8Xlovzfe\neEMiIiIkOjpaihUrJt98842IiBw4cEBy5cole/fuFZPJJPPnzxc/Pz9JTEwUEZHly5dLRESEiIgs\nXbpUHBwc5OrVqyIiMnfuXLG2tpYZM2aI0WiU+Pj4x2IZMWKE2NjYyOrVq0VEJD4+Xho3bizdu3cX\ng8Eg165dk/Lly8usWbNEROTrr7+WgIAACQ8Pl5iYGKlRo4ZotVoxGo0iIhISEiL58uWT48ePi9Fo\nlNjYWPHx8ZF58+aJ0WiUgwcPiru7u5w4cUJERLy8vOTPP/8UEZHY2Fg5cOCAiIgMHDhQunfvLsnJ\nyZKcnGze59HzOmzYMHnzzTfl+vXrcv36dalYsaIMGzbMfJ6tra1lxIgRkpycLOvWrRO9Xi+xsbGP\nnQeDwSBOTk5y+vRp82vBwcGydOlSERHp2LGjudyoqChZuXKlxMfHy+3bt6V58+bSuHHj1D8Ej8Q7\ndepUefPNNyU8PFwSExOlW7du0rp1a/O+c+fOlTt37khiYqL06tVLypQpY972pHO1bdu2hz5PqXnS\nsevXrxdPT085duyYxMXFSevWrUWj0cjZs2dFRKRq1aoyZ86ch+J76623zL8vWrRIoqOjxWg0yuTJ\nk8XLy0sSEhJEJO2frVatWsnYsWNFRCQhIUF27tz5xPY8GOOj53jkyJHSvn37p54P5fWmrggor6SP\nPvoILy8vXF1dadCgAYcOHQJg9uzZdOvWjXLlyqHRaHjnnXews7Nj165dADRr1gwvLy8AWrRoQaFC\nhdizZ4+53Dx58tCzZ0+0Wi329vap1l2xYkUaNmwIwM2bN1m/fj1ffPEFOp0ODw8PevXqxZIlSwBY\ntmwZvXr1Ik+ePOTIkYNBgwY9dPlfo9HQsWNHihUrhlarZcOGDeTPn58OHTqg1WopU6YMTZs2Zdmy\nZQDY2tpy7Ngxbt26hYuLi/kbnK2tLREREVy4cAErKysqVaqUauw//PADw4cPx93dHXd3d0aMGPHQ\nvWEbGxuGDx+OlZUVderUwdHRkZMnTz5Wjk6no1GjRvz4448AnD59mpMnT5rPC2Bup5ubG02aNMHe\n3h5HR0cGDx7M9u3bU39jHzFr1iw+/fRT8uTJg42NDSNGjGDFihXmb8UdO3bEwcHBvO2ff/7h9u3b\nTz1X8hy3X5507LJly3j33XcJCAhAr9enetXoadq2bYurqytarZY+ffqQkJDw0PlNy2fL1taWCxcu\nEB4ejq2tLRUrVkxTLPeJuiWV7alEQHkl3e/MIaVTunPnDgAXL15k8uTJuLq6mn8uX75MREQEAAsW\nLCAwMNC87ejRo0RFRZnLevAWxJP4+PiY/33x4kWSkpLInTu3uczu3btz/fp1ACIiIh4q88FjU6vz\n4sWL7Nmz56H4f/jhByIjIwH46aefWLduHX5+flStWpXdu3cD0L9/fwoWLEhYWBj+/v58/vnnqcZ+\n5coV8uXLZ/49b968XLlyxfx7zpw50Wr//2dBr9ebz+2j2rRpY04EfvjhB3Nn/yiDwUC3bt3w8/PD\nxcWFkJAQbt68+Vydz4ULF2jSpIn5XAQEBGBtbU1kZCRGo5GBAwdSsGBBXFxcyJ8/PxqNhhs3bjz1\nXD2PJx376PuZN2/e5y4TYNKkSQQEBJAjRw5cXV25efOmOV5I22drwoQJiAjly5enRIkSzJ07N02x\nKMp91pYOQFEywv1Bhnnz5mXIkCEMHjz4sX0uXrzIe++9x5YtW3jzzTfRaDQEBgY+9g39WfU8uI+v\nry92dnZERUU91IHelzt37odGp6c2Uv3B8vLmzUtISAgbN25Mtf7g4GBWrVqF0Wjkyy+/pEWLFvz3\n3384OjoyadIkJk2axLFjx6hevTrly5enWrVqDx2fJ08eLly4QLFixQD477//yJMnz1Pb/CQ1a9bk\n+vXr/PPPPyxZsoSpU6em2q7Jkydz6tQp9u7dS65cuTh06BBly5ZFRJ55vvPmzcvcuXN58803H9u2\ncOFC1qxZw+bNm8mXLx+xsbG4ubmZ388nnavnGZD6pGNz587Nf//9Z97vwX8DODg4EBcXZ/79wfEH\nO3bsYOLEiWzZsoXixYsDPBTvg+cMnv3Z8vT0ZPbs2QDs3LmTmjVrEhISQoECBZ7ZvgdlpQG6imWo\nKwJKlpKYmMjdu3fNP887cvr+H9OuXbvyzTffsHfvXkSEuLg4fv31V+7cuUNcXBwajQZ3d3dMJhNz\n587l6NGjaYrv0W+xuXPnJiwsjD59+nD79m1MJhNnz541P6PdokULpk2bxpUrV4iNjeXzzz9/7A/v\ng2XWr1+fU6dOsWjRIpKSkkhKSmLfvn38+++/JCUlsXjxYm7evImVlRVOTk5YWVkBKQMkz5w5g4jg\n7OyMlZVVqp1H69at+fTTT7lx4wY3btxg9OjRtG/fPk3n4D4bGxuaN29Ov379iImJITQ09KE23W/X\nnTt30Ol0uLi4EB0dnabL6d27d2fw4MHmDvf69eusWbPGXK6dnR1ubm7ExcU9lPw97Vx5enoSFRXF\nrVu3Uq3zace2aNGCefPmceLECQwGw2NtKVOmDCtXriQ+Pp4zZ84wZ84c8/t9+/ZtrK2tcXd3JzEx\nkdGjRz8xBnj2Z2v58uVcvnwZgBw5cqDRaFJ9z5/Fy8uLCxcuqNsD2ZhKBJQspW7duuj1evPPqFGj\nnvlY4YPbg4KC+Pbbb/nggw9w
"text": [
"<matplotlib.figure.Figure at 0x10699bb10>"
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"**Eventually, the benchmark:**:"
]
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"import random\n",
"random.seed(12345)\n",
"\n",
"funcs = ['py_lstsqr', 'cy_lstsqr']\n",
"\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" x = [x_i*random.randrange(8,12)/10 for x_i in range(n)]\n",
" y = [y_i*random.randrange(10,14)/10 for y_i in range(n)]\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(x,y)' %f, \n",
" 'from __main__ import %s, x, y' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-08 18:57:48 +00:00
"prompt_number": 13
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
"labels = [('py_lstsqr', 'regular Python (CPython)'), \n",
" ('cy_lstsqr', 'Cython implementation')]\n",
"\n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
"max_perf = max( py/nu for py,nu in zip(times_n['py_lstsqr'],\n",
" times_n['cy_lstsqr']) )\n",
"min_perf = min( py/nu for py,nu in zip(times_n['py_lstsqr'],\n",
" times_n['cy_lstsqr']) )\n",
"ftext = 'Using Cython is {:.2f}x to '\\\n",
" '{:.2f}x faster than regular (C)Python'\\\n",
" .format(min_perf, max_perf)\n",
2014-05-08 22:59:11 +00:00
"plt.figtext(.15,.8, ftext, fontsize=11, ha='left')\n",
2014-05-08 18:57:48 +00:00
"plt.title('Performance of least square fit implementations in Cython and (C)Python')\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAnIAAAIECAYAAACdVcNJAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XdYFFf3B/DvLB2k4yIg0gVBRWONhaIYY0mssRdsMZpo\nNOaNvkYFVGKLNWLeiL3GjgajWLE3xIqKIkJEBY1dRGnn98f+GFhYYEHYBTyf5/FJ9u7MnTNnyl7u\n3JkRiIjAGGOMMcYqHYm6A2CMMcYYY6XDDTnGGGOMsUqKG3KMMcYYY5UUN+QYY4wxxiopbsgxxhhj\njFVS3JBjjDHGGKukuCGnBpmZmRg6dCgsLCwgkUhw/PhxdYdUKW3btg1OTk7Q1NTE0KFDFU7j7++P\ndu3aqTgyln/bHDt2DBKJBA8fPixxXQ4ODvjll1/KIcqC7O3tERwcrJJlVVQSiQSbNm1Sdxjw8fHB\n119/re4wilVV95mEhARIJBKcPn26yOmICI0aNcK2bdtKVH9WVhbq1KmDffv2fUiYJVJVfw+4IVcI\nf39/SCQSSCQSaGlpwd7eHqNGjcKzZ88+uO4dO3Zg8+bNCA8PR3JyMj799NMyiPjjkpWVhaFDh6JP\nnz64f/8+Fi9erHA6QRAgCIJKY9uwYQMkko/30FK0bVq0aIHk5GRYWVkBAE6ePAmJRIJ//vmn2Pqi\noqIwfvz48g4bgHr2lw+RlJRU6j8G/fz8MGTIkALlycnJ6NGjR1mE90HCwsKwYMGCMqkrJiYGAwcO\nRM2aNaGrqwt7e3t0794dkZGRStcxc+ZMODg4FCivbPtMWdu0aRPev3+Pr776Sq68uJxraGjg559/\nxsSJE+Xmi4yMFH97JRIJLCws0KZNG5w8eVLpmAo7v1TVbfXx/toowcvLC8nJyUhMTMSSJUuwc+dO\nDBo0qNT1paenAwDu3LkDGxsbNG/eHFKpFFpaWh9U38fo4cOHSE1NRYcOHWBlZQVDQ0OF0xER+JnX\nJZednY3s7OxSzato22hpaUEqlRY4iSqzbczNzaGnp1eqWD4WZbmPS6VS6OjolFl9pWViYoJq1ap9\ncD0RERFo3LgxkpOTsXLlSty8eRPh4eFo3rw5vvnmmzKI9OO2aNEiDBs2TK5M2Zz36NEDiYmJOHr0\naIF6L126hOTkZBw+fBh6enro0KEDEhMTSxRb/uOiyv4WEFNo8ODB5OfnJ1cWHBxMGhoa9O7dOyIi\n2rx5M3l6epKuri7Z29vTDz/8QKmpqeL03t7eNGzYMJoyZQpZWVlRjRo1yMfHhwRBEP85ODgQEVF6\nejpNnDiRbGxsSFtbm9zd3WnTpk1yyxcEgZYsWUJ9+/YlY2Nj6t27N61evZo0NTXp6NGjVLduXdLT\n0yNfX1969OgRHTlyhDw9PcnAwID8/PzowYMHYl3x8fHUrVs3sra2Jn19fapXrx6tX79ebnne3t40\nfPhwmj59OtWoUYPMzMxo0KBB9ObNG7np/vzzT/rkk09IV1eXzM3NqUOHDvT8+XPx+yVLlpCrqyvp\n6uqSi4sLBQcHU2ZmZpH5P3PmDLVu3Zr09PTI1NSU+vXrR48fPyYiotWrV8vlUBAEOnbsmNLbsbjt\nduDAAfL29iYzMzMyNjYmb29vOn/+vFwdoaGh5ObmRrq6umRmZkZeXl6UlJRER48eLRDbkCFDCl3P\n4OBgcnR0JB0dHapevTq1b9+e0tLS5HJnY2ND+vr61L59e1q7di0JgiBuy5ztn9f9+/cL5GT48OHk\n5OREenp65OjoSJMnT6b379+L3wcEBJCzszNt2bKFXF1dSVNTk27dukWvX7+msWPHijE0bNiQdu7c\nWej6FLZtcvLy4MEDunfvXoFpfH19C63Tzs6OZs6cKfd56tSp9M0335CxsTFZWlrSsmXLKC0tjUaP\nHk2mpqZkY2NDS5culatHEARavHgxde/enQwMDMjGxoYWL14sN429vT0FBweLn9PT0ykgIIAcHBxI\nV1eXPDw86I8//ihQ72+//Ua9evUiAwMDsrOzo507d9KzZ8+oT58+ZGhoSI6OjrRjxw65+ZKTk2nw\n4MFUvXp1MjQ0pJYtW9Lx48fF73NydvDgQWrdujXp6+uTu7s77du3T27Zis4nxR3fgwcPLvQYEgSB\nNm7cKE778OFD6t27N5mYmJCenh75+PhQVFRUieIkKn5fzy/n/JP/c3Hno7xSU1NJKpVSx44dFX7/\n4sULMR+fffZZge99fX1p2LBhtGbNmgL5CgoKIiLZPjNt2jQaO3YsmZmZkaWlJY0fP17uHKfs+X3Z\nsmU0YMAAMjQ0pJo1a9KsWbMKXbccyh7bu3fvJldXVzIwMCAfHx+6c+eOXD1btmwhJycn0tXVpRYt\nWtDu3btJEAQ6depUocuOjY0lQRDo3r17YpmyOc/Rq1cvGjp0qPg577kix4MHD0gQBFq+fDmtXr2a\nTExM6O3bt3L1BAUFkYuLS5Hnl5zfgz/++INq1apFRkZG9OWXX1JKSopcXWvWrKE6deqQtrY21axZ\nk6ZMmSK3PUuzL5YnbsgVYvDgwdSuXTu5svnz55MgCPTmzRtavXo1mZqa0oYNG+jevXt0/Phxql+/\nPg0cOFCc3tvbmwwNDWnUqFF08+ZNun79Oj179ox+/PFHcnBwoJSUFPr333+JiOjHH38kc3Nz2r59\nO925c4d++eUXkkgkdPjwYbE+QRDI3NycQkJCKD4+nu7cuUOrV68miURCvr6+dP78eYqOjiYXFxdq\n1aoVeXl50blz5+jy5cvk5uZGvXv3Fuu6du0ahYSE0NWrVyk+Pp5+++03sUGYN34TExP64YcfKDY2\nlg4cOEBmZmY0depUcZpVq1aRlpYWzZw5U1zHpUuXiusVEBBAdnZ2FBYWRgkJCfT3339TrVq15OrI\n79GjR2RoaEj9+/en69ev08mTJ6l+/frk5eVFRERpaWl04cIFEgSB/vrrL0pJSaH09PRCt2Pehpwy\n223Xrl20bds2un37Nt24cYOGDx9OZmZm9PTpUyIiioqKIk1NTVq/fj39888/dO3aNVq5ciUlJSVR\neno6hYSEkCAIlJKSQikpKfTq1SuFse3YsYOMjIwoPDyc7t+/T5cvX6bFixeLP25hYWGkqalJCxcu\npDt37tDKlStJKpWSRCIpUUMuOzubfv75Zzp//jwlJibSnj17yMrKigICAsR5AgICSF9fn3x8fOj8\n+fN0584dev36Nfn4+JCvry+dOnWK7t27R8uXLydtbW25/TKvwrZN3pNzVlYW7dmzhwRBoKioKEpJ\nSZFr+OeXv3FlZ2dHJiYmtHDhQrp79y7NnDmTJBIJtW/fXiybNWsWSSQSunHjhjifIAhkZmZGS5cu\npTt37tDixYtJU1OTdu/eXeiyBg8eTJ6ennTw4EFKSEigLVu2kImJCa1cuVKu3ho1atC6devo7t27\nNHr0aDIwMKDPPvuM1q5dS3fv3qUxY8aQgYGBuA+9ffuW6tSpQz179qSLFy/S3bt3KTg4mHR0dOjm\nzZtElPuD5unpSRERERQXF0dDhgwhIyMjMV+XLl0iQRBo165dcueT4o7vly9fkpeXF/Xp00fcT3OO\nobwNuezsbGratCk1bNiQTp06RdeuXaPevXuTqampuCxl4ixuX1fEx8eHRowYIX5W5nyU365du4pt\njBDJ/nCUSCRyDZI7d+6QRCKh8+fPU1paGk2aNIlsbW3FfOX88WdnZ0empqY0Z84ciouLo61bt5KW\nlpbcPqLs+d3S0pJWrFhB8fHx4nmksGONSPlj28DAgDp06EDR0dF05coVatSoEbVu3VqcJjo6mjQ0\nNGjy5Ml0+/Zt2rlzJ9nb2xebuz/++IOqV69eqpznmD9/Ptnb24ufFTXknj59SoIgUEhICKWlpZGp\nqSmtXbtW/D4rK4vs7Oxo7ty5RZ5fBg8eTMbGxtSvXz+KiYmhM2fOkIODg9z5Pzw8nDQ0NGj27Nl0\n584d2rJlC5mamsrtZ6XZF8sT
"text": [
"<matplotlib.figure.Figure at 0x10ca5b4d0>"
]
}
],
"prompt_number": 14
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"numba\"></a>\n",
"<br>\n",
"<br>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-05-08 22:59:11 +00:00
"# Numba vs. Cython vs. regular (C)Python & Numpy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
2014-05-08 18:57:48 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Numba is using the [LLVM compiler infrastructure](http://llvm.org) for compiling Python code to machine code. Its strength is to work with NumPy arrays to speed-up the code. If you want to read more about Numba, please see refer to the original [website and documentation](http://numba.pydata.org/numba-doc/0.13/index.html)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, we implement a linear regression via least squares fitting (with vertical offsets) by solving to fit *n* points $(x_i, y_i)$ with $i=1,2,...n,$ via linear equation of the form \n",
"$f(x) = a\\cdot x + b$. \n",
"\n",
"\n",
"\n",
"$\\Rightarrow \\pmb a = (\\pmb X^T \\; \\pmb X)^{-1} \\pmb X^T \\; \\pmb y$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I have described the approach in more detail in this [IPython notebook](http://sebastianraschka.com/IPython_htmls/cython_least_squares.html)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Matrix equation "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to obtain the parameters for the linear regression line for a set of multiple points, we can re-write the problem as matrix equation \n",
"$\\pmb X \\; \\pmb a = \\pmb y$ \n",
"\n",
"$\\Rightarrow\\Bigg[ \\begin{array}{cc}\n",
"x_1 & 1 \\\\\n",
"... & 1 \\\\\n",
"x_n & 1 \\end{array} \\Bigg]$\n",
"$\\bigg[ \\begin{array}{c}\n",
"a \\\\\n",
"b \\end{array} \\bigg]$\n",
"$=\\Bigg[ \\begin{array}{c}\n",
"y_1 \\\\\n",
"... \\\\\n",
"y_n \\end{array} \\Bigg]$ \n",
"\n",
"With a little bit of calculus, we can rearrange the term in order to obtain the parameter vector \n",
"$\\pmb a = [a\\;b]^T$ \n",
"\n",
"We will implement this matrix equation in \n",
2014-05-08 22:59:11 +00:00
"- Python/CPython: `py_mat_lstsqr()` \n",
"- Numba: `numba_mat_lstsqrs()` \n",
"- Cython: `cy_mat_lstsqr()`"
2014-05-08 18:57:48 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-05-08 22:59:11 +00:00
"### \"Classic\" approach "
2014-05-08 18:57:48 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the more \"classic\" approach that is often found in statistics textbooks, we calculate the following parameters as follows:\n",
"\n",
"$a = \\frac{S_{x,y}}{\\sigma_{x}^{2}}\\quad$ (slope)\n",
"\n",
"$b = \\bar{y} - a\\bar{x}\\quad$ (y-axis intercept)\n",
"\n",
"where \n",
"\n",
"\n",
"$S_{xy} = \\sum_{i=1}^{n} (x_i - \\bar{x})(y_i - \\bar{y})\\quad$ (covariance)\n",
"\n",
"\n",
"$\\sigma{_x}^{2} = \\sum_{i=1}^{n} (x_i - \\bar{x})^2\\quad$ (variance)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will implement this \"classic\" approach in\n",
2014-05-08 22:59:11 +00:00
"- Python/CPython: `py_lstsqr()` \n",
"- Numba: `numba_lstsqrs()` \n",
"- Cython: `cy_lstsqrs()` "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<br>"
2014-05-08 18:57:48 +00:00
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import numpy as np\n",
2014-05-08 22:59:11 +00:00
"import scipy.stats\n",
"from numba import jit"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%load_ext cythonmagic"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Matrix equation:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
2014-05-08 18:57:48 +00:00
"def py_mat_lstsqr(x, y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
" X = np.vstack([x, np.ones(len(x))]).T\n",
" return (np.linalg.inv(X.T.dot(X)).dot(X.T)).dot(y)"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-08 22:59:11 +00:00
"prompt_number": 59
2014-05-08 18:57:48 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"@jit\n",
"def numba_mat_lstsqr(x, y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
" X = np.vstack([x, np.ones(len(x))]).T\n",
" return (np.linalg.inv(X.T.dot(X)).dot(X.T)).dot(y)"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-08 22:59:11 +00:00
"prompt_number": 60
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%cython\n",
"def cy_mat_lstsqr(x, y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
" X = np.vstack([x, np.ones(len(x))]).T\n",
" return (np.linalg.inv(X.T.dot(X)).dot(X.T)).dot(y)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### \"Classic\" approach:"
]
2014-05-08 18:57:48 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def py_lstsqr(x, y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
" x_avg = sum(x)/len(x)\n",
" y_avg = sum(y)/len(y)\n",
" var_x = 0\n",
" cov_xy = 0\n",
" for x_i, y_i in zip(x,y):\n",
" temp = (x_i - x_avg)\n",
" var_x += temp**2\n",
" cov_xy += temp*(y_i - y_avg)\n",
" slope = cov_xy / var_x\n",
" y_interc = y_avg - slope*x_avg\n",
" return (slope, y_interc)"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-08 22:59:11 +00:00
"prompt_number": 61
2014-05-08 18:57:48 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"@jit\n",
"def numba_lstsqr(x, y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
" x_avg = sum(x)/len(x)\n",
" y_avg = sum(y)/len(y)\n",
" var_x = 0\n",
" cov_xy = 0\n",
" for x_i, y_i in zip(x,y):\n",
" temp = (x_i - x_avg)\n",
" var_x += temp**2\n",
" cov_xy += temp*(y_i - y_avg)\n",
" slope = cov_xy / var_x\n",
" y_interc = y_avg - slope*x_avg\n",
" return (slope, y_interc)"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-08 22:59:11 +00:00
"prompt_number": 62
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%cython\n",
"def cy_lstsqr(x, y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
" cdef double x_avg, y_avg, temp, var_x, cov_xy, slope, y_interc, x_i, y_i\n",
" x_avg = sum(x)/len(x)\n",
" y_avg = sum(y)/len(y)\n",
" var_x = 0\n",
" cov_xy = 0\n",
" for x_i, y_i in zip(x,y):\n",
" temp = (x_i - x_avg)\n",
" var_x += temp**2\n",
" cov_xy += temp*(y_i - y_avg)\n",
" slope = cov_xy / var_x\n",
" y_interc = y_avg - slope*x_avg\n",
" return (slope, y_interc)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 63
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### NumPy and SciPy libraries:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def numpy_lstsqr(x, y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
" X = np.vstack([x, np.ones(len(x))]).T\n",
" return np.linalg.lstsq(X,y)[0]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 70
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def scipy_lstsqr(x,y):\n",
" \"\"\" Computes the least-squares solution to a linear matrix equation. \"\"\"\n",
" return scipy.stats.linregress(x, y)[0:2]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 71
2014-05-08 18:57:48 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2014-05-08 22:59:11 +00:00
"### Verifying that the different approaches yield the same results"
2014-05-08 18:57:48 +00:00
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import random\n",
"random.seed(12345)\n",
"\n",
"n = 500\n",
"x = [x_i*random.randrange(8,12)/10 for x_i in range(n)]\n",
"y = [y_i*random.randrange(10,14)/10 for y_i in range(n)]\n",
"\n",
2014-05-08 22:59:11 +00:00
"np.testing.assert_array_almost_equal(\n",
" py_lstsqr(x, y), py_mat_lstsqr(x, y), decimal=6)\n",
"np.testing.assert_array_almost_equal(\n",
" numpy_lstsqr(x,y), py_lstsqr(x, y), decimal=6)\n",
"np.testing.assert_array_almost_equal(\n",
" scipy_lstsqr(x,y), py_lstsqr(x, y), decimal=6)\n",
"\n",
"print('ok')"
2014-05-08 18:57:48 +00:00
],
"language": "python",
"metadata": {},
2014-05-08 22:59:11 +00:00
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"ok\n"
]
}
],
"prompt_number": 80
2014-05-08 18:57:48 +00:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Visual checking the least square fit"
]
2014-05-07 07:04:41 +00:00
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%pylab inline"
],
"language": "python",
"metadata": {},
"outputs": []
},
2014-05-08 18:57:48 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"from matplotlib import pyplot as plt\n",
"\n",
"slope, intercept = py_mat_lstsqr(x, y)\n",
"\n",
"line_x = [round(min(x)) - 1, round(max(x)) + 1]\n",
"line_y = [slope*x_i + intercept for x_i in line_x]\n",
"\n",
"plt.figure(figsize=(7,6))\n",
"plt.scatter(x,y)\n",
"plt.plot(line_x, line_y, color='red', lw='2')\n",
"\n",
"plt.ylabel('y')\n",
"plt.xlabel('x')\n",
"plt.title('Linear regression via least squares fit')\n",
"\n",
"ftext = 'y = ax + b = {:.3f} + {:.3f}x'\\\n",
" .format(slope, intercept)\n",
"plt.figtext(.15,.8, ftext, fontsize=11, ha='left')\n",
"\n",
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAdQAAAGQCAYAAAAX7pEWAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xl4TNf/wPH3zGSbySabLbGEiCzErpYiKEVbpaVqD6qr\nVrXWtvZqS1v90mpRFKW0iKVaamnsRWyx7xHEFktC9mTm8/sjMT+RRCIZBOf1PHnq3nPuWe7czmfu\nveeeqxERQVEURVGUQtE+6gYoiqIoypNABVRFURRFsQAVUBVFURTFAlRAVRRFURQLUAFVURRFUSxA\nBVRFURRFsQAVUBWCg4N58803H3UznjobNmxAq9Vy4cKFR90Uzpw5g1arZdu2bYUqJyQkhBYtWlio\nVYolpKen07t3b9zd3dFqtWzcuFF9Tg+I1aNugPLghYSEEB0dzdq1a3NMX7ZsGVZW6lB42Bo2bMil\nS5fw8PB41E2hbNmyXLp0CVdX10KVo9Fo0Gg0FmpV/sybN48ePXpgMpkear2PiyVLlrBgwQLCwsKo\nUKECLi4u1KxZM8v+euONNzh16hRhYWGPsKWPP/Ut+hTI60uuWLFiD7E195aamoqNjU2e+W5/GWi1\nD+YiS37bURjW1tYUL178gdaRX1qt1iJtERHUXDH370EezydOnMDT05N69eqZ11lbW1u8HkVd8n0q\n5PUlFxwcTN++fbMtjx07llKlSuHm5kbPnj1JSEjIst3ChQupXr06er0eb29vPv74YxITE83pa9eu\nJTg4GDc3N4oVK0ZwcDDh4eFZytBqtXz//fd06dKFYsWK0bNnzxzbOGrUKCpVqsQff/yBn58ftra2\nnDhxgvj4ePr374+Xlxf29vbUrFmTpUuXZtl279691KtXD71ej5+fH6GhoZQvX55x48bl2Y61a9fS\nsGFDDAYDXl5e9O7dm+vXr5u3O3ToEM8//zwuLi44ODgQEBDAvHnzzOkzZszA398fvV6Pm5sbTZo0\nITo6Gsj5ku/27dtp3LgxBoMBV1dXunbtSkxMTLb9sGLFCvz8/HBwcKBp06acPHkyl08Xfv75Z4oV\nK0ZKSkqW9ePHj6dcuXJAzpd8P/30UwICArC3t6ds2bK888473Lx5M9d6cmOJ4yS3/bhhwwZ69OgB\nZHyGWq2W3r1759qWL774gooVK2JnZ0fx4sVp1aoVycnJ5vTvv//efCy1atWKuXPnZvmMZs+enS0Y\nnT9/Hq1Wy6ZNm8zr+vbti4+PDwaDgYoVK/Lpp5+SmppqTi/M8ZxXH+4UHBzMiBEjOH36NFqtlgoV\nKgBZL82PGjWKWbNmsXHjRvM+nDt3bq77ULkHUZ54PXv2lOeeey7X9ODgYOnbt695uUmTJlKsWDH5\n6KOP5NixY7JmzRpxdXWV4cOHm/P88ssv4uLiIvPmzZPIyEjZtGmTBAUFSffu3c15li5dKosWLZLj\nx4/L4cOH5Y033hBXV1e5du2aOY9GoxE3NzeZMmWKnD59Wk6ePJljG0eOHCkGg0GCg4Nl586dcuLE\nCbl165YEBwdL06ZNZevWrRIZGSnTp08XGxsbWb9+vYiIJCQkSMmSJaVt27Zy4MAB2b59uzRo0EAM\nBoOMGzfunu1Yv369GAwG+eGHH+TkyZMSHh4uTZs2lSZNmpi3q1q1qnTt2lWOHDkikZGRsmrVKlm5\ncqWIiOzatUusrKzk119/lbNnz8qBAwdk5syZcv78eRERCQsLE41GI9HR0SIicvHiRXF0dJSuXbvK\nwYMHZcuWLRIUFCSNGzfOsh/s7e2ldevWsmfPHomIiJBatWpJo0aNcv184+LiRK/Xy++//55lfUBA\ngHz66aciIhIZGSkajUa2bt1qTv/8889ly5YtEhUVJevXrxc/Pz/p2bNnrvWIZD/WLHGc3Gs/pqam\nypQpU0Sj0cjly5fl8uXLcvPmzRzbtmTJEnFycpKVK1fKuXPnZN++fTJp0iRJSkoSEZFly5aJlZWV\nfPfdd3LixAmZOXOmFC9eXLRarfkz+uWXX8TKyipLuefOnRONRiMbN24UERGTySSffvqp7Ny5U6Ki\nomTFihVSqlQpGTlypHmbgh7PefXhbtevX5eBAweKt7e3XL58Wa5evWr+nFq0aCEiIvHx8dK1a1dp\n2LCheR/mVp5ybyqgPgUKElCrV6+eJc8777wj9evXNy+XK1dOpk2bliXPxo0bRaPRSGxsbI71GI1G\ncXFxkfnz55vXaTQaeeONN/Lsw8iRI0Wr1cq5c+fM68LCwsTOzk7i4uKy5O3Vq5e0a9dORESmT58u\nDg4OWb5kjx49KhqNJltAvbsdTZo0kWHDhmVZFxUVJRqNRiIiIkRExNnZWWbPnp1jm0NDQ8XZ2TnX\nL/i7A+pnn30mZcqUkbS0NHOeiIgI0Wg0snnzZvN+sLKyMn8xioj8/vvvotVqJSUlJcd6RERef/11\neeGFF8zL4eHhotFo5Pjx4yKSc0DNqT+2tra5potkP9YscZzktR9//fVX0Wg092yXiMjEiRPF19c3\ny/69U8OGDaVbt25Z1g0cODDLZ5SfgJpb3ZUqVTIvF/R4zqsPORk5cqT4+PhkWXf359SnTx8JDg7O\nd5lKztQlXyUbjUZDtWrVsqwrVaoUly9fBiAmJoazZ88yYMAAHB0dzX9t2rRBo9GYLz9GRkbSvXt3\nKlWqhLOzM87OzsTFxXH27NksZdetWzdf7SpRogReXl7m5fDwcFJTU/H09MzSjvnz55vbcPjwYQIC\nAnB0dDRvV7ly5RzvG9/djvDwcL777rssZQcGBqLRaDhx4gQAAwcO5I033qBp06aMHj2avXv3mrdv\n2bIlFSpUwNvbm86dO/Pzzz9z7dq1XPt36NAh6tWrl2WAWFBQEM7Ozhw6dMi8rnTp0ri5uZmXS5Uq\nhYhw5cqVXMvu2bMna9as4erVqwDMnTuXZ555hkqVKuW6TWhoKI0bNzbv327dupGWlsalS5dy3eZO\nljpO7nc/5qZTp06kpaVRrlw5evXqxbx584iPjzenHzlyhAYNGmTZpmHDhvddD2RcZn/mmWcoWbIk\njo6OfPLJJ9mO+4Icz3n1QXm0VEBVcnT3gByNRmMeOHH7v5MnTyYiIsL8t3//fk6cOEGVKlUAePHF\nFzl//jw//vgjO3bsYN++fRQvXjzLvSQAe3v7fLXp7nwmkwlnZ+csbYiIiODIkSOsWrXqvvt8d/ki\nwtChQ7OVf+LECVq1agXAZ599xvHjx3nttdc4ePAg9erVY/jw4ebydu3axdKlS/H19WXq1Kn4+Piw\nZ8+eHOvXaDT5GtCT02dze3/kpkWLFri7uzN//nzS0tJYuHBhrverAXbs2MFrr71GcHAwy5YtY+/e\nvUydOhURyfb55cZSx8n97sfclC5dmqNHjzJr1iyKFy/O2LFjqVy5MufPn893GTkNGkpLS8uyvGjR\nIvr160fnzp1ZtWoV+/btY8SIEXke9/k5ni3RB+XBUQH1KWHJRxlKlChBmTJlOHr0KBUqVMj2Z2tr\ny7Vr1zhy5AhDhw6lRYsW5oEX9zqLul916tQhNjaWpKSkbG24/cs/MDCQI0eOZBlMc+zYMWJjY/Ms\nv3bt2hw8eDDHPt75Zejt7c0777zDokWLGD16ND/99JM5TavV0qhRI0aPHs3u3bspVaoUCxYsyLG+\nwMBAtm/fnuULOiIigri4OHPwKSidTkfXrl359ddfWbVqFTdv3uT111/PNf+WLVtwd3dnzJgx1KlT\nBx8fH86dO3dfdVryOLnXfrz9AyO/P0aef/55xo8fz4EDB0hMTGT58uUABAQEsHXr1iz5714uXrw4\nRqMxS/vuDuybNm2iRo0afPjhh9SoUYOKFSsSGRmZZ9vyczzn1YeCsrGxwWg0FqoMRT0289S4desW\nERERWb509Ho9lStXzjYK+O7lnIwbN44+ffrg4uJC27Ztsba25siRI6xevZqpU6fi4uKCh4cH06dP\np0KFCly9epXBgwej1+st1qdm
"text": [
"<matplotlib.figure.Figure at 0x10ca76cd0>"
]
}
],
"prompt_number": 45
},
2014-05-08 22:59:11 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Benchmarking:"
]
},
2014-05-08 18:57:48 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"import timeit\n",
"import random\n",
"random.seed(12345)\n",
"\n",
2014-05-08 22:59:11 +00:00
"funcs = ['py_mat_lstsqr', 'numba_mat_lstsqr', 'cy_mat_lstsqr', \n",
" 'py_lstsqr', 'numba_lstsqr', 'cy_lstsqr',\n",
" 'numpy_lstsqr', 'scipy_lstsqr']\n",
2014-05-08 18:57:48 +00:00
"\n",
"orders_n = [10**n for n in range(1, 6)]\n",
"times_n = {f:[] for f in funcs}\n",
"\n",
"for n in orders_n:\n",
" x = np.asarray([x_i*np.random.randint(8,12)/10 for x_i in range(n)])\n",
" y = np.asarray([y_i*np.random.randint(10,14)/10 for y_i in range(n)])\n",
" for f in funcs:\n",
" times_n[f].append(min(timeit.Timer('%s(x,y)' %f, \n",
" 'from __main__ import %s, x, y' %f)\n",
" .repeat(repeat=3, number=1000)))"
],
"language": "python",
"metadata": {},
"outputs": [],
2014-05-08 22:59:11 +00:00
"prompt_number": 66
2014-05-08 18:57:48 +00:00
},
2014-05-07 07:04:41 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"\n",
2014-05-08 22:59:11 +00:00
"labels = [('py_mat_lstsqr', 'matrix equation in reg. (C)Python & NumPy'), \n",
2014-05-08 18:57:48 +00:00
" ('numba_mat_lstsqr', 'matrix equation in Numba'),\n",
2014-05-08 22:59:11 +00:00
" ('cy_mat_lstsqr', 'matrix equation in Cython & NumPy'),\n",
" ('py_lstsqr', '\"classic\" least squares in reg. (C)Python'),\n",
" ('numba_lstsqr', '\"classic\" least squares in Numba'),\n",
" ('cy_lstsqr', '\"classic\" least squares in Cython'),\n",
" ('numpy_lstsqr', 'least squares via np.linalg.lstsq()'),\n",
" ('scipy_lstsqr', 'least_squares via scipy.stats.linregress()'),]\n",
2014-05-07 07:04:41 +00:00
"\n",
"\n",
"matplotlib.rcParams.update({'font.size': 12})\n",
"\n",
"fig = plt.figure(figsize=(10,8))\n",
"for lb in labels:\n",
" plt.plot(orders_n, times_n[lb[0]], alpha=0.5, label=lb[1], marker='o', lw=3)\n",
"plt.xlabel('sample size n')\n",
"plt.ylabel('time per computation in milliseconds [ms]')\n",
"plt.xlim([1,max(orders_n) + max(orders_n) * 10])\n",
"plt.legend(loc=2)\n",
"plt.grid()\n",
"plt.xscale('log')\n",
"plt.yscale('log')\n",
2014-05-08 18:57:48 +00:00
"\n",
2014-05-08 22:59:11 +00:00
"max_perf = max( py/nu for py,nu in zip(times_n['py_lstsqr'],\n",
" times_n['cy_lstsqr']) )\n",
"min_perf = min( py/nu for py,nu in zip(times_n['py_lstsqr'],\n",
" times_n['cy_lstsqr']) )\n",
"\n",
"ftext = 'Using Cython is {:.2f}x to '\\\n",
" '{:.2f}x faster than regular (C)Python'\\\n",
" .format(min_perf, max_perf)\n",
2014-05-08 18:57:48 +00:00
"\n",
"plt.figtext(.14,.15, ftext, fontsize=11, ha='left')\n",
2014-05-08 22:59:11 +00:00
"plt.title('Performance of least square fit implementations')\n",
2014-05-07 07:04:41 +00:00
"plt.show()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
2014-05-08 22:59:11 +00:00
"png": "iVBORw0KGgoAAAANSUhEUgAAAnIAAAIECAYAAACdVcNJAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXlcT9n/x1/3Vto/fVp8SkppUYSylcHUp5RlGLsZDIrR\nGMyQYWaylqVMDA0xjFAIYxlLs9lSmbKUpcaStFliSnYSpd6/P/y6X7c+rVKW83w8ejy6577P+7zP\n+5x77/tzlns5IiIwGAwGg8FgMN46+IY2gMFgMBgMBoNRO1ggx2AwGAwGg/GWwgI5BoPBYDAYjLcU\nFsgxGAwGg8FgvKWwQI7BYDAYDAbjLYUFcgwGg8FgMBhvKSyQY7yzPH/+HGPHjoWBgQF4nsfRo0cb\n2qS3kp07d8LS0hLKysoYO3asQhkvLy94eHjUs2WMsm0TGxsLnudx8+bNGutq3rw5AgMDX4OV5TE3\nN0dAQEC9lPWmwvM8tm7d2tBmMN4BWCDHaFC8vLzA8zx4noeKigrMzc0xYcIE3L1795V1//bbb9i2\nbRv++OMP5OTk4IMPPqgDi98viouLMXbsWAwbNgzXr1/H8uXLFcpxHAeO4+rVtoiICPD8+3sLU9Q2\nXbp0QU5ODpo0aQIAiIuLA8/zuHbtWpX6Tp06halTp75uswE0TH95FbKzs2v9Y9Dd3R1jxowpl56T\nk4PBgwfXhXmM9xzlhjaAwXB2dsaOHTvw/PlznDp1Ct7e3rh+/Tr++OOPWukrLCxEo0aNkJaWhqZN\nm6Jz586vZF+pvveRmzdvIj8/H7179xaCA0UQEdi7xWtOSUkJANQqIK2obWQyWTnZ6rSNvr5+jW14\n36jLPq6onRiM2vD+/pxlvDGoqKhAJpPB2NgY/fr1w5QpU7B//348e/YMAPDrr7/CwcEB6urqaN68\nOaZNm4YnT54I+eVyOcaNG4c5c+bA2NgYZmZmcHV1xdy5c5GZmQme52FhYQEAKCoqgq+vL0xMTKCq\nqgo7Ozts27ZNZA/P8wgJCcGIESMglUoxevRohIeHQ0VFBTExMWjTpg00NDTg5uaGnJwcREdHw8HB\nAVpaWvDw8BBNa2VlZWHQoEFo2rQpNDU10bZtW0RERIjKk8vl8Pb2xoIFC9CkSRPo6+vD09MT+fn5\nIrnt27ejQ4cOUFdXh4GBAT766CPcv39fOB8SEgJbW1uoq6ujRYsWCAwMRHFxcaW+P3HiBJydnaGh\noQE9PT189tlnyMvLAwCEh4fDzMwMwItgu6YjElW126FDhyCXy6Gvrw+pVAq5XI7ExESRjnXr1qFl\ny5ZQV1eHvr4+XFxccOPGDcTExGD06NEAIIzoVjTtCwCBgYGwtLSEmpoaZDIZevXqhadPn4p8Z2Ji\nAk1NTfTq1QubNm0STVGWtv/LKBql8fb2hpWVFTQ0NGBpaYlZs2ahsLBQOO/v7w9ra2vs2LEDtra2\nUFVVRVpaGh4/fowpU6YINrRv3x579uypsD4VtU1MTIxg95UrV+Ds7AzgxbQpz/Nwc3OrUGfZ6U5z\nc3PMnTsXEyZMgFQqhZGREVavXo2nT59i0qRJ0NPTg4mJCVatWiXSw/M8VqxYgcGDB0NLSwsmJiZY\nsWJFheUCL65Lf39/WFhYQF1dHa1bt8batWvL6V25ciU+/fRTaGlpwdzcHHv27MG9e/cwfPhwSCQS\nWFpaYvfu3aJ8ubm58PLygkwmg0QiQbdu3fDPP/8I50t9dvjwYTg7O0NTUxN2dnbYv3+/INOsWTMA\ngKurq+h+UtX17eXlhSNHjmDjxo1CPy3tL2WnVv/77z8MGzYMurq60NDQgKurK06fPl0jO4Gq+zrj\nHYQYjAbE09OTPDw8RGlLly4ljuPo8ePHFBYWRrq6uhQREUFZWVl09OhRatu2LY0aNUqQd3FxIW1t\nbZowYQKlpKTQ+fPn6e7duzR9+nRq3rw55ebm0u3bt4mIaPr06aSvr0+7du2itLQ0CgwMJJ7nKSoq\nStDHcRzp6+vTqlWrKDMzk9LS0igsLIx4nidXV1dKSEigM2fOkLW1NXXr1o2cnZ3p5MmTlJSURLa2\ntvTpp58Kus6dO0erVq2if//9lzIzMykkJISUlZUpOjpaZL9UKqVvvvmGUlNT6eDBg6Snp0dz5swR\nZDZs2EAqKiq0cOFCoY4rV64U6uXn50dmZma0d+9eunLlCv3111/UrFkzkY6y/Pfff6StrU2fffYZ\nnT9/nuLi4qht27bk7OxMREQFBQWUmJhIHMfR77//Trm5uVRYWFhhO7q7uwvH1Wm3PXv20M6dO+ny\n5ct08eJFGjduHOnp6dGdO3eIiOjUqVOkrKxMmzdvpmvXrtG5c+do/fr1lJ2dTYWFhbRq1SriOI5y\nc3MpNzeXHj58qNC23377jSQSCf3xxx90/fp1SkpKouXLl1NBQQEREe3du5eUlZUpODiY0tLSaP36\n9SSTyYjnebpx44ZQH2VlZZHe69evE8dxFBsbS0REJSUlNGvWLEpISKCrV69SZGQkNWnShPz8/IQ8\nfn5+pKGhQXK5nBISEigtLY0ePXpEcrmcXF1dKT4+nrKysmjt2rXUqFEjUb98mYraJjo6mjiOoxs3\nblBxcTFFRkYSx3F06tQpys3NpXv37lXYH8zNzSkgIEA4NjMzI6lUSsHBwZSRkUELFy4knuepZ8+e\nQtqiRYuI53m6ePGikI/jONLT06OVK1dSWloaLV++nJSVlWnfvn0VluXp6Un29vZ06NAhunLlCm3f\nvp2kUimtX79epNfIyIg2bdpEGRkZNHHiRNLU1KQePXrQxo0bKSMjg77++mvS1NQU+tCTJ0+oZcuW\nNGTIEDp9+jRlZGRQQEAAqaqqUkpKChGR4DN7e3s6cOAApaen05gxY0gikQj+Onv2LHEcR3v27BHd\nT6q6vh88eEDOzs40bNgwoZ+WXkMcx9GWLVuEvuPo6Ejt2rWj+Ph4OnfuHH366aekq6srlFUdO6vq\n64x3ExbIMRqUsgHAhQsXyMLCgj744AMievEw+eWXX0R5YmNjieM4un//PhG9CIRsbGzK6fbz8yMr\nKyvhOD8/n1RVVWn16tUiuYEDB5Kbm5twzHEcjRs3TiQTFhZGHMdRcnKykLZkyRLiOI7OnDkjpAUH\nB5OBgUGlde7fvz95e3sLxy4uLuTg4CCSmTBhguADIiJTU1P6+uuvFerLz88nDQ0NOnDggCh948aN\nJJVKK7Rj9uzZZGpqSkVFRUJacnIycRxHR48eJSKirKws4jiO4uPjK61T2XasTruVpbi4mHR1dYWH\n2+7du0lHR6fCAG3z5s3EcVyldhERLVu2jFq0aCGq58t07dqVRo4cKUqbPn26EBARVS+Qq6hsa2tr\n4djPz494nqfr168LadHR0aSmpkYPHjwQ5R0zZgwNGDCgQt2K2ublQI6I6J9//iGO4+jq1asV6ilF\nUSA3cOBA4bikpIQkEgn169dPlKarq0srV64U0jiOo9GjR4t0jxgxgj788EOFZWVmZhLP85SamirK\nM2/ePNF1wXEcTZ06VTjOy8sjjuNo8uTJQtq9e/eI4zj6888/iehFu5mYmNDz589Ful1dXcnHx4eI\n/uezPXv2COdzc3OJ4zg6ePAgEVWvrUspe327u7vTmDFjysm9HMgdPnyYOI4TgksiomfPnlGTJk1o\n/vz51bazqr7OeDdhU6uMBicmJgba2trQ0NBAmzZtYGVlhS1btiAvLw/Xrl3D1KlToa2tLfx99NFH\n4DgO6enpgo4OHTpUWU56ejoKCwuF6aZSnJ2dceHCBVGao6Njufwcx6FNmzbCsaGhIQCgbdu2orQ7\nd+4Ia2mePHkCX19ftG7dGvr6+tDW1sZff/0lWnzOcRzs7e1FZTVp0gS5ubkAgFu3biE7Oxs9evRQ\nWK8LFy6goKAAgwYNEvnpyy+/xMOHD3Hnzp0K83Xu3BnKyv9bKtu2bVvo6Ojg4sWLCvNUh+q2W1ZW\nFkaNGgVra2vo6OhAR0cHDx48
2014-05-07 07:04:41 +00:00
"text": [
2014-05-08 22:59:11 +00:00
"<matplotlib.figure.Figure at 0x10a753b90>"
2014-05-07 07:04:41 +00:00
]
}
],
2014-05-08 22:59:11 +00:00
"prompt_number": 69
2014-05-07 07:04:41 +00:00
},
2014-04-26 05:15:39 +00:00
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
2014-03-25 19:36:28 +00:00
}
],
"metadata": {}
}
]
}