running cython

This commit is contained in:
rasbt 2014-06-12 09:52:01 -04:00
parent 4ebe90dceb
commit 844a1560a3
2 changed files with 626 additions and 3 deletions

View File

@ -21,12 +21,10 @@
- A collection of not so obvious Python stuff you should know! [[IPython nb](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/not_so_obvious_python_stuff.ipynb?create=1)] - A collection of not so obvious Python stuff you should know! [[IPython nb](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/not_so_obvious_python_stuff.ipynb?create=1)]
- Python's scope resolution for variable names and the LEGB rule [[IPython nb](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb?create=1)] - Python's scope resolution for variable names and the LEGB rule [[IPython nb](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb?create=1)]
- Key differences between Python 2.x and Python 3.x [[IPython nb](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/key_differences_between_python_2_and_3.ipynb?create=1)] - Key differences between Python 2.x and Python 3.x [[IPython nb](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/key_differences_between_python_2_and_3.ipynb?create=1)]
- A thorough guide to SQLite database operations in Python [[Markdown](./sqlite3_howto/README.md)] - A thorough guide to SQLite database operations in Python [[Markdown](./sqlite3_howto/README.md)]
- Unit testing in Python - Why we want to make it a habit [[Markdown](./tutorials/unit_testing.md)] - Unit testing in Python - Why we want to make it a habit [[Markdown](./tutorials/unit_testing.md)]
@ -34,9 +32,11 @@
- Installing Scientific Packages for Python3 on MacOS 10.9 Mavericks [[Markdown](./tutorials/installing_scientific_packages.md)] - Installing Scientific Packages for Python3 on MacOS 10.9 Mavericks [[Markdown](./tutorials/installing_scientific_packages.md)]
- Sorting CSV files using the Python csv module [[IPython nb](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/sorting_csvs.ipynb)] - Sorting CSV files using the Python csv module [[IPython nb](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/sorting_csvs.ipynb)]
- Using Cython with and without IPython magic [[IPython nb](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/running_cython.ipynb)]
<br> <br>

View File

@ -0,0 +1,623 @@
{
"metadata": {
"name": "",
"signature": "sha256:0e619f8592165b10fa0b3558de3c622629bfb7aab1e9544b5bad64478cf35848"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Sebastian Raschka](http://sebastianraschka.com) \n",
"last updated: 06/12/2014"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<hr>\n",
"I would be happy to hear your comments and suggestions. \n",
"Please feel free to drop me a note via\n",
"[twitter](https://twitter.com/rasbt), [email](mailto:bluewoodtree@gmail.com), or [google+](https://plus.google.com/118404394130788869227).\n",
"<hr>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"sections\"></a>\n",
"<br>\n",
"<br>\n"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Using Cython with and without IPython magic"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"introduction\"></a>\n",
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Bubblesort in regular (C)Python"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, we will write a simple implementation of the bubble sort algorithm in regular (C)Python"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def python_bubblesort(a_list):\n",
" \"\"\" Bubblesort in Python for list objects. \"\"\"\n",
" length = len(a_list)\n",
" swapped = 1\n",
" for i in range(0, length):\n",
" if swapped: \n",
" swapped = 0\n",
" for ele in range(0, length-i-1):\n",
" if a_list[ele] > a_list[ele + 1]:\n",
" temp = a_list[ele + 1]\n",
" a_list[ele + 1] = a_list[ele]\n",
" a_list[ele] = temp\n",
" swapped = 1\n",
" return a_list"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"python_bubblesort([6,3,1,5,6])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 2,
"text": [
"[1, 3, 5, 6, 6]"
]
}
],
"prompt_number": 2
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Implemented in Cython"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Maybe we can speed things up a little bit via [Cython's C-extensions for Python](http://cython.org). Cython is basically a hybrid between C and Python and can be pictured as compiled Python code with type declarations. \n",
"Since we are working in an IPython notebook here, we can make use of the very convenient *IPython magic*: It will take care of the conversion to C code, the compilation, and eventually the loading of the function. "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%load_ext cythonmagic"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, we will take the initial Python code as is and use Cython for the compilation. Cython is capable of autoguessing types, however, we can make our code way more efficient by adding static types."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%cython\n",
"def cython_bubblesort_untyped(a_list):\n",
" \"\"\" Bubblesort in Python for list objects. \"\"\"\n",
" length = len(a_list)\n",
" swapped = 1\n",
" for i in range(0, length):\n",
" if swapped: \n",
" swapped = 0\n",
" for ele in range(0, length-i-1):\n",
" if a_list[ele] > a_list[ele + 1]:\n",
" temp = a_list[ele + 1]\n",
" a_list[ele + 1] = a_list[ele]\n",
" a_list[ele] = temp\n",
" swapped = 1\n",
" return a_list"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%cython\n",
"import numpy as np\n",
"cimport numpy as np\n",
"cimport cython\n",
"@cython.boundscheck(False) \n",
"@cython.wraparound(False)\n",
"cpdef cython_bubblesort_typed(inp_ary):\n",
" \"\"\" The Cython implementation of Bubblesort with NumPy memoryview.\"\"\"\n",
" cdef unsigned long length, i, swapped, ele, temp\n",
" cdef long[:] np_ary = inp_ary\n",
" length = np_ary.shape[0]\n",
" swapped = 1\n",
" for i in xrange(0, length):\n",
" if swapped: \n",
" swapped = 0\n",
" for ele in xrange(0, length-i-1):\n",
" if np_ary[ele] > np_ary[ele + 1]:\n",
" temp = np_ary[ele + 1]\n",
" np_ary[ele + 1] = np_ary[ele]\n",
" np_ary[ele] = temp\n",
" swapped = 1\n",
" return inp_ary"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Speed comparison"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below, we will do a quick speed comparison of our 3 implementations of the bubble sort algorithm by sorting a list (or numpy array) of 1000 random digits. Here, we have to make copies of the lists/numpy arrays, since our bubble sort implementation is sorting in place."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import random\n",
"import numpy as np\n",
"import copy\n",
"\n",
"list_a = [random.randint(0,1000) for num in range(1000)]\n",
"list_b = copy.deepcopy(a_list)\n",
"\n",
"ary_a = np.asarray(list_a)\n",
"ary_b = copy.deepcopy(ary_a)\n",
"ary_c = copy.deepcopy(ary_a)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 11
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print('\\n(C)Python on list:')\n",
"%timeit python_bubblesort(list_a)\n",
"\n",
"print('\\n(C)Python on numpy array:')\n",
"%timeit python_bubblesort(ary_a)\n",
"\n",
"print('\\nuntyped Cython on list:')\n",
"%timeit cython_bubblesort_untyped(list_b)\n",
"\n",
"print('\\nuntyped Cython on numpy array:')\n",
"%timeit cython_bubblesort_untyped(ary_b)\n",
"\n",
"print('\\ntyped Cython with memoryview on numpy array:')\n",
"%timeit cython_bubblesort_typed(ary_c)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"(C)Python on list:\n",
"1 loops, best of 3: 332 \u00b5s per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"(C)Python on numpy array:\n",
"1 loops, best of 3: 839 \u00b5s per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"untyped Cython on list:\n",
"10000 loops, best of 3: 183 \u00b5s per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"untyped Cython on numpy array:\n",
"1 loops, best of 3: 666 \u00b5s per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"typed Cython with memoryview on numpy array:\n",
"100000 loops, best of 3: 4.05 \u00b5s per loop"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
"prompt_number": 12
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"As we can see from the results above, we are already able to make our Python code run almost as twice as fast if we compile it via Cython (Python on list: 332 \u00b5s, untyped Cython on list: 183 \u00b5s). \n",
"However, although it is more \"work\" to adjust the Python code, the \"typed Cython with memoryview on numpy array\" is significantly as expected."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"cython_bonus\"></a>\n",
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"How to use Cython without the IPython magic"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to top](#sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"IPython's notebook is really great for explanatory analysis and documentation, but what if we want to compile our Python code via Cython without letting IPython's magic doing all the work? \n",
"These are the steps you would need."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 1. Creating a .pyx file containing the the desired code or function."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file cython_bubblesort_nomagic.pyx\n",
"\n",
"cpdef cython_bubblesort_nomagic(inp_ary):\n",
" \"\"\" The Cython implementation of Bubblesort with NumPy memoryview.\"\"\"\n",
" cdef unsigned long length, i, swapped, ele, temp\n",
" cdef long[:] np_ary = inp_ary\n",
" length = np_ary.shape[0]\n",
" swapped = 1\n",
" for i in xrange(0, length):\n",
" if swapped: \n",
" swapped = 0\n",
" for ele in xrange(0, length-i-1):\n",
" if np_ary[ele] > np_ary[ele + 1]:\n",
" temp = np_ary[ele + 1]\n",
" np_ary[ele + 1] = np_ary[ele]\n",
" np_ary[ele] = temp\n",
" swapped = 1\n",
" return inp_ary"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting cython_bubblesort_nomagic.pyx\n"
]
}
],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 2. Creating a simple setup file"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file setup.py\n",
"\n",
"from distutils.core import setup\n",
"from distutils.extension import Extension\n",
"from Cython.Distutils import build_ext\n",
"\n",
"setup(\n",
" cmdclass = {'build_ext': build_ext},\n",
" ext_modules = [Extension(\"cython_bubblesort_nomagic\", [\"cython_bubblesort_nomagic.pyx\"])]\n",
")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting setup.py\n"
]
}
],
"prompt_number": 17
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####3. Building and Compiling"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"!python3 setup.py build_ext --inplace"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"running build_ext\r\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"cythoning cython_bubblesort_nomagic.pyx to cython_bubblesort_nomagic.c\r\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"building 'cython_bubblesort_nomagic' extension\r\n",
"/usr/bin/clang -fno-strict-aliasing -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/sebastian/miniconda3/envs/py34/include -arch x86_64 -I/Users/sebastian/miniconda3/envs/py34/include/python3.4m -c cython_bubblesort_nomagic.c -o build/temp.macosx-10.5-x86_64-3.4/cython_bubblesort_nomagic.o\r\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\u001b[1mcython_bubblesort_nomagic.c:16276:32: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__Pyx_PyUnicode_FromString' [-Wunused-function]\u001b[0m\r\n",
"static CYTHON_INLINE PyObject* __Pyx_PyUnicode_FromString(char* c_str) {\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:16427:33: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__Pyx_PyInt_FromSize_t' [-Wunused-function]\u001b[0m\r\n",
"static CYTHON_INLINE PyObject * __Pyx_PyInt_FromSize_t(size_t ival) {\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:14058:26: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__Pyx_GetBufferAndValidate' [-Wunused-function]\u001b[0m\r\n",
"static CYTHON_INLINE int __Pyx_GetBufferAndValidate(\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:14092:27: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__Pyx_SafeReleaseBuffer' [-Wunused-function]\u001b[0m\r\n",
"static CYTHON_INLINE void __Pyx_SafeReleaseBuffer(Py_buffer* info) {\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:14165:1: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__pyx_add_acquisition_count_locked' [-Wunused-function]\u001b[0m\r\n",
"__pyx_add_acquisition_count_locked(__pyx_atomic_int *acquisition_count,\r\n",
"\u001b[0;1;32m^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:14175:1: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__pyx_sub_acquisition_count_locked' [-Wunused-function]\u001b[0m\r\n",
"__pyx_sub_acquisition_count_locked(__pyx_atomic_int *acquisition_count,\r\n",
"\u001b[0;1;32m^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:14643:26: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__Pyx_PyBytes_Equals' [-Wunused-function]\u001b[0m\r\n",
"static CYTHON_INLINE int __Pyx_PyBytes_Equals(PyObject* s1, PyObject* s2...\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:14920:32: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__Pyx_GetItemInt_List_Fast' [-Wunused-function]\u001b[0m\r\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"static CYTHON_INLINE PyObject *__Pyx_GetItemInt_List_Fast(PyObject *o, P...\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:14934:32: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__Pyx_GetItemInt_Tuple_Fast' [-Wunused-function]\u001b[0m\r\n",
"static CYTHON_INLINE PyObject *__Pyx_GetItemInt_Tuple_Fast(PyObject *o, ...\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:15111:38: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1munused function\r\n",
" '__Pyx_PyInt_From_unsigned_long' [-Wunused-function]\u001b[0m\r\n",
" static CYTHON_INLINE PyObject* __Pyx_PyInt_From_unsigned_long(unsi...\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:15158:36: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1mfunction\r\n",
" '__Pyx_PyInt_As_unsigned_long' is not needed and will not be emitted\r\n",
" [-Wunneeded-internal-declaration]\u001b[0m\r\n",
"static CYTHON_INLINE unsigned long __Pyx_PyInt_As_unsigned_long(PyObject *x) {\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:15627:27: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1mfunction '__Pyx_PyInt_As_char' is\r\n",
" not needed and will not be emitted [-Wunneeded-internal-declaration]\u001b[0m\r\n",
"static CYTHON_INLINE char __Pyx_PyInt_As_char(PyObject *x) {\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m\u001b[1mcython_bubblesort_nomagic.c:15727:27: \u001b[0m\u001b[0;1;35mwarning: \u001b[0m\u001b[1mfunction '__Pyx_PyInt_As_long' is\r\n",
" not needed and will not be emitted [-Wunneeded-internal-declaration]\u001b[0m\r\n",
"static CYTHON_INLINE long __Pyx_PyInt_As_long(PyObject *x) {\r\n",
"\u001b[0;1;32m ^\r\n",
"\u001b[0m"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"13 warnings generated.\r\n",
"/usr/bin/clang -bundle -undefined dynamic_lookup -L/Users/sebastian/miniconda3/envs/py34/lib -arch x86_64 build/temp.macosx-10.5-x86_64-3.4/cython_bubblesort_nomagic.o -L/Users/sebastian/miniconda3/envs/py34/lib -o /Users/sebastian/Desktop/cython_bubblesort_nomagic.so\r\n"
]
}
],
"prompt_number": 18
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 4. Importing and running the code"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import cython_bubblesort_nomagic\n",
"\n",
"cython_bubblesort_nomagic.cython_bubblesort_nomagic(np.array([4,6,2,1,6]))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 20,
"text": [
"array([1, 2, 4, 6, 6])"
]
}
],
"prompt_number": 20
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"numba\"></a>\n",
"<br>\n",
"<br>"
]
}
],
"metadata": {}
}
]
}