python_reference/python_patterns/patterns.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[Go back](https://github.com/rasbt/python_reference) to the `python_reference` repository."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# A random collection of useful Python snippets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I just cleaned my hard drive and found a couple of useful Python snippets that I had some use for in the past. I thought it would be worthwhile to collect them in a IPython notebook for personal reference and share it with people who might find them useful too.  \n",
    "Most of those snippets are hopefully self-explanatory, but I am planning to add more comments and descriptions in future."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Table of Contents"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- [Bitstrings from positive and negative elements in a list](#Bitstrings-from-positive-and-negative-elements-in-a-list)\n",
    "- [Command line arguments 1 - sys.argv](#Command-line-arguments-1---sys.argv)\n",
    "- [Data and time basics](#Data-and-time-basics)\n",
    "- [Differences between 2 files](#Differences-between-2-files)\n",
    "- [Differences between successive elements in a list](#Differences-between-successive-elements-in-a-list)\n",
    "- [Doctest example](#Doctest-example)\n",
    "- [English language detection](#English-language-detection)\n",
    "- [File browsing basics](#File-browsing-basics)\n",
    "- [File reading basics](#File-reading-basics)\n",
    "- [Indices of min and max elements from a list](#Indices-of-min-and-max-elements-from-a-list)\n",
    "- [Lambda functions](#Lambda-functions)\n",
    "- [Private functions](#Private-functions)\n",
    "- [Namedtuples](#Namedtuples)\n",
    "- [Normalizing data](#Normalizing-data)\n",
    "- [NumPy essentials](#NumPy-essentials)\n",
    "- [Pickling Python objects to bitstreams](#Pickling-Python-objects-to-bitstreams)\n",
    "- [Python version check](#Python-version-check)\n",
    "- [Runtime within a script](#Runtime-within-a-script)\n",
    "- [Sorting lists of tuples by elements](#Sorting-lists-of-tuples-by-elements)\n",
    "- [Sorting multiple lists relative to each other](#Sorting-multiple-lists-relative-to-each-other)\n",
    "- [Using namedtuples](#Using-namedtuples)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "%load_ext watermark"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Sebastian Raschka 26/09/2014 \n",
      "\n",
      "CPython 3.4.1\n",
      "IPython 2.0.0\n"
     ]
    }
   ],
   "source": [
    "%watermark -d -a \"Sebastian Raschka\" -v"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font size=\"1.5em\">[More information](https://github.com/rasbt/watermark) about the `watermark` magic command extension.</font>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bitstrings from positive and negative elements in a list"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "input values [ 1.   2.   0.3 -1.  -2. ]\n",
      "bitstring [1 1 1 0 0]\n"
     ]
    }
   ],
   "source": [
    "# Generating a bitstring from a Python list or numpy array\n",
    "# where all postive values -> 1\n",
    "# all negative values -> 0\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "def make_bitstring(ary):\n",
    "    return np.where(ary > 0, 1, 0)\n",
    "\n",
    "\n",
    "def faster_bitstring(ary):\n",
    "    return np.where(ary > 0).astype('i1')\n",
    "\n",
    "### Example:\n",
    "\n",
    "ary1 = np.array([1, 2, 0.3, -1, -2])\n",
    "print('input values %s' %ary1)\n",
    "print('bitstring %s' %make_bitstring(ary1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Command line arguments 1 - sys.argv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Overwriting cmd_line_args_1_sysarg.py\n"
     ]
    }
   ],
   "source": [
    "%%file cmd_line_args_1_sysarg.py\n",
    "import sys\n",
    "\n",
    "def error(msg):\n",
    "    \"\"\"Prints error message, sends it to stderr, and quites the program.\"\"\"\n",
    "    sys.exit(msg)\n",
    "\n",
    "args = sys.argv[1:]     #  sys.argv[0] is the name of the python script itself\n",
    "\n",
    "try:\n",
    "    arg1 = int(args[0])\n",
    "    arg2 = args[1]\n",
    "    arg3 = args[2]\n",
    "    print(\"Everything okay!\")\n",
    "\n",
    "except ValueError:\n",
    "    error(\"First argument must be integer type!\")\n",
    "\n",
    "except IndexError:\n",
    "    error(\"Requires 3 arguments!\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Everything okay!\n"
     ]
    }
   ],
   "source": [
    "% run cmd_line_args_1_sysarg.py 1 2 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "SystemExit",
     "evalue": "First argument must be integer type!",
     "output_type": "error",
     "traceback": [
      "An exception has occurred, use %tb to see the full traceback.\n",
      "\u001b[0;31mSystemExit\u001b[0m\u001b[0;31m:\u001b[0m First argument must be integer type!\n"
     ]
    }
   ],
   "source": [
    "% run cmd_line_args_1_sysarg.py a 2 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data and time basics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "13:28:05\n",
      "26/09/2014\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "# print time HOURS:MINUTES:SECONDS\n",
    "# e.g., '10:50:58'\n",
    "print(time.strftime(\"%H:%M:%S\"))\n",
    "\n",
    "# print current date DAY:MONTH:YEAR\n",
    "# e.g., '06/03/2014'\n",
    "print(time.strftime(\"%d/%m/%Y\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Differences between 2 files"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Writing id_file1.txt\n"
     ]
    }
   ],
   "source": [
    "%%file id_file1.txt\n",
    "1234\n",
    "2342\n",
    "2341"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Writing id_file2.txt\n"
     ]
    }
   ],
   "source": [
    "%%file id_file2.txt\n",
    "5234\n",
    "3344\n",
    "2341"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "5234\n",
      "3344\n",
      "Total differences: 2\n"
     ]
    }
   ],
   "source": [
    "# Print lines that are different between 2 files. Insensitive\n",
    "# to the order of the file contents.\n",
    "\n",
    "id_set1 = set()\n",
    "id_set2 = set()\n",
    "\n",
    "with open('id_file1.txt', 'r') as id_file:\n",
    "    for line in id_file:\n",
    "        id_set1.add(line.strip())\n",
    "\n",
    "with open('id_file2.txt', 'r') as id_file:\n",
    "    for line in id_file:\n",
    "        id_set2.add(line.strip())    \n",
    "\n",
    "diffs = id_set2.difference(id_set1)\n",
    "\n",
    "for d in diffs:\n",
    "    print(d)\n",
    "print(\"Total differences:\",len(diffs))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Differences between successive elements in a list"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 1, 2, 3]\n"
     ]
    }
   ],
   "source": [
    "from itertools import islice\n",
    "\n",
    "lst = [1,2,3,5,8]\n",
    "diff = [j - i for i, j in zip(lst, islice(lst, 1, None))]\n",
    "print(diff)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Doctest example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ok\n"
     ]
    }
   ],
   "source": [
    "def subtract(a, b):\n",
    "    \"\"\"\n",
    "    Subtracts second from first number and returns result.\n",
    "    >>> subtract(10, 5)\n",
    "    5\n",
    "    >>> subtract(11, 0.7)\n",
    "    10.3\n",
    "    \"\"\"\n",
    "    return a-b\n",
    "\n",
    "if __name__ == \"__main__\":  # is 'false' if imported\n",
    "    import doctest\n",
    "    doctest.testmod()\n",
    "    print('ok')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "**********************************************************************\n",
      "File \"__main__\", line 4, in __main__.hello_world\n",
      "Failed example:\n",
      "    hello_world()\n",
      "Expected:\n",
      "    'Hello, World'\n",
      "Got:\n",
      "    'hello world'\n",
      "**********************************************************************\n",
      "1 items had failures:\n",
      "   1 of   1 in __main__.hello_world\n",
      "***Test Failed*** 1 failures.\n"
     ]
    }
   ],
   "source": [
    "def hello_world():\n",
    "    \"\"\"\n",
    "    Returns 'Hello, World'\n",
    "    >>> hello_world()\n",
    "    'Hello, World'\n",
    "    \"\"\"\n",
    "    return 'hello world'\n",
    "\n",
    "if __name__ == \"__main__\":  # is 'false' if imported\n",
    "    import doctest\n",
    "    doctest.testmod()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## English language detection"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.2\n"
     ]
    }
   ],
   "source": [
    "import nltk\n",
    "\n",
    "def eng_ratio(text):\n",
    "    ''' Returns the ratio of non-English to English words from a text '''\n",
    "\n",
    "    english_vocab = set(w.lower() for w in nltk.corpus.words.words()) \n",
    "    text_vocab = set(w.lower() for w in text.split() if w.lower().isalpha()) \n",
    "    unusual = text_vocab.difference(english_vocab)\n",
    "    diff = len(unusual)/len(text_vocab)\n",
    "    return diff\n",
    "    \n",
    "text = 'This is a test fahrrad'\n",
    "\n",
    "print(eng_ratio(text))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## File browsing basics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import shutil\n",
    "import glob\n",
    "\n",
    "# working directory\n",
    "c_dir = os.getcwd()                 # show current working directory\n",
    "os.listdir(c_dir)                   # shows all files in the working directory\n",
    "os.chdir('~/Data')                  # change working directory\n",
    "\n",
    "\n",
    "# get all files in a directory\n",
    "glob.glob('/Users/sebastian/Desktop/*')\n",
    "\n",
    "# e.g.,  ['/Users/sebastian/Desktop/untitled folder', '/Users/sebastian/Desktop/Untitled.txt']\n",
    "\n",
    "# walk\n",
    "tree = os.walk(c_dir)      \n",
    "# moves through sub directories and creates a 'generator' object of tuples\n",
    "# ('dir', [file1, file2, ...] [subdirectory1, subdirectory2, ...]), \n",
    "#    (...), ...\n",
    "\n",
    "#check files: returns either True or False\n",
    "os.exists('../rel_path')\n",
    "os.exists('/home/abs_path')\n",
    "os.isfile('./file.txt')\n",
    "os.isdir('./subdir')\n",
    "\n",
    "\n",
    "# file permission (True or False\n",
    "os.access('./some_file', os.F_OK) # File exists? Python 2.7\n",
    "os.access('./some_file', os.R_OK) # Ok to read? Python 2.7\n",
    "os.access('./some_file', os.W_OK) # Ok to write? Python 2.7\n",
    "os.access('./some_file', os.X_OK) # Ok to execute? Python 2.7\n",
    "os.access('./some_file', os.X_OK | os.W_OK) # Ok to execute or write? Python 2.7\n",
    "\n",
    "# join (creates operating system dependent paths)\n",
    "os.path.join('a', 'b', 'c')\n",
    "# 'a/b/c' on Unix/Linux\n",
    "# 'a\\\\b\\\\c' on Windows\n",
    "os.path.normpath('a/b/c') # converts file separators\n",
    "\n",
    "\n",
    "# os.path: direcory and file names\n",
    "os.path.samefile('./some_file', '/home/some_file')  # True if those are the same\n",
    "os.path.dirname('./some_file')  # returns '.' (everythin but last component)\n",
    "os.path.basename('./some_file') # returns 'some_file' (only last component\n",
    "os.path.split('./some_file') # returns (dirname, basename) or ('.', 'some_file)\n",
    "os.path.splitext('./some_file.txt') # returns ('./some_file', '.txt')\n",
    "os.path.splitdrive('./some_file.txt') # returns ('', './some_file.txt')\n",
    "os.path.isabs('./some_file.txt') # returns False (not an absolute path)\n",
    "os.path.abspath('./some_file.txt')\n",
    "\n",
    "\n",
    "# create and delete files and directories\n",
    "os.mkdir('./test')  # create a new direcotory\n",
    "os.rmdir('./test')  # removes an empty direcotory\n",
    "os.removedirs('./test') # removes nested empty directories\n",
    "os.remove('file.txt')   # removes an individual file\n",
    "shutil.rmtree('./test') # removes directory (empty or not empty)\n",
    "\n",
    "os.rename('./dir_before', './renamed') # renames directory if destination doesn't exist\n",
    "shutil.move('./dir_before', './renamed') # renames directory always\n",
    "\n",
    "shutil.copytree('./orig', './copy') # copies a directory recursively\n",
    "shutil.copyfile('file', 'copy')     # copies a file\n",
    "\n",
    " \n",
    "# Getting files of particular type from directory\n",
    "files = [f for f in os.listdir(s_pdb_dir) if f.endswith(\".txt\")]\n",
    "  \n",
    "# Copy and move\n",
    "shutil.copyfile(\"/path/to/file\", \"/path/to/new/file\") \n",
    "shutil.copy(\"/path/to/file\", \"/path/to/directory\")\n",
    "shutil.move(\"/path/to/file\",\"/path/to/directory\")\n",
    "   \n",
    "# Check if file or directory exists\n",
    "os.path.exists(\"file or directory\")\n",
    "os.path.isfile(\"file\")\n",
    "os.path.isdir(\"directory\")\n",
    "    \n",
    "# Working directory and absolute path to files\n",
    "os.getcwd()\n",
    "os.path.abspath(\"file\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## File reading basics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Note: rb opens file in binary mode to avoid issues with Windows systems\n",
    "# where '\\r\\n' is used instead of '\\n' as newline character(s).\n",
    "\n",
    "\n",
    "# A) Reading in Byte chunks\n",
    "reader_a = open(\"file.txt\", \"rb\")\n",
    "chunks = []\n",
    "data = reader_a.read(64)  # reads first 64 bytes\n",
    "while data != \"\":\n",
    "    chunks.append(data)\n",
    "    data = reader_a.read(64)\n",
    "if data:\n",
    "    chunks.append(data)\n",
    "print(len(chunks))\n",
    "reader_a.close()\n",
    "\n",
    "\n",
    "# B) Reading whole file at once into a list of lines\n",
    "with open(\"file.txt\", \"rb\") as reader_b:   # recommended syntax, auto closes\n",
    "    data = reader_b.readlines() # data is assigned a list of lines\n",
    "print(len(data))\n",
    "\n",
    "\n",
    "# C) Reading whole file at once into a string\n",
    "with open(\"file.txt\", \"rb\") as reader_c:\n",
    "    data = reader_c.read() # data is assigned a list of lines\n",
    "print(len(data))\n",
    "\n",
    "\n",
    "# D) Reading line by line into a list\n",
    "data = []\n",
    "with open(\"file.txt\", \"rb\") as reader_d:\n",
    "    for line in reader_d:\n",
    "        data.append(line)\n",
    "print(len(data))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Indices of min and max elements from a list"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "min_index: 0 min_value: 1\n",
      "max_index: 4 max_value: 5\n"
     ]
    }
   ],
   "source": [
    "import operator\n",
    "\n",
    "values = [1, 2, 3, 4, 5]\n",
    "\n",
    "min_index, min_value = min(enumerate(values), key=operator.itemgetter(1))\n",
    "max_index, max_value = max(enumerate(values), key=operator.itemgetter(1))\n",
    "\n",
    "print('min_index:', min_index, 'min_value:', min_value)\n",
    "print('max_index:', max_index, 'max_value:', max_value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Lambda functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Lambda functions are just a short-hand way or writing\n",
    "# short function definitions\n",
    "\n",
    "def square_root1(x):\n",
    "    return x**0.5\n",
    "    \n",
    "square_root2 = lambda x: x**0.5\n",
    "\n",
    "assert(square_root1(9) == square_root2(9))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Private functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "My message: Hello, World\n"
     ]
    }
   ],
   "source": [
    "def create_message(msg_txt):\n",
    "    def _priv_msg(message):     # private, no access from outside\n",
    "        print(\"{}: {}\".format(msg_txt, message))\n",
    "    return _priv_msg            # returns a function\n",
    "\n",
    "new_msg = create_message(\"My message\")\n",
    "# note, new_msg is a function\n",
    "\n",
    "new_msg(\"Hello, World\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Namedtuples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 2 3\n"
     ]
    }
   ],
   "source": [
    "from collections import namedtuple\n",
    "\n",
    "my_namedtuple = namedtuple('field_name', ['x', 'y', 'z', 'bla', 'blub'])\n",
    "p = my_namedtuple(1, 2, 3, 4, 5)\n",
    "print(p.x, p.y, p.z)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Normalizing data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def normalize(data, min_val=0, max_val=1):\n",
    "    \"\"\"\n",
    "    Normalizes values in a list of data points to a range, e.g.,\n",
    "    between 0.0 and 1.0. \n",
    "    Returns the original object if value is not a integer or float.\n",
    "    \n",
    "    \"\"\"\n",
    "    norm_data = []\n",
    "    data_min = min(data)\n",
    "    data_max = max(data)\n",
    "    for x in data:\n",
    "        numerator = x - data_min\n",
    "        denominator = data_max - data_min\n",
    "        x_norm = (max_val-min_val) * numerator/denominator + min_val\n",
    "        norm_data.append(x_norm)\n",
    "    return norm_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0.0, 0.25, 0.5, 0.75, 1.0]"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "normalize([1,2,3,4,5])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[-10.0, -5.0, 0.0, 5.0, 10.0]"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "normalize([1,2,3,4,5], min_val=-10, max_val=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## NumPy essentials"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "ary1 = np.array([1,2,3,4,5])  # must be same type\n",
    "ary2 = np.zeros((3,4))        # 3x4 matrix consisiting of 0s \n",
    "ary3 = np.ones((3,4))         # 3x4 matrix consisiting of 1s \n",
    "ary4 = np.identity(3)         # 3x3 identity matrix\n",
    "ary5 = ary1.copy()               # make a copy of ary1\n",
    "\n",
    "item1 = ary3[0, 0]  # item in row1, column1\n",
    "\n",
    "ary2.shape  # tuple of dimensions. Here: (3,4)\n",
    "ary2.size   # number of elements. Here: 12\n",
    "\n",
    "\n",
    "ary2_t = ary2.transpose()    # transposes matrix\n",
    "\n",
    "ary2.ravel()     # makes an array linear (1-dimensional)\n",
    "                 # by concatenating rows\n",
    "ary2.reshape(2,6) # reshapes array (must have same dimensions)\n",
    "\n",
    "ary3[0:2, 0:3]   # submatrix of first 2 rows and first 3 columns \n",
    "\n",
    "ary3 = ary3[[2,0,1]]    # re-arrange rows\n",
    "\n",
    "\n",
    "# element-wise operations\n",
    "\n",
    "ary1 + ary1\n",
    "ary1 * ary1\n",
    "numpy.dot(ary1, ary1)   # matrix/vector (dot) product\n",
    "\n",
    "numpy.sum(ary1, axis=1)   # sum of a 1D array, column sums of a 2D array\n",
    "numpy.mean(ary1, axis=1)  # mean of a 1D array, column means of a 2D array"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pickling Python objects to bitstreams"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{1: 'some text', 2: 'some text', 3: 'some text', 4: 'some text', 5: 'some text', 6: 'some text', 7: 'some text', 8: 'some text', 9: 'some text'}\n"
     ]
    }
   ],
   "source": [
    "import pickle\n",
    "\n",
    "#### Generate some object\n",
    "my_dict = dict()\n",
    "for i in range(1,10):\n",
    "    my_dict[i] = \"some text\"\n",
    "\n",
    "#### Save object to file\n",
    "pickle_out = open('my_file.pkl', 'wb')\n",
    "pickle.dump(my_dict, pickle_out)\n",
    "pickle_out.close()\n",
    "\n",
    "#### Load object from file\n",
    "my_object_file = open('my_file.pkl', 'rb')\n",
    "my_dict = pickle.load(my_object_file)\n",
    "my_object_file.close()\n",
    "\n",
    "print(my_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Python version check"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "executed in Python 3.x\n",
      "H\n",
      "in for-loop:\n",
      "e\n",
      "l\n",
      "l\n",
      "o\n"
     ]
    }
   ],
   "source": [
    "import sys\n",
    "\n",
    "def give_letter(word):\n",
    "    for letter in word:\n",
    "        yield letter\n",
    "\n",
    "if sys.version_info[0] == 3:\n",
    "    print('executed in Python 3.x')\n",
    "    test = give_letter('Hello')\n",
    "    print(next(test))\n",
    "    print('in for-loop:')\n",
    "    for l in test:\n",
    "        print(l)\n",
    "\n",
    "# if Python 2.x\n",
    "if sys.version_info[0] == 2:\n",
    "    print('executed in Python 2.x')\n",
    "    test = give_letter('Hello')\n",
    "    print(test.next())\n",
    "    print('in for-loop:') \n",
    "    for l in test:\n",
    "        print(l)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Runtime within a script"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time elapsed: 0.49176900000000057 seconds\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "start_time = time.clock()\n",
    "\n",
    "for i in range(10000000):\n",
    "    pass\n",
    "\n",
    "elapsed_time = time.clock() - start_time\n",
    "print(\"Time elapsed: {} seconds\".format(elapsed_time))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time elapsed: 0.3550995970144868 seconds\n"
     ]
    }
   ],
   "source": [
    "import timeit\n",
    "elapsed_time = timeit.timeit('for i in range(10000000): pass', number=1)\n",
    "print(\"Time elapsed: {} seconds\".format(elapsed_time))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sorting lists of tuples by elements"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(2, 3, 'a'), (2, 2, 'b'), (3, 2, 'b'), (1, 3, 'c')]\n"
     ]
    }
   ],
   "source": [
    "# Here, we make use of the \"key\" parameter of the in-built \"sorted()\" function \n",
    "# (also available for the \".sort()\" method), which let's us define a function \n",
    "# that is called on every element that is to be sorted. In this case, our \n",
    "# \"key\"-function is a simple lambda function that returns the last item \n",
    "# from every tuple.\n",
    "\n",
    "a_list = [(1,3,'c'), (2,3,'a'), (3,2,'b'), (2,2,'b')]\n",
    "\n",
    "sorted_list = sorted(a_list, key=lambda e: e[::-1])\n",
    "\n",
    "print(sorted_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(2, 3, 'a'), (3, 2, 'b'), (2, 2, 'b'), (1, 3, 'c')]\n"
     ]
    }
   ],
   "source": [
    "# prints [(2, 3, 'a'), (2, 2, 'b'), (3, 2, 'b'), (1, 3, 'c')]\n",
    "\n",
    "# If we are only interesting in sorting the list by the last element\n",
    "# of the tuple and don't care about a \"tie\" situation, we can also use\n",
    "# the index of the tuple item directly instead of reversing the tuple \n",
    "# for efficiency.\n",
    "\n",
    "a_list = [(1,3,'c'), (2,3,'a'), (3,2,'b'), (2,2,'b')]\n",
    "\n",
    "sorted_list = sorted(a_list, key=lambda e: e[-1])\n",
    "\n",
    "print(sorted_list)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sorting multiple lists relative to each other"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "input values:\n",
      " ['c', 'b', 'a'] [6, 5, 4] ['some-val-associated-with-c', 'another_val-b', 'z_another_third_val-a']\n",
      "\n",
      "\n",
      "sorted output:\n",
      " ['a', 'b', 'c'] [4, 5, 6] ['z_another_third_val-a', 'another_val-b', 'some-val-associated-with-c']\n"
     ]
    }
   ],
   "source": [
    "\"\"\"\n",
    "You have 3 lists that you want to sort \"relative\" to each other,\n",
    "for example, picturing each list as a row in a 3x3 matrix: sort it by columns\n",
    "\n",
    "########################\n",
    "If the input lists are\n",
    "########################\n",
    "\n",
    " list1 = ['c','b','a']\n",
    " list2 = [6,5,4]\n",
    " list3 = ['some-val-associated-with-c','another_val-b','z_another_third_val-a']\n",
    "\n",
    "########################\n",
    "the desired outcome is:\n",
    "########################\n",
    "\n",
    " ['a', 'b', 'c'] \n",
    " [4, 5, 6] \n",
    " ['z_another_third_val-a', 'another_val-b', 'some-val-associated-with-c']\n",
    "\n",
    "########################\n",
    "and NOT:\n",
    "########################\n",
    "\n",
    " ['a', 'b', 'c'] \n",
    " [4, 5, 6] \n",
    " ['another_val-b', 'some-val-associated-with-c', 'z_another_third_val-a']\n",
    "\n",
    "\n",
    "\"\"\"\n",
    "\n",
    "list1 = ['c','b','a']\n",
    "list2 = [6,5,4]\n",
    "list3 = ['some-val-associated-with-c','another_val-b','z_another_third_val-a']\n",
    "\n",
    "print('input values:\\n', list1, list2, list3)\n",
    "\n",
    "list1, list2, list3 = [list(t) for t in zip(*sorted(zip(list1, list2, list3)))]\n",
    "\n",
    "print('\\n\\nsorted output:\\n', list1, list2, list3 )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using namedtuples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[back to top](#Table-of-Contents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`namedtuples` are high-performance container datatypes in the [`collection`](https://docs.python.org/2/library/collections.html) module (part of Python's stdlib since 2.6).\n",
    "`namedtuple()` is factory function for creating tuple subclasses with named fields."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "X-coordinate: 1\n"
     ]
    }
   ],
   "source": [
    "from collections import namedtuple\n",
    "\n",
    "Coordinates = namedtuple('Coordinates', ['x', 'y', 'z'])\n",
    "point1 = Coordinates(1, 2, 3)\n",
    "print('X-coordinate: %d' % point1.x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.4.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}