python_reference/algorithms/sequential_selection_algorithms.ipynb

{
 "metadata": {
  "name": "",
  "signature": "sha256:422c66e0088094cc07058647ff0a8c20c5fbb08fad34a18ee957f594d82b1e53"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "*Sebastian Raschka*  \n",
      "last updated: **03/29/2014**  \n",
      "[Link to this IPython Notebook on GitHub](https://github.com/rasbt/algorithms_in_ipython_notebooks)  \n",
      "\n",
      "<br>\n",
      "Executed in Python 3.4.0\n",
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"sections\"></a></p>\n",
      "<br>\n",
      "<br>\n",
      "#Sections\n",
      "- <a href=\"#intro\">1. Introduction</a><br>\n",
      "    - <a href=\"#crit_func\">Defining a criterion function for testing</a><br>\n",
      "    - <a href=\"#sfs\">2. Sequential Forward Selection (SFS)</a><br>\n",
      "    - <a href=\"#sfs_code\">SFS Code</a><br>\n",
      "    - <a href=\"#example_sfs\">Example SFS</a><br>\n",
      "- <a href=\"#sbs\">3. Sequential Backward Selection (SBS)</a><br>\n",
      "    - <a href=\"#sbs_code\">SBS Code</a><br>\n",
      "    - <a href=\"#example_sbs\">Example SBS</a><br>\n",
      "- <a href=\"#pLmR\">4. \"Plus L take away R\" (+L -R)</a><br>\n",
      "    - <a href=\"#pLmR_code\">+L -R Code</a><br>\n",
      "    - <a href=\"#example_pLmR\">Example +L -R</a><br>\n",
      "        - <a href=\"#example_pLmR1\">Example 1: L > R</a><br>\n",
      "        - <a href=\"#example_pLmR2\">Example 2: R > L</a><br>\n",
      "- <a href=\"#sffs\">5. Sequential Floating Forward Selection (SFFS)</a><br>\n",
      "    - <a href=\"#sffs_code\">SFFS Code</a><br>\n",
      "    - <a href=\"#example_sffs\">Example SFFS</a><br>\n",
      "- <a href=\"#sfbs\">6. Sequential Floating Backward Selection (SFBS)</a><br>\n",
      "    - <a href=\"#sfbs_code\">SFBS Code</a><br>\n",
      "    - <a href=\"#example_sfbs\">Example SFBS</a><br>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "#Sequential Selection Algorithms in Python\n",
      "\n",
      "<p><a name=\"intro\"></a></p>\n",
      "## 1. Introduction  \n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "One of the biggest challenges of designing a good classifier for solving a ***Statistical Pattern Classification*** problem is to estimate the underlying parameters to fit the model - given that the forms of the underlying probability distributions are known. The larger the number of parameters becomes, the more difficult it naturally is to estimate those parameters accurately (***Curse of Dimensionality***) from a limited number of training samples.  \n",
      "\n",
      "In order to avoid the ***Curse of Dimensionality***, pattern classification is often accompanied by ***Dimensionality Reduction***, which also has the nice side-effect of increasing the computational performance.\n",
      "Common techniques are projection-based, such as ***Principal Component Analysis (PCA)***, ***Linear Dimension Analysis (LDA)***, and ***Multivariate Dimension Analysis (MDA)***.  \n",
      "An alternative to the projection-based approach is the so-called ***Feature Selection***, and in this article, we will take a look at some of the established algorithms to tackle this combinatorial search problem. Note that those algorithms are considered as \"subpoptimal\" in contrast to an ***exhaustive search***, which is often computationally not feasible, though.\n",
      "\n",
      "Therefore, the goal of the presented ***sequential selection algorithms*** is to reduce the feature space *D = {x_1, x_2, x_n}* to a subset of features *D - n* in order to improve or optimize the **computational performance** of the classifier and to avoid the so-called ***Curse of Dimensionality***.  \n",
      "The goal is to select a \"sufficiently reduced\" subset from the feature space *D* \"without significantly reducing\" the performance of the classifier. In the process of choosing an \"optimal\" feature subset of size *k*, a so-called ***Criterion Function***, which typically, simply, and intuitively assesses the ***recognition rate*** of the classifier.\n",
      "\n",
      "F. Ferri, P. Pudil, M. Hatef, and J. Kittler investigated the performance of different ***Sequential Selection Algorithms*** for  ***Feature Selection*** on different scales and reported their results in a nice research article: *\"[Comparative Study of Techniques for Large Scale Feature Selection](http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=02CB16CB1C28EA6CB57E212861CFB180?doi=10.1.1.24.4369&rep=rep1&type=pdf),\" Pattern Recognition in Practice IV, E. Gelsema and L. Kanal, eds., pp. 403-413. Elsevier Science B.V., 1994.*  \n",
      "Choosing an \"appropriate\" algorithm really depends on the problem - the size and desired recognition rate and computational performance. Thus, I want to encourage you to take (at least) a brief look at their paper and the results they obtained from experimenting with different problems feature space dimensions.\n"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"crit_func\"></a></p>\n",
      "### Defining a criterion function (for testing)\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "In order to evaluate the performance of our selected ***feature subset*** (typically the recognition rate of the classifier), we need to define a ***criterion function***.\n",
      "\n",
      "For the sake of simplicity, and in order to get an intuitive idea if our algorithm returns an \"appropriate\" result, let us define a very simple criterion function here.  \n",
      "The criterion function defined below simply returns the sum of numerical values in a list."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def simple_crit_func(feat_sub):\n",
      "    \"\"\" Returns sum of numerical values of an input list. \"\"\" \n",
      "    return sum(feat_sub)\n",
      "\n",
      "# Example:\n",
      "simple_crit_func([1,2,4])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 1,
       "text": [
        "7"
       ]
      }
     ],
     "prompt_number": 1
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"sfs\"></a></p>\n",
      "\n",
      "## 2. Sequential Forward Selection (SFS)\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "The ***Sequential Fortward Selection (SFS)*** is one of the simplest and probably fastest *Feature Selection* algorithms.  \n",
      "Let's summarize its mechanics in words:  \n",
      "***SFS*** starts with an empty feature subset and sequentially adds features from the whole input feature space to this subset until the subset reaches a desired (user-specified) size. For every iteration (= inclusion of a new feature), the whole feature subset is evaluated (expect for the features that are already included in the new subset). The evaluation is done by the so-called ***criterion function*** which assesses the feature that leads to the maximum performance improvement of the feature subset if it is included.  \n",
      "Note that included features are never removed, which is one of the biggest downsides of this algorithm.<br><br>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>\n",
      "<hr>\n",
      "\n",
      "**Input:**  \n",
      "the set of all features,  \n",
      "- $ Y ={y_1, y_2, ..., y_d} $\n",
      "\n",
      "\n",
      "The ***SFS*** algorithm takes the whole feature set as input, if our feature space consists of, e.g. 10, if our feature space consists of 10 dimensions (***d = 10***).\n",
      "<br><br>\n",
      "\n",
      "**Output:**   \n",
      "a subset of features,  \n",
      "- $X_k = {x_j \\; | \\;  j = 1, 2, ..., k; x_j \u2208 Y}$,  \n",
      "where $k = (0, 1, 2, ..., d $)  \n",
      "\n",
      "The returned output of the algorithm is a subset of the feature space of a specified size. E.g., a subset of 5 features from a 10-dimensional feature space ($k = 5, \\; d = 10$).\n",
      "<br><br>\n",
      "\n",
      "**Initialization:**  \n",
      "- $ X_0 = \\emptyset, \\; k = 0 $\n",
      "\n",
      "We initialize the algorithm with an empty set (\"null set\") so that the $k = 0$ (where $k$ is the size of the subset)\n",
      "<br><br>\n",
      "\n",
      "**Step 1 (Inclusion):**  \n",
      "<br> \n",
      "- $ x^+ \\; arg \\; max \\; J(x_k + x), \\quad where \\; x \u2208 Y - X_k $  \n",
      "- $ X_k+1 = X_k + x^+ $  \n",
      "- $ k = k + 1 $  \n",
      "- $ Go \\; to \\; Step 1 $  \n",
      "\n",
      "We go through the ***feature space*** and look for the feature $x^+$ which maximizes our criterion if we add it to the ***feature subset*** (where $J()$ is the criterion function).  We repeat this process until we reach the ***Termination*** criterion.\n",
      "<br><br>\n",
      "\n",
      "**Termination:**  \n",
      "- stop when *k* equals the number of desired features\n",
      "\n",
      "We add features to the new feature subset $X_k$ until we reach the number of specified features for our final subset. E.g., if our desired number of features is 5 and we start with the \"null set\" (*Initialization*), we would add features to the subset until it contains 5 features.\n",
      "\n",
      "<hr>\n",
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"sfs_code\"></a></p>\n",
      "\n",
      "##SFS Code\n",
      "\n",
      "[[back to top]](#sections)\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Sebastian Raschka\n",
      "# last updated: 03/29/2014 \n",
      "# Sequential Forward Selection (SBS)\n",
      "\n",
      "\n",
      "def seq_forw_select(features, max_k, criterion_func, print_steps=False):\n",
      "    \"\"\"\n",
      "    Implementation of a Sequential Forward Selection algorithm.\n",
      "    \n",
      "    Keyword Arguments:\n",
      "        features (list): The feature space as a list of features.\n",
      "        max_k: Termination criterion; the size of the returned feature subset.\n",
      "        criterion_func (function): Function that is used to evaluate the\n",
      "            performance of the feature subset.\n",
      "        print_steps (bool): Prints the algorithm procedure if True.\n",
      "    \n",
      "    Returns the selected feature subset, a list of features of length max_k.\n",
      "\n",
      "    \"\"\"\n",
      "    \n",
      "    # Initialization\n",
      "    feat_sub = []\n",
      "    k = 0\n",
      "    d = len(features)\n",
      "    if max_k > d:\n",
      "        max_k = d\n",
      "    \n",
      "    while True:\n",
      "        \n",
      "        # Inclusion step\n",
      "        if print_steps:\n",
      "            print('\\nInclusion from feature space', features)\n",
      "        crit_func_max = criterion_func(feat_sub + [features[0]])\n",
      "        best_feat = features[0]\n",
      "        for x in features[1:]:\n",
      "            crit_func_eval = criterion_func(feat_sub + [x])\n",
      "            if crit_func_eval > crit_func_max:\n",
      "                crit_func_max = crit_func_eval\n",
      "                best_feat = x\n",
      "        feat_sub.append(best_feat)\n",
      "        if print_steps:\n",
      "            print('include: {} -> feature subset: {}'.format(best_feat, feat_sub))\n",
      "        features.remove(best_feat)\n",
      "        \n",
      "        # Termination condition\n",
      "        k = len(feat_sub)\n",
      "        if k == max_k:\n",
      "            break\n",
      "                \n",
      "    return feat_sub"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 2
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"example_sfs\"></a></p>\n",
      "### Example SFS:\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "In this example, we take a look at the individual steps of the [***SFS***](#sfs) algorithmn to select a ***feature subset*** consisting of 3 features out of a ***feature space*** of size 10.  \n",
      "The input feature space consists of 10 integers: 6, 3, 1, 6, 8, 2, 3, 7, 9, 1,  \n",
      "and our criterion is to find a subset of size 3 in this ***feature space*** that maximizes the integer sum in this ***feature subset***."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def example_seq_forw_select():\n",
      "    ex_features = [6, 3, 1, 6, 8, 2, 3, 7, 9, 1]\n",
      "    res_forw = seq_forw_select(features=ex_features, max_k=3,\\\n",
      "                               criterion_func=simple_crit_func, print_steps=True) \n",
      "    return res_forw\n",
      "    \n",
      "# Run example\n",
      "res_forw = example_seq_forw_select()\n",
      "print('\\nRESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] ->', res_forw)\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Inclusion from feature space [6, 3, 1, 6, 8, 2, 3, 7, 9, 1]\n",
        "include: 9 -> feature subset: [9]\n",
        "\n",
        "Inclusion from feature space [6, 3, 1, 6, 8, 2, 3, 7, 1]\n",
        "include: 8 -> feature subset: [9, 8]\n",
        "\n",
        "Inclusion from feature space [6, 3, 1, 6, 2, 3, 7, 1]\n",
        "include: 7 -> feature subset: [9, 8, 7]\n",
        "\n",
        "RESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] -> [9, 8, 7]\n"
       ]
      }
     ],
     "prompt_number": 3
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "###Result:\n",
      "The returned result is definitely what we would expect: the 3 highest values (note that we defined 3 as the number of desired features for our subset) in the input feature list, since our ***criterion*** is to select the numerical values (= features) that yield the maximum mathematical sum in the feature subset."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"sbs\"></a></p>\n",
      "## 3. Sequential Backward Selection (SBS)\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "The ***Sequential Backward Selection (SBS)*** algorithm is very similar to the [***SFS***](#sfs), which we have just seen in the section above. The only difference is that we start with the complete feature set instead of the \"null set\" and remove features sequentially until we reach the number of desired features *k*.  \n",
      "Note that features are never added back once they were removed, which (similar to [***SFS***](#sfs)) is one of the biggest downsides of this algorithm.<br><hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>\n",
      "<hr>\n",
      "**Input:**  \n",
      "the set of all features,   \n",
      "- $Y = {y_1, y_2, ..., y_d}$  \n",
      "\n",
      "The ***SBS*** algorithm takes the whole feature set as input, if our feature space consists of, e.g. 10, if our feature space consists of 10 dimensions ($d = 10$).\n",
      "<br><br>\n",
      "\n",
      "**Output:**  \n",
      "a subset of features,  \n",
      "- $ X_k = {x_j \\;| \\;j = 1, 2, ..., k; x_j \u2208 Y}, \\quad where \\; k = (0, 1, 2, ..., d)$  \n",
      "\n",
      "The returned output of the algorithm is a subset of the feature space of a specified size. E.g., a subset of 5 features from a 10-dimensional feature space ($k = 5, d = 10$).\n",
      "<br><br>\n",
      "\n",
      "**Initialization:**  \n",
      "- $X_0 = Y, \\; k = d$\n",
      "\n",
      "We initialize the algorithm with the given feature set so that the $k = d$ (where $k$ has the size of the feature set $d$)\n",
      "<br><br>\n",
      "\n",
      "**Step 1 (Exclusion):**  \n",
      "<br> \n",
      "- $x^- = arg \\; max \\; J(x_k - x), where x \u2208 X_k$  \n",
      "- $X_k-1 = X_k - x^-$  \n",
      "- $k = k - 1$    \n",
      "- $Go \\; to \\; Step 1$ \n",
      "\n",
      "We go through the ***feature subset*** and look for the feature $x^-$ which minimizes our criterion if we remove it from the ***feature subset*** (where $J()$ is the criterion function).  We repeat this process until we reach the ***Termination*** criterion.\n",
      "<br><br>\n",
      "\n",
      "**Termination:**  \n",
      "- stop when $k$ equals the number of desired features\n",
      "\n",
      "We remove features from the feature subset $X_k$ until we reach the number of specified features for our final subset. E.g., if our desired number of features is 5 and we start with the entire feature space (*Initialization*), we would remove features from the subset until it contains 5 features.\n",
      "<hr>\n",
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"sbs_code\"></a></p>\n",
      "\n",
      "##SBS Code\n",
      "\n",
      "[[back to top]](#sections)\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Sebastian Raschka\n",
      "# last updated: 03/29/2014 \n",
      "# Sequential Backward Selection (SBS)\n",
      "\n",
      "from copy import deepcopy\n",
      "\n",
      "def seq_backw_select(features, max_k, criterion_func, print_steps=False):\n",
      "    \"\"\"\n",
      "    Implementation of a Sequential Backward Selection algorithm.\n",
      "    \n",
      "    Keyword Arguments:\n",
      "        features (list): The feature space as a list of features.\n",
      "        max_k: Termination criterion; the size of the returned feature subset.\n",
      "        criterion_func (function): Function that is used to evaluate the\n",
      "            performance of the feature subset.\n",
      "        print_steps (bool): Prints the algorithm procedure if True.\n",
      "        \n",
      "    Returns the selected feature subset, a list of features of length max_k.\n",
      "\n",
      "    \"\"\"\n",
      "    # Initialization\n",
      "    feat_sub = deepcopy(features)\n",
      "    k = len(feat_sub)\n",
      "    i = 0\n",
      "\n",
      "    while True:\n",
      "        \n",
      "        # Exclusion step\n",
      "        if print_steps:\n",
      "            print('\\nExclusion from feature subset', feat_sub)\n",
      "        worst_feat = len(feat_sub)-1\n",
      "        worst_feat_val = feat_sub[worst_feat]\n",
      "        crit_func_max = criterion_func(feat_sub[:-1]) \n",
      "\n",
      "        for i in reversed(range(0,len(feat_sub)-1)):\n",
      "            crit_func_eval = criterion_func(feat_sub[:i] + feat_sub[i+1:])\n",
      "            if crit_func_eval > crit_func_max:\n",
      "                worst_feat, crit_func_max = i, crit_func_eval\n",
      "                worst_feat_val = feat_sub[worst_feat]\n",
      "        del feat_sub[worst_feat]\n",
      "        if print_steps:\n",
      "            print('exclude: {} -> feature subset: {}'.format(worst_feat_val, feat_sub))\n",
      "        \n",
      "        # Termination condition\n",
      "        k = len(feat_sub)\n",
      "        if k == max_k:\n",
      "            break\n",
      "                \n",
      "    return feat_sub"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 4
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"example_sbs\"></a></p>\n",
      "### Example SBS:\n",
      "\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "\n",
      "Like we did for the [***SFS example***](#example_sfs) above, we take a look at the individual steps of the [***SBS***](#sbs) algorithmn to select a ***feature subset*** consisting of 3 features out of a ***feature space*** of size 10.  \n",
      "The input feature space consists of 10 integers: 6, 3, 1, 6, 8, 2, 3, 7, 9, 1,  \n",
      "and our criterion is to find a subset of size 3 in this ***feature space*** that maximizes the integer sum in this ***feature subset***."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def example_seq_backw_select():\n",
      "    ex_features = [6,3,1,6,8,2,3,7,9,1]\n",
      "    res_backw = seq_backw_select(features=ex_features, max_k=3,\\\n",
      "                                 criterion_func=simple_crit_func, print_steps=True)  \n",
      "    return (res_backw)\n",
      "    \n",
      "# Run example\n",
      "res_backw = example_seq_backw_select()\n",
      "print('\\nRESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] ->', res_backw)\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Exclusion from feature subset [6, 3, 1, 6, 8, 2, 3, 7, 9, 1]\n",
        "exclude: 1 -> feature subset: [6, 3, 1, 6, 8, 2, 3, 7, 9]\n",
        "\n",
        "Exclusion from feature subset [6, 3, 1, 6, 8, 2, 3, 7, 9]\n",
        "exclude: 1 -> feature subset: [6, 3, 6, 8, 2, 3, 7, 9]\n",
        "\n",
        "Exclusion from feature subset [6, 3, 6, 8, 2, 3, 7, 9]\n",
        "exclude: 2 -> feature subset: [6, 3, 6, 8, 3, 7, 9]\n",
        "\n",
        "Exclusion from feature subset [6, 3, 6, 8, 3, 7, 9]\n",
        "exclude: 3 -> feature subset: [6, 3, 6, 8, 7, 9]\n",
        "\n",
        "Exclusion from feature subset [6, 3, 6, 8, 7, 9]\n",
        "exclude: 3 -> feature subset: [6, 6, 8, 7, 9]\n",
        "\n",
        "Exclusion from feature subset [6, 6, 8, 7, 9]\n",
        "exclude: 6 -> feature subset: [6, 8, 7, 9]\n",
        "\n",
        "Exclusion from feature subset [6, 8, 7, 9]\n",
        "exclude: 6 -> feature subset: [8, 7, 9]\n",
        "\n",
        "RESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] -> [8, 7, 9]\n"
       ]
      }
     ],
     "prompt_number": 5
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Result:\n",
      "The returned ***feature subset*** is similar to the result of the [***SFS***](#example_sfs) algorithm that we have seen above, which is what we would expect: the 3 highest values (note that we defined 3 as the number of desired features for our subset) in the input feature list, since our ***criterion*** is to select the feanumerical values (= features) that yield the maximum mathematical sum in the feature subset."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"pLmR\"></a></p>\n",
      "## 4. \"Plus L take away R\" (+L -R)\n",
      "\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "\n",
      "The ***\"Plus L take away R\" (+L -R)*** is basically a combination of [***SFS***](#sfs) and [***SBS***](#sbs). It append features to the ***feature subset*** *L-times*, and afterwards it removes features *R-times* until we reach our desired size for the ***feature subset***.\n",
      "\n",
      "**Variant 1: L > R**  \n",
      "If *L > R*, the algorithm starts with an empty ***feature subset*** and adds *L* features to it from the ***feature space***. Then it goes over to the next step 2, where it removes *R* features from the ***feature subset***, after which it goes back to step 1 to add *L* features again.  \n",
      "Those steps are repeated until the ***feature subset** reaches the desired size *k*.  \n",
      "<br>\n",
      "**Variant 2: R > L**  \n",
      "Else, if *R > L*, the algorithms starts with the whole ***feature space*** as ***feature subset***. It remove s*R* features from it before it adds back *L* features from those features that were just removed.  \n",
      "Those steps are repeated until the ***feature subset** reaches the desired size *k*.    \n",
      "<br><hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>\n",
      "<hr>\n",
      "**Input:**  \n",
      "the set of all features,  \n",
      "- $Y = {y_1, y_2, ..., y_d}$  \n",
      "\n",
      "The ***+L -R*** algorithm takes the whole feature set as input, if our feature space consists of, e.g. 10, if our feature space consists of 10 dimensions ($d = 10$).\n",
      "<br><br>\n",
      "\n",
      "**Output:** \n",
      "a subset of features,  \n",
      "- $X_k = {x_j \\; | \\; j = 1, 2, ..., k; x_j \u2208 Y}, \\quad where \\; k = (0, 1, 2, ..., d)$  \n",
      "\n",
      "The returned output of the algorithm is a subset of the feature space of a specified size. E.g., a subset of 5 features from a 10-dimensional feature space ($k = 5, \\; d = 10$).\n",
      "<br><br>\n",
      "\n",
      "**Initialization:**  \n",
      "- $X_0 = Y, \\; k = d$\n",
      "\n",
      "We initialize the algorithm with the given feature set so that the $k = d$ (where $k$ has the size of the feature set $d$)\n",
      "<br><br>\n",
      "\n",
      "**Step 1 (Inclusion):**  \n",
      "<br>\n",
      "- *repeat L-times:*   \n",
      "    - $x^+ = arg \\; max \\; J(x_k + x), \\quad where \\; x \u2208 Y - X_k $    \n",
      "    - $X_k+1 = X_k + x^+$    \n",
      "    - $k = k + 1$  \n",
      "- $Go \\; to\\; Step 2$  \n",
      "<br> <br> \n",
      "**Step 2 (Exclusion):**  \n",
      "<br> \n",
      "- *repeat R-times:*  \n",
      "    - $x^- = arg max J(x_k - x), \\quad where \\; x \u2208 X_k$  \n",
      "    - $X_k-1 = X_k - x^-$  \n",
      "    - $k = k - 1$   \n",
      "- $Go \\; to \\; Step 1$  \n",
      "\n",
      "In step 1, we go *L-times* through the ***feature space*** and look for the feature $x^+$ which maximizes our criterion if we add it to the ***feature subset*** (where $J()$ is the criterion function). Then we go over to step 2.  \n",
      "In step 2, we go *R-times* through the ***feature subset*** and look for the feature $x^-$ which minimizes our criterion if we remove it from the ***feature subset*** (where $J()$ is the criterion function).  Then we go back to step 1.  \n",
      "Note that this order of steps only applies if $L > R$, in the opposite case ($R > L$), we have to start with the ***Exlusion*** on the whole ***feature space***, followed by inclusion of removed features.\n",
      "<br><br>\n",
      "\n",
      "**Termination:**  \n",
      "- stop when $k$ equals the number of desired features\n",
      "\n",
      "We add and remove features from the feature subset $X_k$ until we reach the number of specified features for our final subset. E.g., if our desired number of features is 5 and we start with the entire feature space (*Initialization*), we would remove features from the subset until it contains 5 features.\n",
      "<hr>\n",
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"pLmR_code\"></a></p>\n",
      "\n",
      "##+L -R Code\n",
      "\n",
      "[[back to top]](#sections)\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Sebastian Raschka\n",
      "# last updated: 03/29/2014 \n",
      "# \"Plus L take away R\" (+L -R)\n",
      "\n",
      "from copy import deepcopy\n",
      "\n",
      "def plus_L_minus_R(features, max_k, criterion_func, L=3, R=2, print_steps=False):\n",
      "    \"\"\"\n",
      "    Implementation of a \"Plus l take away r\" algorithm.\n",
      "    \n",
      "    Keyword Arguments:\n",
      "        features (list): The feature space as a list of features.\n",
      "        max_k: Termination criterion; the size of the returned feature subset.\n",
      "        criterion_func (function): Function that is used to evaluate the\n",
      "            performance of the feature subset.\n",
      "        L (int): Number of features added per iteration.\n",
      "        R (int): Number of features removed per iteration.\n",
      "        print_steps (bool): Prints the algorithm procedure if True.\n",
      "    \n",
      "    Returns the selected feature subset, a list of features of length max_k.\n",
      "\n",
      "    \"\"\"\n",
      "    assert(L != R), 'L must be != R to avoid an infinite loop'\n",
      "    \n",
      "    ############################\n",
      "    ### +L -R for case L > R ###\n",
      "    ############################\n",
      "    \n",
      "    if L > R:\n",
      "        feat_sub = []\n",
      "        k = 0\n",
      "        \n",
      "        # Initialization\n",
      "        while True:\n",
      "        \n",
      "            # +L (Inclusion)\n",
      "            if print_steps:\n",
      "                print('\\nInclusion from features', features)\n",
      "            for i in range(L):\n",
      "                if len(features) > 0:\n",
      "                    crit_func_max = criterion_func(feat_sub + [features[0]])\n",
      "                    best_feat = features[0]\n",
      "                    if len(features) > 1:\n",
      "                        for x in features[1:]:\n",
      "                            crit_func_eval = criterion_func(feat_sub + [x])\n",
      "                            if crit_func_eval > crit_func_max:\n",
      "                                crit_func_max = crit_func_eval\n",
      "                                best_feat = x\n",
      "                    features.remove(best_feat)\n",
      "                    feat_sub.append(best_feat)\n",
      "                    if print_steps:\n",
      "                        print('include: {} -> feature_subset: {}'.format(best_feat, feat_sub))\n",
      "        \n",
      "            # -R (Exclusion)\n",
      "            if print_steps:\n",
      "                print('\\nExclusion from feature_subset', feat_sub)\n",
      "            for i in range(R):\n",
      "                if len(features) + len(feat_sub) > max_k:\n",
      "                    worst_feat = len(feat_sub)-1\n",
      "                    worst_feat_val = feat_sub[worst_feat]\n",
      "                    crit_func_max = criterion_func(feat_sub[:-1]) \n",
      "\n",
      "                    for j in reversed(range(0,len(feat_sub)-1)):\n",
      "                        crit_func_eval = criterion_func(feat_sub[:j] + feat_sub[j+1:])\n",
      "                        if crit_func_eval > crit_func_max:\n",
      "                            worst_feat, crit_func_max = j, crit_func_eval\n",
      "                            worst_feat_val = feat_sub[worst_feat]\n",
      "                    del feat_sub[worst_feat]\n",
      "                    if print_steps:\n",
      "                        print('exclude: {} -> feature subset: {}'.format(worst_feat_val, feat_sub))\n",
      "                \n",
      "        \n",
      "            # Termination condition\n",
      "            k = len(feat_sub)\n",
      "            if k == max_k:\n",
      "                break\n",
      "                \n",
      "        return feat_sub\n",
      "    \n",
      "    ############################\n",
      "    ### +L -R for case L < R ###\n",
      "    ############################\n",
      "\n",
      "    else:\n",
      "        # Initialization\n",
      "        feat_sub = deepcopy(features)\n",
      "        k = len(feat_sub)\n",
      "        i = 0\n",
      "        count = 0\n",
      "        while True:\n",
      "            count += 1\n",
      "            # Exclusion step\n",
      "            removed_feats = []\n",
      "            if print_steps:\n",
      "                print('\\nExclusion from feature subset', feat_sub)\n",
      "            for i in range(R):\n",
      "                if len(feat_sub) > max_k:\n",
      "                    worst_feat = len(feat_sub)-1\n",
      "                    worst_feat_val = feat_sub[worst_feat]\n",
      "                    crit_func_max = criterion_func(feat_sub[:-1]) \n",
      "\n",
      "                    for i in reversed(range(0,len(feat_sub)-1)):\n",
      "                        crit_func_eval = criterion_func(feat_sub[:i] + feat_sub[i+1:])\n",
      "                        if crit_func_eval > crit_func_max:\n",
      "                            worst_feat, crit_func_max = i, crit_func_eval\n",
      "                            worst_feat_val = feat_sub[worst_feat]\n",
      "                    removed_feats.append(feat_sub.pop(worst_feat))\n",
      "            if print_steps:\n",
      "                print('exclude: {} -> feature subset: {}'.format(removed_feats, feat_sub))\n",
      "            \n",
      "            # +L (Inclusion)\n",
      "            included_feats = []\n",
      "            if len(feat_sub) != max_k:\n",
      "                for i in range(L):\n",
      "                    if len(removed_feats) > 0:\n",
      "                        crit_func_max = criterion_func(feat_sub + [removed_feats[0]])\n",
      "                        best_feat = removed_feats[0]\n",
      "                        if len(removed_feats) > 1:\n",
      "                            for x in removed_feats[1:]:\n",
      "                                crit_func_eval = criterion_func(feat_sub + [x])\n",
      "                                if crit_func_eval > crit_func_max:\n",
      "                                    crit_func_max = crit_func_eval\n",
      "                                    best_feat = x\n",
      "                        removed_feats.remove(best_feat)\n",
      "                        feat_sub.append(best_feat)\n",
      "                        included_feats.append(best_feat)\n",
      "                if print_steps:\n",
      "                    print('\\nInclusion from removed features', removed_feats)\n",
      "                    print('include: {} -> feature_subset: {}'.format(included_feats, feat_sub))\n",
      "                        \n",
      "            # Termination condition\n",
      "            k = len(feat_sub)\n",
      "            if k == max_k:\n",
      "                break\n",
      "            if count >= 30:\n",
      "                break\n",
      "        return feat_sub\n",
      "        "
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 6
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"example_pLmR\"></a></p>\n",
      "### Example +L -R:\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "Like we did for the [***SFS***](#example_sfs) example above, let's have look at the individual steps of the ***+L -R*** algorithmn to select a ***feature subset*** consisting of 3 features out of a ***feature space*** of size 10.  \n",
      "Again, the input feature space consists of the 10 integers: 6, 3, 1, 6, 8, 2, 3, 7, 9, 1,  \n",
      "and our criterion is to find a subset of size 3 in this ***feature space*** that maximizes the integer sum in this ***feature subset***."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"example_pLmR1\"></a></p>\n",
      "#### Example 1:  L > R\n",
      "\n",
      "[[back to top]](#sections)\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def example_plus_L_minus_R():\n",
      "    ex_features = [6, 3, 1, 6, 8, 2, 3, 7, 9, 1]\n",
      "    res_plmr = plus_L_minus_R(features=ex_features, max_k=3,\\\n",
      "                               criterion_func=simple_crit_func, L=3, R=2, print_steps=True) \n",
      "     \n",
      "    return (res_plmr)\n",
      " \n",
      "# Run example\n",
      "res = example_plus_L_minus_R()\n",
      "print('\\nRESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] ->', res)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Inclusion from features [6, 3, 1, 6, 8, 2, 3, 7, 9, 1]\n",
        "include: 9 -> feature_subset: [9]\n",
        "include: 8 -> feature_subset: [9, 8]\n",
        "include: 7 -> feature_subset: [9, 8, 7]\n",
        "\n",
        "Exclusion from feature_subset [9, 8, 7]\n",
        "exclude: 7 -> feature subset: [9, 8]\n",
        "exclude: 8 -> feature subset: [9]\n",
        "\n",
        "Inclusion from features [6, 3, 1, 6, 2, 3, 1]\n",
        "include: 6 -> feature_subset: [9, 6]\n",
        "include: 6 -> feature_subset: [9, 6, 6]\n",
        "include: 3 -> feature_subset: [9, 6, 6, 3]\n",
        "\n",
        "Exclusion from feature_subset [9, 6, 6, 3]\n",
        "exclude: 3 -> feature subset: [9, 6, 6]\n",
        "exclude: 6 -> feature subset: [9, 6]\n",
        "\n",
        "Inclusion from features [1, 2, 3, 1]\n",
        "include: 3 -> feature_subset: [9, 6, 3]\n",
        "include: 2 -> feature_subset: [9, 6, 3, 2]\n",
        "include: 1 -> feature_subset: [9, 6, 3, 2, 1]\n",
        "\n",
        "Exclusion from feature_subset [9, 6, 3, 2, 1]\n",
        "exclude: 1 -> feature subset: [9, 6, 3, 2]\n",
        "exclude: 2 -> feature subset: [9, 6, 3]\n",
        "\n",
        "RESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] -> [9, 6, 3]\n"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"example_pLmR2\"></a></p>\n",
      "#### Example 2:  R > L\n",
      "\n",
      "[[back to top]](#sections)\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def example_plus_L_minus_R():\n",
      "    ex_features = [6, 3, 1, 6, 8, 2, 3, 7, 9, 1]\n",
      "    res_plmr = plus_L_minus_R(features=ex_features, max_k=3,\\\n",
      "                               criterion_func=simple_crit_func, L=2, R=3, print_steps=True) \n",
      "     \n",
      "    return (res_plmr)\n",
      " \n",
      "# Run example\n",
      "res = example_plus_L_minus_R()\n",
      "print('\\nRESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] ->', res)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Exclusion from feature subset [6, 3, 1, 6, 8, 2, 3, 7, 9, 1]\n",
        "exclude: [1, 1, 2] -> feature subset: [6, 3, 6, 8, 3, 7, 9]\n",
        "\n",
        "Inclusion from removed features [1]\n",
        "include: [2, 1] -> feature_subset: [6, 3, 6, 8, 3, 7, 9, 2, 1]\n",
        "\n",
        "Exclusion from feature subset [6, 3, 6, 8, 3, 7, 9, 2, 1]\n",
        "exclude: [1, 2, 3] -> feature subset: [6, 3, 6, 8, 7, 9]\n",
        "\n",
        "Inclusion from removed features [1]\n",
        "include: [3, 2] -> feature_subset: [6, 3, 6, 8, 7, 9, 3, 2]\n",
        "\n",
        "Exclusion from feature subset [6, 3, 6, 8, 7, 9, 3, 2]\n",
        "exclude: [2, 3, 3] -> feature subset: [6, 6, 8, 7, 9]\n",
        "\n",
        "Inclusion from removed features [2]\n",
        "include: [3, 3] -> feature_subset: [6, 6, 8, 7, 9, 3, 3]\n",
        "\n",
        "Exclusion from feature subset [6, 6, 8, 7, 9, 3, 3]\n",
        "exclude: [3, 3, 6] -> feature subset: [6, 8, 7, 9]\n",
        "\n",
        "Inclusion from removed features [3]\n",
        "include: [6, 3] -> feature_subset: [6, 8, 7, 9, 6, 3]\n",
        "\n",
        "Exclusion from feature subset [6, 8, 7, 9, 6, 3]\n",
        "exclude: [3, 6, 6] -> feature subset: [8, 7, 9]\n",
        "\n",
        "RESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] -> [8, 7, 9]\n"
       ]
      }
     ],
     "prompt_number": 8
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Result:\n",
      "The returned ***feature subset*** really is suboptimal in this particular example for L > R ([Example 1: L > R](#example_pLmR1)). This is due to the fact that we add multiple features to our ***feature subset*** and we also remove multiple features from it; we never add back any of the removed features to the investigated ***feature space***.  \n",
      "\n",
      "**Modifying the \"Plus L take away R\" algorithm**  \n",
      "This algorithm can be tweaked by adding back *r - 1* features to the ***feature subset*** after each ***Exclusion*** step to be assessed by the ***criterion function*** for inclusion in the next iteration. This is a decision that has to be made by considering the particular application, since it decreases the computational performance of the algorithm, but improves the performance of the resulting ***feature subset*** as a classifier."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"sffs\"></a></p>\n",
      "## 5. Sequential Floating Forward Selection (SFFS)\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "The ***Sequential Floating Forward Selection (SFFS)*** algorithm can be considered as extension of the simpler [***SFS***](#sfs) algorithm, which we have seen in the very beginning. In constrast to ***SFS***, the ***SFFS*** algorithm **can** remove features once they were included, so that a larger number of feature subset combinations can be sampled. It is important to emphasize that the removal of included features is **conditional**, which makes it different from the [***+L -R***](#pLmR) algorithm. The ***Conditional Exclusion*** in ***SFFS*** only occurs if the resultin feature subset is assessed as \"better\" by the ***criterion function*** after removal of a particular feature."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>\n",
      "<hr>\n",
      "**Input:**  \n",
      "the set of all features,  \n",
      "- $Y = {y_1, y_2, ..., y_d}$    \n",
      "\n",
      "The ***SFFS*** algorithm takes the whole feature set as input, if our feature space consists of, e.g. 10, if our feature space consists of 10 dimensions ($d = 10$).\n",
      "<br><br>\n",
      "\n",
      "**Output:**  \n",
      "a subset of features,  \n",
      "- $X_k = {x_j \\; | \\; j = 1, 2, ..., k; x_j \u2208 Y}, \\quad where \\; k = (0, 1, 2, ..., d)$  \n",
      "\n",
      "The returned output of the algorithm is a subset of the feature space of a specified size. E.g., a subset of 5 features from a 10-dimensional feature space ($k = 5, \\; d = 10$).\n",
      "<br><br>\n",
      "\n",
      "**Initialization:**  \n",
      "- $X_0 = \\emptyset, \\quad k = 0$\n",
      "\n",
      "We initialize the algorithm with an empty set (\"null set\") so that the $k = 0$ (where $k$ is the size of the subset)\n",
      "<br><br>\n",
      "\n",
      "**Step 1 (Inclusion):**  \n",
      "<br> \n",
      "- $x^+ = arg\\; max \\;J(x_k + x), \\quad  where \\; x \u2208 Y - X_k$   \n",
      "- $X_k+1 = X_k + x^+$   \n",
      "- $k = k + 1$   \n",
      "- $Go \\; to \\; Step 2  $ \n",
      "<br> <br> \n",
      "**Step 2 (Conditional Exclusion):**  \n",
      "<br> \n",
      "- $x^- = arg \\; max \\; J(x_k - x),\\quad where \\; x \u2208 X_k$   \n",
      "- $if\\; J(x_k - x) > J(x_k - x):$      \n",
      "    - $X_k-1 = X_k - x^-$   \n",
      "    - $k = k - 1$    \n",
      "- $Go \\; to \\; Step 1$  \n",
      "\n",
      "In step 1, we include the feature from the ***feature space*** that leads to the best performance increase for our ***feature subset*** (assessed by the ***criterion function***). Then, we go over to step 2  \n",
      "In step 2, we only remove a feature if the resulting subset would gain an increase in performance. We go back to step 1.  \n",
      "Steps 1 and 2 are reapeated until the **Termination** criterion is reached.\n",
      "<br><br>\n",
      "\n",
      "**Termination:**  \n",
      "- stop when $k$ equals the number of desired features\n",
      "\n",
      "<hr>\n",
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"sffs_code\"></a></p>\n",
      "\n",
      "##SFFS Code\n",
      "\n",
      "[[back to top]](#sections)\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Sebastian Raschka\n",
      "# last updated: 03/29/2014 \n",
      "# Sequential Floating Forward Selection (SFFS)\n",
      "\n",
      "def seq_float_forw_select(features, max_k, criterion_func, print_steps=False):\n",
      "    \"\"\"\n",
      "    Implementation of Sequential Floating Forward Selection.\n",
      "    \n",
      "    Keyword Arguments:\n",
      "        features (list): The feature space as a list of features.\n",
      "        max_k: Termination criterion; the size of the returned feature subset.\n",
      "        criterion_func (function): Function that is used to evaluate the\n",
      "            performance of the feature subset.\n",
      "        print_steps (bool): Prints the algorithm procedure if True.\n",
      "    \n",
      "    Returns the selected feature subset, a list of features of length max_k.\n",
      "\n",
      "    \"\"\"\n",
      "\n",
      "    # Initialization\n",
      "    feat_sub = []\n",
      "    k = 0\n",
      "    \n",
      "    while True:\n",
      "       \n",
      "        # Step 1: Inclusion\n",
      "        if print_steps:\n",
      "            print('\\nInclusion from features', features)\n",
      "        if len(features) > 0:\n",
      "            crit_func_max = criterion_func(feat_sub + [features[0]])\n",
      "            best_feat = features[0]\n",
      "            if len(features) > 1:\n",
      "                for x in features[1:]:\n",
      "                    crit_func_eval = criterion_func(feat_sub + [x])\n",
      "                    if crit_func_eval > crit_func_max:\n",
      "                        crit_func_max = crit_func_eval\n",
      "                        best_feat = x\n",
      "            features.remove(best_feat)\n",
      "            feat_sub.append(best_feat)\n",
      "            if print_steps:\n",
      "                print('include: {} -> feature_subset: {}'.format(best_feat, feat_sub))\n",
      "        \n",
      "        # Step 2: Conditional Exclusion\n",
      "            worst_feat_val = None\n",
      "            if len(features) + len(feat_sub) > max_k:\n",
      "                crit_func_max = criterion_func(feat_sub) \n",
      "                for i in reversed(range(0,len(feat_sub))):\n",
      "                    crit_func_eval = criterion_func(feat_sub[:i] + feat_sub[i+1:])\n",
      "                    if crit_func_eval > crit_func_max:\n",
      "                        worst_feat, crit_func_max = i, crit_func_eval\n",
      "                        worst_feat_val = feat_sub[worst_feat]\n",
      "                if worst_feat_val:        \n",
      "                    del feat_sub[worst_feat]\n",
      "            if print_steps:\n",
      "                print('exclude: {} -> feature subset: {}'.format(worst_feat_val, feat_sub))\n",
      "            \n",
      "        \n",
      "        # Termination condition\n",
      "        k = len(feat_sub)\n",
      "        if k == max_k:\n",
      "            break\n",
      "                \n",
      "    return feat_sub"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 9
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"example_sffs\"></a></p>\n",
      "### Example SFFS:\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "Since the ***Exclusion*** step in the ***Sequential Floating Forward*** algorithm is ***conditional*** - a feature is only removed if the criterion function asses a better performance after removal - we would never exclude any feature using our [`simple_criterion_func()`](#crit_func\"), which returns the integer sum of a feature set. Thus, let us define another simple criterion function that we use for testing our ***SFFS*** algorithm.\n",
      "\n",
      "\n",
      "Just as we did for the previous examples above, let's take a look at the individual steps of the ***SFFS*** algorithmn to select a ***feature subset*** consisting of 3 features out of a ***feature space*** of size 10.  \n",
      "Also here, the input feature space consists of the 10 integers: 6, 3, 1, 6, 8, 2, 3, 7, 9, 1,  \n",
      "and our criterion is to find a subset of size 3 in this ***feature space*** that maximizes the integer sum in this ***feature subset***."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"random_crit_func\"></a></p>\n",
      "#### A simple criterion function with a random parameter\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "The criterion function we define below is also calculates the sum of a subset similar to the [`simple_criterion_func()`](#crit_func\") we used before. However, here we add a random integer ranging from -15 to 15 to the returned sum. Therefore, in some occasions, our criterion function can return a larger sum for a smaller subset - after we removed a feature from the subset after the ***Inclusion*** step - in order to trigger the ***Conditional Exclusion*** step."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from random import randint\n",
      "\n",
      "def simple_rand_crit_func(feat_sub):\n",
      "    \"\"\" \n",
      "    Returns sum of numerical values of an input list plus \n",
      "    a random integer ranging from -15 to 15. \n",
      "    \n",
      "    \"\"\" \n",
      "    return sum(feat_sub) + randint(-15,15)\n",
      "\n",
      "# Example:\n",
      "simple_rand_crit_func([1,2,4])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 10,
       "text": [
        "19"
       ]
      }
     ],
     "prompt_number": 10
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def example_seq_float_forw_select():\n",
      "    ex_features = [6,3,1,6,8,2,3,7,9,1]\n",
      "    res_seq_flforw = seq_float_forw_select(features=ex_features, max_k=3,\\\n",
      "                                 criterion_func=simple_rand_crit_func, print_steps=True)  \n",
      "    return res_seq_flforw\n",
      "    \n",
      "# Run example\n",
      "res_seq_flforw = example_seq_float_forw_select()\n",
      "print('\\nRESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] ->', res_seq_flforw)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Inclusion from features [6, 3, 1, 6, 8, 2, 3, 7, 9, 1]\n",
        "include: 7 -> feature_subset: [7]\n",
        "exclude: None -> feature subset: [7]\n",
        "\n",
        "Inclusion from features [6, 3, 1, 6, 8, 2, 3, 9, 1]\n",
        "include: 1 -> feature_subset: [7, 1]\n",
        "exclude: 7 -> feature subset: [1]\n",
        "\n",
        "Inclusion from features [6, 3, 6, 8, 2, 3, 9, 1]\n",
        "include: 3 -> feature_subset: [1, 3]\n",
        "exclude: None -> feature subset: [1, 3]\n",
        "\n",
        "Inclusion from features [6, 6, 8, 2, 3, 9, 1]\n",
        "include: 9 -> feature_subset: [1, 3, 9]\n",
        "exclude: None -> feature subset: [1, 3, 9]\n",
        "\n",
        "RESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] -> [1, 3, 9]\n"
       ]
      }
     ],
     "prompt_number": 11
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"sfbs\"></a></p>\n",
      "## 6. Sequential Floating Backward Selection (SFBS)\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "Just as in the [***SFFS***](#sffs) algorithm, we have a conditional step: Here, we start with the whole feature subset and exclude features sequentially. Only if adding one of the previously excluded features back to a new ***feature subset*** improves the performance (assessed by the criterion function), we add it back in the ***Conditional Inclusion*** step."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>\n",
      "<hr>\n",
      "**Input:**  \n",
      "the set of all features,   \n",
      "- $Y = {y_1, y_2, ..., y_d}$  \n",
      "\n",
      "The ***SFBS*** algorithm takes the whole feature set as input, if our feature space consists of, e.g. 10, if our feature space consists of 10 dimensions ($d = 10$).\n",
      "<br><br>\n",
      "\n",
      "**Output:**  \n",
      "a subset of features,  \n",
      "- $X_k = {x_j \\; | \\; j = 1, 2, ..., k; x_j \u2208 Y}, \\quad where \\; k = (0, 1, 2, ..., d)$   \n",
      "\n",
      "The returned output of the algorithm is a subset of the feature space of a specified size. E.g., a subset of 5 features from a 10-dimensional feature space ($k = 5,\\; d = 10$).\n",
      "<br><br>\n",
      "\n",
      "**Initialization:**  \n",
      "- $X_0 = Y, \\quad k = d$\n",
      "\n",
      "We initialize the algorithm with the given feature set so that the $k = d$ (where $k$ has the size of the feature set $d$)\n",
      "<br><br>\n",
      "\n",
      "**Step 1 (Exclusion):**  \n",
      "<br> \n",
      "- $x^- = arg max J(x_k - x), \\quad where \\; x \u2208 X_k$  \n",
      "- $X_k-1 = X_k - x^-$  \n",
      "- $k = k - 1$    \n",
      "- $Go \\;to \\;Step 2$    \n",
      "\n",
      "<br> <br> \n",
      "**Step 2 (Conditional Inclusion):**  \n",
      "<br> \n",
      "- $x^+ = arg max J(x_k + x), \\quad where \\; x \u2208 Y - X_k$    \n",
      "- $if J(x_k + x) > J(x_k + x):$      \n",
      "     - $X_k+1 = X_k + x^+$    \n",
      "     - $k = k + 1$    \n",
      "- $Go\\; to\\; Step 1$  \n",
      "\n",
      "In step 1, we exclude the feature from the ***feature space*** that yields the best performance increase of the  ***feature subset*** (assessed by the ***criterion function***). Then, we go over to step 2.  \n",
      "In step 2, we only include one of the removed features if the resulting subset would gain an increase in performance. We go back to step 1.   \n",
      "Steps 1 and 2 are reapeated until the **Termination** criterion is reached.\n",
      "<br><br>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"sfbs_code\"></a></p>\n",
      "##SFBS Code\n",
      "\n",
      "[[back to top]](#sections)\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Sebastian Raschka\n",
      "# last updated: 03/29/2014 \n",
      "# Sequential Floating Backward Selection (SFBS)\n",
      "\n",
      "from copy import deepcopy\n",
      "\n",
      "def seq_float_backw_select(features, max_k, criterion_func, print_steps=False):\n",
      "    \"\"\"\n",
      "    Implementation of Sequential Floating Backward Selection.\n",
      "    \n",
      "    Keyword Arguments:\n",
      "        features (list): The feature space as a list of features.\n",
      "        max_k: Termination criterion; the size of the returned feature subset.\n",
      "        criterion_func (function): Function that is used to evaluate the\n",
      "            performance of the feature subset.\n",
      "        print_steps (bool): Prints the algorithm procedure if True.\n",
      "    \n",
      "    Returns the selected feature subset, a list of features of length max_k.\n",
      "\n",
      "    \"\"\"\n",
      "\n",
      "    # Initialization\n",
      "    feat_sub = deepcopy(features)\n",
      "    k = len(feat_sub)\n",
      "    i = 0\n",
      "    excluded_features = []\n",
      "    \n",
      "    while True:\n",
      "        \n",
      "        # Termination condition\n",
      "        k = len(feat_sub)\n",
      "        if k == max_k:\n",
      "            break\n",
      "        \n",
      "        # Step 1: Exclusion\n",
      "        if print_steps:\n",
      "            print('\\nExclusion from feature subset', feat_sub)\n",
      "        worst_feat = len(feat_sub)-1\n",
      "        worst_feat_val = feat_sub[worst_feat]\n",
      "        crit_func_max = criterion_func(feat_sub[:-1]) \n",
      "\n",
      "        for i in reversed(range(0,len(feat_sub)-1)):\n",
      "            crit_func_eval = criterion_func(feat_sub[:i] + feat_sub[i+1:])\n",
      "            if crit_func_eval > crit_func_max:\n",
      "                worst_feat, crit_func_max = i, crit_func_eval\n",
      "                worst_feat_val = feat_sub[worst_feat]\n",
      "        excluded_features.append(feat_sub[worst_feat])\n",
      "        del feat_sub[worst_feat]\n",
      "        if print_steps:\n",
      "            print('exclude: {} -> feature subset: {}'.format(worst_feat_val, feat_sub))\n",
      "       \n",
      "        # Step 2: Conditional Inclusion\n",
      "        if len(excluded_features) > 0 and len(feat_sub) != max_k:\n",
      "            best_feat = None\n",
      "            best_feat_val = None\n",
      "            crit_func_max = criterion_func(feat_sub)\n",
      "            for i in range(len(excluded_features)):\n",
      "                crit_func_eval = criterion_func(feat_sub + [excluded_features[i]])\n",
      "                if crit_func_eval > crit_func_max:\n",
      "                    best_feat, crit_func_max = i, crit_func_eval\n",
      "                    best_feat_val = excluded_features[best_feat]\n",
      "            if best_feat:\n",
      "                feat_sub.append(excluded_features[best_feat])\n",
      "                del excluded_features[best_feat]\n",
      "            if print_steps:\n",
      "                    print('include: {} -> feature subset: {}'.\\\n",
      "                          format(best_feat_val, feat_sub))\n",
      "    \n",
      "    return feat_sub"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 16
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<p><a name=\"example_sfbs\"></a></p>\n",
      "### Example SFBS:\n",
      "\n",
      "[[back to top]](#sections)\n",
      "\n",
      "Note that the ***Inclusion*** step in the ***Sequential Floating Backward Selection*** algorithm is ***conditional*** - a feature is only added back if the criterion function asses a better performance after its inclusion in the ***feature subset***.  \n",
      "Therefore, we have to be a little bit careful about the ***criterion function***: If we used our [`simple_criterion_func()`](#crit_func\"), which returns the integer sum of a subset, we would trigger an infinite loop and never reach the termination criterion - assuming that our feature space consists of positive integers. The reason is that the [`simple_criterion_func()`](#crit_func\") would always return a larger sum if we include a positive integer (our feature) to the ***feature subset***, thus let us use our [second simple criterion function](#random_crit_func), which contains a (pseudo) random integer between -15 and 15 to the returned sum.  \n",
      "In order to reduce the number of iterations, we set the number of desired features for the ***feature subset*** (via the argument `max_k`) to 7."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def example_seq_float_backw_select():\n",
      "    ex_features = [6,3,1,6,8,2,3,7,9,1]\n",
      "    res_seq_flbackw = seq_float_backw_select(features=ex_features, max_k=7,\\\n",
      "                                 criterion_func=simple_rand_crit_func, print_steps=True)  \n",
      "    return res_seq_flbackw\n",
      "    \n",
      "# Run example\n",
      "res_seq_flbackw = example_seq_float_backw_select()\n",
      "print('\\nRESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] ->', res_seq_flbackw)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Exclusion from feature subset [6, 3, 1, 6, 8, 2, 3, 7, 9, 1]\n",
        "exclude: 2 -> feature subset: [6, 3, 1, 6, 8, 3, 7, 9, 1]\n",
        "include: None -> feature subset: [6, 3, 1, 6, 8, 3, 7, 9, 1]\n",
        "\n",
        "Exclusion from feature subset [6, 3, 1, 6, 8, 3, 7, 9, 1]\n",
        "exclude: 3 -> feature subset: [6, 3, 1, 6, 8, 7, 9, 1]\n",
        "include: 3 -> feature subset: [6, 3, 1, 6, 8, 7, 9, 1, 3]\n",
        "\n",
        "Exclusion from feature subset [6, 3, 1, 6, 8, 7, 9, 1, 3]\n",
        "exclude: 3 -> feature subset: [6, 3, 1, 6, 8, 7, 9, 1]\n",
        "include: None -> feature subset: [6, 3, 1, 6, 8, 7, 9, 1]\n",
        "\n",
        "Exclusion from feature subset [6, 3, 1, 6, 8, 7, 9, 1]\n",
        "exclude: 6 -> feature subset: [3, 1, 6, 8, 7, 9, 1]\n",
        "\n",
        "RESULT: [6, 3, 1, 6, 8, 2, 3, 7, 9, 1] -> [3, 1, 6, 8, 7, 9, 1]\n"
       ]
      }
     ],
     "prompt_number": 18
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## 7. Genetic Algorithm (GA)\n",
      "\n",
      "*to be continued ...*"
     ]
    }
   ],
   "metadata": {}
  }
 ]
}