updated lda section

This commit is contained in:
rasbt 2014-07-02 10:31:44 -04:00
parent e48fa3cd35
commit 79830528ae
3 changed files with 15 additions and 15 deletions

BIN
Images/lda_overview.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 78 KiB

View File

@ -1,7 +1,7 @@
{ {
"metadata": { "metadata": {
"name": "", "name": "",
"signature": "sha256:6e3f719b1daa1babf21267ceb967f34d2d592ae35badaa05a16c0cfd6bc5acdb" "signature": "sha256:bd98b97f9830126c7a9d419b19f563a8b78d66b25dba8c380e54754b84335a50"
}, },
"nbformat": 3, "nbformat": 3,
"nbformat_minor": 0, "nbformat_minor": 0,
@ -118,7 +118,7 @@
"\n", "\n",
"- [Linear Transformation: Principal Component Analysis (PCA)](#PCA)\n", "- [Linear Transformation: Principal Component Analysis (PCA)](#PCA)\n",
"\n", "\n",
"- [Linear Transformation: Multiple Discrciminant Analysis (MDA)](#MDA)\n", "- [Linear Transformation: Linear Discrciminant Analysis (MDA)](#MDA)\n",
"\n", "\n",
"- [Simple Supervised Classification](#Simple-Supervised-Classification)\n", "- [Simple Supervised Classification](#Simple-Supervised-Classification)\n",
"\n", "\n",
@ -1026,7 +1026,7 @@
"level": 2, "level": 2,
"metadata": {}, "metadata": {},
"source": [ "source": [
"Linear Transformation: Multiple Discriminant Analysis (MDA)" "Linear Transformation: Linear Discriminant Analysis (MDA)"
] ]
}, },
{ {
@ -1040,11 +1040,11 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"The main purposes of a Multiple Discriminant Analysis is to analyze the data to identify patterns to project it onto a subspace that yields a better separation of the classes. Also, the dimensionality of the dataset shall be reduced with minimal loss of information.\n", "The main purposes of a Linear Discriminant Analysis (LDA) is to analyze the data to identify patterns to project it onto a subspace that yields a better separation of the classes. Also, the dimensionality of the dataset shall be reduced with minimal loss of information.\n",
"\n", "\n",
"**The approach is very similar to a Principal Component Analysis (PCA), but in addition to finding the component axes that maximize the variance of our data, we are additionally interested in the axes that maximize the separation of our classes (e.g., in a supervised pattern classification problem)**\n", "**The approach is very similar to a Principal Component Analysis (PCA), but in addition to finding the component axes that maximize the variance of our data, we are additionally interested in the axes that maximize the separation of our classes (e.g., in a supervised pattern classification problem)**\n",
"\n", "\n",
"Here, our desired outcome of the multiple discriminant analysis is to project a feature space (our dataset consisting of n d-dimensional samples) onto a smaller subspace that represents our data \"well\" and has a good class separation. A possible application would be a pattern classification task, where we want to reduce the computational costs and the error of parameter estimation by reducing the number of dimensions of our feature space by extracting a subspace that describes our data \"best\"." "Here, our desired outcome of the Linear discriminant analysis is to project a feature space (our dataset consisting of n d-dimensional samples) onto a smaller subspace that represents our data \"well\" and has a good class separation. A possible application would be a pattern classification task, where we want to reduce the computational costs and the error of parameter estimation by reducing the number of dimensions of our feature space by extracting a subspace that describes our data \"best\"."
] ]
}, },
{ {
@ -1052,40 +1052,40 @@
"level": 4, "level": 4,
"metadata": {}, "metadata": {},
"source": [ "source": [
"Principal Component Analysis (PCA) Vs. Multiple Discriminant Analysis (MDA)" "Principal Component Analysis (PCA) Vs. Linear Discriminant Analysis (LDA)"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Both Multiple Discriminant Analysis (MDA) and Principal Component Analysis (PCA) are linear transformation methods and closely related to each other. In PCA, we are interested to find the directions (components) that maximize the variance in our dataset, where in MDA, we are additionally interested to find the directions that maximize the separation (or discrimination) between different classes (for example, in pattern classification problems where our dataset consists of multiple classes. In contrast two PCA, which ignores the class labels).\n", "Both Linear Discriminant Analysis and Principal Component Analysis are linear transformation methods and closely related to each other. In PCA, we are interested to find the directions (components) that maximize the variance in our dataset, where in LDA, we are additionally interested to find the directions that maximize the separation (or discrimination) between different classes (for example, in pattern classification problems where our dataset consists of multiple classes. In contrast two PCA, which ignores the class labels).\n",
"\n", "\n",
"**In other words, via PCA, we are projecting the entire set of data (without class labels) onto a different subspace, and in MDA, we are trying to determine a suitable subspace to distinguish between patterns that belong to different classes. Or, roughly speaking in PCA we are trying to find the axes with maximum variances where the data is most spread (within a class, since PCA treats the whole data set as one class), and in MDA we are additionally maximizing the spread between classes.**\n", "**In other words, via PCA, we are projecting the entire set of data (without class labels) onto a different subspace, and in LDA, we are trying to determine a suitable subspace to distinguish between patterns that belong to different classes. Or, roughly speaking in PCA we are trying to find the axes with maximum variances where the data is most spread (within a class, since PCA treats the whole data set as one class), and in LDA we are additionally maximizing the spread between classes.**\n",
"\n", "\n",
"In typical pattern recognition problems, a PCA is often followed by an MDA." "In typical pattern recognition problems, a PCA is often followed by an LDA."
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](../Images/mda_overview.png)" "![](../Images/lda_overview.png)"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"If you are interested, you can find more information about the MDA in my IPython notebook \n", "If you are interested, you can find more information about the LDA in my IPython notebook \n",
"[Stepping through a Multiple Discriminant Analysis - using Python's NumPy and matplotlib](http://nbviewer.ipython.org/github/rasbt/pattern_classification/blob/master/dimensionality_reduction/projection/multiple_discriminant_analysis.ipynb?create=1)." "[Stepping through a Linear Discriminant Analysis - using Python's NumPy and matplotlib](http://nbviewer.ipython.org/github/rasbt/pattern_classification/blob/master/dimensionality_reduction/projection/minear_discriminant_analysis.ipynb?create=1)."
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Like we did in the PCA section above, we will use a `scikit-learn` funcion, [`sklearn.lda.LDA`](http://scikit-learn.org/stable/modules/generated/sklearn.lda.LDA.html) in order to transform our training data onto 2 dimensional subspace, where MDA is basically the more generalized form of an LDA (Linear Discriminant Analysis):" "Like we did in the PCA section above, we will use a `scikit-learn` funcion, [`sklearn.lda.LDA`](http://scikit-learn.org/stable/modules/generated/sklearn.lda.LDA.html) in order to transform our training data onto 2 dimensional subspace:"
] ]
}, },
{ {
@ -1137,14 +1137,14 @@
"level": 4, "level": 4,
"metadata": {}, "metadata": {},
"source": [ "source": [
"MDA for feature extraction" "LDA for feature extraction"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"If we want to use MDA for projecting our data onto a smaller subspace (i.e., for dimensionality reduction), we can directly set the number of components to keep via `LDA(n_components=...)`; this is analogous to the [PCA function](#PCA-for-feature-extraction), which we have seen above.\n" "If we want to use LDA for projecting our data onto a smaller subspace (i.e., for dimensionality reduction), we can directly set the number of components to keep via `LDA(n_components=...)`; this is analogous to the [PCA function](#PCA-for-feature-extraction), which we have seen above.\n"
] ]
}, },
{ {