3 new tutorials added

This commit is contained in:
Sebastian Raschka 2014-04-29 23:02:49 -04:00
parent 8dc6fe765c
commit 778b5613db
18 changed files with 615 additions and 0 deletions

BIN
Images/pytest_01.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 170 KiB

BIN
Images/pytest_02.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

BIN
Images/pytest_02_2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

BIN
Images/pytest_03.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 173 KiB

BIN
Images/pytest_04.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 198 KiB

BIN
Images/pytest_05.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

BIN
Images/pytest_06.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

BIN
Images/pytest_07.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

BIN
Images/pytest_08.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 133 KiB

BIN
Images/pytest_09.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 113 KiB

BIN
Images/pytest_10.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

BIN
Images/pytest_11.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

BIN
Images/pytest_12.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

BIN
Images/pytest_13.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

View File

@ -11,3 +11,7 @@ Syntax examples for useful Python functions, methods, and modules
- [A collection of not so obvious Python stuff you should know!](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/not_so_obvious_python_stuff.ipynb?create=1)
- [Python's scope resolution for variable names and the LEGB rule](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb?create=1)
### Links to Markdown files
- [A thorough guide to SQLite database operations in Python](./sqlite3_howto/Readme.md)
- [Unit testing in Python - Why we want to make it a habit](./tutorials/unit_testing.md)
- [Installing Scientific Packages for Python3 on MacOS 10.9 Mavericks](./tutorials/installing_scientific_packages.md)

View File

@ -0,0 +1,321 @@
## Installing Scientific Packages for Python3 on MacOS 10.9 Mavericks
_\-- written by Sebastian Raschka_ on March 13, 2014
![](../Images/python_sci_pack_ing.png)
* * *
#### Sections
• Anaconda and Miniconda
• Consider a virtual environment
• Installing pip
• Installing NumPy
• Installing SciPy
• Installing matplotlib
• Installing IPython
• Updating installed packages
* * *
## Anaconda and Miniconda
Alternatively, instead of going through all the manual steps listed in the
following sections, there is the [Anaconda Python
distribution](https://store.continuum.io/cshop/anaconda/) for scientific
computing. Although Anaconda is distributed by Continuum Analytics, it is
completely free and includes more than 125 packages for science and data
analysis.
The installation procedure is nicely summarized here:
<http://docs.continuum.io/anaconda/install.html>
If this is too much, the [Miniconda](http://repo.continuum.io/miniconda/)
might be right for you. Miniconda is basically just a Python distribution with
the Conda package manager, which let's us install a list of Python packages
into a specified conda environment.
$[bash]> conda create -n myenv python=3
$[bash]> conda install -n myenv numpy scipy matplotlib ipython
Note: environments will be created in `ROOT_DIR/envs` by default, you can use
the `-p` instead of the `-n` flag in the conda commands above in order to
specify a custom path.
If you we decided pro Anaconda or Miniconda, we are basically done at this
point. The following sections are explaining a more (semi)-manual approach to
install the packages individually using `pip`.
## Consider a virtual environment
In order to not mess around with our system packages, we should consider
setting up a virtual environment when we want to install the additional
scientific packages.
To set up a new virtual environment, we can use the following command
$[bash]> python3 -m venv /path_to/my_virtual_env
and activate it via
$[bash]> source /path_to/my_virtual_env/bin/activate
## Installing pip
`pip` is a tool for installing and managing Python packages. It makes the
installation process for Python packages a lot easier, since they don't have
to be downloaded manually.
If you haven't installed the `pip` package for your version of Python, yet,
I'd suggest to download it from <https://pypi.python.org/pypi/pip>, unzip it,
and install it from the unzipped directory via
$[bash]> python3 setup.py install
## Installing NumPy
Installing NumPy should be straight forward now using `pip`
$[bash]> python3 -m pip install numpy
The installation will probably take a few minutes due to the source files that
have to be compiled for your machine. Once it is installed, `NumPy` should be
available in Python via
>> import numpy
If you want to see a few examples of how to operate with NumPy arrays, you can
check out my [Matrix Cheatsheet for Moving from MATLAB matrices to NumPy
arrays](http://sebastianraschka.com/Articles/2014_matlab_vs_numpy.html)
## Installing SciPy
While the `clang` compiler worked fine for compiling the C source code for
`numpy`, we now need an additional Fortran compiler in order to install
`scipy`.
#### Installing a Fortran Compiler
Unfortunately, MacOS 10.9 Mavericks doesn't come with a Fortran compiler, but
it is pretty easy to download and install one.
For example, `gfortran` for MacOS 10.9 can be downloaded from
<http://coudert.name/software/gfortran-4.8.2-Mavericks.dmg>
Just double-click on the downloaded .DMG container and follow the familiar
MacOS X installation procedure. Once it is installed, the `gfortran` compiler
should be available from the command line,. We can test it by typing
$[bash]> gfortran -v
Among other information, we will see the current version, e.g.,
gcc version 4.8.2 (GCC)
#### Installing SciPy
Now, we should be good to go to install `SciPy` using `pip`.
$[bash]> python3 -m pip install scipy
After it was successfully installed - might also take a couple of minutes due
to the source code compilation - it should be available in Python via
>> import scipy
## Installing matplotlib
The installation process for matplotlib should go very smoothly using `pip`, I
haven't encountered any hurdles here.
$[bash]> python3 -m pip install matplotlib
After successful installation, it can be imported in Python via
>> import matplotlib
The `matplotlib` library has become my favorite data plotting tool recently,
you can check out some examples in my little matplotlib-gallery on GitHub:
<https://github.com/rasbt/matplotlib_gallery>
## Installing IPython
#### Installing pyzmq
The IPython kernel requires the `pyzmq` package to run, `pyzmq` contains
Python bindings for ØMQ, which is a lightweight and fast messaging
implementation. It can be installed via `pip`.
$[bash]> python3 -m pip install pyzmq
#### Installing pyside
When I was trying to install the `pyside` package, I had it complaining about
the missing `cmake`. It can be downloaded from:
<http://www.cmake.org/files/v2.8/cmake-2.8.12.2-Darwin64-universal.dmg>
Just as we did with `gfortran` in the Installing SciPy section, double-click
on the downloaded .DMG container and follow the familiar MacOS X installation
procedure.
We can confirm that it was successfully installed by typing
$[bash]> cmake --version
into the command line where it would print something like
cmake version 2.8.12.2
#### Installing IPython
Now, we should finally be able to install IPython with all its further
dependencies (pygments, Sphinx, jinja2, docutils, markupsafe) via
$[bash]> python3 -m pip install ipython[all]
By doing this, we would install IPython to a custom location, e.g.,
`/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-
packages/IPython`.
You can find the path to this location by importing IPython in Python and then
print its path:
>> import IPython
>> IPython.__path__
Finally, we can set an `alias` in our `.bash_profile` or `.bash_rc` file to
conviniently run IPython from the console. E.g.,
alias ipython3="python3 /Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/IPython/terminal/ipapp.py"
(Don't forget to `source` the `.bash_rc` or `.bash_profile` file afterwards)
Now we can run
$[bash]> ipython3
from you shell terminal to launch the interactive IPython shell, and
$[bash]> ipython3 notebook
to bring up the awesome IPython notebook in our browser, respectively.
## Updating installed packages
Finally, if we want to keep our freshly installed packages up to date, we'd
run `pip` with the `\--upgrade` flag, for example
$[bash]> python3 -m pip install numpy --upgrade

290
tutorials/unit_testing.md Normal file
View File

@ -0,0 +1,290 @@
## Unit testing in Python - Why we want to make it a habit
#### Sections
Advantages of unit testing
Main components a typical unit test
The different unit test frameworks in Python
[Installing py.test](\(#installing)
A py.test example walkthrough
Writing some code we want to test
Creating a "test" file
Testing edge cases and refining our code
* * *
## Advantages of unit testing
Traditionally, for every piece of code we write (let it be a single function
or class method), we would feed it some arbitrary inputs to make sure that it
works the way we have expected. And this might sound like a reasonable
approach given that everything works as it should and if we do not plan to
make any changes to the code until the end of days. Of course, this is rarely
the case.
Suppose we want to modify our code by refactoring it, or by tweaking it for
improved efficiency: Do we really want to manually type the previous test
cases all over again to make sure we didn't break anything? Or suppose we are
planning to pass our code along to our co-workers: What reason do they have to
trust it? How can we make their life easier by providing evidence that
everything was tested and is supposed to work properly?
Surely, no one wants to spend hours or even days of mundane work to test code
that was inherited before it can be put to use in good conscience.
There must be a cleverer way, an automated and more systematic approach…
This is where unit tests come into play. Once we designed the interface
(_here:_ the in- and outputs of our functions and methods), we can write down
several test cases and let them be checked every time we make changes to our
code - without the tedious work of typing everything all over again, and
without the risk of forgetting anything or by omitting crucial tests simply
due to laziness.
**This is especially important in scientific research, where your whole project depends on the correct analysis and assessment of any data - and there is probably no more convenient way to convince both you and the rightly skeptical reviewer that you just made a(nother) groundbreaking discovery.**
## Main components a typical unit test
In principle, unit testing is really no more than a more systematic way to
automate code testing process. Where the term "unit" is typically defined as
an isolated test case that consists of a the following components:
\- a so-called "fixture" (e.g., a function, a class or class method, or even a
data file)
\- an action on the fixture (e.g., calling a function with a particular input)
\- an expected outcome (e.g., the expected return value of a function)
\- the actual outcome (e.g., the actual return value of a function call)
\- a verification message (e.g., a report whether the actual return value
matches the expected return value or not)
## The different unit test frameworks in Python
In Python, we have the luxury to be able to choose from a variety of good and
capable unit testing frameworks. Probably, the most popular and most widely
used ones are:
\- the [unittest](http://docs.python.org/3.3/library/unittest.html) module -
part of the Python Standard Library
\- [nose](https://nose.readthedocs.org/en/latest/index.html)
\- [py.test](http://pytest.org/latest/index.html)
All of them work very well, and they are all sufficient for basic unit
testing. Some people might prefer to use _nose_ over the more "basic"
_unittest_ module. And many people are moving to the more recent _py.test_
framework, since it offers some nice extensions and even more advanced and
useful features. However, it shall not be the focus of this tutorial to
discuss all the details of the different unit testing frameworks and weight
them against each other. The screenshot below shows how the simple execution
of _py.test_ and _nose_ may look like. To provide you with a little bit more
background information: Both _nose_ and _py.test_ are crawling a subdirectory
tree while looking for Python script files that start with the naming prefix
"test". If those script files contain functions, classes, and class methods
that also start with the prefix "test", the contained code will be executed by
the unit testing frameworks.
![../Images/pytest_01.png](../Images/pytest_01.png)
* * *
Command line syntax:
`py.test <file/directory>` \- default unit testing with detailed report
`py.test -q <file/directory>` \- default unit testing with summarized report
(quiet mode)
`nosetests` \- default unit testing with summarized report
`nosetests -v` \- default unit testing with detailed report (verbose mode)
* * *
For the further sections of this tutorial, we will be using _py.test_, but
everything is also compatible to the _nose_ framework, and for the simple
examples below it would not matter which framework we picked.
However, there is one little difference in the default behavior, though, and
it might also answer the question: "How does the framework know where to find
the test code to execute?"
By default, _py.test_ descends into all subdirectories (from the current
working directory or a particular folder that you provided as additional
argument) looking for Python scripts that start with the prefix "test". If
there are functions, classes, or class methods contained in these scripts that
also start with the prefix "test", those will be executed by the unit testing
framework. The basic behavior of _nose_ is quite similar, but in contrast to
browsing through all subdirectories, it will only consider those that start
with the prefix "test" to look for the respective Python unit test code. Thus,
it is a good habit to put all your test code under a directory starting with
the prefix "test" even if you use _py.test_ \- your _nose_ colleagues will
thank you!
The figure below shows how the _nose_ and _py.test_ unit test frameworks would
descend the subdirectory tree looking for Python script files that start with
the prefix "test".
![../Images/pytest_02.png](../Images/pytest_02.png)
_Note: Interestingly,_ nose _seems to be twice as fast as_ py.test _in the
example above, and I was curious if it is due to the fact that_ py.test
_searches all subdirectories (_ nose _only searches those that start with
"test"). Although there is a tiny speed difference when I specify the test
code containing folder directly,_ nose _still seems to be faster. However, I
don't know how it scales, and it might be an interesting experiment to test
for much larger projects._
![../Images/pytest_02_2.png](../Images/pytest_02_2.png)
## Installing py.test
Installing py.test is pretty straightforward. We can install it directly from
the command line via
pip install -U pytest
or
easy_install -U pytest
If this doesn't work for you, you can visit the _py.test_ website
(<http://pytest.org/latest/>), download the package, and try to install it
"manually":
~/Desktop/pytest-2.5.0> python3 setup.py install
If it was installed correctly, we can now run _py.test_ in any directory from
the command line via
py.test <file or directory>
or
python -m pytest <file or directory>
## A py.test example walkthrough
For the following example we will be using _py.test_, however, _nose_ works
pretty similarly, and as I mentioned in the previous section, I only want to
focus on the essentials of unit testing here. Note that _py.test_ has a lot of
advanced and useful features to offer that we won't touch in this tutorial,
e.g., setting break points for debugging, etc. (if you want to learn more,
please take a look at the complete _py.test_ documentation:
<http://pytest.org/latest/contents.html#toc>).
### Writing some code we want to test
Assume we wrote two very simple functions that we want to test, either as
small scripts or part of a larger package. The first function,
"multiple_of_three", is supposed to check whether a number is a multiple of
the number 3 or not. We want the function to return the boolean value True if
this is the case, and else it should return False. The second function,
"filter_multiples_of_three", takes a list as input argument and is supposed to
return a subset of the input list containing only those numbers that are
multiples of 3.
![../Images/pytest_03.png](../Images/pytest_03.png)
### Creating a "test" file
Next, we write a small unit test to check if our function works for some
simple input cases:
![../Images/pytest_04.png](../Images/pytest_04.png)
Great, when we run our py.test unit testing framework, we see that everything
works as expected!
![../Images/pytest_05.png](../Images/pytest_05.png)
But what about edge cases?
### Testing edge cases and refining our code
In order to check if our function is yet robust enough to handle special
cases, e.g., 0 as input, we extend our unit test code. Here, assume that we
don't want 0 to evaluate to True, since we don't consider 3 to be a factor of
0.
![../Images/pytest_06.png](../Images/pytest_06.png)
![../Images/pytest_07.png](../Images/pytest_07.png)
As we can see from the _py.test report_, our test just failed. So let us go
back and fix our code to handle this special case.
![../Images/pytest_08.png](../Images/pytest_08.png)
So far so good, when we execute _py.test_ again (image not shown) we see that
our codes handles 0 correctly now. Let us add some more edge cases: Negative
integers, decimal floating-point numbers, and large integers.
![../Images/pytest_09.png](../Images/pytest_09.png)
![../Images/pytest_10.png](../Images/pytest_10.png)
According to the unit test report, we face another problem here: Our code
considers 3 as a factor of -9 (negative 9). For the sake of this example,
let's assume that we don't want this to happen: We'd like to consider only
positive numbers to be multiples of 3. In order to account for those cases, we
need to make another small modification to our code by changing `!=0` to `>0`
in the if-statement.
![../Images/pytest_11.png](../Images/pytest_11.png)
After running the _py.test_ utility again, we are certain that our code can
also handle negative numbers correctly now. And once we are satisfied with the
general behavior of our current code, we can move on to testing the next
function "filter_multiples_of_three", which depends on the correctness of
"multiple_of_three".
![../Images/pytest_12.png](../Images/pytest_12.png)
![../Images/pytest_13.png](../Images/pytest_13.png)
This time, our test seems to be "bug"-free, and we are confident that it can
handle all the scenarios we could currently think of. If we plan to make any
further modifications to the code in future, nothing can be more convenient to
just re-run our previous tests in order to make sure that we didn't break
anything.
If you have any questions or need more explanations, you are welcome to
provide feedback in the comment section below.