3 new tutorials added
BIN
Images/pytest_01.png
Normal file
After Width: | Height: | Size: 170 KiB |
BIN
Images/pytest_02.png
Normal file
After Width: | Height: | Size: 148 KiB |
BIN
Images/pytest_02_2.png
Normal file
After Width: | Height: | Size: 38 KiB |
BIN
Images/pytest_03.png
Normal file
After Width: | Height: | Size: 173 KiB |
BIN
Images/pytest_04.png
Normal file
After Width: | Height: | Size: 198 KiB |
BIN
Images/pytest_05.png
Normal file
After Width: | Height: | Size: 48 KiB |
BIN
Images/pytest_06.png
Normal file
After Width: | Height: | Size: 88 KiB |
BIN
Images/pytest_07.png
Normal file
After Width: | Height: | Size: 76 KiB |
BIN
Images/pytest_08.png
Normal file
After Width: | Height: | Size: 133 KiB |
BIN
Images/pytest_09.png
Normal file
After Width: | Height: | Size: 113 KiB |
BIN
Images/pytest_10.png
Normal file
After Width: | Height: | Size: 65 KiB |
BIN
Images/pytest_11.png
Normal file
After Width: | Height: | Size: 97 KiB |
BIN
Images/pytest_12.png
Normal file
After Width: | Height: | Size: 124 KiB |
BIN
Images/pytest_13.png
Normal file
After Width: | Height: | Size: 45 KiB |
BIN
Images/python_sci_pack_ing.png
Normal file
After Width: | Height: | Size: 94 KiB |
|
@ -11,3 +11,7 @@ Syntax examples for useful Python functions, methods, and modules
|
|||
- [A collection of not so obvious Python stuff you should know!](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/not_so_obvious_python_stuff.ipynb?create=1)
|
||||
- [Python's scope resolution for variable names and the LEGB rule](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb?create=1)
|
||||
|
||||
### Links to Markdown files
|
||||
- [A thorough guide to SQLite database operations in Python](./sqlite3_howto/Readme.md)
|
||||
- [Unit testing in Python - Why we want to make it a habit](./tutorials/unit_testing.md)
|
||||
- [Installing Scientific Packages for Python3 on MacOS 10.9 Mavericks](./tutorials/installing_scientific_packages.md)
|
||||
|
|
321
tutorials/installing_scientific_packages.md
Normal file
|
@ -0,0 +1,321 @@
|
|||
|
||||
|
||||
## Installing Scientific Packages for Python3 on MacOS 10.9 Mavericks
|
||||
|
||||
_\-- written by Sebastian Raschka_ on March 13, 2014
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
![](../Images/python_sci_pack_ing.png)
|
||||
|
||||
* * *
|
||||
|
||||
#### Sections
|
||||
|
||||
• Anaconda and Miniconda
|
||||
• Consider a virtual environment
|
||||
• Installing pip
|
||||
• Installing NumPy
|
||||
• Installing SciPy
|
||||
• Installing matplotlib
|
||||
• Installing IPython
|
||||
• Updating installed packages
|
||||
|
||||
|
||||
|
||||
* * *
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## Anaconda and Miniconda
|
||||
|
||||
|
||||
|
||||
Alternatively, instead of going through all the manual steps listed in the
|
||||
following sections, there is the [Anaconda Python
|
||||
distribution](https://store.continuum.io/cshop/anaconda/) for scientific
|
||||
computing. Although Anaconda is distributed by Continuum Analytics, it is
|
||||
completely free and includes more than 125 packages for science and data
|
||||
analysis.
|
||||
The installation procedure is nicely summarized here:
|
||||
<http://docs.continuum.io/anaconda/install.html>
|
||||
|
||||
If this is too much, the [Miniconda](http://repo.continuum.io/miniconda/)
|
||||
might be right for you. Miniconda is basically just a Python distribution with
|
||||
the Conda package manager, which let's us install a list of Python packages
|
||||
into a specified conda environment.
|
||||
|
||||
|
||||
|
||||
$[bash]> conda create -n myenv python=3
|
||||
$[bash]> conda install -n myenv numpy scipy matplotlib ipython
|
||||
|
||||
|
||||
Note: environments will be created in `ROOT_DIR/envs` by default, you can use
|
||||
the `-p` instead of the `-n` flag in the conda commands above in order to
|
||||
specify a custom path.
|
||||
|
||||
If you we decided pro Anaconda or Miniconda, we are basically done at this
|
||||
point. The following sections are explaining a more (semi)-manual approach to
|
||||
install the packages individually using `pip`.
|
||||
|
||||
|
||||
|
||||
|
||||
## Consider a virtual environment
|
||||
|
||||
|
||||
In order to not mess around with our system packages, we should consider
|
||||
setting up a virtual environment when we want to install the additional
|
||||
scientific packages.
|
||||
To set up a new virtual environment, we can use the following command
|
||||
|
||||
|
||||
|
||||
$[bash]> python3 -m venv /path_to/my_virtual_env
|
||||
|
||||
|
||||
and activate it via
|
||||
|
||||
|
||||
|
||||
$[bash]> source /path_to/my_virtual_env/bin/activate
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## Installing pip
|
||||
|
||||
|
||||
`pip` is a tool for installing and managing Python packages. It makes the
|
||||
installation process for Python packages a lot easier, since they don't have
|
||||
to be downloaded manually.
|
||||
If you haven't installed the `pip` package for your version of Python, yet,
|
||||
I'd suggest to download it from <https://pypi.python.org/pypi/pip>, unzip it,
|
||||
and install it from the unzipped directory via
|
||||
|
||||
|
||||
|
||||
$[bash]> python3 setup.py install
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## Installing NumPy
|
||||
|
||||
|
||||
Installing NumPy should be straight forward now using `pip`
|
||||
|
||||
|
||||
|
||||
$[bash]> python3 -m pip install numpy
|
||||
|
||||
|
||||
The installation will probably take a few minutes due to the source files that
|
||||
have to be compiled for your machine. Once it is installed, `NumPy` should be
|
||||
available in Python via
|
||||
|
||||
|
||||
|
||||
>> import numpy
|
||||
|
||||
|
||||
If you want to see a few examples of how to operate with NumPy arrays, you can
|
||||
check out my [Matrix Cheatsheet for Moving from MATLAB matrices to NumPy
|
||||
arrays](http://sebastianraschka.com/Articles/2014_matlab_vs_numpy.html)
|
||||
|
||||
|
||||
|
||||
|
||||
## Installing SciPy
|
||||
|
||||
|
||||
While the `clang` compiler worked fine for compiling the C source code for
|
||||
`numpy`, we now need an additional Fortran compiler in order to install
|
||||
`scipy`.
|
||||
|
||||
|
||||
|
||||
#### Installing a Fortran Compiler
|
||||
|
||||
Unfortunately, MacOS 10.9 Mavericks doesn't come with a Fortran compiler, but
|
||||
it is pretty easy to download and install one.
|
||||
For example, `gfortran` for MacOS 10.9 can be downloaded from
|
||||
<http://coudert.name/software/gfortran-4.8.2-Mavericks.dmg>
|
||||
|
||||
Just double-click on the downloaded .DMG container and follow the familiar
|
||||
MacOS X installation procedure. Once it is installed, the `gfortran` compiler
|
||||
should be available from the command line,. We can test it by typing
|
||||
|
||||
|
||||
|
||||
$[bash]> gfortran -v
|
||||
|
||||
|
||||
Among other information, we will see the current version, e.g.,
|
||||
|
||||
|
||||
|
||||
gcc version 4.8.2 (GCC)
|
||||
|
||||
|
||||
|
||||
#### Installing SciPy
|
||||
|
||||
Now, we should be good to go to install `SciPy` using `pip`.
|
||||
|
||||
|
||||
|
||||
$[bash]> python3 -m pip install scipy
|
||||
|
||||
|
||||
After it was successfully installed - might also take a couple of minutes due
|
||||
to the source code compilation - it should be available in Python via
|
||||
|
||||
|
||||
|
||||
>> import scipy
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## Installing matplotlib
|
||||
|
||||
|
||||
The installation process for matplotlib should go very smoothly using `pip`, I
|
||||
haven't encountered any hurdles here.
|
||||
|
||||
|
||||
|
||||
$[bash]> python3 -m pip install matplotlib
|
||||
|
||||
|
||||
After successful installation, it can be imported in Python via
|
||||
|
||||
|
||||
|
||||
>> import matplotlib
|
||||
|
||||
|
||||
The `matplotlib` library has become my favorite data plotting tool recently,
|
||||
you can check out some examples in my little matplotlib-gallery on GitHub:
|
||||
<https://github.com/rasbt/matplotlib_gallery>
|
||||
|
||||
|
||||
|
||||
|
||||
## Installing IPython
|
||||
|
||||
|
||||
|
||||
#### Installing pyzmq
|
||||
|
||||
The IPython kernel requires the `pyzmq` package to run, `pyzmq` contains
|
||||
Python bindings for ØMQ, which is a lightweight and fast messaging
|
||||
implementation. It can be installed via `pip`.
|
||||
|
||||
|
||||
|
||||
$[bash]> python3 -m pip install pyzmq
|
||||
|
||||
|
||||
|
||||
|
||||
#### Installing pyside
|
||||
|
||||
When I was trying to install the `pyside` package, I had it complaining about
|
||||
the missing `cmake`. It can be downloaded from:
|
||||
|
||||
<http://www.cmake.org/files/v2.8/cmake-2.8.12.2-Darwin64-universal.dmg>
|
||||
|
||||
Just as we did with `gfortran` in the Installing SciPy section, double-click
|
||||
on the downloaded .DMG container and follow the familiar MacOS X installation
|
||||
procedure.
|
||||
We can confirm that it was successfully installed by typing
|
||||
|
||||
|
||||
|
||||
$[bash]> cmake --version
|
||||
|
||||
|
||||
into the command line where it would print something like
|
||||
|
||||
|
||||
|
||||
cmake version 2.8.12.2
|
||||
|
||||
|
||||
|
||||
#### Installing IPython
|
||||
|
||||
Now, we should finally be able to install IPython with all its further
|
||||
dependencies (pygments, Sphinx, jinja2, docutils, markupsafe) via
|
||||
|
||||
|
||||
|
||||
$[bash]> python3 -m pip install ipython[all]
|
||||
|
||||
|
||||
By doing this, we would install IPython to a custom location, e.g.,
|
||||
`/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-
|
||||
packages/IPython`.
|
||||
|
||||
You can find the path to this location by importing IPython in Python and then
|
||||
print its path:
|
||||
|
||||
|
||||
|
||||
>> import IPython
|
||||
>> IPython.__path__
|
||||
|
||||
|
||||
Finally, we can set an `alias` in our `.bash_profile` or `.bash_rc` file to
|
||||
conviniently run IPython from the console. E.g.,
|
||||
|
||||
|
||||
|
||||
alias ipython3="python3 /Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/IPython/terminal/ipapp.py"
|
||||
|
||||
(Don't forget to `source` the `.bash_rc` or `.bash_profile` file afterwards)
|
||||
|
||||
Now we can run
|
||||
|
||||
|
||||
|
||||
$[bash]> ipython3
|
||||
|
||||
|
||||
from you shell terminal to launch the interactive IPython shell, and
|
||||
|
||||
|
||||
|
||||
$[bash]> ipython3 notebook
|
||||
|
||||
|
||||
to bring up the awesome IPython notebook in our browser, respectively.
|
||||
|
||||
|
||||
|
||||
|
||||
## Updating installed packages
|
||||
|
||||
|
||||
Finally, if we want to keep our freshly installed packages up to date, we'd
|
||||
run `pip` with the `\--upgrade` flag, for example
|
||||
|
||||
|
||||
|
||||
$[bash]> python3 -m pip install numpy --upgrade
|
||||
|
||||
|
||||
|
||||
|
||||
|
290
tutorials/unit_testing.md
Normal file
|
@ -0,0 +1,290 @@
|
|||
|
||||
## Unit testing in Python - Why we want to make it a habit
|
||||
|
||||
|
||||
|
||||
#### Sections
|
||||
|
||||
Advantages of unit testing
|
||||
Main components a typical unit test
|
||||
The different unit test frameworks in Python
|
||||
[Installing py.test](\(#installing)
|
||||
A py.test example walkthrough
|
||||
Writing some code we want to test
|
||||
Creating a "test" file
|
||||
Testing edge cases and refining our code
|
||||
|
||||
* * *
|
||||
|
||||
|
||||
|
||||
## Advantages of unit testing
|
||||
|
||||
Traditionally, for every piece of code we write (let it be a single function
|
||||
or class method), we would feed it some arbitrary inputs to make sure that it
|
||||
works the way we have expected. And this might sound like a reasonable
|
||||
approach given that everything works as it should and if we do not plan to
|
||||
make any changes to the code until the end of days. Of course, this is rarely
|
||||
the case.
|
||||
Suppose we want to modify our code by refactoring it, or by tweaking it for
|
||||
improved efficiency: Do we really want to manually type the previous test
|
||||
cases all over again to make sure we didn't break anything? Or suppose we are
|
||||
planning to pass our code along to our co-workers: What reason do they have to
|
||||
trust it? How can we make their life easier by providing evidence that
|
||||
everything was tested and is supposed to work properly?
|
||||
Surely, no one wants to spend hours or even days of mundane work to test code
|
||||
that was inherited before it can be put to use in good conscience.
|
||||
There must be a cleverer way, an automated and more systematic approach…
|
||||
This is where unit tests come into play. Once we designed the interface
|
||||
(_here:_ the in- and outputs of our functions and methods), we can write down
|
||||
several test cases and let them be checked every time we make changes to our
|
||||
code - without the tedious work of typing everything all over again, and
|
||||
without the risk of forgetting anything or by omitting crucial tests simply
|
||||
due to laziness.
|
||||
**This is especially important in scientific research, where your whole project depends on the correct analysis and assessment of any data - and there is probably no more convenient way to convince both you and the rightly skeptical reviewer that you just made a(nother) groundbreaking discovery.**
|
||||
|
||||
|
||||
|
||||
|
||||
## Main components a typical unit test
|
||||
|
||||
In principle, unit testing is really no more than a more systematic way to
|
||||
automate code testing process. Where the term "unit" is typically defined as
|
||||
an isolated test case that consists of a the following components:
|
||||
|
||||
\- a so-called "fixture" (e.g., a function, a class or class method, or even a
|
||||
data file)
|
||||
\- an action on the fixture (e.g., calling a function with a particular input)
|
||||
\- an expected outcome (e.g., the expected return value of a function)
|
||||
\- the actual outcome (e.g., the actual return value of a function call)
|
||||
\- a verification message (e.g., a report whether the actual return value
|
||||
matches the expected return value or not)
|
||||
|
||||
|
||||
|
||||
|
||||
## The different unit test frameworks in Python
|
||||
|
||||
In Python, we have the luxury to be able to choose from a variety of good and
|
||||
capable unit testing frameworks. Probably, the most popular and most widely
|
||||
used ones are:
|
||||
|
||||
\- the [unittest](http://docs.python.org/3.3/library/unittest.html) module -
|
||||
part of the Python Standard Library
|
||||
\- [nose](https://nose.readthedocs.org/en/latest/index.html)
|
||||
\- [py.test](http://pytest.org/latest/index.html)
|
||||
|
||||
All of them work very well, and they are all sufficient for basic unit
|
||||
testing. Some people might prefer to use _nose_ over the more "basic"
|
||||
_unittest_ module. And many people are moving to the more recent _py.test_
|
||||
framework, since it offers some nice extensions and even more advanced and
|
||||
useful features. However, it shall not be the focus of this tutorial to
|
||||
discuss all the details of the different unit testing frameworks and weight
|
||||
them against each other. The screenshot below shows how the simple execution
|
||||
of _py.test_ and _nose_ may look like. To provide you with a little bit more
|
||||
background information: Both _nose_ and _py.test_ are crawling a subdirectory
|
||||
tree while looking for Python script files that start with the naming prefix
|
||||
"test". If those script files contain functions, classes, and class methods
|
||||
that also start with the prefix "test", the contained code will be executed by
|
||||
the unit testing frameworks.
|
||||
|
||||
![../Images/pytest_01.png](../Images/pytest_01.png)
|
||||
|
||||
|
||||
|
||||
* * *
|
||||
|
||||
Command line syntax:
|
||||
`py.test <file/directory>` \- default unit testing with detailed report
|
||||
`py.test -q <file/directory>` \- default unit testing with summarized report
|
||||
(quiet mode)
|
||||
`nosetests` \- default unit testing with summarized report
|
||||
`nosetests -v` \- default unit testing with detailed report (verbose mode)
|
||||
|
||||
* * *
|
||||
|
||||
|
||||
|
||||
For the further sections of this tutorial, we will be using _py.test_, but
|
||||
everything is also compatible to the _nose_ framework, and for the simple
|
||||
examples below it would not matter which framework we picked.
|
||||
However, there is one little difference in the default behavior, though, and
|
||||
it might also answer the question: "How does the framework know where to find
|
||||
the test code to execute?"
|
||||
By default, _py.test_ descends into all subdirectories (from the current
|
||||
working directory or a particular folder that you provided as additional
|
||||
argument) looking for Python scripts that start with the prefix "test". If
|
||||
there are functions, classes, or class methods contained in these scripts that
|
||||
also start with the prefix "test", those will be executed by the unit testing
|
||||
framework. The basic behavior of _nose_ is quite similar, but in contrast to
|
||||
browsing through all subdirectories, it will only consider those that start
|
||||
with the prefix "test" to look for the respective Python unit test code. Thus,
|
||||
it is a good habit to put all your test code under a directory starting with
|
||||
the prefix "test" even if you use _py.test_ \- your _nose_ colleagues will
|
||||
thank you!
|
||||
The figure below shows how the _nose_ and _py.test_ unit test frameworks would
|
||||
descend the subdirectory tree looking for Python script files that start with
|
||||
the prefix "test".
|
||||
![../Images/pytest_02.png](../Images/pytest_02.png)
|
||||
|
||||
_Note: Interestingly,_ nose _seems to be twice as fast as_ py.test _in the
|
||||
example above, and I was curious if it is due to the fact that_ py.test
|
||||
_searches all subdirectories (_ nose _only searches those that start with
|
||||
"test"). Although there is a tiny speed difference when I specify the test
|
||||
code containing folder directly,_ nose _still seems to be faster. However, I
|
||||
don't know how it scales, and it might be an interesting experiment to test
|
||||
for much larger projects._
|
||||
|
||||
![../Images/pytest_02_2.png](../Images/pytest_02_2.png)
|
||||
|
||||
|
||||
|
||||
|
||||
## Installing py.test
|
||||
|
||||
Installing py.test is pretty straightforward. We can install it directly from
|
||||
the command line via
|
||||
|
||||
|
||||
|
||||
pip install -U pytest
|
||||
|
||||
|
||||
|
||||
or
|
||||
|
||||
|
||||
|
||||
easy_install -U pytest
|
||||
|
||||
|
||||
|
||||
If this doesn't work for you, you can visit the _py.test_ website
|
||||
(<http://pytest.org/latest/>), download the package, and try to install it
|
||||
"manually":
|
||||
|
||||
|
||||
|
||||
~/Desktop/pytest-2.5.0> python3 setup.py install
|
||||
|
||||
|
||||
|
||||
If it was installed correctly, we can now run _py.test_ in any directory from
|
||||
the command line via
|
||||
|
||||
|
||||
|
||||
py.test <file or directory>
|
||||
|
||||
|
||||
|
||||
or
|
||||
|
||||
|
||||
|
||||
python -m pytest <file or directory>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## A py.test example walkthrough
|
||||
|
||||
For the following example we will be using _py.test_, however, _nose_ works
|
||||
pretty similarly, and as I mentioned in the previous section, I only want to
|
||||
focus on the essentials of unit testing here. Note that _py.test_ has a lot of
|
||||
advanced and useful features to offer that we won't touch in this tutorial,
|
||||
e.g., setting break points for debugging, etc. (if you want to learn more,
|
||||
please take a look at the complete _py.test_ documentation:
|
||||
<http://pytest.org/latest/contents.html#toc>).
|
||||
|
||||
|
||||
|
||||
### Writing some code we want to test
|
||||
|
||||
Assume we wrote two very simple functions that we want to test, either as
|
||||
small scripts or part of a larger package. The first function,
|
||||
"multiple_of_three", is supposed to check whether a number is a multiple of
|
||||
the number 3 or not. We want the function to return the boolean value True if
|
||||
this is the case, and else it should return False. The second function,
|
||||
"filter_multiples_of_three", takes a list as input argument and is supposed to
|
||||
return a subset of the input list containing only those numbers that are
|
||||
multiples of 3.
|
||||
|
||||
![../Images/pytest_03.png](../Images/pytest_03.png)
|
||||
|
||||
|
||||
|
||||
### Creating a "test" file
|
||||
|
||||
Next, we write a small unit test to check if our function works for some
|
||||
simple input cases:
|
||||
|
||||
|
||||
![../Images/pytest_04.png](../Images/pytest_04.png)
|
||||
|
||||
Great, when we run our py.test unit testing framework, we see that everything
|
||||
works as expected!
|
||||
|
||||
|
||||
![../Images/pytest_05.png](../Images/pytest_05.png)
|
||||
|
||||
But what about edge cases?
|
||||
|
||||
|
||||
|
||||
|
||||
### Testing edge cases and refining our code
|
||||
|
||||
In order to check if our function is yet robust enough to handle special
|
||||
cases, e.g., 0 as input, we extend our unit test code. Here, assume that we
|
||||
don't want 0 to evaluate to True, since we don't consider 3 to be a factor of
|
||||
0.
|
||||
|
||||
![../Images/pytest_06.png](../Images/pytest_06.png)
|
||||
![../Images/pytest_07.png](../Images/pytest_07.png)
|
||||
|
||||
As we can see from the _py.test report_, our test just failed. So let us go
|
||||
back and fix our code to handle this special case.
|
||||
|
||||
![../Images/pytest_08.png](../Images/pytest_08.png)
|
||||
|
||||
So far so good, when we execute _py.test_ again (image not shown) we see that
|
||||
our codes handles 0 correctly now. Let us add some more edge cases: Negative
|
||||
integers, decimal floating-point numbers, and large integers.
|
||||
|
||||
![../Images/pytest_09.png](../Images/pytest_09.png)
|
||||
![../Images/pytest_10.png](../Images/pytest_10.png)
|
||||
|
||||
According to the unit test report, we face another problem here: Our code
|
||||
considers 3 as a factor of -9 (negative 9). For the sake of this example,
|
||||
let's assume that we don't want this to happen: We'd like to consider only
|
||||
positive numbers to be multiples of 3. In order to account for those cases, we
|
||||
need to make another small modification to our code by changing `!=0` to `>0`
|
||||
in the if-statement.
|
||||
|
||||
![../Images/pytest_11.png](../Images/pytest_11.png)
|
||||
|
||||
After running the _py.test_ utility again, we are certain that our code can
|
||||
also handle negative numbers correctly now. And once we are satisfied with the
|
||||
general behavior of our current code, we can move on to testing the next
|
||||
function "filter_multiples_of_three", which depends on the correctness of
|
||||
"multiple_of_three".
|
||||
|
||||
![../Images/pytest_12.png](../Images/pytest_12.png)
|
||||
![../Images/pytest_13.png](../Images/pytest_13.png)
|
||||
|
||||
This time, our test seems to be "bug"-free, and we are confident that it can
|
||||
handle all the scenarios we could currently think of. If we plan to make any
|
||||
further modifications to the code in future, nothing can be more convenient to
|
||||
just re-run our previous tests in order to make sure that we didn't break
|
||||
anything.
|
||||
|
||||
If you have any questions or need more explanations, you are welcome to
|
||||
provide feedback in the comment section below.
|
||||
|
||||
|
||||
|
||||
|