python_reference/tutorials/things_in_pandas.ipynb

2541 lines
80 KiB
Plaintext

{
"metadata": {
"name": "",
"signature": "sha256:cf7223086a74b13d1ae2228a4c8545c401765a90cdb3eca418f18138a4afdaab"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to the GitHub repository](https://github.com/rasbt/python_reference)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%load_ext watermark\n",
"%watermark -a 'Sebastian Raschka' -v -d -p pandas"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Sebastian Raschka 24/01/2015 \n",
"\n",
"CPython 3.4.2\n",
"IPython 2.3.1\n",
"\n",
"pandas 0.15.2\n"
]
}
],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<font size=\"1.5em\">[More information](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/ipython_magic/watermark.ipynb) about the `watermark` magic command extension.</font>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Things in Pandas I Wish I'd Had Known Earlier"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is just a small but growing collection of pandas snippets that I find occasionally and particularly useful -- consider it as my personal notebook. Suggestions, tips, and contributions are very, very welcome!"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Sections"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- [Loading Some Example Data](#Loading-Some-Example-Data)\n",
"- [Renaming Columns](#Renaming-Columns)\n",
" - [Converting Column Names to Lowercase](#Converting-Column-Names-to-Lowercase)\n",
" - [Renaming Particular Columns](#Renaming-Particular-Columns)\n",
"- [Applying Computations Rows-wise](#Applying-Computations-Rows-wise)\n",
" - [Changing Values in a Column](#Changing-Values-in-a-Column)\n",
" - [Adding a New Column](#Adding-a-New-Column)\n",
"- [Missing Values aka NaNs](#Missing-Values-aka-NaNs)\n",
" - [Selecting NaN Rows](#Selecting-NaN-Rows)\n",
" - [Selecting non-NaN Rows](#Selecting-non-NaN-Rows)\n",
" - [Filling NaN Rows](#Filling-NaN-Rows)\n",
"- [Appending Rows to a DataFrame](#Appending-Rows-to-a-DataFrame)\n",
"- [Sorting and Reindexing DataFrames](#Sorting-and-Reindexing-DataFrames)\n",
"- [Updating Columns](#Updating-Columns)\n",
"- [Chaining Conditions - Using Bitwise Operators](#Chaining-Conditions---Using-Bitwise-Operators)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Loading Some Example Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I am heavily into sports prediction (via a machine learning approach) these days. So, let us use a (very) small subset of the soccer data that I am just working with."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import pandas as pd\n",
"\n",
"df = pd.read_csv('https://raw.githubusercontent.com/rasbt/python_reference/master/Data/some_soccer_data.csv')\n",
"df"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>PLAYER</th>\n",
" <th>SALARY</th>\n",
" <th>GP</th>\n",
" <th>G</th>\n",
" <th>A</th>\n",
" <th>SOT</th>\n",
" <th>PPG</th>\n",
" <th>P</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> Sergio Ag\u00fcero\\n Forward \u2014 Manchester City</td>\n",
" <td> $19.2m</td>\n",
" <td> 16</td>\n",
" <td> 14</td>\n",
" <td> 3</td>\n",
" <td> 34</td>\n",
" <td> 13.12</td>\n",
" <td> 209.98</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Eden Hazard\\n Midfield \u2014 Chelsea</td>\n",
" <td> $18.9m</td>\n",
" <td> 21</td>\n",
" <td> 8</td>\n",
" <td> 4</td>\n",
" <td> 17</td>\n",
" <td> 13.05</td>\n",
" <td> 274.04</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> Alexis S\u00e1nchez\\n Forward \u2014 Arsenal</td>\n",
" <td> $17.6m</td>\n",
" <td>NaN</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Yaya Tour\u00e9\\n Midfield \u2014 Manchester City</td>\n",
" <td> $16.6m</td>\n",
" <td> 18</td>\n",
" <td> 7</td>\n",
" <td> 1</td>\n",
" <td> 19</td>\n",
" <td> 10.99</td>\n",
" <td> 197.91</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> \u00c1ngel Di Mar\u00eda\\n Midfield \u2014 Manchester United</td>\n",
" <td> $15.0m</td>\n",
" <td> 13</td>\n",
" <td> 3</td>\n",
" <td>NaN</td>\n",
" <td> 13</td>\n",
" <td> 10.17</td>\n",
" <td> 132.23</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> Santiago Cazorla\\n Midfield \u2014 Arsenal</td>\n",
" <td> $14.8m</td>\n",
" <td> 20</td>\n",
" <td> 4</td>\n",
" <td>NaN</td>\n",
" <td> 20</td>\n",
" <td> 9.97</td>\n",
" <td> NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> David Silva\\n Midfield \u2014 Manchester City</td>\n",
" <td> $14.3m</td>\n",
" <td> 15</td>\n",
" <td> 6</td>\n",
" <td> 2</td>\n",
" <td> 11</td>\n",
" <td> 10.35</td>\n",
" <td> 155.26</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea</td>\n",
" <td> $14.0m</td>\n",
" <td> 20</td>\n",
" <td> 2</td>\n",
" <td> 14</td>\n",
" <td> 10</td>\n",
" <td> 10.47</td>\n",
" <td> 209.49</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Saido Berahino\\n Forward \u2014 West Brom</td>\n",
" <td> $13.8m</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> Steven Gerrard\\n Midfield \u2014 Liverpool</td>\n",
" <td> $13.8m</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 2,
"text": [
" PLAYER SALARY GP G A SOT \\\n",
"0 Sergio Ag\u00fcero\\n Forward \u2014 Manchester City $19.2m 16 14 3 34 \n",
"1 Eden Hazard\\n Midfield \u2014 Chelsea $18.9m 21 8 4 17 \n",
"2 Alexis S\u00e1nchez\\n Forward \u2014 Arsenal $17.6m NaN 12 7 29 \n",
"3 Yaya Tour\u00e9\\n Midfield \u2014 Manchester City $16.6m 18 7 1 19 \n",
"4 \u00c1ngel Di Mar\u00eda\\n Midfield \u2014 Manchester United $15.0m 13 3 NaN 13 \n",
"5 Santiago Cazorla\\n Midfield \u2014 Arsenal $14.8m 20 4 NaN 20 \n",
"6 David Silva\\n Midfield \u2014 Manchester City $14.3m 15 6 2 11 \n",
"7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea $14.0m 20 2 14 10 \n",
"8 Saido Berahino\\n Forward \u2014 West Brom $13.8m 21 9 0 20 \n",
"9 Steven Gerrard\\n Midfield \u2014 Liverpool $13.8m 20 5 1 11 \n",
"\n",
" PPG P \n",
"0 13.12 209.98 \n",
"1 13.05 274.04 \n",
"2 11.19 223.86 \n",
"3 10.99 197.91 \n",
"4 10.17 132.23 \n",
"5 9.97 NaN \n",
"6 10.35 155.26 \n",
"7 10.47 209.49 \n",
"8 7.02 147.43 \n",
"9 7.50 150.01 "
]
}
],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Renaming Columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Converting Column Names to Lowercase"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Converting column names to lowercase\n",
"\n",
"df.columns = [c.lower() for c in df.columns]\n",
"\n",
"# or\n",
"# df.rename(columns=lambda x : x.lower())\n",
"\n",
"df.tail(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>gp</th>\n",
" <th>g</th>\n",
" <th>a</th>\n",
" <th>sot</th>\n",
" <th>ppg</th>\n",
" <th>p</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea</td>\n",
" <td> $14.0m</td>\n",
" <td> 20</td>\n",
" <td> 2</td>\n",
" <td> 14</td>\n",
" <td> 10</td>\n",
" <td> 10.47</td>\n",
" <td> 209.49</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Saido Berahino\\n Forward \u2014 West Brom</td>\n",
" <td> $13.8m</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> Steven Gerrard\\n Midfield \u2014 Liverpool</td>\n",
" <td> $13.8m</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
" player salary gp g a sot ppg \\\n",
"7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea $14.0m 20 2 14 10 10.47 \n",
"8 Saido Berahino\\n Forward \u2014 West Brom $13.8m 21 9 0 20 7.02 \n",
"9 Steven Gerrard\\n Midfield \u2014 Liverpool $13.8m 20 5 1 11 7.50 \n",
"\n",
" p \n",
"7 209.49 \n",
"8 147.43 \n",
"9 150.01 "
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Renaming Particular Columns"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df = df.rename(columns={'p': 'points', \n",
" 'gp': 'games',\n",
" 'sot': 'shots_on_target',\n",
" 'g': 'goals',\n",
" 'ppg': 'points_per_game',\n",
" 'a': 'assists',})\n",
"\n",
"df.tail(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea</td>\n",
" <td> $14.0m</td>\n",
" <td> 20</td>\n",
" <td> 2</td>\n",
" <td> 14</td>\n",
" <td> 10</td>\n",
" <td> 10.47</td>\n",
" <td> 209.49</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Saido Berahino\\n Forward \u2014 West Brom</td>\n",
" <td> $13.8m</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> Steven Gerrard\\n Midfield \u2014 Liverpool</td>\n",
" <td> $13.8m</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 4,
"text": [
" player salary games goals assists \\\n",
"7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea $14.0m 20 2 14 \n",
"8 Saido Berahino\\n Forward \u2014 West Brom $13.8m 21 9 0 \n",
"9 Steven Gerrard\\n Midfield \u2014 Liverpool $13.8m 20 5 1 \n",
"\n",
" shots_on_target points_per_game points \n",
"7 10 10.47 209.49 \n",
"8 20 7.02 147.43 \n",
"9 11 7.50 150.01 "
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Applying Computations Rows-wise"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Changing Values in a Column"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Processing `salary` column\n",
"\n",
"df['salary'] = df['salary'].apply(lambda x: x.strip('$m'))\n",
"df.tail()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> Santiago Cazorla\\n Midfield \u2014 Arsenal</td>\n",
" <td> 14.8</td>\n",
" <td> 20</td>\n",
" <td> 4</td>\n",
" <td>NaN</td>\n",
" <td> 20</td>\n",
" <td> 9.97</td>\n",
" <td> NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> David Silva\\n Midfield \u2014 Manchester City</td>\n",
" <td> 14.3</td>\n",
" <td> 15</td>\n",
" <td> 6</td>\n",
" <td> 2</td>\n",
" <td> 11</td>\n",
" <td> 10.35</td>\n",
" <td> 155.26</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea</td>\n",
" <td> 14.0</td>\n",
" <td> 20</td>\n",
" <td> 2</td>\n",
" <td> 14</td>\n",
" <td> 10</td>\n",
" <td> 10.47</td>\n",
" <td> 209.49</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Saido Berahino\\n Forward \u2014 West Brom</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> Steven Gerrard\\n Midfield \u2014 Liverpool</td>\n",
" <td> 13.8</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 5,
"text": [
" player salary games goals assists \\\n",
"5 Santiago Cazorla\\n Midfield \u2014 Arsenal 14.8 20 4 NaN \n",
"6 David Silva\\n Midfield \u2014 Manchester City 14.3 15 6 2 \n",
"7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea 14.0 20 2 14 \n",
"8 Saido Berahino\\n Forward \u2014 West Brom 13.8 21 9 0 \n",
"9 Steven Gerrard\\n Midfield \u2014 Liverpool 13.8 20 5 1 \n",
"\n",
" shots_on_target points_per_game points \n",
"5 20 9.97 NaN \n",
"6 11 10.35 155.26 \n",
"7 10 10.47 209.49 \n",
"8 20 7.02 147.43 \n",
"9 11 7.50 150.01 "
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Adding a New Column"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df['team'] = pd.Series('', index=df.index)\n",
"\n",
"# or\n",
"df.insert(loc=8, column='position', value='') \n",
"\n",
"df.tail(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea</td>\n",
" <td> 14.0</td>\n",
" <td> 20</td>\n",
" <td> 2</td>\n",
" <td> 14</td>\n",
" <td> 10</td>\n",
" <td> 10.47</td>\n",
" <td> 209.49</td>\n",
" <td> </td>\n",
" <td> </td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Saido Berahino\\n Forward \u2014 West Brom</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> </td>\n",
" <td> </td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> Steven Gerrard\\n Midfield \u2014 Liverpool</td>\n",
" <td> 13.8</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" <td> </td>\n",
" <td> </td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 6,
"text": [
" player salary games goals assists \\\n",
"7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea 14.0 20 2 14 \n",
"8 Saido Berahino\\n Forward \u2014 West Brom 13.8 21 9 0 \n",
"9 Steven Gerrard\\n Midfield \u2014 Liverpool 13.8 20 5 1 \n",
"\n",
" shots_on_target points_per_game points position team \n",
"7 10 10.47 209.49 \n",
"8 20 7.02 147.43 \n",
"9 11 7.50 150.01 "
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Processing `player` column\n",
"\n",
"def process_player_col(text):\n",
" name, rest = text.split('\\n')\n",
" position, team = [x.strip() for x in rest.split(' \u2014 ')]\n",
" return pd.Series([name, team, position])\n",
"\n",
"df[['player', 'team', 'position']] = df.player.apply(process_player_col)\n",
"\n",
"# modified after tip from reddit.com/user/hharison\n",
"#\n",
"# Alternative (inferior) approach:\n",
"#\n",
"#for idx,row in df.iterrows():\n",
"# name, position, team = process_player_col(row['player'])\n",
"# df.ix[idx, 'player'], df.ix[idx, 'position'], df.ix[idx, 'team'] = name, position, team\n",
" \n",
"df.tail(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Cesc F\u00e0bregas</td>\n",
" <td> 14.0</td>\n",
" <td> 20</td>\n",
" <td> 2</td>\n",
" <td> 14</td>\n",
" <td> 10</td>\n",
" <td> 10.47</td>\n",
" <td> 209.49</td>\n",
" <td> Midfield</td>\n",
" <td> Chelsea</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Saido Berahino</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> Steven Gerrard</td>\n",
" <td> 13.8</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" <td> Midfield</td>\n",
" <td> Liverpool</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 8,
"text": [
" player salary games goals assists shots_on_target \\\n",
"7 Cesc F\u00e0bregas 14.0 20 2 14 10 \n",
"8 Saido Berahino 13.8 21 9 0 20 \n",
"9 Steven Gerrard 13.8 20 5 1 11 \n",
"\n",
" points_per_game points position team \n",
"7 10.47 209.49 Midfield Chelsea \n",
"8 7.02 147.43 Forward West Brom \n",
"9 7.50 150.01 Midfield Liverpool "
]
}
],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Missing Values aka NaNs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Selecting NaN Rows"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Selecting all rows that have NaNs in the `assists` column\n",
"\n",
"df[df['assists'].isnull()]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> \u00c1ngel Di Mar\u00eda</td>\n",
" <td> 15.0</td>\n",
" <td> 13</td>\n",
" <td> 3</td>\n",
" <td>NaN</td>\n",
" <td> 13</td>\n",
" <td> 10.17</td>\n",
" <td> 132.23</td>\n",
" <td> Midfield</td>\n",
" <td> Manchester United</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> Santiago Cazorla</td>\n",
" <td> 14.8</td>\n",
" <td> 20</td>\n",
" <td> 4</td>\n",
" <td>NaN</td>\n",
" <td> 20</td>\n",
" <td> 9.97</td>\n",
" <td> NaN</td>\n",
" <td> Midfield</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 9,
"text": [
" player salary games goals assists shots_on_target \\\n",
"4 \u00c1ngel Di Mar\u00eda 15.0 13 3 NaN 13 \n",
"5 Santiago Cazorla 14.8 20 4 NaN 20 \n",
"\n",
" points_per_game points position team \n",
"4 10.17 132.23 Midfield Manchester United \n",
"5 9.97 NaN Midfield Arsenal "
]
}
],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Selecting non-NaN Rows"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df[df['assists'].notnull()]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> Sergio Ag\u00fcero</td>\n",
" <td> 19.2</td>\n",
" <td> 16</td>\n",
" <td> 14</td>\n",
" <td> 3</td>\n",
" <td> 34</td>\n",
" <td> 13.12</td>\n",
" <td> 209.98</td>\n",
" <td> Forward</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Eden Hazard</td>\n",
" <td> 18.9</td>\n",
" <td> 21</td>\n",
" <td> 8</td>\n",
" <td> 4</td>\n",
" <td> 17</td>\n",
" <td> 13.05</td>\n",
" <td> 274.04</td>\n",
" <td> Midfield</td>\n",
" <td> Chelsea</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> Alexis S\u00e1nchez</td>\n",
" <td> 17.6</td>\n",
" <td>NaN</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Yaya Tour\u00e9</td>\n",
" <td> 16.6</td>\n",
" <td> 18</td>\n",
" <td> 7</td>\n",
" <td> 1</td>\n",
" <td> 19</td>\n",
" <td> 10.99</td>\n",
" <td> 197.91</td>\n",
" <td> Midfield</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> David Silva</td>\n",
" <td> 14.3</td>\n",
" <td> 15</td>\n",
" <td> 6</td>\n",
" <td> 2</td>\n",
" <td> 11</td>\n",
" <td> 10.35</td>\n",
" <td> 155.26</td>\n",
" <td> Midfield</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Cesc F\u00e0bregas</td>\n",
" <td> 14.0</td>\n",
" <td> 20</td>\n",
" <td> 2</td>\n",
" <td> 14</td>\n",
" <td> 10</td>\n",
" <td> 10.47</td>\n",
" <td> 209.49</td>\n",
" <td> Midfield</td>\n",
" <td> Chelsea</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Saido Berahino</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> Steven Gerrard</td>\n",
" <td> 13.8</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" <td> Midfield</td>\n",
" <td> Liverpool</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 10,
"text": [
" player salary games goals assists shots_on_target \\\n",
"0 Sergio Ag\u00fcero 19.2 16 14 3 34 \n",
"1 Eden Hazard 18.9 21 8 4 17 \n",
"2 Alexis S\u00e1nchez 17.6 NaN 12 7 29 \n",
"3 Yaya Tour\u00e9 16.6 18 7 1 19 \n",
"6 David Silva 14.3 15 6 2 11 \n",
"7 Cesc F\u00e0bregas 14.0 20 2 14 10 \n",
"8 Saido Berahino 13.8 21 9 0 20 \n",
"9 Steven Gerrard 13.8 20 5 1 11 \n",
"\n",
" points_per_game points position team \n",
"0 13.12 209.98 Forward Manchester City \n",
"1 13.05 274.04 Midfield Chelsea \n",
"2 11.19 223.86 Forward Arsenal \n",
"3 10.99 197.91 Midfield Manchester City \n",
"6 10.35 155.26 Midfield Manchester City \n",
"7 10.47 209.49 Midfield Chelsea \n",
"8 7.02 147.43 Forward West Brom \n",
"9 7.50 150.01 Midfield Liverpool "
]
}
],
"prompt_number": 10
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Filling NaN Rows"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Filling NaN cells with default value 0\n",
"\n",
"df = df.fillna(value=0)\n",
"df"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> Sergio Ag\u00fcero</td>\n",
" <td> 19.2</td>\n",
" <td> 16</td>\n",
" <td> 14</td>\n",
" <td> 3</td>\n",
" <td> 34</td>\n",
" <td> 13.12</td>\n",
" <td> 209.98</td>\n",
" <td> Forward</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Eden Hazard</td>\n",
" <td> 18.9</td>\n",
" <td> 21</td>\n",
" <td> 8</td>\n",
" <td> 4</td>\n",
" <td> 17</td>\n",
" <td> 13.05</td>\n",
" <td> 274.04</td>\n",
" <td> Midfield</td>\n",
" <td> Chelsea</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> Alexis S\u00e1nchez</td>\n",
" <td> 17.6</td>\n",
" <td> 0</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Yaya Tour\u00e9</td>\n",
" <td> 16.6</td>\n",
" <td> 18</td>\n",
" <td> 7</td>\n",
" <td> 1</td>\n",
" <td> 19</td>\n",
" <td> 10.99</td>\n",
" <td> 197.91</td>\n",
" <td> Midfield</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> \u00c1ngel Di Mar\u00eda</td>\n",
" <td> 15.0</td>\n",
" <td> 13</td>\n",
" <td> 3</td>\n",
" <td> 0</td>\n",
" <td> 13</td>\n",
" <td> 10.17</td>\n",
" <td> 132.23</td>\n",
" <td> Midfield</td>\n",
" <td> Manchester United</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> Santiago Cazorla</td>\n",
" <td> 14.8</td>\n",
" <td> 20</td>\n",
" <td> 4</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 9.97</td>\n",
" <td> 0.00</td>\n",
" <td> Midfield</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> David Silva</td>\n",
" <td> 14.3</td>\n",
" <td> 15</td>\n",
" <td> 6</td>\n",
" <td> 2</td>\n",
" <td> 11</td>\n",
" <td> 10.35</td>\n",
" <td> 155.26</td>\n",
" <td> Midfield</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Cesc F\u00e0bregas</td>\n",
" <td> 14.0</td>\n",
" <td> 20</td>\n",
" <td> 2</td>\n",
" <td> 14</td>\n",
" <td> 10</td>\n",
" <td> 10.47</td>\n",
" <td> 209.49</td>\n",
" <td> Midfield</td>\n",
" <td> Chelsea</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Saido Berahino</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> Steven Gerrard</td>\n",
" <td> 13.8</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" <td> Midfield</td>\n",
" <td> Liverpool</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 11,
"text": [
" player salary games goals assists shots_on_target \\\n",
"0 Sergio Ag\u00fcero 19.2 16 14 3 34 \n",
"1 Eden Hazard 18.9 21 8 4 17 \n",
"2 Alexis S\u00e1nchez 17.6 0 12 7 29 \n",
"3 Yaya Tour\u00e9 16.6 18 7 1 19 \n",
"4 \u00c1ngel Di Mar\u00eda 15.0 13 3 0 13 \n",
"5 Santiago Cazorla 14.8 20 4 0 20 \n",
"6 David Silva 14.3 15 6 2 11 \n",
"7 Cesc F\u00e0bregas 14.0 20 2 14 10 \n",
"8 Saido Berahino 13.8 21 9 0 20 \n",
"9 Steven Gerrard 13.8 20 5 1 11 \n",
"\n",
" points_per_game points position team \n",
"0 13.12 209.98 Forward Manchester City \n",
"1 13.05 274.04 Midfield Chelsea \n",
"2 11.19 223.86 Forward Arsenal \n",
"3 10.99 197.91 Midfield Manchester City \n",
"4 10.17 132.23 Midfield Manchester United \n",
"5 9.97 0.00 Midfield Arsenal \n",
"6 10.35 155.26 Midfield Manchester City \n",
"7 10.47 209.49 Midfield Chelsea \n",
"8 7.02 147.43 Forward West Brom \n",
"9 7.50 150.01 Midfield Liverpool "
]
}
],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Appending Rows to a DataFrame"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Adding an \"empty\" row to the DataFrame\n",
"\n",
"import numpy as np\n",
"\n",
"df = df.append(pd.Series(\n",
" [np.nan]*len(df.columns), # Fill cells with NaNs\n",
" index=df.columns), \n",
" ignore_index=True)\n",
"\n",
"df.tail(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>8 </th>\n",
" <td> Saido Berahino</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9 </th>\n",
" <td> Steven Gerrard</td>\n",
" <td> 13.8</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" <td> Midfield</td>\n",
" <td> Liverpool</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td> NaN</td>\n",
" <td> NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td> NaN</td>\n",
" <td> NaN</td>\n",
" <td> NaN</td>\n",
" <td> NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 12,
"text": [
" player salary games goals assists shots_on_target \\\n",
"8 Saido Berahino 13.8 21 9 0 20 \n",
"9 Steven Gerrard 13.8 20 5 1 11 \n",
"10 NaN NaN NaN NaN NaN NaN \n",
"\n",
" points_per_game points position team \n",
"8 7.02 147.43 Forward West Brom \n",
"9 7.50 150.01 Midfield Liverpool \n",
"10 NaN NaN NaN NaN "
]
}
],
"prompt_number": 12
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Filling cells with data\n",
"\n",
"df.loc[df.index[-1], 'player'] = 'New Player'\n",
"df.loc[df.index[-1], 'salary'] = 12.3\n",
"df.tail(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>8 </th>\n",
" <td> Saido Berahino</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9 </th>\n",
" <td> Steven Gerrard</td>\n",
" <td> 13.8</td>\n",
" <td> 20</td>\n",
" <td> 5</td>\n",
" <td> 1</td>\n",
" <td> 11</td>\n",
" <td> 7.50</td>\n",
" <td> 150.01</td>\n",
" <td> Midfield</td>\n",
" <td> Liverpool</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td> New Player</td>\n",
" <td> 12.3</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td> NaN</td>\n",
" <td> NaN</td>\n",
" <td> NaN</td>\n",
" <td> NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 13,
"text": [
" player salary games goals assists shots_on_target \\\n",
"8 Saido Berahino 13.8 21 9 0 20 \n",
"9 Steven Gerrard 13.8 20 5 1 11 \n",
"10 New Player 12.3 NaN NaN NaN NaN \n",
"\n",
" points_per_game points position team \n",
"8 7.02 147.43 Forward West Brom \n",
"9 7.50 150.01 Midfield Liverpool \n",
"10 NaN NaN NaN NaN "
]
}
],
"prompt_number": 13
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Sorting and Reindexing DataFrames"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Sorting the DataFrame by a certain column (from highest to lowest)\n",
"\n",
"df = df.sort('goals', ascending=False)\n",
"df.head()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> Sergio Ag\u00fcero</td>\n",
" <td> 19.2</td>\n",
" <td> 16</td>\n",
" <td> 14</td>\n",
" <td> 3</td>\n",
" <td> 34</td>\n",
" <td> 13.12</td>\n",
" <td> 209.98</td>\n",
" <td> Forward</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> Alexis S\u00e1nchez</td>\n",
" <td> 17.6</td>\n",
" <td> 0</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Saido Berahino</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Eden Hazard</td>\n",
" <td> 18.9</td>\n",
" <td> 21</td>\n",
" <td> 8</td>\n",
" <td> 4</td>\n",
" <td> 17</td>\n",
" <td> 13.05</td>\n",
" <td> 274.04</td>\n",
" <td> Midfield</td>\n",
" <td> Chelsea</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Yaya Tour\u00e9</td>\n",
" <td> 16.6</td>\n",
" <td> 18</td>\n",
" <td> 7</td>\n",
" <td> 1</td>\n",
" <td> 19</td>\n",
" <td> 10.99</td>\n",
" <td> 197.91</td>\n",
" <td> Midfield</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 14,
"text": [
" player salary games goals assists shots_on_target \\\n",
"0 Sergio Ag\u00fcero 19.2 16 14 3 34 \n",
"2 Alexis S\u00e1nchez 17.6 0 12 7 29 \n",
"8 Saido Berahino 13.8 21 9 0 20 \n",
"1 Eden Hazard 18.9 21 8 4 17 \n",
"3 Yaya Tour\u00e9 16.6 18 7 1 19 \n",
"\n",
" points_per_game points position team \n",
"0 13.12 209.98 Forward Manchester City \n",
"2 11.19 223.86 Forward Arsenal \n",
"8 7.02 147.43 Forward West Brom \n",
"1 13.05 274.04 Midfield Chelsea \n",
"3 10.99 197.91 Midfield Manchester City "
]
}
],
"prompt_number": 14
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Optional reindexing of the DataFrame after sorting\n",
"\n",
"df.index = range(1,len(df.index)+1)\n",
"df.head()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Sergio Ag\u00fcero</td>\n",
" <td> 19.2</td>\n",
" <td> 16</td>\n",
" <td> 14</td>\n",
" <td> 3</td>\n",
" <td> 34</td>\n",
" <td> 13.12</td>\n",
" <td> 209.98</td>\n",
" <td> Forward</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> Alexis S\u00e1nchez</td>\n",
" <td> 17.6</td>\n",
" <td> 0</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Saido Berahino</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> Eden Hazard</td>\n",
" <td> 18.9</td>\n",
" <td> 21</td>\n",
" <td> 8</td>\n",
" <td> 4</td>\n",
" <td> 17</td>\n",
" <td> 13.05</td>\n",
" <td> 274.04</td>\n",
" <td> Midfield</td>\n",
" <td> Chelsea</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> Yaya Tour\u00e9</td>\n",
" <td> 16.6</td>\n",
" <td> 18</td>\n",
" <td> 7</td>\n",
" <td> 1</td>\n",
" <td> 19</td>\n",
" <td> 10.99</td>\n",
" <td> 197.91</td>\n",
" <td> Midfield</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 15,
"text": [
" player salary games goals assists shots_on_target \\\n",
"1 Sergio Ag\u00fcero 19.2 16 14 3 34 \n",
"2 Alexis S\u00e1nchez 17.6 0 12 7 29 \n",
"3 Saido Berahino 13.8 21 9 0 20 \n",
"4 Eden Hazard 18.9 21 8 4 17 \n",
"5 Yaya Tour\u00e9 16.6 18 7 1 19 \n",
"\n",
" points_per_game points position team \n",
"1 13.12 209.98 Forward Manchester City \n",
"2 11.19 223.86 Forward Arsenal \n",
"3 7.02 147.43 Forward West Brom \n",
"4 13.05 274.04 Midfield Chelsea \n",
"5 10.99 197.91 Midfield Manchester City "
]
}
],
"prompt_number": 15
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Updating Columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Creating a dummy DataFrame with changes in the `salary` column\n",
"\n",
"df_2 = df.copy()\n",
"df_2.loc[0:2, 'salary'] = [20.0, 15.0]\n",
"df_2.head(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Sergio Ag\u00fcero</td>\n",
" <td> 20</td>\n",
" <td> 16</td>\n",
" <td> 14</td>\n",
" <td> 3</td>\n",
" <td> 34</td>\n",
" <td> 13.12</td>\n",
" <td> 209.98</td>\n",
" <td> Forward</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> Alexis S\u00e1nchez</td>\n",
" <td> 15</td>\n",
" <td> 0</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Saido Berahino</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 16,
"text": [
" player salary games goals assists shots_on_target \\\n",
"1 Sergio Ag\u00fcero 20 16 14 3 34 \n",
"2 Alexis S\u00e1nchez 15 0 12 7 29 \n",
"3 Saido Berahino 13.8 21 9 0 20 \n",
"\n",
" points_per_game points position team \n",
"1 13.12 209.98 Forward Manchester City \n",
"2 11.19 223.86 Forward Arsenal \n",
"3 7.02 147.43 Forward West Brom "
]
}
],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Temporarily use the `player` columns as indices to \n",
"# apply the update functions\n",
"\n",
"df.set_index('player', inplace=True)\n",
"df_2.set_index('player', inplace=True)\n",
"df.head(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" <tr>\n",
" <th>player</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Sergio Ag\u00fcero</th>\n",
" <td> 19.2</td>\n",
" <td> 16</td>\n",
" <td> 14</td>\n",
" <td> 3</td>\n",
" <td> 34</td>\n",
" <td> 13.12</td>\n",
" <td> 209.98</td>\n",
" <td> Forward</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Alexis S\u00e1nchez</th>\n",
" <td> 17.6</td>\n",
" <td> 0</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Saido Berahino</th>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 17,
"text": [
" salary games goals assists shots_on_target \\\n",
"player \n",
"Sergio Ag\u00fcero 19.2 16 14 3 34 \n",
"Alexis S\u00e1nchez 17.6 0 12 7 29 \n",
"Saido Berahino 13.8 21 9 0 20 \n",
"\n",
" points_per_game points position team \n",
"player \n",
"Sergio Ag\u00fcero 13.12 209.98 Forward Manchester City \n",
"Alexis S\u00e1nchez 11.19 223.86 Forward Arsenal \n",
"Saido Berahino 7.02 147.43 Forward West Brom "
]
}
],
"prompt_number": 17
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Update the `salary` column\n",
"df.update(other=df_2['salary'], overwrite=True)\n",
"df.head(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" <tr>\n",
" <th>player</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Sergio Ag\u00fcero</th>\n",
" <td> 20</td>\n",
" <td> 16</td>\n",
" <td> 14</td>\n",
" <td> 3</td>\n",
" <td> 34</td>\n",
" <td> 13.12</td>\n",
" <td> 209.98</td>\n",
" <td> Forward</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Alexis S\u00e1nchez</th>\n",
" <td> 15</td>\n",
" <td> 0</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Saido Berahino</th>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 18,
"text": [
" salary games goals assists shots_on_target \\\n",
"player \n",
"Sergio Ag\u00fcero 20 16 14 3 34 \n",
"Alexis S\u00e1nchez 15 0 12 7 29 \n",
"Saido Berahino 13.8 21 9 0 20 \n",
"\n",
" points_per_game points position team \n",
"player \n",
"Sergio Ag\u00fcero 13.12 209.98 Forward Manchester City \n",
"Alexis S\u00e1nchez 11.19 223.86 Forward Arsenal \n",
"Saido Berahino 7.02 147.43 Forward West Brom "
]
}
],
"prompt_number": 18
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Reset the indices\n",
"df.reset_index(inplace=True)\n",
"df.head(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> Sergio Ag\u00fcero</td>\n",
" <td> 20</td>\n",
" <td> 16</td>\n",
" <td> 14</td>\n",
" <td> 3</td>\n",
" <td> 34</td>\n",
" <td> 13.12</td>\n",
" <td> 209.98</td>\n",
" <td> Forward</td>\n",
" <td> Manchester City</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Alexis S\u00e1nchez</td>\n",
" <td> 15</td>\n",
" <td> 0</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> Saido Berahino</td>\n",
" <td> 13.8</td>\n",
" <td> 21</td>\n",
" <td> 9</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 7.02</td>\n",
" <td> 147.43</td>\n",
" <td> Forward</td>\n",
" <td> West Brom</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 19,
"text": [
" player salary games goals assists shots_on_target \\\n",
"0 Sergio Ag\u00fcero 20 16 14 3 34 \n",
"1 Alexis S\u00e1nchez 15 0 12 7 29 \n",
"2 Saido Berahino 13.8 21 9 0 20 \n",
"\n",
" points_per_game points position team \n",
"0 13.12 209.98 Forward Manchester City \n",
"1 11.19 223.86 Forward Arsenal \n",
"2 7.02 147.43 Forward West Brom "
]
}
],
"prompt_number": 19
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Chaining Conditions - Using Bitwise Operators"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Selecting only those players that either playing for Arsenal or Chelsea\n",
"\n",
"df[ (df['team'] == 'Arsenal') | (df['team'] == 'Chelsea') ]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Alexis S\u00e1nchez</td>\n",
" <td> 15</td>\n",
" <td> 0</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Eden Hazard</td>\n",
" <td> 18.9</td>\n",
" <td> 21</td>\n",
" <td> 8</td>\n",
" <td> 4</td>\n",
" <td> 17</td>\n",
" <td> 13.05</td>\n",
" <td> 274.04</td>\n",
" <td> Midfield</td>\n",
" <td> Chelsea</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Santiago Cazorla</td>\n",
" <td> 14.8</td>\n",
" <td> 20</td>\n",
" <td> 4</td>\n",
" <td> 0</td>\n",
" <td> 20</td>\n",
" <td> 9.97</td>\n",
" <td> 0.00</td>\n",
" <td> Midfield</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> Cesc F\u00e0bregas</td>\n",
" <td> 14.0</td>\n",
" <td> 20</td>\n",
" <td> 2</td>\n",
" <td> 14</td>\n",
" <td> 10</td>\n",
" <td> 10.47</td>\n",
" <td> 209.49</td>\n",
" <td> Midfield</td>\n",
" <td> Chelsea</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 20,
"text": [
" player salary games goals assists shots_on_target \\\n",
"1 Alexis S\u00e1nchez 15 0 12 7 29 \n",
"3 Eden Hazard 18.9 21 8 4 17 \n",
"7 Santiago Cazorla 14.8 20 4 0 20 \n",
"9 Cesc F\u00e0bregas 14.0 20 2 14 10 \n",
"\n",
" points_per_game points position team \n",
"1 11.19 223.86 Forward Arsenal \n",
"3 13.05 274.04 Midfield Chelsea \n",
"7 9.97 0.00 Midfield Arsenal \n",
"9 10.47 209.49 Midfield Chelsea "
]
}
],
"prompt_number": 20
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Selecting forwards from Arsenal only\n",
"\n",
"df[ (df['team'] == 'Arsenal') & (df['position'] == 'Forward') ]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>games</th>\n",
" <th>goals</th>\n",
" <th>assists</th>\n",
" <th>shots_on_target</th>\n",
" <th>points_per_game</th>\n",
" <th>points</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Alexis S\u00e1nchez</td>\n",
" <td> 15</td>\n",
" <td> 0</td>\n",
" <td> 12</td>\n",
" <td> 7</td>\n",
" <td> 29</td>\n",
" <td> 11.19</td>\n",
" <td> 223.86</td>\n",
" <td> Forward</td>\n",
" <td> Arsenal</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 22,
"text": [
" player salary games goals assists shots_on_target \\\n",
"1 Alexis S\u00e1nchez 15 0 12 7 29 \n",
"\n",
" points_per_game points position team \n",
"1 11.19 223.86 Forward Arsenal "
]
}
],
"prompt_number": 22
}
],
"metadata": {}
}
]
}