column types

This commit is contained in:
rasbt 2015-01-28 22:47:03 -05:00
parent 40d5dd93a5
commit c20584ea6c

View File

@ -1,7 +1,7 @@
{ {
"metadata": { "metadata": {
"name": "", "name": "",
"signature": "sha256:2dd2df7144acb424c4be84b5679ebc4d06bf16dc6a3e0f7fd3d1ed0595479f64" "signature": "sha256:c8ab1a3c99e7c72951c91e74991b8837884cd9e3863f1cd1833651e180ff32bd"
}, },
"nbformat": 3, "nbformat": 3,
"nbformat_minor": 0, "nbformat_minor": 0,
@ -38,7 +38,7 @@
] ]
} }
], ],
"prompt_number": 2 "prompt_number": 1
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -99,7 +99,10 @@
"- [Sorting and Reindexing DataFrames](#Sorting-and-Reindexing-DataFrames)\n", "- [Sorting and Reindexing DataFrames](#Sorting-and-Reindexing-DataFrames)\n",
"- [Updating Columns](#Updating-Columns)\n", "- [Updating Columns](#Updating-Columns)\n",
"- [Chaining Conditions - Using Bitwise Operators](#Chaining-Conditions---Using-Bitwise-Operators)\n", "- [Chaining Conditions - Using Bitwise Operators](#Chaining-Conditions---Using-Bitwise-Operators)\n",
"- [Getting an Overview of the Column Types](#Getting-an-Overview-of-the-Column-Types)" "- [Column Types](#Column-Types)\n",
" - [Printing Column Types](#Printing-Column-Types)\n",
" - [Selecting by Column Type](#Selecting-by-Column-Type)\n",
" - [Converting Column Types](#Converting-Column-Types)"
] ]
}, },
{ {
@ -278,7 +281,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 3, "prompt_number": 2,
"text": [ "text": [
" PLAYER SALARY GP G A SOT \\\n", " PLAYER SALARY GP G A SOT \\\n",
"0 Sergio Ag\u00fcero\\n Forward \u2014 Manchester City $19.2m 16 14 3 34 \n", "0 Sergio Ag\u00fcero\\n Forward \u2014 Manchester City $19.2m 16 14 3 34 \n",
@ -306,7 +309,7 @@
] ]
} }
], ],
"prompt_number": 3 "prompt_number": 2
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -420,7 +423,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 4, "prompt_number": 3,
"text": [ "text": [
" player salary gp g a sot ppg \\\n", " player salary gp g a sot ppg \\\n",
"7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea $14.0m 20 2 14 10 10.47 \n", "7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea $14.0m 20 2 14 10 10.47 \n",
@ -434,7 +437,7 @@
] ]
} }
], ],
"prompt_number": 4 "prompt_number": 3
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -525,7 +528,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 5, "prompt_number": 4,
"text": [ "text": [
" player salary games goals assists \\\n", " player salary games goals assists \\\n",
"7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea $14.0m 20 2 14 \n", "7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea $14.0m 20 2 14 \n",
@ -539,7 +542,7 @@
] ]
} }
], ],
"prompt_number": 5 "prompt_number": 4
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -671,7 +674,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 6, "prompt_number": 5,
"text": [ "text": [
" player salary games goals assists \\\n", " player salary games goals assists \\\n",
"5 Santiago Cazorla\\n Midfield \u2014 Arsenal 14.8 20 4 NaN \n", "5 Santiago Cazorla\\n Midfield \u2014 Arsenal 14.8 20 4 NaN \n",
@ -689,7 +692,7 @@
] ]
} }
], ],
"prompt_number": 6 "prompt_number": 5
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -786,7 +789,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 7, "prompt_number": 6,
"text": [ "text": [
" player salary games goals assists \\\n", " player salary games goals assists \\\n",
"7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea 14.0 20 2 14 \n", "7 Cesc F\u00e0bregas\\n Midfield \u2014 Chelsea 14.0 20 2 14 \n",
@ -800,7 +803,7 @@
] ]
} }
], ],
"prompt_number": 7 "prompt_number": 6
}, },
{ {
"cell_type": "code", "cell_type": "code",
@ -893,7 +896,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 8, "prompt_number": 7,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"7 Cesc F\u00e0bregas 14.0 20 2 14 10 \n", "7 Cesc F\u00e0bregas 14.0 20 2 14 10 \n",
@ -907,7 +910,7 @@
] ]
} }
], ],
"prompt_number": 8 "prompt_number": 7
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -1027,7 +1030,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 9, "prompt_number": 8,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"0 sergio ag\u00fcero 19.2 16 14 3 34 \n", "0 sergio ag\u00fcero 19.2 16 14 3 34 \n",
@ -1045,7 +1048,7 @@
] ]
} }
], ],
"prompt_number": 9 "prompt_number": 8
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -1105,7 +1108,7 @@
] ]
} }
], ],
"prompt_number": 10 "prompt_number": 9
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -1186,7 +1189,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 11, "prompt_number": 10,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"4 \u00e1ngel di mar\u00eda 15.0 13 3 NaN 13 \n", "4 \u00e1ngel di mar\u00eda 15.0 13 3 NaN 13 \n",
@ -1198,7 +1201,7 @@
] ]
} }
], ],
"prompt_number": 11 "prompt_number": 10
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -1355,7 +1358,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 12, "prompt_number": 11,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"0 sergio ag\u00fcero 19.2 16 14 3 34 \n", "0 sergio ag\u00fcero 19.2 16 14 3 34 \n",
@ -1379,7 +1382,7 @@
] ]
} }
], ],
"prompt_number": 12 "prompt_number": 11
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -1565,7 +1568,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 13, "prompt_number": 12,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"0 sergio ag\u00fcero 19.2 16 14 3 34 \n", "0 sergio ag\u00fcero 19.2 16 14 3 34 \n",
@ -1593,7 +1596,7 @@
] ]
} }
], ],
"prompt_number": 13 "prompt_number": 12
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -1701,7 +1704,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 14, "prompt_number": 13,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"8 saido berahino 13.8 21 9 0 20 \n", "8 saido berahino 13.8 21 9 0 20 \n",
@ -1715,7 +1718,7 @@
] ]
} }
], ],
"prompt_number": 14 "prompt_number": 13
}, },
{ {
"cell_type": "code", "cell_type": "code",
@ -1795,7 +1798,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 15, "prompt_number": 14,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"8 saido berahino 13.8 21 9 0 20 \n", "8 saido berahino 13.8 21 9 0 20 \n",
@ -1809,7 +1812,7 @@
] ]
} }
], ],
"prompt_number": 15 "prompt_number": 14
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -1937,7 +1940,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 16, "prompt_number": 15,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"0 sergio ag\u00fcero 19.2 16 14 3 34 \n", "0 sergio ag\u00fcero 19.2 16 14 3 34 \n",
@ -1955,7 +1958,7 @@
] ]
} }
], ],
"prompt_number": 16 "prompt_number": 15
}, },
{ {
"cell_type": "code", "cell_type": "code",
@ -2060,7 +2063,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 17, "prompt_number": 16,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"1 sergio ag\u00fcero 19.2 16 14 3 34 \n", "1 sergio ag\u00fcero 19.2 16 14 3 34 \n",
@ -2078,7 +2081,7 @@
] ]
} }
], ],
"prompt_number": 17 "prompt_number": 16
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -2181,7 +2184,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 18, "prompt_number": 17,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"1 sergio ag\u00fcero 20 16 14 3 34 \n", "1 sergio ag\u00fcero 20 16 14 3 34 \n",
@ -2195,7 +2198,7 @@
] ]
} }
], ],
"prompt_number": 18 "prompt_number": 17
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -2292,7 +2295,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 19, "prompt_number": 18,
"text": [ "text": [
" salary games goals assists shots_on_target \\\n", " salary games goals assists shots_on_target \\\n",
"player \n", "player \n",
@ -2308,7 +2311,7 @@
] ]
} }
], ],
"prompt_number": 19 "prompt_number": 18
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -2402,7 +2405,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 20, "prompt_number": 19,
"text": [ "text": [
" salary games goals assists shots_on_target \\\n", " salary games goals assists shots_on_target \\\n",
"player \n", "player \n",
@ -2418,7 +2421,7 @@
] ]
} }
], ],
"prompt_number": 20 "prompt_number": 19
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -2504,7 +2507,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 21, "prompt_number": 20,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"0 sergio ag\u00fcero 20 16 14 3 34 \n", "0 sergio ag\u00fcero 20 16 14 3 34 \n",
@ -2518,7 +2521,7 @@
] ]
} }
], ],
"prompt_number": 21 "prompt_number": 20
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@ -2632,7 +2635,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 22, "prompt_number": 21,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"1 alexis s\u00e1nchez 15 0 12 7 29 \n", "1 alexis s\u00e1nchez 15 0 12 7 29 \n",
@ -2648,7 +2651,7 @@
] ]
} }
], ],
"prompt_number": 22 "prompt_number": 21
}, },
{ {
"cell_type": "code", "cell_type": "code",
@ -2700,7 +2703,7 @@
], ],
"metadata": {}, "metadata": {},
"output_type": "pyout", "output_type": "pyout",
"prompt_number": 23, "prompt_number": 22,
"text": [ "text": [
" player salary games goals assists shots_on_target \\\n", " player salary games goals assists shots_on_target \\\n",
"1 alexis s\u00e1nchez 15 0 12 7 29 \n", "1 alexis s\u00e1nchez 15 0 12 7 29 \n",
@ -2710,6 +2713,72 @@
] ]
} }
], ],
"prompt_number": 22
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Column Types"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Printing Column Types"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"types = df.columns.to_series().groupby(df.dtypes).groups\n",
"types"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 23,
"text": [
"{dtype('float64'): ['games',\n",
" 'goals',\n",
" 'assists',\n",
" 'shots_on_target',\n",
" 'points_per_game',\n",
" 'points'],\n",
" dtype('O'): ['player', 'salary', 'position', 'team']}"
]
}
],
"prompt_number": 23 "prompt_number": 23
}, },
{ {
@ -2722,48 +2791,144 @@
}, },
{ {
"cell_type": "heading", "cell_type": "heading",
"level": 1, "level": 3,
"metadata": {}, "metadata": {},
"source": [ "source": [
"Getting an Overview of the Column Types" "Selecting by Column Type"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[[back to section overview](#Sections)]"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"collapsed": false, "collapsed": false,
"input": [ "input": [
"types = df.columns.to_series().groupby(df.dtypes).groups\n", "# select string columns\n",
"for t in types.items():\n", "df.loc[:, (df.dtypes == np.dtype('O')).values].head()"
" print(t)"
], ],
"language": "python", "language": "python",
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"output_type": "stream", "html": [
"stream": "stdout", "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>player</th>\n",
" <th>salary</th>\n",
" <th>position</th>\n",
" <th>team</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> sergio ag\u00fcero</td>\n",
" <td> 20</td>\n",
" <td> forward</td>\n",
" <td> manchester city</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> alexis s\u00e1nchez</td>\n",
" <td> 15</td>\n",
" <td> forward</td>\n",
" <td> arsenal</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> saido berahino</td>\n",
" <td> 13.8</td>\n",
" <td> forward</td>\n",
" <td> west brom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> eden hazard</td>\n",
" <td> 18.9</td>\n",
" <td> midfield</td>\n",
" <td> chelsea</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> yaya tour\u00e9</td>\n",
" <td> 16.6</td>\n",
" <td> midfield</td>\n",
" <td> manchester city</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 24,
"text": [ "text": [
"(dtype('O'), ['player', 'salary', 'position', 'team'])\n", " player salary position team\n",
"(dtype('float64'), ['games', 'goals', 'assists', 'shots_on_target', 'points_per_game', 'points'])\n" "0 sergio ag\u00fcero 20 forward manchester city\n",
"1 alexis s\u00e1nchez 15 forward arsenal\n",
"2 saido berahino 13.8 forward west brom\n",
"3 eden hazard 18.9 midfield chelsea\n",
"4 yaya tour\u00e9 16.6 midfield manchester city"
] ]
} }
], ],
"prompt_number": 29 "prompt_number": 24
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Converting Column Types"
]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"collapsed": false, "collapsed": false,
"input": [], "input": [
"df['salary'] = df['salary'].astype(float)"
],
"language": "python", "language": "python",
"metadata": {}, "metadata": {},
"outputs": [] "outputs": [],
"prompt_number": 25
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"types = df.columns.to_series().groupby(df.dtypes).groups\n",
"types"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 26,
"text": [
"{dtype('float64'): ['salary',\n",
" 'games',\n",
" 'goals',\n",
" 'assists',\n",
" 'shots_on_target',\n",
" 'points_per_game',\n",
" 'points'],\n",
" dtype('O'): ['player', 'position', 'team']}"
]
}
],
"prompt_number": 26
} }
], ],
"metadata": {} "metadata": {}