diff --git a/README.md b/README.md index a6a6686..8dcdf81 100755 --- a/README.md +++ b/README.md @@ -4,16 +4,33 @@ Python Tutorials and References Useful functions, tutorials, and other Python-related things -###Links to view the IPython Notebooks +###Links to view the IPython Notebooks and Markdown files -- [Python benchmarks via `timeit`](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/benchmarks/timeit_tests.ipynb?create=1) -- [Implementing the least squares fit method for linear regression and speeding it up via Cython](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/benchmarks/cython_least_squares.ipynb?create=1) -- [Benchmarks of different palindrome functions](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/benchmarks/palindrome_timeit.ipynb?create=1) -- [A collection of not so obvious Python stuff you should know!](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/not_so_obvious_python_stuff.ipynb?create=1) -- [Python's scope resolution for variable names and the LEGB rule](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb?create=1) -- [Happy Mother's Day](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/funstuff/happy_mothers_day.ipynb?create=1) +
+
+
-### Links to Markdown files -- [A thorough guide to SQLite database operations in Python](./sqlite3_howto/README.md) -- [Unit testing in Python - Why we want to make it a habit](./tutorials/unit_testing.md) -- [Installing Scientific Packages for Python3 on MacOS 10.9 Mavericks](./tutorials/installing_scientific_packages.md) +**// Python tips and tutorials** + +- A collection of not so obvious Python stuff you should know! [[IPython nb]](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/not_so_obvious_python_stuff.ipynb?create=1) +- Python's scope resolution for variable names and the LEGB rule [[IPython nb]](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb?create=1) +- A thorough guide to SQLite database operations in Python [[Markdown]](./sqlite3_howto/README.md) +- Unit testing in Python - Why we want to make it a habit [[Markdown]](./tutorials/unit_testing.md) +- Installing Scientific Packages for Python3 on MacOS 10.9 Mavericks [[Markdown]](./tutorials/installing_scientific_packages.md) + +**// benchmarks** + +- Python benchmarks via `timeit` [[IPython nb]](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/benchmarks/timeit_tests.ipynb?create=1) +- Implementing the least squares fit method for linear regression and speeding it up via Cython [[IPython nb]](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/benchmarks/cython_least_squares.ipynb?create=1) +- Benchmarks of different palindrome functions [[IPython nb]](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/benchmarks/palindrome_timeit.ipynb?create=1) + + + +**// other** + +- Happy Mother's Day [[IPython nb]](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/funstuff/happy_mothers_day.ipynb?create=1) + + +**// useful snippets** + +- convert 'tab-delimited' to 'comma-separated' CSV files [[IPython nb]](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/useful_scripts/fix_tab_csv.ipynb?create=1) diff --git a/useful_scripts/fix_tab_csv.ipynb b/useful_scripts/fix_tab_csv.ipynb new file mode 100644 index 0000000..496f89f --- /dev/null +++ b/useful_scripts/fix_tab_csv.ipynb @@ -0,0 +1,94 @@ +{ + "metadata": { + "name": "", + "signature": "sha256:996358a25da6fc77c66d183e79209307af06bd2f9abb0656d3bb70cfc2fe597a" + }, + "nbformat": 3, + "nbformat_minor": 0, + "worksheets": [ + { + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Sebastian Raschka 05/09/2014" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Fixing CSV files" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We have a directory `../CSV_files_raw/` with CSV files where some of them have 'tab-separated' and some of them 'comma-separated' columns. \n", + "Here, we will 'fix' them, i.e., have them all comma-separated, and save them to a new directory `../CSV_fixed`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we create a dictionary with the file basenames as keys. The values are lists of the file paths to the raw and new fixed CSV files. e.g., \n", + "\n", + " {\n", + " 'abc.csv': ['../CSV_files_raw/abc.csv', '../CSV_fixed/abc.csv'], \n", + " 'def.csv': ['../CSV_files_raw/def.csv', '../CSV_fixed/def.csv'], \n", + " ...\n", + " }" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "import sys\n", + "import os\n", + "\n", + "raw_dir = '../CSV_files_raw/'\n", + "fixed_dir = '../CSV_fixed'\n", + "\n", + "if not os.path.exists(fixed_dir):\n", + " os.mkdir(fixed_dir)\n", + "\n", + "f_dict = {os.path.basename(f):[os.path.join(raw_dir, f),\n", + " os.path.join(fixed_dir, f)]\n", + " for f in os.listdir(raw_dir) if f.endswith('.csv')} " + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 8 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we can replace the tabs with commas for the new files very easily:" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "for f in f_dict.keys():\n", + " with open(f_dict[f][0], 'r') as raw, open(f_dict[f][1], 'w') as fixed:\n", + " for line in raw:\n", + " line = line.strip().split('\\t')\n", + " fixed.write(','.join(line) + '\\n')" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 11 + } + ], + "metadata": {} + } + ] +} \ No newline at end of file