This page has snippets of Python code for accomplishing various data processing tasks. Below each snippet is an IPython %loadpy magic function which can be used like this:
In [1]: %loadpy http://econpy.pythonanywhere.com/s/foo.py
The %loadpy IPython magic function returns the content of a script to your terminal (without executing the script). So, for example, %loadpy is handy when you want to make a quick edit to a remote or local python script before executing it. If you don't need to make any changes, then just press enter after executing the %loadpy function.
» Return only the unique elements of a list while preserving the order of the original list.
def uniquify(myList, idfun=None):
if idfun is None:
def idfun(x): return x
seen, result = {}, []
for item in myList:
marker = idfun(item)
if marker in seen: continue
seen[marker] = 1
result.append(item)
return result
%loadpy http://econpy.pythonanywhere.com/s/uniquify.py
» Remove all the HTML tags from a string.
def striptags(raw_html):
tag = [False]
def checkit(i):
if tag[0]:
tag[0] = (i != '>')
return False
elif i == '<':
tag[0] = True
return False
return True
return ''.join(i for i in raw_html if checkit(i))
%loadpy http://econpy.pythonanywhere.com/s/striptags.py
» Create a list of file names in a directory dirname. Set subdir = True (False) to include (exclude) files in sub-directories of dirname.
from os import listdir,path
def filelist(dirname, subdir, *args):
f = []
for i in listdir(dirname):
d = path.join(dirname, i)
if path.isfile(d):
if len(args) == 0: f.append(d)
elif path.splitext(d)[1][1:] in args: f.append(d)
elif path.isdir(d) and subdir: f += filelist(d, subdir, *args)
return f
%loadpy http://econpy.pythonanywhere.com/s/filelist.py
» Merge the files in a list of files myFileList into a single file outputFile.
def mergefiles(myFileList, outputFile):
g = open(outputFile, 'w')
for i in myFileList:
print 'Writing file: %s' % i
g.write(open(i).read())
g.close()
print 'File created: %s' % outputFile
%loadpy http://econpy.pythonanywhere.com/s/mergefiles.py
Often times it's useful to combine the previous 2 functions into a single file-merging procedure. To do so, first cd into the directory your files are in (dirname), then run:
In [2]: %loadpy http://econpy.pythonanywhere.com/s/filelist.py
In [3]: %loadpy http://econpy.pythonanywhere.com/s/mergefiles.py
In [4]: mergefiles(filelist('.', False, 'txt'), 'myOutputFile.txt')
This command will create the file myOutputFile.txt which contains the content/lines of all .txt files in the working directory.