common

pyxshell.common defines some helpful pipeline components, borrowed straight from UNIX but adapted to work with arbitrary Python objects instead of just simple text streams.

For example, the following list of functions can be grabbed with:

>>> from pyxshell.common import *
>>> ("    *    :func:`%s`"%c for c in cat("pyxshell/common.py") | grep("def") | cut(1) | cut(0,"(") | filter(lambda i:"default" not in i) | uniq ) | sort() | glue("\n") > sys.stdout

Defined in this module: * cat() * curl() * cut() * dir_file() * dos2unix() * echo() * expand() * filter() * glue() * grep_e() * grep_in() * grep() * head() * join() * map() * pretty_printer() * sed() * sh() * sort() * tail() * tee() * traverse() * uniq() * unix2dos() * wc()

pyxshell.common.cat(*args, **kwargs)[source]

Read a file. Passes directly through to a call to open().

>>> src_file = __file__.replace('.pyc', '.py')
>>> for line in cat(src_file):
...     if line.startswith('def cat'):
...          print repr(line)
'def cat(*args, **kwargs):\n'
pyxshell.common.curl(*args, **kwargs)[source]

Fetch a URL, yielding output line-by-line.

>>> UNLICENSE = 'http://unlicense.org/UNLICENSE'
>>> for line in curl(UNLICENSE): 
...     print line,
This is free and unencumbered software released into the public domain.
...
pyxshell.common.cut(*args, **kwargs)[source]

Yields the fields-th items of the strings splited as a list according to the delimiter. If delimiter is None, any whitespace-like character is used to split. If fields is None, every field are returned.

>>> list( iter( ["You don't NEED to follow ME","You don't NEED to follow ANYBODY!"] ) | cut(1,"NEED to"))
[' follow ME', ' follow ANYBODY!']
>>> list( iter( ["I say you are Lord","and I should know !","I've followed a few !"] ) | cut([4]) )
[['Lord'], ['!'], ['!']]
>>> list( iter( ["You don't NEED to follow ME","You don't NEED to follow ANYBODY!"] ) | cut([0,1],"NEED to"))
[["You don't ", ' follow ME'], ["You don't ", ' follow ANYBODY!']]
>>> list( iter( ["I say you are Lord","and I should know !","I've followed a few !"] ) | cut([4,1]) )
[['Lord', 'say'], ['!', 'I'], ['!', 'followed']]
pyxshell.common.dir_file(*args, **kwargs)[source]

Yields the file name and its absolute path in a tuple, expand home and vars if necessary.

pyxshell.common.dos2unix(*args, **kwargs)[source]

Replace DOS-like newline characters by UNIX-like ones.

>>> list( iter(["dos

”,”unix “]) | dos2unix()

[‘dos

‘, ‘unix ‘]

pyxshell.common.echo(*args, **kwargs)[source]

Yield a single item. Equivalent to iter([item]), but nicer-looking.

>>> list(echo(1))
[1]
>>> list(echo('hello'))
['hello']
pyxshell.common.expand(*args, **kwargs)[source]

Yelds file names matching each ‘filepatterns’.

pyxshell.common.filter(*args, **kwargs)[source]

Only pass through items for which predicate(item) is truthy.

>>> list(xrange(5) | filter(lambda x: x % 2 == 0))
[0, 2, 4]
pyxshell.common.glue(*args, **kwargs)[source]

Join every lines in the stream, using the given delimiter. The default delimiter is a space.

>>> list( [[[1],[2]],[[3],[4]],[[5],[6]]] | traverse() | map(str) | glue(" ") )
['1 2 3 4 5 6']
pyxshell.common.grep(*args, **kwargs)[source]

Filters strings on stdin acconding to a given pattern. Use a regular expression if the pattern can be compiled, else use built-in operators (in() or ==()).

>>> list( range(10) | grep(5) )
[5]
>>> list( range(10) | grep([5]) )
[5]
>>> list( ['cat', 'cabbage', 'conundrum', 'cathedral'] | grep('cat') )
['cat', 'cathedral']
>>> list( ['cat', 'cabbage', 'conundrum', 'cathedral'] | grep('^c.*e') )
['cabbage', 'cathedral']
pyxshell.common.grep_e(*args, **kwargs)[source]

Filter strings on stdin for the given regex (uses re.search()).

>>> list(iter(['cat', 'cabbage', 'conundrum', 'cathedral']) | grep_e(r'^ca'))
['cat', 'cabbage', 'cathedral']
pyxshell.common.grep_in(*args, **kwargs)[source]

Filter strings on stdin for any string in a given list (uses in()).

>>> list(iter(['cat', 'cabbage', 'conundrum', 'cathedral']) | grep_in(["cat","cab"]))
['cat', 'cabbage', 'cathedral']
>>> list( range(10) | grep_in(5) )
[5]
pyxshell.common.head(*args, **kwargs)[source]

Yield only a given number of lines, then stop.

If size=None, yield all the lines of the stream.

>>> list( iter(range(10)) | head() )
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list( iter(range(10)) | head(5) )
[0, 1, 2, 3, 4]
>>> list( iter(range(10)) | head(0) )
[]
pyxshell.common.join(*args, **kwargs)[source]

Join every list items in the input lines with the given delimiter. The default delimiter is a space.

>>> list( iter( ["- Yes, we are all different!  - I'm not!"] ) | cut() | join() )
["- Yes, we are all different! - I'm not!"]
>>> list( iter( ["- Yes, we are all different!  - I'm not!"] ) | cut(delimiter="all") | join("NOT") )
["- Yes, we are NOT different!  - I'm not!"]
pyxshell.common.map(*args, **kwargs)[source]

Map each item on stdin through the given function.

>>> list(xrange(5) | map(lambda x: x + 2))
[2, 3, 4, 5, 6]
pyxshell.common.pretty_printer(*args, **kwargs)[source]

Pretty print each item on stdin and pass it straight through.

>>> for item in iter([{'a': 1}, ['b', 'c', 3]]) | pretty_printer():
...     pass
{'a': 1}
['b', 'c', 3]
pyxshell.common.sed(*args, **kwargs)[source]

Apply re.sub() to each line on stdin with the given pattern/repl.

>>> list(iter(['cat', 'cabbage']) | sed(r'^ca', 'fu'))
['fut', 'fubbage']

Upon encountering a non-matching line of input, sed() will pass it through as-is. If you want to change this behaviour to only yield lines which match the given pattern, pass exclusive=True:

>>> list(iter(['cat', 'nomatch']) | sed(r'^ca', 'fu'))
['fut', 'nomatch']
>>> list(iter(['cat', 'nomatch']) | sed(r'^ca', 'fu', exclusive=True))
['fut']
pyxshell.common.sh(*args, **kwargs)[source]

Run a shell command, send it input, and produce its output.

>>> print ''.join(echo("h\ne\nl\nl\no") | sh('sort -u'))
e
h
l
o

>>> for line in sh('echo Hello World'):
...     print line,
Hello World
>>> for line in sh('false', check_success=True):
...     print line, 
Traceback (most recent call last):
...
CalledProcessError: Command '['false']' returned non-zero exit status 1
pyxshell.common.tail(*args, **kwargs)[source]

Yield the given number of lines at the end of the stream.

If size=None, yield all the lines. If size!=None, it will wait for the data stream to end before yielding lines.

>>> list( iter(range(10)) | tail() )
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list( iter(range(10)) | tail(5) )
[5, 6, 7, 8, 9]
>>> list( iter(range(10)) | tail(0) )
[]
pyxshell.common.tee(*args, **kwargs)[source]

Save the input stream in a given file and forward it to the next pipe.

>>> out=[];range(10) | map(str) | glue(",") | tee(sys.stdout) > out ; print out
0,1,2,3,4,5,6,7,8,9['0,1,2,3,4,5,6,7,8,9']
pyxshell.common.traverse(*args, **kwargs)[source]

Recursively browse all items, as if nested levels where flatten. Yield all items in the corresponding flatten list.

>>> list( [[1],[[2,3]],[[[4],[5]]]] | traverse() )
[1, 2, 3, 4, 5]
pyxshell.common.unix2dos(*args, **kwargs)[source]

Replace UNIX-like newline characters by DOS-like ones.

>>> list( iter(["dos

”,”unix “]) | unix2dos()

[‘dos

‘, ‘unix ‘]

pyxshell.common.wc(*args, **kwargs)[source]

Return a list indicating the total number of [lines, words, characters] in the whole stream.

>>> list( [["It's every man's right to have babies if he wants them."],["But you can't have babies. "],["Don't you oppress me."]] | wc() )
[3, 20, 103]
>>> list( [["It's every man's right to have babies if he wants them."],["But you can't have babies. "],["Don't you oppress me."]] | wc("lines") )
[3]
>>> list( [["It's every man's right to have babies if he wants them."],["But you can't have babies. "],["Don't you oppress me."]] | wc("words") )
[20]
>>> list( [["It's every man's right to have babies if he wants them."],["But you can't have babies. "],["Don't you oppress me."]] | wc("characters") )

[103]

Related Topics

This Page