Guinea Pig
From Cohen Courses
Quick Start
Running wordcount.py
Set up a directory that contains the file gp.py
and a
second script called wordcount.py
which contains this
code:
# always start like this from gp import * import sys # supporting routines can go here def tokens(line): for tok in line.split(): yield tok.lower() #always subclass Planner class WordCount(Planner): wc = ReadLines('corpus.txt') | FlattenBy(by=tokens) | Group(by=lambda x:x, reducingWith=ReduceToCount()) # always end like this if __name__ == "__main__": WordCount().main(sys.argv)
Then type the command:
% python tutorial/wordcount.py --store wc
After a couple of seconds it will return, and you can see the wordcounts with
% head wc.gp