Mahout

I basically modified the examples/bin/classify-20newsgroups.sh in Mahout, adding

	export MASTER=spark://127.0.0.1:7077
	export SPARK_HOME=... spark install directory ...
	WORK_DIR=~/usr/data/mahoutspark
	alg=naivebayes
      

The WORK_DIR will have to correspond with the mahout directories given in aether.prop. The alg=naivebayes and cnaivebayes corresponds to mahoutalgorithm=bayes and cbayes, correspondingly. The downloading and unpacking of the newsgroups can be replaced with for instance the top level Dewey classes having directories philosophy and psychology, religion, social sciences etc, each containing sample books or documents used to make the classification model. For more about the settings in aether.prop, see here.