hadoop0.20.2集群安装mahout0.4

1:下载二进制解压安装。

 mahout-distribution-0.4.tar.gz    

 

2:配置环境变量

PATH=$PATH:$HOME/bin:/usr/lib/jvm/java-6-sun-1.6.0.24/bin:/home/yangze/soft/pig-0.9.2/bin:/home/yangze/soft/hadoop-0.20.2/bin:/usr/local/python-2.7.3/bin
export PATH
export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.24
export PIG_CLASSPATH=/home/yangze/soft/pig-0.9.2/conf
export HADOOP_HOME="/home/yangze/soft/hadoop-0.20.2"
export HADOOP_CONF_DIR=/home/yangze/soft/hadoop-0.20.2/conf
export MAHOUT_HOME=/home/yangze/soft/mahout-distribution-0.4
export MAHOUT_CONF_DIR=/home/yangze/soft/mahout-distribution-0.4/conf
export PATH=$PATH:$HADOOP_HOME/bin:$MAHOUT_HOME/bin

image

 

验证:

image

3.做一个kmeans聚类测试:

将测试数据拷贝到HDFS 下载地址http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data

yangze@master:~/soft/hadoop-0.20.2$ bin/hadoop fs -mkdir ./mahoutdata
yangze@master:~/soft/hadoop-0.20.2$ bin/hadoop fs -put /home/yangze/newdisk/data/synthetic_control.data ./mahoutdata
yangze@master:~/soft/hadoop-0.20.2$ bin/hadoop fs -ls ./mahoutdata
Found 1 items
-rw-r--r--   2 yangze supergroup     288374 2014-01-21 15:47 /user/yangze/mahoutdata/synthetic_control.data

yangze@master:~/soft/hadoop-0.20.2$ bin/hadoop fs -mv ./mahoutdata  ./testdata

yangze@master:~/soft/hadoop-0.20.2$

mahout  org.apache.mahout.clustering.syntheticcontrol.kmeans.Job

 

image

 

查看结果:

image

 

问题:mahout  org.apache.mahout.clustering.syntheticcontrol.kmeans.Job 这个不要指定数据目录吗? 数据目录是怎么弄的,我用mahoutdata目录就报错,修改成testdata就可以。

您可以选择一种方式赞助本站