Hadoop实战 Hadoop Pipes运行C++程序问题解决

说明:我使用的是hadoop-1.2.1,开发环境是OpenSuSE12.3 x64。 Hadoop安装在/usr/lib/hadoop下。并且我的hadoop的相关指令已经加入到了系统的path中。

下面四篇有我解决问题时,所主要参考的文档:

1、http://www.cnblogs.com/lanxuezaipiao/p/3648853.html 该博客指出64位的libhadooppipes.a和
  libhadooputils.a这两个库应该由我们自己编译,官方提供的是32位的库。

2、http://guoyunsky.iteye.com/blog/1709654

3、http://blog.csdn.net/keljony/article/details/29872915

4、http://blog.csdn.net/sigxxl/article/details/12293435

第二、三、四篇博客指出了如何解决“Hadoop Pipes “Server failed to authenticate”错误,我也是我们的主要内容。

这里所说的是《hadoop实战》(第2版)中3.5节Hadoop Pipes的例子。

(1)下面附上书上的wordcount.cpp的代码,这里我完全基本完全照着书上敲的,为了方便只在前面加了名using namespace std;另外加了<vector>的头文件。

#include "hadoop/Pipes.hh"
#include "hadoop/TemplateFactory.hh"
#include "hadoop/StringUtils.hh"
#include <vector>

using namespace std;

const std::string WORDCOUNT = "WORDCOUNT";
const std::string INPUT_WORDS = "INPUT_WORDS";
const std::string OUTPUT_WORDS = "OUTPUT_WORDS";

class WordCountMap : public HadoopPipes::Mapper
{
public:
    HadoopPipes::TaskContext::Counter* inputWords;

    WordCountMap(HadoopPipes::TaskContext& context)
    {
        inputWords = context.getCounter(WORDCOUNT, INPUT_WORDS);
    }

    void map(HadoopPipes::MapContext& context)
    {
        vector<string> words = HadoopUtils::splitString(context.getInputValue(), " ");

        for(unsigned int i = 0; i < words.size(); ++i)
        {
            context.emit(words[i], "1");
        }

        context.incrementCounter(inputWords, words.size());
    }
};

class WordCountReduce : public HadoopPipes::Reducer
{
public:
    HadoopPipes::TaskContext::Counter* outputWords;

    WordCountReduce(HadoopPipes::TaskContext& context)
    {
        outputWords = context.getCounter(WORDCOUNT, OUTPUT_WORDS);
    }

    void reduce(HadoopPipes::ReduceContext& context)
    {
        int sum = 0;
        while(context.nextValue())
        {
            sum += HadoopUtils::toInt(context.getInputValue());
        }

        context.emit(context.getInputKey(), HadoopUtils::toString(sum));
        context.incrementCounter(outputWords, 1);
    }
};

int main(int argc, char* argv[])
{
    return HadoopPipes::runTask(HadoopPipes::TemplateFactory<WordCountMap, WordCountReduce>());
}

(2)下面是附附上书上所说的Makefile文件。

HADOOP_INSTALL=/usr/lib/hadoop-1.2.1
PLATFORM=Linux-i386-32
CC=g++
CPPFLAGS= -m32 -I$(HADOOP_INSTALL)/c++/$(PLATFORM)/include
wordcount:	wordcount.cpp
	$(CC) ${CPPFLAGS} $< -Wall -L$(HADOOP_INSTALL)/c++/$(PLATFORM)/lib -lhadooppipes -lhadooputils -lpthread -g -O2 -o [email protected]
clean:
	rm -f *.o wordcount

由于默认使用的库是32位的,如果你用的是64位的系统,应该将

PLATFORM=Linux-i386-32

修改为

Linux-amd64-64

。这里书上说得有点问题,书上说的是“如果是AMD的CPU,请使用Linux-amd64-64”,这里应该是如果是64位的系统的话,请使用Linux-amd64-64。并将

CPPFLAGS= -m32 -I$(HADOOP_INSTALL)/c++/$(PLATFORM)/include

改为

CPPFLAGS= -m64 -I$(HADOOP_INSTALL)/c++/$(PLATFORM)/include

修改完成后,输入make进行编译会报如下的错误:

g++ -m64 -I/usr/lib/hadoop-1.2.1/c++/Linux-amd64-64/include wordcount.cpp -Wall -L/usr/lib/hadoop-1.2.1/c++/Linux-amd64-64/lib -lhadooppipes -lhadooputils -lpthread -g -O2 -o wordcount
/usr/lib/hadoop-1.2.1/c++/Linux-amd64-64/lib/libhadooppipes.a(HadoopPipes.o): In function `HadoopPipes::BinaryProtocol::createDigest(std::string&, std::string&)':
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:426: undefined reference to `EVP_sha1'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:426: undefined reference to `HMAC_Init'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:427: undefined reference to `HMAC_Update'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:429: undefined reference to `HMAC_Final'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:430: undefined reference to `HMAC_CTX_cleanup'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:436: undefined reference to `BIO_f_base64'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:436: undefined reference to `BIO_new'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:437: undefined reference to `BIO_s_mem'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:437: undefined reference to `BIO_new'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:438: undefined reference to `BIO_push'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:439: undefined reference to `BIO_write'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:440: undefined reference to `BIO_ctrl'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:441: undefined reference to `BIO_ctrl'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:446: undefined reference to `BIO_free_all'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:426: undefined reference to `EVP_sha1'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:426: undefined reference to `HMAC_Init'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:427: undefined reference to `HMAC_Update'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:429: undefined reference to `HMAC_Final'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:430: undefined reference to `HMAC_CTX_cleanup'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:436: undefined reference to `BIO_f_base64'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:436: undefined reference to `BIO_new'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:437: undefined reference to `BIO_s_mem'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:437: undefined reference to `BIO_new'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:438: undefined reference to `BIO_push'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:439: undefined reference to `BIO_write'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:440: undefined reference to `BIO_ctrl'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:441: undefined reference to `BIO_ctrl'
/usr/lib/hadoop-1.2.1/src/c++/pipes/impl/HadoopPipes.cc:446: undefined reference to `BIO_free_all'
collect2: error: ld returned 1 exit status
make: *** [wordcount] Error 1

出现该错误是因为在编译时少加了两个选项(-lcrypto
-lssl)所导致的,应将

	$(CC) ${CPPFLAGS} $< -Wall -L$(HADOOP_INSTALL)/c++/$(PLATFORM)/lib -lhadooppipes -lhadooputils -lpthread -g -O2 -o [email protected]

修改为

	$(CC) ${CPPFLAGS} $< -lcrypto -lssl -Wall -L$(HADOOP_INSTALL)/c++/$(PLATFORM)/lib -lhadooppipes -lhadooputils -lpthread -g -O2 -o [email protected]

这样输入make后,顺利通过了编译。下面附上我的makefile文件,这里我做了上述的修改,为了文件,我在makefile文件中多加了clean目标。

HADOOP_INSTALL=/usr/lib/hadoop-1.2.1
PLATFORM=Linux-amd64-64
CC=g++
CPPFLAGS= -m64 -I$(HADOOP_INSTALL)/c++/$(PLATFORM)/include
wordcount:	wordcount.cpp
	$(CC) ${CPPFLAGS} $< -lcrypto -lssl -Wall -L$(HADOOP_INSTALL)/c++/$(PLATFORM)/lib -lhadooppipes -lhadooputils -lpthread -g -O2 -o [email protected]
clean:
	rm -f *.o wordcount

(3)运行所生成的wordcount程序。

将生成的wordcount文件上传到HDFS的bin文件夹下。使用下面命令的建立bin文件夹并上传文件。

hadoop dfs -mkdir bin
hadoop dfs -put wordcount bin 

向HDFS的input文件夹中上传所需要的输入文件,HDFS上如果已经存在output文件夹,则将其删除。这两部的指令很简单,这里就是再给出。

使用如下的指令运行HDFS的bin文件夹下的wordcount程序:

hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input input -output output -program bin/wordcount

此时会出现如下的问题:

[email protected]:~/MapReduce/wordcount_cpp$ ~/hadoop-1.1.2/bin/hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input /user/xxl/input/file0* -output /user/xxl/output/outputfile -program bin/wordcount
13/10/04 22:29:21 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
13/10/04 22:29:22 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/10/04 22:29:22 WARN snappy.LoadSnappy: Snappy native library not loaded
13/10/04 22:29:22 INFO mapred.FileInputFormat: Total input paths to process : 2
13/10/04 22:29:22 INFO mapred.JobClient: Running job: job_201310041509_0017
13/10/04 22:29:23 INFO mapred.JobClient:  map 0% reduce 0%
13/10/04 22:29:32 INFO mapred.JobClient: Task Id : attempt_201310041509_0017_m_000000_0, Status : FAILED
java.io.IOException
	at org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication(OutputHandler.java:188)
	at org.apache.hadoop.mapred.pipes.Application.waitForAuthentication(Application.java:194)
	at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:149)
	at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201310041509_0017_m_000000_0: Server failed to authenticate. Exiting
13/10/04 22:29:32 INFO mapred.JobClient: Task Id : attempt_201310041509_0017_m_000001_0, Status : FAILED
java.io.IOException
	at org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication(OutputHandler.java:188)
	at org.apache.hadoop.mapred.pipes.Application.waitForAuthentication(Application.java:194)
	at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:149)
	at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201310041509_0017_m_000001_0: Server failed to authenticate. Exiting
13/10/04 22:29:40 INFO mapred.JobClient: Task Id : attempt_201310041509_0017_m_000000_1, Status : FAILED
java.io.IOException
	at org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication(OutputHandler.java:188)
	at org.apache.hadoop.mapred.pipes.Application.waitForAuthentication(Application.java:194)
	at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:149)
	at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201310041509_0017_m_000000_1: Server failed to authenticate. Exiting
13/10/04 22:29:40 INFO mapred.JobClient: Task Id : attempt_201310041509_0017_m_000001_1, Status : FAILED
java.io.IOException
	at org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication(OutputHandler.java:188)
	at org.apache.hadoop.mapred.pipes.Application.waitForAuthentication(Application.java:194)
	at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:149)
	at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201310041509_0017_m_000001_1: Server failed to authenticate. Exiting
13/10/04 22:29:48 INFO mapred.JobClient: Task Id : attempt_201310041509_0017_m_000000_2, Status : FAILED
java.io.IOException
	at org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication(OutputHandler.java:188)
	at org.apache.hadoop.mapred.pipes.Application.waitForAuthentication(Application.java:194)
	at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:149)
	at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201310041509_0017_m_000000_2: Server failed to authenticate. Exiting
13/10/04 22:29:48 INFO mapred.JobClient: Task Id : attempt_201310041509_0017_m_000001_2, Status : FAILED
java.io.IOException
	at org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication(OutputHandler.java:188)
	at org.apache.hadoop.mapred.pipes.Application.waitForAuthentication(Application.java:194)
	at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:149)
	at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201310041509_0017_m_000001_2: Server failed to authenticate. Exiting
13/10/04 22:29:59 INFO mapred.JobClient: Job complete: job_201310041509_0017
13/10/04 22:29:59 INFO mapred.JobClient: Counters: 7
13/10/04 22:29:59 INFO mapred.JobClient:   Job Counters
13/10/04 22:29:59 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=65416
13/10/04 22:29:59 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/10/04 22:29:59 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/10/04 22:29:59 INFO mapred.JobClient:     Launched map tasks=8
13/10/04 22:29:59 INFO mapred.JobClient:     Data-local map tasks=8
13/10/04 22:29:59 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/10/04 22:29:59 INFO mapred.JobClient:     Failed map tasks=1
13/10/04 22:29:59 INFO mapred.JobClient: Job Failed: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201310041509_0017_m_000000
Exception in thread "main" java.io.IOException: Job failed!
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1327)
	at org.apache.hadoop.mapred.pipes.Submitter.runJob(Submitter.java:248)
	at org.apache.hadoop.mapred.pipes.Submitter.run(Submitter.java:479)
	at org.apache.hadoop.mapred.pipes.Submitter.main(Submitter.java:494)

该问题的主要意思是说“Server failed to authenticate”。我在最前面列出的后三篇博客主要也就是解决该问题。但上述三个解决方法还有点不足。 这里我说的是我的解决步骤。

出现该问题主要原因是所使用的库不对,官方默认是32位的,64位的需要自己编译,并将库放在合适的位置。我使用的就是64位系统,需要自己编译所需要的pipes库和util库。

1、编译pipes所需库

进入到pipes源码所在的的文件夹,我这里是/usr/lib/hadoop-1.2.1/src/c++/pipes。执行下面的命令编译生成所需要的libhadooppipes.a。

./configure
make install

我在编译pipes时,未出现什么问题。完成后,会在该目录下,看到所生成的libhadooppipes.a库。下面需要将其放到wordcount程序所引用的位置或者修改wordcount的makefile文件中所引用的库的位置。这里我选择前者。将libhadooppipes.a库文件复制到wordcount中makefile所引用的位置/usr/lib/hadoop-1.2.1/c++/Linux-amd64-64/lib/ 目录下。为了安全,我也将/usr/lib/hadoop-1.2.1/src/c++/pipes/api/hadoop/目录下的两个头文件Pipes.hh和TemplateFactory.hh复制到/usr/lib/hadoop-1.2.1/c++/Linux-amd64-64/include/
目录下。

1、编译utils所需库

进入到utils源码所在的的文件夹,我这里是/usr/lib/hadoop-1.2.1/src/c++/utils。执行下面的命令编译生成所需要的libhadooputilss.a。

./configure
make install

我在编译utils时,输入./configure出现如下问题。

[email protected]:/usr/local/hadoop/src/c++/pipes$ ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking for style of include used by make... GNU
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking dependency style of gcc... gcc3
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking minix/config.h usability... no
checking minix/config.h presence... no
checking for minix/config.h... no
checking whether it is safe to define __EXTENSIONS__... yes
checking for special C compiler options needed for large files... no
checking for _FILE_OFFSET_BITS value needed for large files... 64
checking pthread.h usability... yes
checking pthread.h presence... yes
checking for pthread.h... yes
checking for pthread_create in -lpthread... yes
checking for HMAC_Init in -lssl... no
configure: error: Cannot find libssl.so
./configure: line 5166: exit: please: numeric argument required
./configure: line 5166: exit: please: numeric argument required
[email protected]:/usr/local/hadoop/src/c++/pipes$

主要意思是说“error: Cannot find libssl.so”。输入rpm -qa | grep ssl后,发现我已经安装了所需要的openssl等。后来在网上找到了解决方法。这里需要修改congigure文件。

我修改的是4744和4804两行的

LIBS="-lssl $LIBS" 

修改为

LIBS="-lssl -lcrypto $LIBS"

这里你应该根据自己的情况,在configure文件中搜索LIBS="-lssl $LIBS",然后,进行修改。行数可能和我的不同。

修改完成后再次输入./configure,检查通过。

然后输入make进行编译,此时报错,有一个cc文件中的sleep()等函数未定义,打开该文件后,在头文件中加上#include <unistd.h>后问题得以解决。

再次输入make后,编译通过,得到所需要的libhadooputils.a库文件。

完成后,会在该目录下,看到所生成的libhadooputils.a库。下面需要将其放到wordcount程序所引用的位置或者修改wordcount的makefile文件中所引用的库的位置。这里我选择前者。将libhadooputils.a库文件复制到wordcount中makefile所引用的位置/usr/lib/hadoop-1.2.1/c++/Linux-amd64-64/lib/ 目录下。为了安全,我也将/usr/lib/hadoop-1.2.1/src/c++/pipes/api/hadoop/目录下的两个头文件SerialUtils.hh和StringUtils.hh复制到/usr/lib/hadoop-1.2.1/c++/Linux-amd64-64/include/
目录下。

(4)至此,问题全部解决。

删除HDFS上的output文件夹后。重新编译生成wordcount程序,并将其上传到HDFS的bin文件夹下(原来的wordcount程序,是用旧库编译的,不能使用)。再次运行wordcount程序,得到所需要的结果。

hadoop dfs -rmr output
make  //在wordcountt程序所在的文件夹中执行,使用makefile文件产生wordcount程序
hadoop dfs -rm output/worcount //删除HDFS的bin文件夹下的原有的wordcount程序
hadooop dfs -put worcount output
hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input input -output output -program bin/wordcount
时间: 2024-08-08 22:47:16

Hadoop实战 Hadoop Pipes运行C++程序问题解决的相关文章

Python菜鸟的Hadoop实战——Hadoop集群搭建

Hadoop集群的部署 网上很多关于hadoop集群部署的文章, 我这里重新整理下,毕竟,别人的经历,让你按照着完整走下来,总有或多或少的问题. 小技巧分享: 有些初学者喜欢在自己机器上安装虚拟机来部署hadoop,毕竟,很多同学的学习环境都是比较受限的. 我这里则直接选择了阿里云的机器,买了三台ECS作为学习环境.毕竟,最低配一个月才40多块,学习还是要稍微投入点的. 一. 基础环境准备 Windows不熟练,小主只有选择Linux了. 官方提示如下,Linux所需软件包括: JavaTM1.

在Hadoop集群上运行R程序--安装RHadoop

RHadoop是由Revolution Analytics发起的一个开源项目,它可以将统计语言R与Hadoop结合起来.目前该项目包括三个R packages,分别为支持用R来编写MapReduce应用的rmr.用于R语言访问HDFS的rhdfs以及用于R语言访问HBASE的rhbase.下载网址为https://github.com/RevolutionAnalytics/RHadoop/wiki/Downloads. 说明:下面的记录是在安装成功后的总结,中间的过程描述及解决方法可能并不精确

hadoop实战---Hadoop开发过程中遇到的问题和解决方法

先上正确运行的显示: 错误1:变量为IntWritable,接收的是LongWritable,如下图: 原因,多写了参数reporter,如下图: 错误2:数组超出边界,如下图: 原因:设置了combine类,如下图: 错误3:nullpointerexception异常,如下图: 原因:静态变量为null,赋值即可,如下图: 错误4:进入了map,但是无法进入reduce,且直接把map的数据输出了,并且无错误提示 原因:Hadoop新老版本问题,实际是实例化异常,如下图: 错误5:进入主函数

在ubuntu上安装eclipse同时连接hadoop运行wordcount程序

起先我是在win7 64位上远程连接hadoop运行wordcount程序的,但是这总是需要网络,考虑到这一情况,我决定将这个环境转移到unbuntu上 需要准备的东西 一个hadoop的jar包,一个连接eclipse的插件(在解压的jar包里有这个东西),一个hadoop-core-*.jar(考虑到连接的权限问题) 一个eclipse的.tar.gz包(其它类型的包也可以,eclipse本身就是不需要安装的,这里就不多说了) 因为我之前在win7上搭建过这个环境,所以一切很顺利,但还是要在

win7 64位下安装hadoop的eclipse插件并编写运行WordCount程序

win7 64位下安装hadoop的eclipse插件并编写运行WordCount程序 环境: win7 64位 hadoop-2.6.0 步骤: 1.下载hadoop-eclipse-plugin-2.6.0.jar包 2.把hadoop-eclipse-plugin-2.6.0.jar放到eclipse安装目录下的plugins目录下 3.打开eclipse发现左边多出来一个DFS Locations 4.在win7上解压hadoop-2.6.0. 5.下载hadoop.dll.winuti

Hadoop日记Day16---命令行运行MapReduce程序

一.代码编写 1.1 单词统计 回顾我们以前单词统计的例子,如代码1.1所示. 1 package counter; 2 3 import java.net.URI; 4 5 import org.apache.hadoop.conf.Configuration; 6 import org.apache.hadoop.fs.FileSystem; 7 import org.apache.hadoop.fs.Path; 8 import org.apache.hadoop.io.LongWrita

原生态在Hadoop上运行Java程序

第一种:原生态运行jar包1,利用eclipse编写Map-Reduce方法,一般引入Hadoop-core-1.1.2.jar.注意这里eclipse里没有安装hadoop的插件,只是引入其匝包,该eclipse可以安装在windows或者linux中,如果是在windows中安装的,且在其虚拟机安装的linux,可以通过共享文件夹来实现传递.2,编写要测试的数据,如命名为tempdata3,利用eclipse的export来打包已编写好的,在利用eclipse打包jar的时候,只需要选择sr

多个线程运行MR程序时hadoop出现的问题

夜间多个任务同时并行,总有几个随机性有任务失败,查看日志: cat -n ads_channel.log |grep "Caused by" 7732 Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Rename cannot overwrite non empty destination directory /tmp/hadoop-hdfs/mapred/local/1576781334

Hadoop实战视频教程完整版 完整的Hadoop大数据视频教程

分享一套迪伦老师的完整的Hadoop实战视频教程,教程从学习Hadoop需要的数据库.Java和Linux基础讲起,非常适合零基础的学员,课程最后结合了实战项目演练,理论结合实战,深入浅出,绝对是当前最为完整.实战的Hadoop教程. <Hadoop大数据零基础高端实战培训系列配文本挖掘项目(七大亮点.十大目标)> 课程讲师:迪伦 课程分类:大数据 适合人群:初级 课时数量:230课时 用到技术:部署Hadoop集群 涉及项目:京东商城.百度.阿里巴巴 咨询QQ:1337192913(小公子)