环境 win7+hadoop_1.2.1+eclipse 3.7+maven 3
1.在win7下下载hadoop_1.2.1
2.安装hadoop的eclipse插件,注意eclipse 4.x版本下,hadoop插件会有问题,一定不能用4.x版本的eclipse,我用的是eclipse Indigo 也就是3.7 版本。
我有一个视频:http://pan.baidu.com/s/1sjUhsh3,讲的是安装配置eclipse插件
3.重新编译hadoop的hadoop-1.2.1\src\core\org\apache\hadoop\fs\FileUtil.java。
视频:http://pan.baidu.com/s/1gdVhCOV 讲的是如何导入hadoop源码到eclipse
在博客 http://blog.csdn.net/poisonchry/article/details/27535333 中讲到,需要修改 hadoop-1.2.1\src\core\org\apache\hadoop\fs\ 下的FileUtil.java。改动是:
private static void checkReturnValue(boolean rv, File p, FsPermission permission) throws IOException { /** * 把整个方法体注释了 if (!rv) { throw new IOException("Failed to set permissions of path: " + p + " to " + String.format("%04o", permission.toShort())); } */ }
不知道怎么回事,我在eclipse中直接export jar包总是出错,没有关系,进入eclipse的workspace,找到hadoop源码的那个项目,进入bin文件夹,找到 core\org\apache\hadoop\fs\FileUtil.class文件,进入win7下 hadoop1.2.1的目录,用winRAR打开hadoop-core-1.2.1.jar文件,把刚刚的core\org\apache\hadoop\fs\FileUtil.class拷贝到对应的位置就好了
4. 新建一个maven项目,pom文件如下:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>cn.howso</groupId> <artifactId>hadoopmaven</artifactId> <version>0.0.1-SNAPSHOT</version> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <hadoop.version>1.2.1</hadoop.version> </properties> <dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-core</artifactId> <version>${hadoop.version}</version> <scope>system</scope> <systemPath>C:\hadoop\hadoop-1.2.1\hadoop-core-1.2.1.jar</systemPath> </dependency> <dependency> <groupId>org.codehaus.jackson</groupId> <artifactId>jackson-mapper-asl</artifactId> <version>1.8.8</version> </dependency> <dependency> <groupId>org.codehaus.jackson</groupId> <artifactId>jackson-core-asl</artifactId> <version>1.8.8</version> </dependency> <dependency> <groupId>commons-httpclient</groupId> <artifactId>commons-httpclient</artifactId> <version>3.0.1</version> </dependency> <dependency> <groupId>commons-cli</groupId> <artifactId>commons-cli</artifactId> <version>1.2</version> </dependency> <dependency> <groupId>commons-configuration</groupId> <artifactId>commons-configuration</artifactId> <version>1.6</version> </dependency> <dependency> <groupId>org.hamcrest</groupId> <artifactId>hamcrest-all</artifactId> <version>1.1</version> <scope>test</scope> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.11</version> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.mrunit</groupId> <artifactId>mrunit</artifactId> <version>1.1.0</version> <classifier>hadoop2</classifier> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-minicluster</artifactId> <version>${hadoop.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-test</artifactId> <version>${hadoop.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>com.sun.jersey</groupId> <artifactId>jersey-core</artifactId> <version>1.8</version> <scope>test</scope> </dependency> </dependencies> <build> <finalName>hadoopx</finalName> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compilter-plugin</artifactId> <version>3.1</version> <configuration> <source>1.7</source> <target>1.7</target> </configuration> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-jar-plugin</artifactId> <version>2.5</version> <configuration> <outputDirectory>basedir</outputDirectory> <archive> <manifest> <mainClass>hadoopmaven.Driver</mainClass> </manifest> </archive> </configuration> </plugin> </plugins> </build> </project>
当中的groupId,artifactId等等当然是根据各自项目命名的,当然对项目本身没有影响。
注意到对应hadoop-core的依赖,我写的是依赖本地系统的jar包,也就是刚刚被修改过的hadoop-core-1.2.1.jar文件。
5.随便写个什么mapreduce,我是按照博客 http://www.cnblogs.com/formyjava/p/5219191.html 中的写法,然后直接运行Driver类,或者右击Driver类,选择Run as ---> Run on hadoop就可以了