修改陆喜恒. Hadoop实战(第2版)5.3排序的代码时遇到IO异常。
环境:Mac OS X 10.9.5, IntelliJ IDEA 13.1.5, Hadoop 1.2.1
异常具体信息如下
1 14/10/06 03:08:51 INFO mapred.JobClient: Task Id : attempt_201410021756_0043_m_000000_0, Status : FAILED 2 java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.IntWritable, recieved org.apache.hadoop.io.Text 3 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1024) 4 at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:690) 5 at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 6 at main.ch5.ReSort$Map.map(ReSort.java:51) 7 at main.ch5.ReSort$Map.map(ReSort.java:43) 8 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) 9 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 10 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) 11 at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 12 at java.security.AccessController.doPrivileged(Native Method) 13 at javax.security.auth.Subject.doAs(Subject.java:396) 14 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) 15 at org.apache.hadoop.mapred.Child.main(Child.java:249)
相关代码如下
public static class Map extends Mapper<LongWritable, Text, IntWritable, Text> { // ... } public static class Reduce extends Reducer<IntWritable, Text, IntWritable, IntWritable> { // ... } public static void main(String[] args){ // ... job.setOutputFormatClass(TextOutputFormat.class); job.setOutputKeyClass(IntWritable.class); job.setOutputValueClass(IntWritable.class); // ... }
Map的输出与Reduce的输入类型相符。但是,根据错误信息,Map的value值预期为IntWritable,接受到的却是Text,两者类型不匹配。另外,错误提示与代码定义也不同,因为定义声明了map阶段输出值为<IntWritable, Text>。
造成这个问题的原因等以后阅读源码时再分析,先处理问题。配置作业的MapOutputKeyClass和MapOutputValueClass参数,将其设置成Map输出类型即可。
job.setMapOutputKeyClass(/*K2*/IntWritable.class); job.setMapOutputValueClass(/*V2*/Text.class);
时间: 2024-10-07 08:11:42