1. Combiner
combiner is between map and reduce, similar to reducer, combine some data before reducer.
http://hadooptutorial.wikispaces.com/Custom+combiner
http://wiki.apache.org/hadoop/HadoopMapReduce
http://blog.optimal.io/3-differences-between-a-mapreduce-combiner-and-reducer/
2. Partitioner
partitioner is between map and reduce, further partition data that has the same key
http://hadooptutorial.wikispaces.com/Custom+partitioner
3. sort and group
SortComparator decides how map output keys are sorted while GroupComparator decides which map output keys within the Reducer go to the same reduce method call.
4. whole picture
http://stackoverflow.com/questions/18395998/hadoop-map-reduce-secondary-sorting
时间: 2024-11-08 19:33:06