import java.io.DataInput; import java.io.DataOutput; import java.io.IOException; /** * Created by user on 16/3/17. */ public interface Writable { void write(DataOutput out) throws IOException; void readFields(DataInput in) throws IOException; }
- Writable主要定义两个方法,一个writing its state to a DataOutput binary stream(我就不理解,为什么这里要用这个state),另一个就是从一个输入二进制流中读取它的状态。
IntWritable writable = new IntWritable(163);public static byte[] serialize(Writable writable) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); DataOutputStream dataOut = new DataOutputStream(out); writable.write(dataOut); dataOut.close(); return out.toByteArray(); }
- 这里主要是测试IntWritable的序列化方式。
public static byte[] deserialize(Writable writable,byte[] bytes) throws IOException{ ByteArrayInputStream in = new ByteArrayInputStream(bytes); DataInputStream dataIn = new DataInputStream(in); writable.readFields(dataIn); dataIn.close(); return bytes; }
- 这个主要是反序列化IntWritable
public interface WritableComparable<T> extends Writable,Comparable<T> {} public interface RawComparator<T> extends Comparable<T> { public int compare(byte[] b1,int s1, int l1, byte[] b2, int s2, int l2); }
- IntWritable实现了WritableComparable接口,它是Writable和Comparable的子接口。比较类型对于MapReduce是比较关键的,在排序阶段需要key和其他key进行比较。hadoop提供RawComparator(继承Comparator)
- 这个接口的实现允许比较从流中读取的记录,而不需要反序列化他们变成对象。避免了生成对象的开销,
时间: 2024-10-07 09:11:30