别人的相关测试数据: http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
测试纬度
- 序列化时间
- 反序列化时间
- bytes大小
测试代码
准备protobuf文件
Message.proto文件代码
import "InnerMessage.proto"; package demo; option java_package = "com.agapple.protobuf.data"; option java_outer_classname = "MessageProtos"; option optimize_for = SPEED ; //CODE_SIZE,LITE_RUNTIME option java_generic_services = false; message Message { required string strObj = 1 [default="hello"]; optional int32 int32Obj = 2; optional int64 int64Obj = 3; optional uint32 uint32Obj = 4; optional uint64 uint64Obj = 5; optional sint32 sint32Obj = 6; optional sint64 sint64Obj = 7; optional fixed32 fixed32Obj = 8; optional fixed64 fixed64Obj = 9; optional sfixed32 sfixed32Obj = 10; optional sfixed64 sfixed64Obj = 11; optional bool boolObj = 12; optional bytes bytesObj = 13; optional float folatObj = 14 [deprecated=true]; repeated double doubleObj = 15 [packed=true]; // optional InnerMessage innerMessage = 16; } Innermessage.proto代码 import "EnumType.proto"; package demo; option java_package = "com.agapple.protobuf.data"; option java_outer_classname = "InnerMessageProtos"; message InnerMessage { optional string name = 1 [default = "name"]; optional int32 id = 2; optional EnumType type = 3 [default = UNIVERSAL]; }
Enumtype.proto代码
package demo; option java_package = "com.agapple.protobuf.data"; option java_outer_classname = "EnumTypeProtos"; enum EnumType { UNIVERSAL = 0; WEB = 1; IMAGES = 2; LOCAL = 3; NEWS = 4; PRODUCTS = 5; VIDEO = 6; }
基本上把protobuf支持的类型都囊括了,包括嵌套类型,枚举类型,以及各种int,uint,bool,bytes。
依赖关系是Message.proto依赖了InnerMessage对象,而InnerMessage对象里包含了一个自定义枚举类型EnumType。
关于类型的使用可参见:
http://code.google.com/intl/zh/apis/protocolbuffers/docs/reference/java-generated.html
http://code.google.com/intl/zh/apis/protocolbuffers/docs/proto.html
生成protobuf javabean
C代码
cd /home/ljh/work/code/src/main/java /home/ljh/work/protobuf/bin/protoc --proto_path=com/agapple/protobuf/ --java_out=. com/agapple/protobuf/EnumType.proto /home/ljh/work/protobuf/bin/protoc --proto_path=com/agapple/protobuf/ --java_out=. com/agapple/protobuf/InnerMessage.proto /home/ljh/work/protobuf/bin/protoc --proto_path=com/agapple/protobuf/ --java_out=. com/agapple/protobuf/Message.proto
通过protobuf自带的protoc进行编译,指定了protobuf文件的路径, 具体的文档: http://code.google.com/intl/zh/apis/protocolbuffers/docs/proto.html#generating
运行脚本后就会生成对应的3个javabean文件: MessageProtos , InnerMessageProtos , EnumTypeProtos。
最后构造测试的protobuf bean代码
private static MessageProtos.Message getProtobufBean() { com.agapple.protobuf.data.MessageProtos.Message.Builder messageBuilder = MessageProtos.Message.newBuilder(); messageBuilder.setStrObj("message"); messageBuilder.setFolatObj(1f); messageBuilder.addDoubleObj(1d); messageBuilder.addDoubleObj(2d); messageBuilder.setBoolObj(true); messageBuilder.setBytesObj(ByteString.copyFrom(new byte[] { 1, 2, 3 })); messageBuilder.setInt32Obj(32); messageBuilder.setInt64Obj(64l); messageBuilder.setSint32Obj(232); messageBuilder.setSint64Obj(264); messageBuilder.setFixed32Obj(532); messageBuilder.setFixed64Obj(564); messageBuilder.setSfixed32Obj(2532); messageBuilder.setSfixed64Obj(2564); messageBuilder.setUint32Obj(632); messageBuilder.setUint64Obj(664); com.agapple.protobuf.data.InnerMessageProtos.InnerMessage.Builder innerMessageBuilder = InnerMessageProtos.InnerMessage.newBuilder(); innerMessageBuilder.setId(1); innerMessageBuilder.setName("inner"); innerMessageBuilder.setType(EnumType.PRODUCTS); messageBuilder.setInnerMessage(innerMessageBuilder); return messageBuilder.build(); }
准备纯Pojo Bean
同样的,为了和json , xml以及java序列化有个很好的对比,新建了3个纯的pojo bean: MessagePojo , InnerMessagePojo , EnumTypePojo。
属性和proto的bean保持一致。
构建bean对象
Java代码 private static MessagePojo getPojoBean() { MessagePojo bean = new MessagePojo(); bean.setStrObj("message"); bean.setFolatObj(1f); List<Double> doubleObj = new ArrayList<Double>(); doubleObj.add(1d); doubleObj.add(2d); bean.setDoubleObj(doubleObj); bean.setBoolObj(true); bean.setBytesObj(new byte[] { 1, 2, 3 }); bean.setInt32Obj(32); bean.setInt64Obj(64l); bean.setSint32Obj(232); bean.setSint64Obj(264); bean.setFixed32Obj(532); bean.setFixed64Obj(564); bean.setSfixed32Obj(2532); bean.setSfixed64Obj(2564); bean.setUint32Obj(632); bean.setUint64Obj(664); InnerMessagePojo innerMessagePojo = new InnerMessagePojo(); innerMessagePojo.setId(1); innerMessagePojo.setName("inner"); innerMessagePojo.setType(EnumTypePojo.PRODUCTS); bean.setInnerMessage(innerMessagePojo); return bean; }
具体的测试代码
定义测试Template接口
Java代码
interface TestCallback { String getName(); byte[] writeObject(Object source); Object readObject(byte[] bytes); }
统一的测试模板
Java代码 private static void testTemplate(TestCallback callback, Object source, int count) { int warmup = 10; // 先进行预热,加载一些类,避免影响测试 for (int i = 0; i < warmup; i++) { byte[] bytes = callback.writeObject(source); callback.readObject(bytes); } restoreJvm(); // 进行GC回收 // 进行测试 long start = System.nanoTime(); long size = 0l; for (int i = 0; i < count; i++) { byte[] bytes = callback.writeObject(source); size = size + bytes.length; callback.readObject(bytes); // System.out.println(callback.readObject(bytes)); bytes = null; } long nscost = (System.nanoTime() - start); System.out.println(callback.getName() + " total cost=" + integerFormat.format(nscost) + "ns , each cost=" + integerFormat.format(nscost / count) + "ns , and byte sizes = " + size / count); restoreJvm();// 进行GC回收 }
在测试模板方法中,使用了warmup预热的概念,就是预先执行目标方法一定的次数,用于避免因为jit的优化影响系统测试。 同时包含了每次测试模板调用完成后system.gc保证下一轮的功能测试
相应的restoreJvm方法:
Java代码 private static void restoreJvm() { int maxRestoreJvmLoops = 10; long memUsedPrev = memoryUsed(); for (int i = 0; i < maxRestoreJvmLoops; i++) { System.runFinalization(); System.gc(); long memUsedNow = memoryUsed(); // break early if have no more finalization and get constant mem used if ((ManagementFactory.getMemoryMXBean().getObjectPendingFinalizationCount() == 0) && (memUsedNow >= memUsedPrev)) { break; } else { memUsedPrev = memUsedNow; } } } private static long memoryUsed() { Runtime rt = Runtime.getRuntime(); return rt.totalMemory() - rt.freeMemory(); }
最后相应的测试例子:
Java代码 final int testCount = 1000 * 500; final MessageProtos.Message protoObj = getProtobufBean(); final MessagePojo pojoOBj = getPojoBean(); // Serializable测试 testTemplate(new TestCallback() { public String getName() { return "Serializable Test"; } @Override public byte[] writeObject(Object source) { try { ByteArrayOutputStream bout = new ByteArrayOutputStream(); ObjectOutputStream output = new ObjectOutputStream(bout); output.writeObject(source); return bout.toByteArray(); } catch (IOException e) { e.printStackTrace(); } return null; } @Override public Object readObject(byte[] bytes) { try { ByteArrayInputStream bin = new ByteArrayInputStream(bytes); ObjectInputStream input = new ObjectInputStream(bin); return input.readObject(); } catch (Exception e) { e.printStackTrace(); } return null; } }, pojoOBj, testCount); // protobuf测试 testTemplate(new TestCallback() { public String getName() { return "protobuf test"; } @Override public byte[] writeObject(Object source) { if (source instanceof MessageProtos.Message) { MessageProtos.Message message = (MessageProtos.Message) source; return message.toByteArray(); } return null; } @Override public Object readObject(byte[] bytes) { try { return MessageProtos.Message.parseFrom(bytes); } catch (InvalidProtocolBufferException e) { e.printStackTrace(); } return null; } }, protoObj, testCount); // json测试 final ObjectMapper objectMapper = new ObjectMapper(); final JavaType javaType = TypeFactory.type(pojoOBj.getClass()); // JSON configuration not to serialize null field objectMapper.getSerializationConfig().setSerializationInclusion(JsonSerialize.Inclusion.NON_NULL); // JSON configuration not to throw exception on empty bean class objectMapper.getSerializationConfig().disable(SerializationConfig.Feature.FAIL_ON_EMPTY_BEANS); // JSON configuration for compatibility objectMapper.configure(Feature.ALLOW_UNQUOTED_FIELD_NAMES, true); objectMapper.configure(Feature.ALLOW_UNQUOTED_CONTROL_CHARS, true); testTemplate(new TestCallback() { public String getName() { return "Jackson Test"; } @Override public byte[] writeObject(Object source) { try { return objectMapper.writeValueAsBytes(source); } catch (JsonGenerationException e) { e.printStackTrace(); } catch (JsonMappingException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return null; } @Override public Object readObject(byte[] bytes) { try { return objectMapper.readValue(bytes, 0, bytes.length, javaType); } catch (JsonParseException e) { e.printStackTrace(); } catch (JsonMappingException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return null; } }, pojoOBj, testCount); // Xstream测试 final XStream xstream = new XStream(); testTemplate(new TestCallback() { public String getName() { return "Xstream test"; } @Override public byte[] writeObject(Object source) { return xstream.toXML(source).getBytes(); } @Override public Object readObject(byte[] bytes) { return xstream.fromXML(new ByteArrayInputStream(bytes)); } }, pojoOBj, testCount);
hessian 3.1.5版本基于二进制序列化的测试
Xml代码 <dependency> <groupId>com.caucho</groupId> <artifactId>hessian</artifactId> <version>3.1.5</version> </dependency>
测试了3种情况:
- hessian 2协议
- hessian 2协议 + deflat压缩
- hessian 1协议
测试代码:
Java代码 // hessian 2 with no deflat testTemplate(new TestCallback() { public String getName() { return "hessian 2 with no deflat"; } @Override public byte[] writeObject(Object source) { try { ByteArrayOutputStream bos = new ByteArrayOutputStream(); Hessian2Output out = new Hessian2Output(bos); // out.startMessage(); out.writeObject(source); // out.completeMessage(); out.flush(); return bos.toByteArray(); } catch (IOException e) { e.printStackTrace(); } return null; } @Override public Object readObject(byte[] bytes) { try { ByteArrayInputStream bin = new ByteArrayInputStream(bytes); Hessian2Input in = new Hessian2Input(bin); // in.startMessage(); Object obj = in.readObject(); // in.completeMessage(); return obj; } catch (IOException e) { e.printStackTrace(); } return null; } }, pojoOBj, testCount); // hessian 2 with deflat final Deflation envelope = new Deflation(); testTemplate(new TestCallback() { public String getName() { return "hessian 2 with deflat"; } @Override public byte[] writeObject(Object source) { try { ByteArrayOutputStream bos = new ByteArrayOutputStream(); Hessian2Output out = new Hessian2Output(bos); out = envelope.wrap(out); out.writeObject(source); out.flush(); out.close(); // 记得关闭 return bos.toByteArray(); } catch (Exception e) { e.printStackTrace(); } return null; } @Override public Object readObject(byte[] bytes) { try { ByteArrayInputStream bin = new ByteArrayInputStream(bytes); Hessian2Input in = new Hessian2Input(bin); in = envelope.unwrap(in); Object obj = in.readObject(); in.close(); return obj; } catch (IOException e) { e.printStackTrace(); } return null; } }, pojoOBj, testCount); // hessian 1 with no deflat testTemplate(new TestCallback() { public String getName() { return "hessian 1 with no deflat"; } @Override public byte[] writeObject(Object source) { try { ByteArrayOutputStream bos = new ByteArrayOutputStream(); HessianOutput out = new HessianOutput(bos); out.writeObject(source); out.flush(); return bos.toByteArray(); } catch (Exception e) { e.printStackTrace(); } return null; } @Override public Object readObject(byte[] bytes) { try { ByteArrayInputStream bin = new ByteArrayInputStream(bytes); HessianInput in = new HessianInput(bin); Object obj = in.readObject(); in.close(); return obj; } catch (IOException e) { e.printStackTrace(); } return null; } }, pojoOBj, testCount);
测试结果
序列化数据对比
bytes字节数对比
具体的数字:
protobuf | jackson | xstream | Serializable | hessian2 | hessian2压缩 | hessian1 | |
序列化(单位ns) | 1154 | 5421 | 92406 | 10189 | 26794 | 100766 | 29027 |
反序列化(单位ns) | 1334 | 8743 | 117329 | 64027 | 37871 | 188432 | 37596 |
bytes | 97 | 311 | 664 | 824 | 374 | 283 | 495 |
- protobuf 不管是处理时间上,还是空间占用上都优于现有的其他序列化方式。内存暂用是java 序列化的1/9,时间也是差了一个数量级,一次操作在1us左右。缺点:就是对象结构体有限制,只适合于内部系统使用。
- json格式在空间占用还是有一些优势,是java序列化的1/2.6。序列化和反序列化处理时间上差不多,也就在5us。当然这次使用的jackson,如果使用普通的jsonlib可能没有这样好的性能,jsonlib估计跟java序列化差不多。
- xml相比于java序列化来说,空间占用上有点优势,但不明显。处理时间上比java序列化多了一个数量级,在100us左右。
- 以前一种的java序列化,表现得有些失望
- hessian测试有点意外,具体序列化数据上还步入json。性能上也不如jackjson,输得比较彻底。
- hessian使用压缩,虽然在字节上有20%以上的空间提升,但性能上差了4,5倍,典型的以时间换空间。总的来说还是google protobuf比较给力
总结
以后在内部系统,数据cache存储上可以考虑使用protobuf。跟外部系统交互上可以考虑使用json。
有兴趣的同学,可以研究一下google protobuf的marshall的方式:http://code.google.com/intl/zh/apis/protocolbuffers/docs/encoding.html