protobuf全称Protocol Buffers,是google推出的一种高效,快捷的数据交换格式,和XML,Thrift一样,都是一种数据交换协议(当然thrift还提供rpc的功能)。protobuf相对与xml结构化的文本数据格式,它是一种二进制的数据格式,具有更高的传输,打包和解包效率,这也是为什么protobuf很受欢迎的原因。
protobuf通过自己的编译器,对协议文件进行编译,生成对应语言的代码,方便的进行数据的打包和解包。目前,google 提供了三种语言的实现:java、c++ 和python,每一种实现都包含了相应语言的编译器以及库文件。
- message
message Order { required uint64 uid = 1; required float cost = 2; optional string tag = 3; }
它经过protobuf编译成c++代码,会生成对应的XXX.pb.h和。message会对应生成一个class,里面存放对应的data members,处理这些数据的函数,以及对应的打包和解包函数。
class Order : public ::google::protobuf::Message { public: ... // accessors ------------------------------------------------------- ... ::google::protobuf::uint64 uid_; ::std::string* tag_; float cost_; };
: 必须赋值,不能为空,否则该条message会被认为是“uninitialized”。build一个“uninitialized” message会抛出一个RuntimeException异常,解析一条“uninitialized” message会抛出一条IOException异常。除此之外,“required”字段跟“optional”字段并无差别。optional
: 该字段可以重复任意次数,包括0次。重复数据的顺序将会保存在protocol buffer中,将这个字段想象成一个可以自动设置size的数组就可以了。
- 枚举
enum Corpus { UNIVERSAL = 0; WEB = 1; IMAGES = 2; LOCAL = 3; NEWS = 4; PRODUCTS = 5; VIDEO = 6; }
执行 protoc --cpp_out=. enum_test.proto,会生成以下c++代码
enum Corpus { UNIVERSAL = 0, WEB = 1, IMAGES = 2, LOCAL = 3, NEWS = 4, PRODUCTS = 5, VIDEO = 6 };
- 基本数据类型
- message详解:
//xxx.proto message Order { required uint64 uid = 1; required float cost = 2; optional string tag = 3; } //xxx.pb.h <pre name="code" class="cpp">class Order : public ::google::protobuf::Message { public: ... // accessors ------------------------------------------------------- // required uint64 uid = 1; inline bool has_uid() const; inline void clear_uid(); static const int kUidFieldNumber = 1; inline ::google::protobuf::uint64 uid() const; inline void set_uid(::google::protobuf::uint64 value); // required float cost = 2; inline bool has_cost() const; inline void clear_cost(); static const int kCostFieldNumber = 2; inline float cost() const; inline void set_cost(float value); // optional string tag = 3; inline bool has_tag() const; inline void clear_tag(); static const int kTagFieldNumber = 3; inline const ::std::string& tag() const; inline void set_tag(const ::std::string& value); inline void set_tag(const char* value); inline void set_tag(const char* value, size_t size); inline ::std::string* mutable_tag(); inline ::std::string* release_tag(); inline void set_allocated_tag(::std::string* tag); // @@protoc_insertion_point(class_scope:Order) private: inline void set_has_uid(); inline void clear_has_uid(); inline void set_has_cost(); inline void clear_has_cost(); inline void set_has_tag(); inline void clear_has_tag(); ::google::protobuf::uint32 _has_bits_[1]; ::google::protobuf::uint64 uid_; ::std::string* tag_; float cost_; };
对于每一个message的data member,protobuf会自动生成相关的处理函数,对于每一个字段主要的处理函数有:has_uid(), clear_uid(), uid(), set_uid(),它们分别用于判断该字段是否被设置,清除该字段设置记录,获得该字段,设置该字段。对于示例中的uid字段,对应函数的实现如下:
//xxx.pb.h // required uint64 uid = 1; inline bool Order::has_uid() const { return (_has_bits_[0] & 0x00000001u) != 0; } inline void Order::set_has_uid() { _has_bits_[0] |= 0x00000001u; } inline void Order::clear_has_uid() { _has_bits_[0] &= ~0x00000001u; } inline void Order::clear_uid() { uid_ = GOOGLE_ULONGLONG(0); clear_has_uid(); } inline ::google::protobuf::uint64 Order::uid() const { // @@protoc_insertion_point(field_get:Order.uid) return uid_; } inline void Order::set_uid(::google::protobuf::uint64 value) { set_has_uid(); uid_ = value; // @@protoc_insertion_point(field_set:Order.uid) }
::google::protobuf::uint32 _has_bits_[1];
通过_has_bits_的位来表达各个字段是否被设置。分别通过0x01, 0x02, 0x04...来分别标记第1,2,3,,,各个field是否已经被设置。
// Serialization --------------------------------------------------- // Methods for serializing in protocol buffer format. Most of these // are just simple wrappers around ByteSize() and SerializeWithCachedSizes(). // Write a protocol buffer of this message to the given output. Returns // false on a write error. If the message is missing required fields, // this may GOOGLE_CHECK-fail. bool SerializeToCodedStream(io::CodedOutputStream* output) const; // Like SerializeToCodedStream(), but allows missing required fields. bool SerializePartialToCodedStream(io::CodedOutputStream* output) const; // Write the message to the given zero-copy output stream. All required // fields must be set. bool SerializeToZeroCopyStream(io::ZeroCopyOutputStream* output) const; bool SerializePartialToZeroCopyStream(io::ZeroCopyOutputStream* output) const; // Serialize the message and store it in the given string. All required // fields must be set. bool SerializeToString(string* output) const; bool SerializePartialToString(string* output) const; // Serialize the message and store it in the given byte array. All required // fields must be set. bool SerializeToArray(void* data, int size) const; bool SerializePartialToArray(void* data, int size) const; string SerializeAsString() const; string SerializePartialAsString() const; // Like SerializeToString(), but appends to the data to the string's existing // contents. All required fields must be set. bool AppendToString(string* output) const; bool AppendPartialToString(string* output) const; // Serialize the message and write it to the given file descriptor. All // required fields must be set. bool SerializeToFileDescriptor(int file_descriptor) const; bool SerializePartialToFileDescriptor(int file_descriptor) const; // Serialize the message and write it to the given C++ ostream. All // required fields must be set. bool SerializeToOstream(ostream* output) const; bool SerializePartialToOstream(ostream* output) const;