一種自動反射消息類型的 Google Protobuf 網絡傳輸方案

陳碩 (giantchen_AT_gmail)

Blog.csdn.net/Solstice t.sina.com.cn/giantchen

這篇文章要解決的問題是：在接收到 protobuf 數據之后，如何自動創建具體的 Protobuf Message 對象，再做的反序列化。“自動”的意思是：當程序中新增一個 protobuf Message 類型時，這部分代碼不需要修改，不需要自己去注冊消息類型。其實，Google Protobuf 本身具有很強的反射(reflection)功能，可以根據 type name 創建具體類型的 Message 對象，我們直接利用即可。

本文假定讀者了解 Google Protocol Buffers 是什么，這不是一篇 protobuf 入門教程。

本文以 C++ 語言舉例，其他語言估計有類似的解法，歡迎補充。

本文的示例代碼在： https://github.com/chenshuo/recipes/tree/master/protobuf

網絡編程中使用 protobuf 的兩個問題

Google Protocol Buffers (Protobuf) 是一款非常優秀的庫，它定義了一種緊湊的可擴展二進制消息格式，特別適合網絡數據傳輸。它為多種語言提供 binding，大大方便了分布式程序的開發，讓系統不再局限于用某一種語言來編寫。

在網絡編程中使用 protobuf 需要解決兩個問題：

長度，protobuf 打包的數據沒有自帶長度信息或終結符，需要由應用程序自己在發生和接收的時候做正確的切分；
類型，protobuf 打包的數據沒有自帶類型信息，需要由發送方把類型信息傳給給接收方，接收方創建具體的 Protobuf Message 對象，再做的反序列化。

第一個很好解決，通常的做法是在每個消息前面加個固定長度的 length header，例如我在《Muduo 網絡編程示例之二： Boost.Asio 的聊天服務器》中實現的 LengthHeaderCodec，代碼見 http://code.google.com/p/muduo/source/browse/trunk/examples/asio/chat/codec.h

第二個問題其實也很好解決，Protobuf 對此有內建的支持。但是奇怪的是，從網上簡單搜索的情況看，我發現了很多山寨的做法。

山寨做法

以下均為在 protobuf data 之前加上 header，header 中包含 int length 和類型信息。類型信息的山寨做法主要有兩種：

在 header 中放 int typeId，接收方用 switch-case 來選擇對應的消息類型和處理函數；
在 header 中放 string typeName，接收方用 look-up table 來選擇對應的消息類型和處理函數。

這兩種做法都有問題。

第一種做法要求保持 typeId 的唯一性，它和 protobuf message type 一一對應。如果 protobuf message 的使用范圍不廣，比如接收方和發送方都是自己維護的程序，那么 typeId 的唯一性不難保證，用版本管理工具即可。如果 protobuf message 的使用范圍很大，比如全公司都在用，而且不同部門開發的分布式程序可能相互通信，那么就需要一個公司內部的全局機構來分配 typeId，每次增加新 message type 都要去注冊一下，比較麻煩。

第二種做法稍好一點。typeName 的唯一性比較好辦，因為可以加上 package name（也就是用 message 的 fully qualified type name），各個部門事先分好 namespace，不會沖突與重復。但是每次新增消息類型的時候都要去手工修改 look-up table 的初始化代碼，比較麻煩。

其實，不需要自己重新發明輪子，protobuf 本身已經自帶了解決方案。

根據 type name 反射自動創建 Message 對象

Google Protobuf 本身具有很強的反射(reflection)功能，可以根據 type name 創建具體類型的 Message 對象。但是奇怪的是，其官方教程里沒有明確提及這個用法，我估計還有很多人不知道這個用法，所以覺得值得寫這篇 blog 談一談。

以下是陳碩繪制的 Protobuf class diagram，點擊查看原圖。

protobuf_classdiagram

我估計大家通常關心和使用的是圖的左半部分：MessageLite、Message、Generated Message Types (Person, AddressBook) 等，而較少注意到圖的右半部分：Descriptor, DescriptorPool, MessageFactory。

上圖中，其關鍵作用的是 Descriptor class，每個具體 Message Type 對應一個 Descriptor 對象。盡管我們沒有直接調用它的函數，但是Descriptor在“根據 type name 創建具體類型的 Message 對象”中扮演了重要的角色，起了橋梁作用。上圖的紅色箭頭描述了根據 type name 創建具體 Message 對象的過程，后文會詳細介紹。

原理簡述

Protobuf Message class 采用了 prototype pattern，Message class 定義了 New() 虛函數，用以返回本對象的一份新實例，類型與本對象的真實類型相同。也就是說，拿到 Message* 指針，不用知道它的具體類型，就能創建和它類型一樣的具體 Message Type 的對象。

每個具體 Message Type 都有一個 default instance，可以通過 ConcreteMessage::default_instance() 獲得，也可以通過 MessageFactory::GetPrototype(const Descriptor*) 來獲得。所以，現在問題轉變為 1. 如何拿到 MessageFactory；2. 如何拿到 Descriptor*。

當然，ConcreteMessage::descriptor() 返回了我們想要的 Descriptor*，但是，在不知道 ConcreteMessage 的時候，如何調用它的靜態成員函數呢？這似乎是個雞與蛋的問題。

我們的英雄是 DescriptorPool，它可以根據 type name 查到 Descriptor*，只要找到合適的 DescriptorPool，再調用 DescriptorPool::FindMessageTypeByName(const string& type_name) 即可。眼前一亮？

在最終解決問題之前，先簡單測試一下，看看我上面說的對不對。

簡單測試

本文用于舉例的 proto 文件：query.proto，見 https://github.com/chenshuo/recipes/blob/master/protobuf/query.proto

package muduo;
message Query {
required int64 id = 1;
required string questioner = 2;
repeated string question = 3;
}
message Answer {
required int64 id = 1;
required string questioner = 2;
required string answerer = 3;
repeated string solution = 4;
}
message Empty {
optional int32 id = 1;
}

其中的 Query.questioner 和 Answer.answerer 是我在前一篇文章這提到的《分布式系統中的進程標識》。

以下代碼驗證 ConcreteMessage::default_instance()、ConcreteMessage::descriptor()、 MessageFactory::GetPrototype()、DescriptorPool::FindMessageTypeByName() 之間的不變式 (invariant)：

https://github.com/chenshuo/recipes/blob/master/protobuf/descriptor_test.cc#L15

  typedef muduo::Query T;
std::string type_name = T::descriptor()->full_name();
cout << type_name << endl;
const Descriptor* descriptor = DescriptorPool::generated_pool()->FindMessageTypeByName(type_name);
assert(descriptor == T::descriptor());
cout << "FindMessageTypeByName() = " << descriptor << endl;
cout << "T::descriptor()         = " << T::descriptor() << endl;
cout << endl;
const Message* prototype = MessageFactory::generated_factory()->GetPrototype(descriptor);
assert(prototype == &T::default_instance());
cout << "GetPrototype()        = " << prototype << endl;
cout << "T::default_instance() = " << &T::default_instance() << endl;
cout << endl;
T* new_obj = dynamic_cast<T*>(prototype->New());
assert(new_obj != NULL);
assert(new_obj != prototype);
assert(typeid(*new_obj) == typeid(T::default_instance()));
cout << "prototype->New() = " << new_obj << endl;
cout << endl;
delete new_obj;

根據 type name 自動創建 Message 的關鍵代碼

好了，萬事具備，開始行動：

用 DescriptorPool::generated_pool() 找到一個 DescriptorPool 對象，它包含了程序編譯的時候所鏈接的全部 protobuf Message types。
用 DescriptorPool::FindMessageTypeByName() 根據 type name 查找 Descriptor。
再用 MessageFactory::generated_factory() 找到 MessageFactory 對象，它能創建程序編譯的時候所鏈接的全部 protobuf Message types。
然后，用 MessageFactory::GetPrototype() 找到具體 Message Type 的 default instance。
最后，用 prototype->New() 創建對象。

示例代碼見 https://github.com/chenshuo/recipes/blob/master/protobuf/codec.h#L69

Message* createMessage(const std::string& typeName)
{
Message* message = NULL;
const Descriptor* descriptor = DescriptorPool::generated_pool()->FindMessageTypeByName(typeName);
if (descriptor)
{
const Message* prototype = MessageFactory::generated_factory()->GetPrototype(descriptor);
if (prototype)
{
message = prototype->New();
}
}
return message;
}

調用方式：https://github.com/chenshuo/recipes/blob/master/protobuf/descriptor_test.cc#L49

  Message* newQuery = createMessage("muduo.Query");
assert(newQuery != NULL);
assert(typeid(*newQuery) == typeid(muduo::Query::default_instance()));
cout << "createMessage(\"muduo.Query\") = " << newQuery << endl;

古之人不余欺也 :-)

注意，createMessage() 返回的是動態創建的對象的指針，調用方有責任釋放它，不然就會內存泄露。在 muduo 里，我用 shared_ptr<Message> 來自動管理 Message 對象的生命期。

線程安全性

Google 的文檔說，我們用到的那幾個 MessageFactory 和 DescriptorPool 都是線程安全的，Message::New() 也是線程安全的。并且它們都是 const member function。

關鍵問題解決了，那么剩下工作就是設計一種包含長度和消息類型的 protobuf 傳輸格式。

Protobuf 傳輸格式

陳碩設計了一個簡單的格式，包含 protobuf data 和它對應的長度與類型信息，消息的末尾還有一個 check sum。格式如下圖，圖中方塊的寬度是 32-bit。

protobuf_wireformat1

用 C struct 偽代碼描述：

 struct ProtobufTransportFormat __attribute__ ((__packed__))
{
int32_t  len;
int32_t  nameLen;
char     typeName[nameLen];
char     protobufData[len-nameLen-8];
int32_t  checkSum; // adler32 of nameLen, typeName and protobufData
};

注意，這個格式不要求 32-bit 對齊，我們的 decoder 會自動處理非對齊的消息。

例子

用這個格式打包一個 muduo.Query 對象的結果是：

protobuf_wireexample

設計決策

以下是我在設計這個傳輸格式時的考慮：

signed int。消息中的長度字段只使用了 signed 32-bit int，而沒有使用 unsigned int，這是為了移植性，因為 Java 語言沒有 unsigned 類型。另外 Protobuf 一般用于打包小于 1M 的數據，unsigned int 也沒用。
check sum。雖然 TCP 是可靠傳輸協議，雖然 Ethernet 有 CRC-32 校驗，但是網絡傳輸必須要考慮數據損壞的情況，對于關鍵的網絡應用，check sum 是必不可少的。對于 protobuf 這種緊湊的二進制格式而言，肉眼看不出數據有沒有問題，需要用 check sum。
adler32 算法。我沒有選用常見的 CRC-32，而是選用 adler32，因為它計算量小、速度比較快，強度和 CRC-32差不多。另外，zlib 和 java.unit.zip 都直接支持這個算法，不用我們自己實現。
type name 以 '\0' 結束。這是為了方便 troubleshooting，比如通過 tcpdump 抓下來的包可以用肉眼很容易看出 type name，而不用根據 nameLen 去一個個數字節。同時，為了方便接收方處理，加入了 nameLen，節省 strlen()，空間換時間。
沒有版本號。Protobuf Message 的一個突出優點是用 optional fields 來避免協議的版本號（凡是在 protobuf Message 里放版本號的人都沒有理解 protobuf 的設計），讓通信雙方的程序能各自升級，便于系統演化。如果我設計的這個傳輸格式又把版本號加進去，那就畫蛇添足了。具體請見本人《分布式系統的工程化開發方法》第 57 頁：消息格式的選擇。

示例代碼

為了簡單起見，采用 std::string 來作為打包的產物，僅為示例。

打包 encode 的代碼：https://github.com/chenshuo/recipes/blob/master/protobuf/codec.h#L35

解包 decode 的代碼：https://github.com/chenshuo/recipes/blob/master/protobuf/codec.h#L99

測試代碼： https://github.com/chenshuo/recipes/blob/master/protobuf/codec_test.cc

如果以上代碼編譯通過，但是在運行時出現“cannot open shared object file”錯誤，一般可以用 sudo ldconfig 解決，前提是 libprotobuf.so 位于 /usr/local/lib，且 /etc/ld.so.conf 列出了這個目錄。

$ make all # 如果你安裝了 boost，可以 make whole

$ ./codec_test
./codec_test: error while loading shared libraries: libprotobuf.so.6: cannot open shared object file: No such file or directory

$ sudo ldconfig

與 muduo 集成

muduo 網絡庫將會集成對本文所述傳輸格式的支持（預計 0.1.9 版本），我會另外寫一篇短文介紹 Protobuf Message <=> muduo::net::Buffer 的相互轉化，使用 muduo::net::Buffer 來打包比上面 std::string 的代碼還簡單，它是專門為 non-blocking 網絡庫設計的 buffer class。

此外，我們可以寫一個 codec 來自動完成轉換，就行 asio/char/codec.h 那樣。這樣客戶代碼直接收到的就是 Message 對象，發送的時候也直接發送 Message 對象，而不需要和 Buffer 對象打交道。

消息的分發 (dispatching)

目前我們已經解決了消息的自動創建，在網絡編程中，還有一個常見任務是把不同類型的 Message 分發給不同的處理函數，這同樣可以借助 Descriptor 來完成。我在 muduo 里實現了 ProtobufDispatcherLite 和 ProtobufDispatcher 兩個分發器，用戶可以自己注冊針對不同消息類型的處理函數。預計將會在 0.1.9 版本發布，您可以先睹為快：

初級版，用戶需要自己做 down casting： https://github.com/chenshuo/recipes/blob/master/protobuf/dispatcher_lite.cc

高級版，使用模板技巧，節省用戶打字： https://github.com/chenshuo/recipes/blob/master/protobuf/dispatcher.cc

基于 muduo 的 Protobuf RPC?

Google Protobuf 還支持 RPC，可惜它只提供了一個框架，沒有開源網絡相關的代碼，muduo 正好可以填補這一空白。我目前還沒有決定是不是讓 muduo 也支持以 protobuf message 為消息格式的 RPC，muduo 還有很多事情要做，我也有很多博客文章打算寫，RPC 這件事情以后再說吧。

注：Remote Procedure Call (RPC) 有廣義和狹義兩種意思。狹義的講，一般特指 ONC RPC，就是用來實現 NFS 的那個東西；廣義的講，“以函數調用之名，行網絡通信之實”都可以叫 RPC，比如 Java RMI，.Net Remoting，Apache Thrift，libevent RPC，XML-RPC 等等。

(待續)

posted on 2011-04-03 15:56 陳碩閱讀(5563) 評論(1) 編輯收藏引用

# re: 一種自動反射消息類型的 Google Protobuf 網絡傳輸方案 2014-05-15 14:29 yilong

陳碩的Blog