Building Hybrid Systems with Boost.Python
用Boost.Python構(gòu)建混合系統(tǒng)
Author: | David Abrahams |
---|---|
Contact: | dave@boost-consulting.com |
Organization: | Boost Consulting |
Date: | 2003-03-19 |
Author: | Ralf W. Grosse-Kunstleve |
Copyright: | Copyright David Abrahams and Ralf W. Grosse-Kunstleve 2003. All rights reserved |
翻譯: | 王志勇,Slowness Chen,金慶 |
譯文更新: | 2008-05-29 |
Abstract
摘要
Boost.Python is an open source C++ library which provides a conciseIDL-like interface for binding C++ classes and functions toPython. Leveraging the full power of C++ compile-time introspectionand of recently developed metaprogramming techniques, this is achievedentirely in pure C++, without introducing a new syntax.Boost.Python's rich set of features and high-level interface make itpossible to engineer packages from the ground up as hybrid systems,giving programmers easy and coherent access to both the efficientcompile-time polymorphism of C++ and the extremely convenient run-timepolymorphism of Python.
Boost.Python是一個(gè)開(kāi)源C++庫(kù),它提供了一個(gè)簡(jiǎn)明的IDL式的接口,用于把C++類和函數(shù)綁定到Python。借助C++強(qiáng)大的編譯時(shí)內(nèi)省能力和最近發(fā)展的元編程技術(shù),綁定工作完全用純C++實(shí)現(xiàn),而沒(méi)有引入新的語(yǔ)法。Boost.Python豐富的特性和高級(jí)接口,使得完全按混合系統(tǒng)設(shè)計(jì)軟件包成為可能,并讓程序員以輕松連貫的方式,同時(shí)使用C++高效的編譯時(shí)多態(tài),和Python極端便利的運(yùn)行時(shí)多態(tài)。
Introduction
介紹
Python and C++ are in many ways as different as two languages couldbe: while C++ is usually compiled to machine-code, Python isinterpreted. Python's dynamic type system is often cited as thefoundation of its flexibility, while in C++ static typing is thecornerstone of its efficiency. C++ has an intricate and difficultcompile-time meta-language, while in Python, practically everythinghappens at runtime.
作為兩種語(yǔ)言,Python和C++存在很多差異。C++一般被編譯為機(jī)器碼,而Python是解釋執(zhí)行的。Python的動(dòng)態(tài)類型系統(tǒng)通常被認(rèn)為是它靈活性的基礎(chǔ),而C++的靜態(tài)類型系統(tǒng)是C++效率的基石。C++有一種復(fù)雜艱深的編譯時(shí)元語(yǔ)言,而在Python中,幾乎一切都發(fā)生在運(yùn)行時(shí)。
Yet for many programmers, these very differences mean that Python andC++ complement one another perfectly. Performance bottlenecks inPython programs can be rewritten in C++ for maximal speed, andauthors of powerful C++ libraries choose Python as a middlewarelanguage for its flexible system integration capabilities.Furthermore, the surface differences mask some strong similarities:
然而對(duì)很多程序員來(lái)說(shuō),這些差異也意味著Python和C++可以完美互補(bǔ)。為了提高運(yùn)行速度,Python程序的性能瓶頸可以用C++重寫(xiě),而大型C++庫(kù)的作者們,為了獲得靈活的系統(tǒng)集成能力,選擇Python作為中間件語(yǔ)言。此外,在表面差異掩蓋之下,二者有一些非常相似之處:
- 'C'-family control structures (if, while, for...)
- Support for object-orientation, functional programming, and genericprogramming (these are both multi-paradigm programming languages.)
- Comprehensive operator overloading facilities, recognizing theimportance of syntactic variability for readability andexpressivity.
- High-level concepts such as collections and iterators.
- High-level encapsulation facilities (C++: namespaces, Python: modules)to support the design of re-usable libraries.
- Exception-handling for effective management of error conditions.
- C++ idioms in common use, such as handle/body classes andreference-counted smart pointers mirror Python reference semantics.
- 'C'-家族的控制結(jié)構(gòu)(if, while, for...)
- 支持面向?qū)ο蟆⒑瘮?shù)式編程,以及泛型編程(它們都是多范式(multi-paradigm)編程語(yǔ)言。)
- 認(rèn)同語(yǔ)法可變性(syntactic variability)對(duì)于提高代碼可讀性和表達(dá)力的重要作用,提供了對(duì)運(yùn)算符重載的廣泛支持。
- 高級(jí)概念,如集合和迭代器。
- 高級(jí)封裝機(jī)制(C++:名字空間,Python:模塊),以支持可重用庫(kù)的設(shè)計(jì)。
- 異常處理,提供有效的錯(cuò)誤管理。
- 通用的C++慣用法,如handle/body類,和引用計(jì)數(shù)的智能指針,即Python的引用語(yǔ)義。
Given Python's rich 'C' interoperability API, it should in principlebe possible to expose C++ type and function interfaces to Python withan analogous interface to their C++ counterparts. However, thefacilities provided by Python alone for integration with C++ arerelatively meager. Compared to C++ and Python, 'C' has only veryrudimentary abstraction facilities, and support for exception-handlingis completely missing. 'C' extension module writers are required tomanually manage Python reference counts, which is both annoyinglytedious and extremely error-prone. Traditional extension modules alsotend to contain a great deal of boilerplate code repetition whichmakes them difficult to maintain, especially when wrapping an evolvingAPI.
因?yàn)镻ython有著豐富的'C'語(yǔ)言集成API,原則上,向Python導(dǎo)出C++類型和函數(shù)接口應(yīng)該是可行的,并且導(dǎo)出的接口與對(duì)應(yīng)C++的接口應(yīng)該是相似的。然而,Python本身提供的C++集成功能相對(duì)比較弱。和C++,Python相比,'C'只有非常基本的抽象能力,而且完全不支持異常處理。'C'擴(kuò)展模塊的作者必須手工管理Python的引用計(jì)數(shù),這不僅單調(diào)乏味,令人惱火,而且還極易出錯(cuò)。傳統(tǒng)的擴(kuò)展模塊往往包含大量重復(fù)的樣板代碼,使它們難以維護(hù),尤其是當(dāng)要封裝的API尚處于發(fā)展之中。
These limitations have lead to the development of a variety of wrappingsystems. SWIG is probably the most popular package for theintegration of C/C++ and Python. A more recent development is SIP,which was specifically designed for interfacing Python with the Qtgraphical user interface library. Both SWIG and SIP introduce theirown specialized languages for customizing inter-language bindings.This has certain advantages, but having to deal with three differentlanguages (Python, C/C++ and the interface language) also introducespractical and mental difficulties. The CXX package demonstrates aninteresting alternative. It shows that at least some parts ofPython's 'C' API can be wrapped and presented through a much moreuser-friendly C++ interface. However, unlike SWIG and SIP, CXX doesnot include support for wrapping C++ classes as new Python types.
這些限制導(dǎo)致了多種封裝系統(tǒng)的發(fā)展。SWIG可能是最流行的C/C++和Python集成系統(tǒng)。還有最近發(fā)展的SIP,它是專門(mén)為Qt圖形用戶界面庫(kù)設(shè)計(jì)的,用于提供Qt的Python接口。為了定制語(yǔ)言間的綁定,SWIG和SIP都引入了它們自己的專用語(yǔ)言。這有一定的好處,但是你不得不去應(yīng)付三種不同語(yǔ)言(Python、C/C++和接口語(yǔ)言),所以也帶來(lái)了事實(shí)上和心理上的困難。
The features and goals of Boost.Python overlap significantly withmany of these other systems. That said, Boost.Python attempts tomaximize convenience and flexibility without introducing a separatewrapping language. Instead, it presents the user with a high-levelC++ interface for wrapping C++ classes and functions, managing much ofthe complexity behind-the-scenes with static metaprogramming.Boost.Python also goes beyond the scope of earlier systems byproviding: Boost.Python的特性和目標(biāo)與這些系統(tǒng)有很多重疊。Boost.Python努力提高封裝的便利性和靈活性,但不引入單獨(dú)的封裝語(yǔ)言。相反,它通過(guò)靜態(tài)元編程,在幕后管理大量的復(fù)雜性,呈現(xiàn)給用戶一個(gè)高級(jí)C++接口來(lái)封裝C++類和函數(shù)。Boost.Python也在如下領(lǐng)域超越了早期的系統(tǒng): The key insight that sparked the development of Boost.Python is thatmuch of the boilerplate code in traditional extension modules could beeliminated using C++ compile-time introspection. Each argument of awrapped C++ function must be extracted from a Python object using aprocedure that depends on the argument type. Similarly the function'sreturn type determines how the return value will be converted from C++to Python. Of course argument and return types are part of eachfunction's type, and this is exactly the source from whichBoost.Python deduces most of the information required. 一個(gè)關(guān)鍵性的發(fā)現(xiàn)啟動(dòng)了Boost.Python的開(kāi)發(fā),即利用C++的編譯時(shí)內(nèi)省,可以消除傳統(tǒng)擴(kuò)展模塊中的大量樣板代碼。如每個(gè)封裝的C++函數(shù)的參數(shù)都是從Python對(duì)象提取的,提取時(shí)必須根據(jù)參數(shù)類型調(diào)用相應(yīng)的過(guò)程。類似地,函數(shù)返回值從C++轉(zhuǎn)換成Python時(shí),返回值的類型決定了如何轉(zhuǎn)換。因?yàn)閰?shù)和返回值的類型是每個(gè)函數(shù)類型的一部分,所以Boost.Python可以從函數(shù)類型推導(dǎo)出大部分所需的信息。 This approach leads to user guided wrapping: as much information isextracted directly from the source code to be wrapped as is possiblewithin the framework of pure C++, and some additional information issupplied explicitly by the user. Mostly the guidance is mechanicaland little real intervention is required. Because the interfacespecification is written in the same full-featured language as thecode being exposed, the user has unprecedented power available whenshe does need to take control. 這種方法導(dǎo)致了“用戶指導(dǎo)的封裝(user guided wrapping)”:在純C++的框架內(nèi),從待封裝的源代碼中直接提取盡可能多的信息,而一些額外的信息由用戶顯式提供。通常這種指導(dǎo)是自動(dòng)的,很少需要真正的干涉。因?yàn)榻涌谝?guī)范和導(dǎo)出代碼是用同一門(mén)全功能的語(yǔ)言寫(xiě)的,當(dāng)用戶確實(shí)需要取得控制時(shí),他所擁有的權(quán)力是空前強(qiáng)大的。
Boost.Python Design Goals
Boost.Python的設(shè)計(jì)目標(biāo)
The primary goal of Boost.Python is to allow users to expose C++classes and functions to Python using nothing more than a C++compiler. In broad strokes, the user experience should be one ofdirectly manipulating C++ objects from Python.
Boost.Python的首要目標(biāo)是,讓用戶只用C++編譯器就能向Python導(dǎo)出C++類和函數(shù)。大體來(lái)講,用戶的體驗(yàn)應(yīng)該是,能夠從Python直接操作C++對(duì)象。
However, it's also important not to translate all interfaces tooliterally: the idioms of each language must be respected. Forexample, though C++ and Python both have an iterator concept, they areexpressed very differently. Boost.Python has to be able to bridge theinterface gap.
然而,有一點(diǎn)也很重要,那就是不要過(guò)于按字面翻譯所有接口:必須考慮每種語(yǔ)言的慣用法。例如,雖然C++和Python都有迭代器的概念,表達(dá)方式卻很不一樣。Boost.Python必須能夠消除這種接口的差異。
It must be possible to insulate Python users from crashes resultingfrom trivial misuses of C++ interfaces, such as accessingalready-deleted objects. By the same token the library shouldinsulate C++ users from low-level Python 'C' API, replacingerror-prone 'C' interfaces like manual reference-count management andraw PyObject pointers with more-robust alternatives.
Python用戶可能會(huì)誤用C++接口,因此,Boost.Python必須能夠隔離因輕微的誤用而造成的崩潰,例如訪問(wèn)已刪除的對(duì)象。同樣的,Boost.Python庫(kù)應(yīng)該把C++用戶從低級(jí)的Python 'C' API中解放出來(lái),將容易出錯(cuò)的'C'接口,如手工引用計(jì)數(shù)管理、原始的PyObject指針,替換為更健壯的接口。
Support for component-based development is crucial, so that C++ typesexposed in one extension module can be passed to functions exposed inanother without loss of crucial information like C++ inheritancerelationships.
支持基于組件的開(kāi)發(fā)是至關(guān)重要的,這樣,一個(gè)擴(kuò)展模塊導(dǎo)出的C++類型,可以傳遞給另一個(gè)模塊導(dǎo)出的函數(shù),而不丟失重要的信息,比如C++的繼承關(guān)系。
Finally, all wrapping must be non-intrusive, without modifying oreven seeing the original C++ source code. Existing C++ libraries haveto be wrappable by third parties who only have access to header filesand binaries.
最后,所有的封裝必須是非侵入性的(non-intrusive),不能修改最初的C++源碼,甚至不必看到源碼。第三方必須能夠封裝現(xiàn)有的C++庫(kù),即使他只有頭文件和二進(jìn)制庫(kù)。
Hello Boost.Python World
Hello Boost.Python World
And now for a preview of Boost.Python, and how it improves on the rawfacilities offered by Python. Here's a function we might want toexpose:
現(xiàn)在來(lái)預(yù)覽一下Boost.Python,看看它是如何改進(jìn)Python原有的封裝機(jī)制的。下面是我們想導(dǎo)出的函數(shù):
char const* greet(unsigned x)
{
static char const* const msgs[] = { "hello", "Boost.Python", "world!" };
if (x > 2)
throw std::range_error("greet: index out of range");
return msgs[x];
}
To wrap this function in standard C++ using the Python 'C' API, we'dneed something like this:
在標(biāo)準(zhǔn)C++中,用Python 'C' API來(lái)封裝這個(gè)函數(shù),我們需要像這樣做:
extern "C" // all Python interactions use 'C' linkage and calling convention
{
// Wrapper to handle argument/result conversion and checking
PyObject* greet_wrap(PyObject* args, PyObject * keywords)
{
int x;
if (PyArg_ParseTuple(args, "i", &x)) // extract/check arguments
{
char const* result = greet(x); // invoke wrapped function
return PyString_FromString(result); // convert result to Python
}
return 0; // error occurred
}
// Table of wrapped functions to be exposed by the module
static PyMethodDef methods[] = {
{ "greet", greet_wrap, METH_VARARGS, "return one of 3 parts of a greeting" }
, { NULL, NULL, 0, NULL } // sentinel
};
// module initialization function
DL_EXPORT init_hello()
{
(void) Py_InitModule("hello", methods); // add the methods to the module
}
}
Now here's the wrapping code we'd use to expose it with Boost.Python:
而這是用Boost.Python來(lái)導(dǎo)出函數(shù)的封裝代碼:
#include <boost/python.hpp>
using namespace boost::python;
BOOST_PYTHON_MODULE(hello)
{
def("greet", greet, "return one of 3 parts of a greeting");
}
and here it is in action:
這是運(yùn)行結(jié)果:
>>> import hello
>>> for x in range(3):
... print hello.greet(x)
...
hello
Boost.Python
world!
Aside from the fact that the 'C' API version is much more verbose,it's worth noting a few things that it doesn't handle correctly:
使用'C' API的版本要冗長(zhǎng)的多,此外,還需要注意,有些東西它沒(méi)有正確處理:
- The original function accepts an unsigned integer, and the Python'C' API only gives us a way of extracting signed integers. TheBoost.Python version will raise a Python exception if we try to passa negative number to hello.greet, but the other one will proceedto do whatever the C++ implementation does when converting annegative integer to unsigned (usually wrapping to some very largenumber), and pass the incorrect translation on to the wrappedfunction.
原來(lái)的函數(shù)接受一個(gè)無(wú)符號(hào)整數(shù),然而Python 'C' API只能提取有符號(hào)整數(shù)。如果我們?cè)噲D向hello.greet傳遞一個(gè)負(fù)數(shù),Boost.Python版會(huì)引發(fā)Python異常,而另一個(gè)則會(huì)繼續(xù):執(zhí)行C++代碼,將負(fù)數(shù)轉(zhuǎn)換為無(wú)符號(hào)數(shù)(通常會(huì)變成一個(gè)很大的數(shù)),然后把不正確的轉(zhuǎn)換結(jié)果傳遞給被封裝的函數(shù)。
- That brings us to the second problem: if the C++ greet()function is called with a number greater than 2, it will throw anexception. Typically, if a C++ exception propagates across theboundary with code generated by a 'C' compiler, it will cause acrash. As you can see in the first version, there's no C++scaffolding there to prevent this from happening. Functions wrappedby Boost.Python automatically include an exception-handling layerwhich protects Python users by translating unhandled C++ exceptionsinto a corresponding Python exception.
這引起了第二個(gè)問(wèn)題:如果輸入一個(gè)大于2的參數(shù),C++ greet()函數(shù)會(huì)拋出異常。典型的,如果C++異常傳播時(shí),跨越了'C'編譯器生成的代碼的邊界,就會(huì)導(dǎo)致崩潰。正如你在第一個(gè)版本中所見(jiàn),那兒沒(méi)有防止崩潰的C++機(jī)制。而B(niǎo)oost.Python封裝的函數(shù)自動(dòng)包含了異常處理層,它把未處理的C++異常翻譯成相應(yīng)的Python異常,從而保護(hù)了Python用戶。
- A slightly more-subtle limitation is that the argument conversionused in the Python 'C' API case can only get that integer x inone way. PyArg_ParseTuple can't convert Python long objects(arbitrary-precision integers) which happen to fit in an unsignedint but not in a signed long, nor will it ever handle awrapped C++ class with a user-defined implicit operator unsignedint() conversion. Boost.Python's dynamic type conversionregistry allows users to add arbitrary conversion methods.
一個(gè)更微妙的限制是,Python 'C' API的參數(shù)轉(zhuǎn)換只能以“一種”方式取得整數(shù)x。如果有一個(gè)Python long對(duì)象(任意精度整數(shù)),它的大小正好屬于unsigned int,但不屬于signed long,PyArg_ParseTuple就不能對(duì)其進(jìn)行轉(zhuǎn)換。對(duì)于一個(gè)定義了operator unsigned int(),即用戶自定義隱式轉(zhuǎn)換的C++封裝類,它同樣無(wú)法處理。而B(niǎo)oost.Python的動(dòng)態(tài)類型轉(zhuǎn)換注冊(cè)表允許用戶添加任意的轉(zhuǎn)換方法。
Library Overview
庫(kù)概覽
This section outlines some of the library's major features. Except asneccessary to avoid confusion, details of library implementation areomitted.
本節(jié)簡(jiǎn)述了庫(kù)的一些主要特性。在不影響理解的情況下,省略了庫(kù)的實(shí)現(xiàn)細(xì)節(jié)。
Exposing Classes
導(dǎo)出類
C++ classes and structs are exposed with a similarly-terse interface.Given:
C++類和結(jié)構(gòu)是用同樣簡(jiǎn)潔的接口導(dǎo)出的。如有:
struct World
{
void set(std::string msg) { this->msg = msg; }
std::string greet() { return msg; }
std::string msg;
};
The following code will expose it in our extension module:
以下代碼會(huì)將它導(dǎo)出到擴(kuò)展模塊:
#include <boost/python.hpp>
BOOST_PYTHON_MODULE(hello)
{
class_<World>("World")
.def("greet", &World::greet)
.def("set", &World::set)
;
}
Although this code has a certain pythonic familiarity, peoplesometimes find the syntax bit confusing because it doesn't look likemost of the C++ code they're used to. All the same, this is juststandard C++. Because of their flexible syntax and operatoroverloading, C++ and Python are great for defining domain-specific(sub)languages(DSLs), and that's what we've done in Boost.Python. To break it down:
盡管上述代碼具有某種熟悉的Python風(fēng)格,但語(yǔ)法還是有點(diǎn)令人迷惑,因?yàn)樗雌饋?lái)不像通常的C++代碼。但是,這仍然是正確的標(biāo)準(zhǔn)C++。因?yàn)镃++和Python具有靈活的語(yǔ)法和運(yùn)算符重載,它們都很善于定義特定領(lǐng)域(子)語(yǔ)言(DSLs, domain-specific (sub)languages)。我們?cè)贐oost.Python里面就是定義了一個(gè)DSL。把代碼拆開(kāi)來(lái)看:
class_<World>("World")
constructs an unnamed object of type class_<World> and passes"World" to its constructor. This creates a new-style Python classcalled World in the extension module, and associates it with theC++ type World in the Boost.Python type conversion registry. Wemight have also written:
構(gòu)造了一個(gè)匿名對(duì)象,類型為class_<World>,并把"World"傳遞給它的構(gòu)造函數(shù)。這將在擴(kuò)展模塊里創(chuàng)建一個(gè)新型Python類World,并在Boost.Python的類型轉(zhuǎn)換注冊(cè)表里,把它和C++類型World關(guān)聯(lián)起來(lái)。我們也可以這么寫(xiě):
class_<World> w("World");
but that would've been more verbose, since we'd have to name wagain to invoke its def() member function:
但是那會(huì)顯得更冗長(zhǎng),因?yàn)槲覀儾坏貌辉俅瓮ㄟ^(guò)w去調(diào)用它的def()成員函數(shù):
w.def("greet", &World::greet)
There's nothing special about the location of the dot for memberaccess in the original example: C++ allows any amount of whitespace oneither side of a token, and placing the dot at the beginning of eachline allows us to chain as many successive calls to member functionsas we like with a uniform syntax. The other key fact that allowschaining is that class_<> member functions all return a referenceto *this.
原來(lái)的例子里的點(diǎn)表示成員訪問(wèn),它的位置沒(méi)有什么特別的:因?yàn)镃++允許標(biāo)記(token)的兩邊可以有任意數(shù)量的空白符。把點(diǎn)放在每行的開(kāi)始,允許我們以一致的句法,鏈?zhǔn)酱舆B續(xù)的成員函數(shù)調(diào)用,想串多少都行。允許鏈?zhǔn)秸{(diào)用的另一關(guān)鍵是,class_<>的成員函數(shù)都返回對(duì)*this的引用。
So the example is equivalent to:
因此本例等同于:
class_<World> w("World");
w.def("greet", &World::greet);
w.def("set", &World::set);
It's occasionally useful to be able to break down the components of aBoost.Python class wrapper in this way, but the rest of this articlewill stick to the terse syntax.
這種方式將Boost.Python類包裝的部件都拆分開(kāi)來(lái)了,能這樣拆分有時(shí)候是有用的。但本文下面仍將堅(jiān)持使用簡(jiǎn)潔格式。
For completeness, here's the wrapped class in use:
最后來(lái)看封裝類的使用:
>>> import hello
>>> planet = hello.World()
>>> planet.set('howdy')
>>> planet.greet()
'howdy'
Constructors
構(gòu)造函數(shù)
Since our World class is just a plain struct, it has animplicit no-argument (nullary) constructor. Boost.Python exposes thenullary constructor by default, which is why we were able to write:
由于我們的World類只是一個(gè)簡(jiǎn)單的struct,它有一個(gè)隱式的無(wú)參數(shù)的構(gòu)造函數(shù)。Boost.Python默認(rèn)會(huì)導(dǎo)出這個(gè)無(wú)參數(shù)的構(gòu)造函數(shù),所以我們可以這樣寫(xiě):
>>> planet = hello.World()
However, well-designed classes in any language may require constructorarguments in order to establish their invariants. Unlike Python,where __init__ is just a specially-named method, In C++constructors cannot be handled like ordinary member functions. Inparticular, we can't take their address: &World::World is anerror. The library provides a different interface for specifyingconstructors. Given:
然而,在任何語(yǔ)言里,對(duì)于設(shè)計(jì)良好的類,構(gòu)造函數(shù)可能需要參數(shù),以建立類的不變式(invariant)。Python的__init__只是一個(gè)特殊命名的方法,而C++的構(gòu)造函數(shù)與Python不同,它不能像普通成員函數(shù)那樣處理。特別是,我們不能取它的地址:&World::World是一個(gè)錯(cuò)誤。Boost.Python庫(kù)提供了一個(gè)不同的接口來(lái)指定構(gòu)造函數(shù)。假設(shè)有:
struct World
{
World(std::string msg); // added constructor
...
we can modify our wrapping code as follows:
我們可以如下修改封裝代碼:
class_<World>("World", init<std::string>())
...
of course, a C++ class may have additional constructors, and we canexpose those as well by passing more instances of init<...> todef():
當(dāng)然,C++類可能還有其他的構(gòu)造函數(shù),我們也可以導(dǎo)出它們,只需要向def()傳入更多的init<...>實(shí)例:
class_<World>("World", init<std::string>())
.def(init<double, double>())
...
Boost.Python allows wrapped functions, member functions, andconstructors to be overloaded to mirror C++ overloading.
Boost.Python封裝的函數(shù)、成員函數(shù),以及構(gòu)造函數(shù)都可以重載,以映射C++中的重載。
Data Members and Properties
數(shù)據(jù)成員和屬性
Any publicly-accessible data members in a C++ class can be easilyexposed as either readonly or readwrite attributes:
C++中任何可公有訪問(wèn)的數(shù)據(jù)成員,都能輕易地封裝成readonly或者readwrite屬性:
class_<World>("World", init<std::string>())
.def_readonly("msg", &World::msg)
...
and can be used directly in Python:
并直接在Python中使用:
>>> planet = hello.World('howdy')
>>> planet.msg
'howdy'
This does not result in adding attributes to the World instance__dict__, which can result in substantial memory savings whenwrapping large data structures. In fact, no instance __dict__will be created at all unless attributes are explicitly added fromPython. Boost.Python owes this capability to the new Python 2.2 typesystem, in particular the descriptor interface and property type.
這不會(huì)在World實(shí)例__dict__中添加屬性,從而在封裝大型數(shù)據(jù)結(jié)構(gòu)時(shí)節(jié)省大量的內(nèi)存。實(shí)際上,根本不會(huì)創(chuàng)建實(shí)例__dict__,除非從Python顯式添加屬性。Boost.Python的這種能力歸功于Python 2.2新的類型系統(tǒng),尤其是描述符(descriptor)接口和property類型。
In C++, publicly-accessible data members are considered a sign of poordesign because they break encapsulation, and style guides usuallydictate the use of "getter" and "setter" functions instead. InPython, however, __getattr__, __setattr__, and since 2.2,property mean that attribute access is just one morewell-encapsulated syntactic tool at the programmer's disposal.Boost.Python bridges this idiomatic gap by making Python propertycreation directly available to users. If msg were private, wecould still expose it as attribute in Python as follows:
在C++中,人們認(rèn)為,可公有訪問(wèn)的數(shù)據(jù)成員是設(shè)計(jì)糟糕的標(biāo)志,因?yàn)樗鼈兤茐牧朔庋b性,并且風(fēng)格指南通常指示使用“getter”和“setter”函數(shù)作為替代。然而在Python里,__getattr__、__setattr__,和2.2版出現(xiàn)的property意味著,屬性訪問(wèn)僅僅是一種任由程序員選用的、封裝性更好的語(yǔ)法工具。Boost.Python讓用戶可直接創(chuàng)建Python property,從而消除了二者語(yǔ)言習(xí)慣上的差異。即使msg是私有的,我們?nèi)钥砂阉鼘?dǎo)出為Python中的屬性,如下:
class_<World>("World", init<std::string>())
.add_property("msg", &World::greet, &World::set)
...
The example above mirrors the familiar usage of properties in Python2.2+:
上例等同于Python 2.2+里面熟悉的屬性的用法:
>>> class World(object):
... __init__(self, msg):
... self.__msg = msg
... def greet(self):
... return self.__msg
... def set(self, msg):
... self.__msg = msg
... msg = property(greet, set)
Operator Overloading
運(yùn)算符重載
The ability to write arithmetic operators for user-defined types hasbeen a major factor in the success of both languages for numericalcomputation, and the success of packages like NumPy attests to thepower of exposing operators in extension modules. Boost.Pythonprovides a concise mechanism for wrapping operator overloads. Theexample below shows a fragment from a wrapper for the Boost rationalnumber library:
兩種語(yǔ)言都能夠?yàn)橛脩糇远x類型編寫(xiě)算術(shù)運(yùn)算符,這是它們?cè)跀?shù)值計(jì)算上獲得成功的主要因素,并且,像NumPy這樣的軟件包的成功證明了在擴(kuò)展模塊中導(dǎo)出運(yùn)算符的威力。Boost.Python為封裝運(yùn)算符重載提供了簡(jiǎn)潔的機(jī)制。下面是Boost有理數(shù)庫(kù)封裝代碼的片斷:
class_<rational<int> >("rational_int")
.def(init<int, int>()) // constructor, e.g. rational_int(3,4)
.def("numerator", &rational<int>::numerator)
.def("denominator", &rational<int>::denominator)
.def(-self) // __neg__ (unary minus)
.def(self + self) // __add__ (homogeneous)
.def(self * self) // __mul__
.def(self + int()) // __add__ (heterogenous)
.def(int() + self) // __radd__
...
The magic is performed using a simplified application of "expressiontemplates" [VELD1995], a technique originally developed foroptimization of high-performance matrix algebra expressions. Theessence is that instead of performing the computation immediately,operators are overloaded to construct a type representing thecomputation. In matrix algebra, dramatic optimizations are oftenavailable when the structure of an entire expression can be taken intoaccount, rather than evaluating each operation "greedily".Boost.Python uses the same technique to build an appropriate Pythonmethod object based on expressions involving self.
魔法的施展只是簡(jiǎn)單應(yīng)用了“表達(dá)式模板(expression templates)”[VELD1995],一種最初為高性能矩陣代數(shù)表達(dá)式優(yōu)化而開(kāi)發(fā)的技術(shù)。其精髓是,不是立即進(jìn)行計(jì)算,而是利用運(yùn)算符重載,來(lái)構(gòu)造一個(gè)代表計(jì)算的類型。在矩陣代數(shù)里,當(dāng)考慮整個(gè)表達(dá)式的結(jié)構(gòu),而不是“貪婪地”對(duì)每步運(yùn)算求值時(shí),經(jīng)常可以獲得顯著的優(yōu)化。Boost.Python使用了同樣的技術(shù),它用包含self的表達(dá)式,構(gòu)建了一個(gè)適當(dāng)?shù)腜ython成員方法對(duì)象。
Inheritance
繼承
C++ inheritance relationships can be represented to Boost.Python by addingan optional bases<...> argument to the class_<...> templateparameter list as follows:
要在Boost.Python里描述C++繼承關(guān)系,可以在class_<...>模板參數(shù)列表里添加一個(gè)可選的bases<...>,如下:
class_<Derived, bases<Base1,Base2> >("Derived")
...
This has two effects:
這有兩種作用:
- When the class_<...> is created, Python type objectscorresponding to Base1 and Base2 are looked up inBoost.Python's registry, and are used as bases for the new PythonDerived type object, so methods exposed for the Python Base1and Base2 types are automatically members of the Derivedtype. Because the registry is global, this works correctly even ifDerived is exposed in a different module from either of itsbases.
- C++ conversions from Derived to its bases are added to theBoost.Python registry. Thus wrapped C++ methods expecting (apointer or reference to) an object of either base type can becalled with an object wrapping a Derived instance. Wrappedmember functions of class T are treated as though they have animplicit first argument of T&, so these conversions areneccessary to allow the base class methods to be called for derivedobjects.
- 當(dāng)class_<...>創(chuàng)建時(shí),會(huì)在Boost.Python的注冊(cè)表里查找Base1和Base2所對(duì)應(yīng)的Python類型對(duì)象,并將它們作為新的Python Derived類型對(duì)象的基類,因此為Python的Base1和Base2類型導(dǎo)出的成員函數(shù)自動(dòng)成為Derived類型的成員。因?yàn)樽?cè)表是全局的,所以Derived和它的基類可以在不同的模塊中導(dǎo)出。
- 在Boost.Python的注冊(cè)表里,添加了從Derived到它的基類的C++轉(zhuǎn)換。這樣,封裝了Derived實(shí)例的對(duì)象就可以調(diào)用其基類的方法,而該封裝的C++方法本該由一個(gè)基類對(duì)象(指針或引用)來(lái)調(diào)用。類T的成員方法封裝后,可視為它們具有一個(gè)隱含的第一參數(shù)T&,所以為了允許派生類對(duì)象調(diào)用基類方法,這些轉(zhuǎn)換是必須的。
Of course it's possible to derive new Python classes from wrapped C++class instances. Because Boost.Python uses the new-style classsystem, that works very much as for the Python built-in types. Thereis one significant detail in which it differs: the built-in typesgenerally establish their invariants in their __new__ function, sothat derived classes do not need to call __init__ on the baseclass before invoking its methods :
當(dāng)然,也可以從封裝的C++類實(shí)例派生新的Python類。因?yàn)锽oost.Python使用了新型類系統(tǒng),從封裝類派生就像是從Python內(nèi)置類型派生一樣。但有一個(gè)重大區(qū)別:內(nèi)置類型一般在__new__函數(shù)里建立不變式,因此其派生類不需要調(diào)用基類的__init__:
>>> class L(list):
... def __init__(self):
... pass
...
>>> L().reverse()
>>>
Because C++ object construction is a one-step operation, C++ instancedata cannot be constructed until the arguments are available, in the__init__ function:
因?yàn)镃++的對(duì)象構(gòu)造是一個(gè)單步操作,在__init__函數(shù)里,只有參數(shù)齊全,才能構(gòu)造C++實(shí)例數(shù)據(jù):
>>> class D(SomeBoostPythonClass):
... def __init__(self):
... pass
...
>>> D().some_boost_python_method()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: bad argument type for built-in operation
This happened because Boost.Python couldn't find instance data of typeSomeBoostPythonClass within the D instance; D's __init__function masked construction of the base class. It could be correctedby either removing D's __init__ function or having it callSomeBoostPythonClass.__init__(...) explicitly.
發(fā)生錯(cuò)誤的原因是,Boost.Python在實(shí)例D中,找不到類型SomeBoostPythonClass的實(shí)例數(shù)據(jù);D的__init__函數(shù)遮蓋了基類的構(gòu)造函數(shù)。糾正方法為,刪除D的__init__函數(shù),或者讓它顯式調(diào)用SomeBoostPythonClass.__init__(...)。
Virtual Functions
虛函數(shù)
Deriving new types in Python from extension classes is not veryinteresting unless they can be used polymorphically from C++. Inother words, Python method implementations should appear to overridethe implementation of C++ virtual functions when called through baseclass pointers/references from C++. Since the only way to alter thebehavior of a virtual function is to override it in a derived class,the user must build a special derived class to dispatch a polymorphicclass' virtual functions:
用Python從擴(kuò)展類派生新的類型沒(méi)有太大意思,除非可以在C++里面多態(tài)地使用派生類。換句話說(shuō),在C++里,通過(guò)基類指針或引用調(diào)用C++虛函數(shù)時(shí),Python實(shí)現(xiàn)的方法應(yīng)該看起來(lái)像是覆蓋了C++虛函數(shù)的實(shí)現(xiàn)。因?yàn)楦淖兲摵瘮?shù)行為的唯一方法是,在派生類里覆蓋它,所以用戶必須構(gòu)建一個(gè)特殊的派生類,來(lái)分派多態(tài)類的虛函數(shù):
//
// interface to wrap:
//
class Base
{
public:
virtual int f(std::string x) { return 42; }
virtual ~Base();
};
int calls_f(Base const& b, std::string x) { return b.f(x); }
//
// Wrapping Code
//
// Dispatcher class
struct BaseWrap : Base
{
// Store a pointer to the Python object
BaseWrap(PyObject* self_) : self(self_) {}
PyObject* self;
// Default implementation, for when f is not overridden
int f_default(std::string x) { return this->Base::f(x); }
// Dispatch implementation
int f(std::string x) { return call_method<int>(self, "f", x); }
};
...
def("calls_f", calls_f);
class_<Base, BaseWrap>("Base")
.def("f", &Base::f, &BaseWrap::f_default)
;
Now here's some Python code which demonstrates:
這是Python演示代碼:
>>> class Derived(Base):
... def f(self, s):
... return len(s)
...
>>> calls_f(Base(), 'foo')
42
>>> calls_f(Derived(), 'forty-two')
9
Things to notice about the dispatcher class:
關(guān)于分派類需要注意:
- The key element which allows overriding in Python is thecall_method invocation, which uses the same global typeconversion registry as the C++ function wrapping does to convert itsarguments from C++ to Python and its return type from Python to C++.
- Any constructor signatures you wish to wrap must be replicated withan initial PyObject* argument
- The dispatcher must store this argument so that it can be used toinvoke call_method
- The f_default member function is needed when the function beingexposed is not pure virtual; there's no other way Base::f can becalled on an object of type BaseWrap, since it overrides f.
- 允許在Python里覆蓋的關(guān)鍵因素是call_method調(diào)用,與C++函數(shù)封裝一樣,它使用同一個(gè)全局注冊(cè)表,把參數(shù)從C++轉(zhuǎn)換到Python,并把返回類型從Python轉(zhuǎn)換到C++。
- 任何你希望封裝的構(gòu)造函數(shù),其函數(shù)簽名必須有一個(gè)的相同的初始化參數(shù)PyObject*。
- 分派者必須保存這個(gè)參數(shù),以便調(diào)用call_method時(shí)使用。
- 當(dāng)導(dǎo)出的函數(shù)不是純虛函數(shù)時(shí),就需要f_default成員函數(shù);在BaseWrap類型的對(duì)象里,沒(méi)有其他方式可以調(diào)用Base::f,因?yàn)?tt class="literal">f被覆蓋了。
Deeper Reflection on the Horizon?
更深的反射即將出現(xiàn)?
Admittedly, this formula is tedious to repeat, especially on a projectwith many polymorphic classes. That it is neccessary reflects somelimitations in C++'s compile-time introspection capabilities: there'sno way to enumerate the members of a class and find out which arevirtual functions. At least one very promising project has beenstarted to write a front-end which can generate these dispatchers (andother wrapping code) automatically from C++ headers.
無(wú)可否認(rèn),重復(fù)這種公式化動(dòng)作是冗長(zhǎng)乏味的,尤其是項(xiàng)目里有大量多態(tài)類的時(shí)候。這里有必要反映一些C++編譯時(shí)內(nèi)省能力的限制:C++無(wú)法列舉類的成員并找出虛函數(shù)。不過(guò),至少有一個(gè)項(xiàng)目已經(jīng)啟動(dòng),有希望編寫(xiě)出一個(gè)前端程序,可以從C++頭文件自動(dòng)生成這些分派類(和其他封裝代碼),
Pyste is being developed by Bruno da Silva de Oliveira. It builds onGCC_XML, which generates an XML version of GCC's internal programrepresentation. Since GCC is a highly-conformant C++ compiler, thisensures correct handling of the most-sophisticated template code andfull access to the underlying type system. In keeping with theBoost.Python philosophy, a Pyste interface description is neitherintrusive on the code being wrapped, nor expressed in some unfamiliarlanguage: instead it is a 100% pure Python script. If Pyste issuccessful it will mark a move away from wrapping everything directlyin C++ for many of our users. It will also allow us the choice toshift some of the metaprogram code from C++ to Python. We expect thatsoon, not only our users but the Boost.Python developers themselveswill be "thinking hybrid" about their own code.
Bruno da Silva de Oliveira正在開(kāi)發(fā)Pyste。Pyste基于GCC_XML構(gòu)建,而GCC_XML可以生成XML版本的GCC內(nèi)部程序描述。因?yàn)镚CC是一種高度兼容標(biāo)準(zhǔn)的C++編譯器,從而確保了對(duì)最復(fù)雜的模板代碼的正確處理,和對(duì)底層類型系統(tǒng)的完全訪問(wèn)。和Boost.Python的哲學(xué)一致,Pyste接口描述既不侵入待封裝的代碼,也不使用某種不熟悉的語(yǔ)言來(lái)表達(dá),相反,它是100%的純Python腳本。如果Pyste成功的話,它將標(biāo)志,我們的許多用戶不必直接用C++封裝所有東西。Pyste也將允許我們選擇性地把一些元編程代碼從C++轉(zhuǎn)移到Python。我們期待不久以后,不僅用戶,而且Boost.Python開(kāi)發(fā)者也能,“混合地思考”他們自己的代碼。(譯注:Pyste已不再維護(hù),更新的是Py++。)
Serialization
序列化
Serialization is the process of converting objects in memory to aform that can be stored on disk or sent over a network connection. Theserialized object (most often a plain string) can be retrieved andconverted back to the original object. A good serialization system willautomatically convert entire object hierarchies. Python's standardpickle module is just such a system. It leverages the language's strongruntime introspection facilities for serializing practically arbitraryuser-defined objects. With a few simple and unintrusive provisions thispowerful machinery can be extended to also work for wrapped C++ objects.Here is an example:
序列化(serialization)是指,把內(nèi)存中的對(duì)象轉(zhuǎn)換成可保存格式,使之可以保存到磁盤(pán)上,或通過(guò)網(wǎng)絡(luò)傳送。序列化后的對(duì)象(最常見(jiàn)的是普通字符串),可以恢復(fù)并轉(zhuǎn)換回原來(lái)的對(duì)象。好的序列化系統(tǒng)會(huì)自動(dòng)轉(zhuǎn)換整個(gè)對(duì)象層次結(jié)構(gòu)。Python的標(biāo)準(zhǔn)模塊pickle正是這樣的系統(tǒng)。它利用了語(yǔ)言強(qiáng)大的運(yùn)行時(shí)內(nèi)省機(jī)制,可以序列化幾乎任意的用戶自定義對(duì)象。只需加入一些簡(jiǎn)單的、非侵入的處理,就可以擴(kuò)展這個(gè)威力巨大的系統(tǒng),使它也能用于封裝的C++對(duì)象。下面是一個(gè)例子:
#include <string>
struct World
{
World(std::string a_msg) : msg(a_msg) {}
std::string greet() const { return msg; }
std::string msg;
};
#include <boost/python.hpp>
using namespace boost::python;
struct World_picklers : pickle_suite
{
static tuple
getinitargs(World const& w) { return make_tuple(w.greet()); }
};
BOOST_PYTHON_MODULE(hello)
{
class_<World>("World", init<std::string>())
.def("greet", &World::greet)
.def_pickle(World_picklers())
;
}
Now let's create a World object and put it to rest on disk:
現(xiàn)在,讓我們創(chuàng)建一個(gè)World對(duì)象并把它保存到磁盤(pán):
>>> import hello
>>> import pickle
>>> a_world = hello.World("howdy")
>>> pickle.dump(a_world, open("my_world", "w"))
In a potentially different script on a potentially differentcomputer with a potentially different operating system:
然后,可能是在不同的計(jì)算機(jī)、不同的操作系統(tǒng)上,一個(gè)腳本可能這樣恢復(fù)對(duì)象:
>>> import pickle
>>> resurrected_world = pickle.load(open("my_world", "r"))
>>> resurrected_world.greet()
'howdy'
Of course the cPickle module can also be used for fasterprocessing.
當(dāng)然,使用cPickle模塊可以更快速地處理。
Boost.Python's pickle_suite fully supports the pickle protocoldefined in the standard Python documentation. Like a __getinitargs__function in Python, the pickle_suite's getinitargs() is responsible forcreating the argument tuple that will be use to reconstruct the pickledobject. The other elements of the Python pickling protocol,__getstate__ and __setstate__ can be optionally provided via C++getstate and setstate functions. C++'s static type system allows thelibrary to ensure at compile-time that nonsensical combinations offunctions (e.g. getstate without setstate) are not used.
Boost.Python的pickle_suite完全支持標(biāo)準(zhǔn)Python文檔定義的pickle協(xié)議。類似Python里的__getinitargs__函數(shù),pickle_suite的getinitargs()負(fù)責(zé)創(chuàng)建參數(shù)元組,以重建pickle的對(duì)象。 Python pickle協(xié)議中的其他元素,__getstate__和__setstate__,可以通過(guò)C++ getstate和setstate函數(shù)來(lái)提供,也可以不提供。利用C++的靜態(tài)類型系統(tǒng),Boost.Python庫(kù)在編譯時(shí)保證,不會(huì)使用沒(méi)有意義的函數(shù)組合(例如,有g(shù)etstate無(wú)setstate)。
Enabling serialization of more complex C++ objects requires a littlemore work than is shown in the example above. Fortunately theobject interface (see next section) greatly helps in keeping thecode manageable.
要想序列化更復(fù)雜的C++對(duì)象,就需要做更多的工作。幸運(yùn)的是,object接口(見(jiàn)下一節(jié))幫了大忙,它保持了代碼的可管理性。
Object interface
Object接口
Experienced 'C' language extension module authors will be familiarwith the ubiquitous PyObject*, manual reference-counting, and theneed to remember which API calls return "new" (owned) references or"borrowed" (raw) references. These constraints are not justcumbersome but also a major source of errors, especially in thepresence of exceptions.
對(duì)于有經(jīng)驗(yàn)的'C'語(yǔ)言擴(kuò)展模塊的作者,他們應(yīng)該熟悉無(wú)所不在的PyObject*,手工引用計(jì)數(shù),而且需要記住哪個(gè)API調(diào)用返回“新的”(擁有的)引用,哪個(gè)返回“借來(lái)的”(原始的)引用。這些約束不僅麻煩,而且是主要的錯(cuò)誤源,尤其是面臨異常的時(shí)候。
Boost.Python provides a class object which automates referencecounting and provides conversion to Python from C++ objects ofarbitrary type. This significantly reduces the learning effort forprospective extension module writers.
Boost.Python提供了一個(gè)object類,它能夠自動(dòng)進(jìn)行引用計(jì)數(shù),并且能把任意類型的C++對(duì)象轉(zhuǎn)換到Python。對(duì)于未來(lái)的擴(kuò)展模塊的編寫(xiě)者來(lái)說(shuō),這極大地減輕了學(xué)習(xí)的負(fù)擔(dān)。
Creating an object from any other type is extremely simple:
從任何其他類型創(chuàng)建object極其簡(jiǎn)單:
object s("hello, world"); // s manages a Python string
object has templated interactions with all other types, withautomatic to-python conversions. It happens so naturally that it'seasily overlooked:
object和所有其他類型的交互,以及到Python的自動(dòng)轉(zhuǎn)換,都已經(jīng)模板化了。這一切進(jìn)行得如此自然,以至于可以輕松地忽略掉它:
object ten_Os = 10 * s[4]; // -> "oooooooooo"
In the example above, 4 and 10 are converted to Python objectsbefore the indexing and multiplication operations are invoked.
上例中,在調(diào)用索引和乘法操作之前,4和10被轉(zhuǎn)換成了Python對(duì)象。
The extract<T> class template can be used to convert Python objectsto C++ types:
用類模板extract<T>可以把Python對(duì)象轉(zhuǎn)換成C++類型:
double x = extract<double>(o);
If a conversion in either direction cannot be performed, anappropriate exception is thrown at runtime.
如果有一個(gè)方向的轉(zhuǎn)換不能進(jìn)行,則將在運(yùn)行時(shí)拋出一個(gè)適當(dāng)?shù)漠惓!?/p>
The object type is accompanied by a set of derived typesthat mirror the Python built-in types such as list, dict,tuple, etc. as much as possible. This enables convenientmanipulation of these high-level types from C++:
除了object類型,還有一組派生類型,它們盡可能地對(duì)應(yīng)于Python內(nèi)置類型,如list、dict、tuple等等。這樣就能方便地從C++操作這些高級(jí)類型了:
dict d;
d["some"] = "thing";
d["lucky_number"] = 13;
list l = d.keys();
This almost looks and works like regular Python code, but it is pureC++. Of course we can wrap C++ functions which accept or returnobject instances.
這看起來(lái)幾乎就像是正規(guī)的Python代碼,運(yùn)行起來(lái)也像,但它是純的C++。當(dāng)然我們也能封裝接受或返回object實(shí)例的C++函數(shù)。
Thinking hybrid
混合地思考
Because of the practical and mental difficulties of combiningprogramming languages, it is common to settle a single language at theoutset of any development effort. For many applications, performanceconsiderations dictate the use of a compiled language for the corealgorithms. Unfortunately, due to the complexity of the static typesystem, the price we pay for runtime performance is often asignificant increase in development time. Experience shows thatwriting maintainable C++ code usually takes longer and requires farmore hard-earned working experience than developing comparable Pythoncode. Even when developers are comfortable working exclusively incompiled languages, they often augment their systems by some type ofad hoc scripting layer for the benefit of their users without everavailing themselves of the same advantages.
因?yàn)榛旌险Z(yǔ)言編程具有事實(shí)上和心理上的困難,所以普通的做法是,在任何開(kāi)發(fā)活動(dòng)開(kāi)始時(shí),先確定一種單一語(yǔ)言。對(duì)很多應(yīng)用來(lái)說(shuō),性能上的考慮決定了核心算法要用編譯性語(yǔ)言實(shí)現(xiàn)。不幸的是,由于靜態(tài)類型系統(tǒng)的復(fù)雜性,為了運(yùn)行時(shí)的性能,我們所付出的代價(jià)常常是,開(kāi)發(fā)時(shí)間大大增加。經(jīng)驗(yàn)表明,和開(kāi)發(fā)同等的Python代碼相比,編寫(xiě)可維護(hù)的C++代碼通常需要更長(zhǎng)的時(shí)間,并且要求多得多的來(lái)之不易的工作經(jīng)驗(yàn)。即使開(kāi)發(fā)者覺(jué)得只用一門(mén)編譯性語(yǔ)言挺好,為了用戶的利益,他們也經(jīng)常給他們的系統(tǒng)增加某種專門(mén)的腳本層,但是他們自己卻從沒(méi)利用這種好處。
Boost.Python enables us to think hybrid. Python can be used forrapidly prototyping a new application; its ease of use and the largepool of standard libraries give us a head start on the way to aworking system. If necessary, the working code can be used todiscover rate-limiting hotspots. To maximize performance these canbe reimplemented in C++, together with the Boost.Python bindingsneeded to tie them back into the existing higher-level procedure.
Boost.Python讓我們可以混合地思考(think hybrid)。Python可以為一個(gè)新應(yīng)用快速搭建原型;在建立一個(gè)可運(yùn)行的系統(tǒng)時(shí),它的易用性和一大堆標(biāo)準(zhǔn)庫(kù)讓我們處于領(lǐng)先。如果有必要,可以用運(yùn)行的代碼來(lái)揭示限制速度的熱點(diǎn)。為了提高性能,這些熱點(diǎn)可以用C++來(lái)重新實(shí)現(xiàn),然后用Boost.Python綁定,并提供給現(xiàn)有的高級(jí)過(guò)程調(diào)用。
Of course, this top-down approach is less attractive if it is clearfrom the start that many algorithms will eventually have to beimplemented in C++. Fortunately Boost.Python also enables us topursue a bottom-up approach. We have used this approach verysuccessfully in the development of a toolbox for scientificapplications. The toolbox started out mainly as a library of C++classes with Boost.Python bindings, and for a while the growth wasmainly concentrated on the C++ parts. However, as the toolbox isbecoming more complete, more and more newly added functionality can beimplemented in Python.
當(dāng)然,如果從一開(kāi)始就清楚,有許多算法將最終不得不用C++實(shí)現(xiàn),這個(gè)自上而下(top-down)的方法就不是那么吸引人了。幸運(yùn)的是,Boost.Python讓我們也可以采用自下而上(bottom-up)的方法。我們?cè)?jīng)非常成功地應(yīng)用這種方法,開(kāi)發(fā)一個(gè)科學(xué)軟件工具箱。開(kāi)始的時(shí)候,這個(gè)工具箱主要是一個(gè)C++類庫(kù),并帶有Boost.Python綁定,并且有一段時(shí)間,其成長(zhǎng)主要集中在C++的部分。然而,當(dāng)工具箱越來(lái)越完善,越來(lái)越多的新增功能可以用Python實(shí)現(xiàn)。
This figure shows the estimated ratio of newly added C++ and Pythoncode over time as new algorithms are implemented. We expect thisratio to level out near 70% Python. Being able to solve new problemsmostly in Python rather than a more difficult statically typedlanguage is the return on our investment in Boost.Python. The abilityto access all of our code from Python allows a broader group ofdevelopers to use it in the rapid development of new applications.
該圖顯示,實(shí)現(xiàn)新的算法時(shí),估計(jì)新增C++和Python代碼的比率隨時(shí)間的變化。我們預(yù)計(jì)這個(gè)比率會(huì)在接近70%的Python處變平。能夠主要地用Python來(lái)解決新問(wèn)題,而不是用更困難的靜態(tài)類型語(yǔ)言,這是我們?cè)贐oost.Python上投入的回報(bào)。我們的所有代碼都能從Python訪問(wèn),這使得更多的開(kāi)發(fā)者可以用它來(lái)快速開(kāi)發(fā)新的應(yīng)用。
Development history
開(kāi)發(fā)歷史
The first version of Boost.Python was developed in 2000 by DaveAbrahams at Dragon Systems, where he was privileged to have Tim Petersas a guide to "The Zen of Python". One of Dave's jobs was to developa Python-based natural language processing system. Since it waseventually going to be targeting embedded hardware, it was alwaysassumed that the compute-intensive core would be rewritten in C++ tooptimize speed and memory footprint 1. The project also wanted totest all of its C++ code using Python test scripts 2. The onlytool we knew of for binding C++ and Python was SWIG, and at the timeits handling of C++ was weak. It would be false to claim any deepinsight into the possible advantages of Boost.Python's approach atthis point. Dave's interest and expertise in fancy C++ templatetricks had just reached the point where he could do some real damage,and Boost.Python emerged as it did because it filled a need andbecause it seemed like a cool thing to try.
Boost.Python的第一版是由Dragon Systems的Dave Abrahams在2000年開(kāi)發(fā)的,在Dragon Systems,Dave有幸由Tim Peters引導(dǎo),接受了“Python之禪(The Zen of Python)”。Dave的工作之一是,開(kāi)發(fā)基于Python的自然語(yǔ)言處理系統(tǒng)(NLP,natural language processing)。由于最終要用于嵌入式硬件,所以總是假設(shè),計(jì)算密集的內(nèi)核將會(huì)用C++來(lái)重寫(xiě),以優(yōu)化速度和內(nèi)存占用1。這個(gè)項(xiàng)目也想用Python測(cè)試腳本來(lái)測(cè)試所有的C++代碼2。當(dāng)時(shí),我們所知的綁定C++和Python的唯一工具是SWIG,但那時(shí)它處理C++的能力比較弱。如果說(shuō)在那時(shí)就有什么深知卓見(jiàn),說(shuō)Boost.Python的方法會(huì)有何等優(yōu)越性,那是騙人的。那時(shí),Dave正好對(duì)花俏的C++模板技巧感興趣,并且嫻熟到剛好能真正做點(diǎn)東西,Boost.Python就那樣出現(xiàn)了,因?yàn)樗鼭M足了需求,因?yàn)樗雌饋?lái)挺酷,值得一試。
This early version was aimed at many of the same basic goals we'vedescribed in this paper, differing most-noticeably by having aslightly more cumbersome syntax and by lack of special support foroperator overloading, pickling, and component-based development.These last three features were quickly added by Ullrich Koethe andRalf Grosse-Kunstleve 3, and other enthusiastic contributors arrivedon the scene to contribute enhancements like support for nestedmodules and static member functions.
這個(gè)早期版本針對(duì)的目標(biāo),與我們?cè)诒疚乃龅脑S多基本目標(biāo)相同,最顯著的區(qū)別在于,語(yǔ)法要稍微麻煩一點(diǎn),并且,對(duì)運(yùn)算符重載、pickling,和基于組件的開(kāi)發(fā)缺乏專門(mén)的支持。后面這三個(gè)特性很快就由Ullrich Koethe和Ralf Grosse-Kunstleve加上了3,并且,其他熱心的貢獻(xiàn)者也出現(xiàn)了,并作了一些改進(jìn),如對(duì)嵌套模塊和靜態(tài)成員函數(shù)的支持等。
By early 2001 development had stabilized and few new features werebeing added, however a disturbing new fact came to light: Ralf hadbegun testing Boost.Python on pre-release versions of a compiler usingthe EDG front-end, and the mechanism at the core of Boost.Pythonresponsible for handling conversions between Python and C++ types wasfailing to compile. As it turned out, we had been exploiting a verycommon bug in the implementation of all the C++ compilers we hadtested. We knew that as C++ compilers rapidly became morestandards-compliant, the library would begin failing on moreplatforms. Unfortunately, because the mechanism was so central to thefunctioning of the library, fixing the problem looked very difficult.
到2001年初,開(kāi)發(fā)已經(jīng)穩(wěn)定下來(lái)了,很少有新增特性了,然而,這時(shí)出現(xiàn)了一件新的麻煩事:Ralf在一個(gè)使用EDG前端的編譯器的預(yù)發(fā)布版上測(cè)試Boost.Python,他發(fā)現(xiàn),Boost.Python內(nèi)核中,Python和C++類型轉(zhuǎn)換機(jī)制無(wú)法通過(guò)編譯。結(jié)果表明,我們一直是在利用一個(gè)錯(cuò)誤,這是一個(gè)非常普遍的錯(cuò)誤,存在于所有我們已經(jīng)測(cè)試過(guò)的C++編譯器的實(shí)現(xiàn)中。我們知道,隨著C++編譯器變得更加符合標(biāo)準(zhǔn),很快,庫(kù)將開(kāi)始在更多的平臺(tái)上失敗。很不幸,因?yàn)檫@套機(jī)制是Boost.Python庫(kù)功能的中樞,解決問(wèn)題看起來(lái)非常困難。
Fortunately, later that year Lawrence Berkeley and later LawrenceLivermore National labs contracted with Boost Consulting for supportand development of Boost.Python, and there was a new opportunity toaddress fundamental issues and ensure a future for the library. Aredesign effort began with the low level type conversion architecture,building in standards-compliance and support for component-baseddevelopment (in contrast to version 1 where conversions had to beexplicitly imported and exported across module boundaries). A newanalysis of the relationship between the Python and C++ objects wasdone, resulting in more intuitive handling for C++ lvalues andrvalues.
幸運(yùn)的是,那一年末,Lawrence Berkeley,后來(lái)建立了Lawrence Livermore National labs,與Boost Consulting簽訂了合同,來(lái)支持和發(fā)展Boost.Python,這樣就有了一個(gè)新的機(jī)會(huì)來(lái)處理庫(kù)的基本問(wèn)題,從而確保了庫(kù)未來(lái)的發(fā)展。庫(kù)進(jìn)行了重新設(shè)計(jì),開(kāi)始于底層的類型轉(zhuǎn)換架構(gòu),使它內(nèi)置具有標(biāo)準(zhǔn)兼容性,并支持基于組件的開(kāi)發(fā)(第1版中,轉(zhuǎn)換必須顯式地在模塊間導(dǎo)入和導(dǎo)出)。對(duì)Python和C++對(duì)象的關(guān)系進(jìn)行了新的分析,從而能更直觀地處理C++左值和右值。
The emergence of a powerful new type system in Python 2.2 made thechoice of whether to maintain compatibility with Python 1.5.2 easy:the opportunity to throw away a great deal of elaborate code foremulating classic Python classes alone was too good to pass up. Inaddition, Python iterators and descriptors provided crucial andelegant tools for representing similar C++ constructs. Thedevelopment of the generalized object interface allowed us tofurther shield C++ programmers from the dangers and syntactic burdensof the Python 'C' API. A great number of other features including C++exception translation, improved support for overloaded functions, andmost significantly, CallPolicies for handling pointers andreferences, were added during this period.
關(guān)于是否維護(hù)對(duì)Python 1.5.2的兼容性,因?yàn)镻ython 2.2里出現(xiàn)了一個(gè)強(qiáng)大的新的類型系統(tǒng),選擇變得容易了:這個(gè)機(jī)會(huì)好的令人無(wú)法拒絕,籍此可以拋棄大量復(fù)雜精細(xì)的代碼,而這些代碼僅僅是用來(lái)模擬傳統(tǒng)的Python類。另外,Python的迭代器(iterator)和描述符(descriptor)提供了重要且優(yōu)雅的工具,用來(lái)表示類似的C++構(gòu)造。通用的object接口的開(kāi)發(fā)進(jìn)一步方便了C++程序員,免除了Python 'C' API的危險(xiǎn)性和語(yǔ)法負(fù)擔(dān)。這一階段,還添加了大量其他特性,包括C++異常翻譯,對(duì)函數(shù)重載的更好的支持,還有最重要的,用來(lái)處理指針和引用的CallPolicies。
In October 2002, version 2 of Boost.Python was released. Developmentsince then has concentrated on improved support for C++ runtimepolymorphism and smart pointers. Peter Dimov's ingeniousboost::shared_ptr design in particular has allowed us to give thehybrid developer a consistent interface for moving objects back andforth across the language barrier without loss of information. Atfirst, we were concerned that the sophistication and complexity of theBoost.Python v2 implementation might discourage contributors, but theemergence of Pyste and several other significant featurecontributions have laid those fears to rest. Daily questions on thePython C++-sig and a backlog of desired improvements show that thelibrary is getting used. To us, the future looks bright.
2002年十月,Boost.Python第2版發(fā)布了。從那以后,開(kāi)發(fā)集中于更好地支持C++運(yùn)行時(shí)多態(tài)性和智能指針。特別是Peter Dimov巧妙的boost::shared_ptr 的設(shè)計(jì),使我們能給混和系統(tǒng)開(kāi)發(fā)者提供一個(gè)一致的接口,用于跨越語(yǔ)言屏障來(lái)回移動(dòng)對(duì)象而不丟失信息。剛開(kāi)始,我們擔(dān)心Boost.Python v2實(shí)現(xiàn)的詭秘與復(fù)雜會(huì)阻礙貢獻(xiàn)者,但Pyste的出現(xiàn),和其他幾個(gè)重要特性的貢獻(xiàn),證明那些擔(dān)心是多余的。在Python C++-sig上每天的提問(wèn),和積壓的改進(jìn)請(qǐng)求表明了庫(kù)正在被使用。對(duì)我們來(lái)說(shuō),未來(lái)是光明的。
Conclusions
結(jié)論
Boost.Python achieves seamless interoperability between two rich andcomplimentary language environments. Because it leverages templatemetaprogramming to introspect about types and functions, the usernever has to learn a third syntax: the interface definitions arewritten in concise and maintainable C++. Also, the wrapping systemdoesn't have to parse C++ headers or represent the type system: thecompiler does that work for us.
Boost.Python在兩種功能豐富并且互補(bǔ)的語(yǔ)言環(huán)境間實(shí)現(xiàn)了無(wú)縫協(xié)作。因?yàn)樗媚0逶幊虒?duì)類型和函數(shù)進(jìn)行內(nèi)省,用戶不必去學(xué)習(xí)第三種語(yǔ)言:接口定義是用簡(jiǎn)潔和可維護(hù)的C++寫(xiě)的。同時(shí),封裝系統(tǒng)不必解析C++頭文件或者描述類型系統(tǒng):編譯器都給我們做了。
Computationally intensive tasks play to the strengths of C++ and areoften impossible to implement efficiently in pure Python, while jobslike serialization that are trivial in Python can be very difficult inpure C++. Given the luxury of building a hybrid software system fromthe ground up, we can approach design with new confidence and power.
計(jì)算密集型任務(wù)是C++的強(qiáng)項(xiàng),一般不可能用純Python高效實(shí)現(xiàn),然而像序列化這樣的工作,用Python很簡(jiǎn)單,用純C++就非常困難。如果我們能構(gòu)建完全的混合軟件系統(tǒng),我們就能以新的信心和力量來(lái)進(jìn)行設(shè)計(jì)。
Citations
引用
[VELD1995] | T. Veldhuizen, "Expression Templates," C++ Report,Vol. 7 No. 5 June 1995, pp. 26-31.http://osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html |
Footnotes
腳注
[1] | In retrospect, it seems that "thinking hybrid" from theground up might have been better for the NLP system: thenatural component boundaries defined by the pure pythonprototype turned out to be inappropriate for getting thedesired performance and memory footprint out of the C++ core,which eventually caused some redesign overhead on the Pythonside when the core was moved to C++.
回想起來(lái),對(duì)NLP系統(tǒng)來(lái)說(shuō),從一開(kāi)始就“混合地思考”可能會(huì)更好:用純Python原型定義的組件接口,對(duì)Python來(lái)說(shuō)是自然的,可后來(lái)發(fā)現(xiàn)并不合適。當(dāng)核心改寫(xiě)成C++時(shí),使用該接口無(wú)法達(dá)到期望的性能和內(nèi)存占用要求,最后只好對(duì)Python這邊的某些部分重新設(shè)計(jì),造成了額外開(kāi)銷(xiāo)。 |
[2] | We also have some reservations about driving all C++testing through a Python interface, unless that's the only wayit will be ultimately used. Any transition across languageboundaries with such different object models can inevitablymask bugs.
對(duì)于通過(guò)Python接口來(lái)驅(qū)動(dòng)所有C++測(cè)試,我們也持保留態(tài)度,除非從Python調(diào)用是最終唯一的使用方式。因?yàn)閮煞N語(yǔ)言的對(duì)象模型如此不同,任何跨越語(yǔ)言邊界的轉(zhuǎn)換都會(huì)不可避免地掩蓋錯(cuò)誤。 |
[3] | These features were expressed very differently in v1 ofBoost.Python
這些特性在Boost.Python v1里表達(dá)方式非常不同。 |