我們經(jīng)常在編程中碰到一種情況叫符號(hào)重復(fù)定義。多個(gè)目標(biāo)文件中含有相同名字全局符號(hào)的定義,那么這些目標(biāo)文件鏈接的時(shí)候?qū)?huì)出現(xiàn)符號(hào)重復(fù)定義的錯(cuò)誤。比如我們?cè)谀繕?biāo)文件A和目標(biāo)文件B都定義了一個(gè)全局整形變量global,并將它們都初始化,那么鏈接器將A和B進(jìn)行鏈接時(shí)會(huì)報(bào)錯(cuò):
1 b.o:(.data+0x0): multiple definition of `global'
2 a.o:(.data+0x0): first defined here
這種符號(hào)的定義可以被稱為強(qiáng)符號(hào)(Strong Symbol)。有些符號(hào)的定義可以被稱為弱符號(hào)(Weak Symbol)。
對(duì)于C/C++語(yǔ)言來(lái)說(shuō),編譯器默認(rèn)函數(shù)和初始化了的全局變量為強(qiáng)符號(hào),未初始化的全局變量為弱符號(hào)。我們也可以通過(guò)GCC的"__attribute__((weak))"來(lái)定義任何一個(gè)強(qiáng)符號(hào)為弱符號(hào)。注意,強(qiáng)符號(hào)和弱符號(hào)都是針對(duì)定義來(lái)說(shuō)的,不是針對(duì)符號(hào)的引用。比如我們有下面這段程序:
extern int ext;
int weak;
int strong = 1;
__attribute__((weak)) weak2 = 2;
int main()
{
return 0;
}
上面這段程序中,"weak"和"weak2"是弱符號(hào),"strong"和"main"是強(qiáng)符號(hào),而"ext"既非強(qiáng)符號(hào)也非弱符號(hào),因?yàn)樗且粋€(gè)外部變量的引用。
針對(duì)強(qiáng)弱符號(hào)的概念,鏈接器就會(huì)按如下規(guī)則處理與選擇被多次定義的全局符號(hào):
規(guī)則1:不允許強(qiáng)符號(hào)被多次定義(即不同的目標(biāo)文件中不能有同名的強(qiáng)符號(hào));如果有多個(gè)強(qiáng)符號(hào)定義,則鏈接器報(bào)符號(hào)重復(fù)定義錯(cuò)誤。
規(guī)則2:如果一個(gè)符號(hào)在某個(gè)目標(biāo)文件中是強(qiáng)符號(hào),在其他文件中都是弱符號(hào),那么選擇強(qiáng)符號(hào)。
規(guī)則3:如果一個(gè)符號(hào)在所有目標(biāo)文件中都是弱符號(hào),那么選擇其中占用空間最大的一個(gè)。比如目標(biāo)文件A定義全局變量global為int型,占4個(gè)字節(jié);目標(biāo)文件B定義global為double型,占8個(gè)字節(jié),那么目標(biāo)文件A和B鏈接后,符號(hào)global占8個(gè)字節(jié)(盡量不要使用多個(gè)不同類型的弱符號(hào),否則容易導(dǎo)致很難發(fā)現(xiàn)的程序錯(cuò)誤)。
弱引用和強(qiáng)引用
目前我們所看到的對(duì)外部目標(biāo)文件的符號(hào)引用在目標(biāo)文件被最終鏈接成可執(zhí)行文件時(shí),它們須要被正確決議,如果沒(méi)有找到該符號(hào)的定義,鏈接器就會(huì)報(bào)符號(hào)未定義錯(cuò)誤,這種被稱為強(qiáng)引用(Strong Reference)。與之相對(duì)應(yīng)還有一種弱引用(Weak Reference),在處理弱引用時(shí),如果該符號(hào)有定義,則鏈接器將該符號(hào)的引用決議;如果該符號(hào)未被定義,則鏈接器對(duì)于該引用不報(bào)錯(cuò)。鏈接器處理強(qiáng)引用和弱引用的過(guò)程幾乎一樣,只是對(duì)于未定義的弱引用,鏈接器不認(rèn)為它是一個(gè)錯(cuò)誤。一般對(duì)于未定義的弱引用,鏈接器默認(rèn)其為0,或者是一個(gè)特殊的值,以便于程序代碼能夠識(shí)別。
在GCC中,我們可以通過(guò)使用"__attribute__((weakref))"這個(gè)擴(kuò)展關(guān)鍵字來(lái)聲明對(duì)一個(gè)外部函數(shù)的引用為弱引用,比如下面這段代碼:
1 __attribute__ ((weakref)) void foo();
2 int main()
3 {
4 foo();
5 }
6
我們可以將它編譯成一個(gè)可執(zhí)行文件,GCC并不會(huì)報(bào)鏈接錯(cuò)誤。但是當(dāng)我們運(yùn)行這個(gè)可執(zhí)行文件時(shí),會(huì)發(fā)生運(yùn)行錯(cuò)誤。因?yàn)楫?dāng)main函數(shù)試圖調(diào)用foo函數(shù)時(shí),foo函數(shù)的地址為0,于是發(fā)生了非法地址訪問(wèn)的錯(cuò)誤。一個(gè)改進(jìn)的例子是:
1 __attribute__ ((weakref)) void foo();
2 int main()
3 {
4 if (foo)
5 foo();
6 }
7
這種弱符號(hào)和弱引用對(duì)于庫(kù)來(lái)說(shuō)十分有用,比如庫(kù)中定義的弱符號(hào)可以被用戶定義的強(qiáng)符號(hào)所覆蓋,從而使得程序可以使用自定義版本的庫(kù)函數(shù);或者程序可以對(duì)某些擴(kuò)展功能模塊的引用定義為弱引用,當(dāng)我們將擴(kuò)展模塊與程序鏈接在一起時(shí),功能模塊就可以正常使用;如果我們?nèi)サ袅四承┕δ苣K,那么程序也可以正常鏈接,只是缺少了相應(yīng)的功能,這使得程序的功能更加容易裁剪和組合。
在Linux程序的設(shè)計(jì)中,如果一個(gè)程序被設(shè)計(jì)成可以支持單線程或多線程的模式,就可以通過(guò)弱引用的方法來(lái)判斷當(dāng)前的程序是鏈接到了單線程的Glibc庫(kù)還是多線程的Glibc庫(kù)(是否在編譯時(shí)有-lpthread選項(xiàng)),從而執(zhí)行單線程版本的程序或多線程版本的程序。我們可以在程序中定義一個(gè)pthread_create函數(shù)的弱引用,然后程序在運(yùn)行時(shí)動(dòng)態(tài)判斷是否鏈接到pthread庫(kù)從而決定執(zhí)行多線程版本還是單線程版本:
1 #include <stdio.h>
2 #include <pthread.h>
3 int pthread_create( pthread_t*, const pthread_attr_t*,
4 void* (*)(void*), void*) __attribute__ ((weak));
5 int main()
6 {
7 if(pthread_create)
8 {
9 printf("This is multi-thread version!\n");
10 // run the multi-thread version
11 // main_multi_thread()
12 }
13 else
14 {
15 printf("This is single-thread version!\n");
16 // run the single-thread version
17 // main_single_thread()
18 }
19 }
20
編譯運(yùn)行結(jié)果如下:
1 $ gcc pthread.c -o pt
2 $ ./pt
3 This is single-thread version!
4 $ gcc pthread.c -lpthread -o pt
5 $ ./pt
6 This is multi-thread version!
在GCC的官方文檔中,對(duì)weak和weakref的描述如下:
http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html#Function-Attributes
weak
The weak attribute causes the declaration to be emitted as a weak symbol rather than a global. This is primarily useful in defining library functions which can be overridden in user code, though it can also be used with non-function declarations. Weak symbols are supported for ELF targets, and also for a.out targets when using the GNU assembler and linker.
weakref
weakref ("target")
The weakref attribute marks a declaration as a weak reference.
Without arguments, it should be accompanied by an alias attribute naming the target symbol. Optionally, the target may be given as an argument to weakref itself. In either case, weakref implicitly marks the declaration as weak. Without a target, given as an argument to weakref or to alias, weakref is equivalent to weak.
1 static int x() __attribute__ ((weakref ("y")));
2 /* is equivalent to... */
3 static int x() __attribute__ ((weak, weakref, alias ("y")));
4 /* and to... */
5 static int x() __attribute__ ((weakref));
6 static int x() __attribute__ ((alias ("y")));
A weak reference is an alias that does not by itself require a definition to be given for the target symbol. If the target symbol is only referenced through weak references, then the becomes a weak undefined symbol. If it is directly referenced, however, then such strong references prevail, and a definition will be required for the symbol, not necessarily in the same translation unit.
The effect is equivalent to moving all references to the alias to a separate translation unit, renaming the alias to the aliased symbol, declaring it as weak, compiling the two separate translation units and performing a reloadable link on them.
At present, a declaration to which weakref is attached can only be static.