http://blog.csdn.net/wqf363/article/details/1420554
如果你正在寫從文件或網(wǎng)絡(luò)讀寫數(shù)據(jù)的跨平臺(tái)C/C++代碼,那么你必須明白有些問題是因語言,編譯器,平臺(tái)而不同的。 主要的問題是
數(shù)據(jù)對(duì)齊,填充,類型大小,字節(jié)順序和默認(rèn)狀態(tài)char是否有符號(hào)。
對(duì)齊
特定機(jī)器上,特定的數(shù)據(jù)被對(duì)齊于特定的邊界。如果數(shù)據(jù)沒有正確對(duì)齊,結(jié)果可能是效率降低甚至崩潰。 當(dāng)你從I/O源讀取數(shù)據(jù)的時(shí)候,確保對(duì)齊是正確的。詳細(xì)內(nèi)容參考本人另一篇blog:
字節(jié)對(duì)齊的影響因素 填充
"填充" 是數(shù)據(jù)集合中不同元素之間的間隔, 一般是為了對(duì)齊而存在。不同編譯器和平臺(tái)下,填充的數(shù)量可能會(huì)不同。 不要假設(shè)結(jié)構(gòu)的大小和成員的位置在任何編譯器和平臺(tái)下都是相同的。 不要一次性讀取或者寫入一整個(gè)結(jié)構(gòu)體,因?yàn)閷懭氲某绦蚩赡軙?huì)使用和讀取的程序不同的填充方式。對(duì)于域也同樣適用。
類型大小
不同數(shù)據(jù)類型的大小隨編譯器和平臺(tái)而不同。 在C/C++中, 內(nèi)置類型的大小完全取決于編譯器(在特定范圍內(nèi)). 不要讀寫大小不明確的數(shù)據(jù)類型。也就是說,不要讀寫bool, enum, long, int, short, float, 或者double類型.(譯者注:在linux下要使用下面跨平臺(tái)符號(hào),要加載頭文件<arpa/inet.h>,此外在C99已經(jīng)增加了一個(gè)頭文件stdint.h,支持標(biāo)準(zhǔn)的,可移植的整數(shù)類型集合,此文件被包含在<inttypes.h>)
用這些 | 替代這些... |
int8, uint8 | char, signed char, unsigned char, enum, bool |
int16, uint16 | short, signed short, unsigned short, enum |
int32, uint32 | int, signed int, unsigned int, long, signed long, unsigned long, enum |
int64, uint64 | long, signed long, unsigned long |
int128, uint128 | long long, signed long long, unsigned long long |
float32 | float |
float64 | double |
Data Type Ranges
C/C++ recognizes the types shown in the table below.
Type Name | Bytes | Other Names | Range of Values |
int | * | signed, signed int | System dependent |
unsigned int | * | unsigned | System dependent |
__int8 | 1 | char, signed char | –128 to 127 |
__int16 | 2 | short, short int, signed short int | –32,768 to 32,767 |
__int32 | 4 | signed, signed int | –2,147,483,648 to 2,147,483,647 |
__int64 | 8 | none | –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
char | 1 | signed char | –128 to 127 |
unsigned char | 1 | none | 0 to 255 |
short | 2 | short int, signed short int | –32,768 to 32,767 |
unsigned short | 2 | unsigned short int | 0 to 65,535 |
long | 4 | long int, signed long int | –2,147,483,648 to 2,147,483,647 |
unsigned long | 4 | unsigned long int | 0 to 4,294,967,295 |
enum | * | none | Same as int |
float | 4 | none | 3.4E +/- 38 (7 digits) |
double | 8 | none | 1.7E +/- 308 (15 digits) |
long double | 10 | none | 1.2E +/- 4932 (19 digits) |
The long double data type (80-bit, 10-byte precision) is mapped directly to double (64-bit, 8- byte precision) in Windows NT and Windows 95.
Signed and unsigned are modifiers that can be used with any integral type. The char type is signed by default, but you can specify /J to make it unsigned by default.
The int and unsigned int types have the size of the system word. This is two bytes (the same as short and unsigned short) in MS-DOS and 16-bit versions of Windows, and 4 bytes in 32-bit operating systems. However, portable code should not depend on the size of int.
此外,顯示個(gè)32位與64位平臺(tái)之間的差異示例:
對(duì)于 Linux on POWER,ILP 32 模型用于 32 位環(huán)境中,而 LP64 用于 64 位環(huán)境中。這兩種模型之間的區(qū)別在于長(zhǎng)整型和指針的大小。
系統(tǒng)中可以有兩種不同的數(shù)據(jù)類型:基本數(shù)據(jù)類型和衍生數(shù)據(jù)類型。
基本數(shù)據(jù)類型是 C 和 C++ 語言規(guī)范定義的所有數(shù)據(jù)類型。下表對(duì) Linux on POWER 和 Solaris 中的基本數(shù)據(jù)類型進(jìn)行了比較:
表 4:基本數(shù)據(jù)類型
| Linux on POWER | Solaris |
基本類型 | ILP32 | LP64 | ILP32 | LP64 |
char | 8 | 8 | 8 | 8 |
short | 16 | 16 | 16 | 16 |
init | 32 | 32 | 32 | 32 |
float | 32 | 32 | 32 | 32 |
long | 32 | 64 | 32 | 64 |
pointer | 32 | 64 | 32 | 64 |
long long | 64 | 64 | 64 | 64 |
double | 64 | 64 | 64 | 64 |
long double | 64/128* | 64/128* | 128 | 128 |
表 5. 衍生數(shù)據(jù)類型
OS | gid_t | mode_t | pid_t | uid_t | wint_t |
Solaris ILP32 l | long | unsigned long | long | long | long |
Solaris LP64 | int | unsigned int | int | int | int |
Linux ILP32 | unsigned int | unsigned int | int | unsigned int | unsigned int |
Linux ILP64 | unsigned int | unsigned int | int | unsigned int | unsigned int |
字節(jié)順序
字節(jié)順序,就是字節(jié)在內(nèi)存中存儲(chǔ)的順 序。 不同的處理器存儲(chǔ)多字節(jié)數(shù)據(jù)的順序是不同的。小端處理器由低到高存儲(chǔ)(換句話說,和書寫的順序相反).。大端處理器由高到低存儲(chǔ)(和書寫順序相同)。如果 數(shù)值的字節(jié)順序和讀寫它的處理器不同,它必須被事先轉(zhuǎn)化。同時(shí),為了標(biāo)準(zhǔn)化網(wǎng)絡(luò)傳輸?shù)淖止?jié)順序,定義了網(wǎng)絡(luò)字節(jié)順序。詳細(xì)內(nèi)容參考本人另一篇blog:
網(wǎng)絡(luò)通訊中字節(jié)排列順序轉(zhuǎn)化 char - 有符號(hào)還是無符號(hào)?
一個(gè)鮮為人知的事實(shí),char默認(rèn)可以是有符號(hào)的也可以是無符號(hào)的-完全取決于編譯器。結(jié)果導(dǎo)致你從char轉(zhuǎn)化為其他類型的時(shí)候(比如int),結(jié)果會(huì)因編譯器而不同。 例如:
char x;
int y;
read( fd, &x, 1 ); // 讀取一個(gè)byte值為0xff
y = x; // y 是 255 或者 -1, 依賴編譯器
不要把數(shù)據(jù)讀入一般的char。明確指定是有符號(hào)或者無符號(hào)的