If you have to support a code being ported that uses functions like sscanf, you may employ special macros in the format of control strings that expand into the necessary size specifiers. Here is an example of a macro that helps create a portable code for various systems:
// PR_SIZET on Win64 = "I"
// PR_SIZET on Win32 = ""
// PR_SIZET on Linux64 = "z"
// ...
size_t u;
scanf("%" PR_SIZET "u", &u);
Here is one more example. Although it looks most strange, the code given here in an abridged form was used in a real application in the UNDO/REDO subsystem:
// Here the pointers were saved in the form of a string
int *p1, *p2;
....
char str[128];
sprintf(str, "%X %X", p1, p2);
// In another function this string was processed
// in this way:
void foo(char *str)
{
int *p1, *p2;
sscanf(str, "%X %X", &p1, &p2);
// The result is incorrect values of pointers p1 and p2.
...
}
Manipulation with the pointers using "%X" resulted in an incorrect program behavior on a 64-bit system. This example shows how dangerous may be the depths of large and complex projects written for many years. If your project is rather large and obsolete, you might encounter very interesting fragments like this one.
Diagnosis
Those types that change their sizes on a 64-bit system, i.e. memsize-types, are dangerous for the functions with the variable number of arguments. PVS-Studio static analyzer warns the programmer about such types with the help of the V111 diagnostic warning.
If the types of the arguments have not changed their sizes, the code is considered correct and no warnings are generated. Here is an example of code correct from the analyzer's viewpoint:
printf("%d", 10*5);
CString str;
size_t n = sizeof(float);
str.Format(StrFormat, static_cast<int>(n));
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 11. Pattern 3. Shift operations
It is easy to make a mistake in code that works with separate bits. The pattern of 64-bit errors under consideration relates to shift operations. Here is an example of code:
ptrdiff_t SetBitN(ptrdiff_t value, unsigned bitNum) {
ptrdiff_t mask = 1 << bitNum;
return value | mask;
}
This code works well on a 32-bit architecture and allows you to set a bit with numbers from 0 to 31 into one. After porting the program to a 64-bit platform you need to set bits from 0 to 63. But this code will never set the bits with the numbers 32-63. Note that the numerical literal "1" has int type and causes an overflow when a shift in 32 positions occurs as shown in Figure 1. As a result, we will get 0 (Figure 1-B) or 1 (Figure 1-C) depending on the compiler implementation.
Figure 1 - a) Correct setting of the 31-st bit in a 32-bit code; b,c) - Incorrect setting of the 32-nd bit on a 64-bit system (two variants of behavior)
To correct the code we must make the type of the constant "1" the same as that of mask variable:
ptrdiff_t mask = ptrdiff_t(1) << bitNum;
Note also that the non-corrected code will lead to one more interesting error. When setting the 31-st bit on a 64-bit system, the function's result will be the value 0xffffffff80000000 (see Figure 2). The result of the expression 1 << 31 is the negative number -2147483648. This number is presented in a 64-bit integer variable as 0xffffffff80000000.
Figure 2 - The error of setting the 31-st bit on a 64-bit system.
You should remember and take into consideration the effects of shifting values of different types. To better understand all said above, consider some interesting expressions with shifts in a 64-bit system shown in Table 1.
Table 1 - Expressions with shifts and their results in a 64-bit system (we used Visual C++ 2005 compiler)
The type of errors we have described is considered dangerous not only from the viewpoint of program operation correctness but from the viewpoint of security as well. Potentially, by manipulating with the input data of such incorrect functions one can get inadmissible rights when, for example, dealing with processing of access permissions' masks defined by separate bits. Questions related to exploiting errors in 64-bit code for application cracking and compromise are described in the article "Safety of 64-bit code".
Now a subtler example:
struct BitFieldStruct {
unsigned short a:15;
unsigned short b:13;
};
BitFieldStruct obj;
obj.a = 0x4000;
size_t addr = obj.a << 17; //Sign Extension
printf("addr 0x%Ix\n", addr);
//Output on 32-bit system: 0x80000000
//Output on 64-bit system: 0xffffffff80000000
In the 32-bit environment, the order of calculating the expression will be as shown in Figure 3.
Figure 3 - Calculation of expression in 32-bit code
Note that a sign extension of "unsigned short" type to "signed int" takes place when calculating "obj.a << 17". To make it clear, consider the following code:
#include <stdio.h>
template <typename T> void PrintType(T)
{
printf("type is %s %d-bit\n",
(T)-1 < 0 ? "signed" : "unsigned", sizeof(T)*8);
}
struct BitFieldStruct {
unsigned short a:15;
unsigned short b:13;
};
int main(void)
{
BitFieldStruct bf;
PrintType( bf.a );
PrintType( bf.a << 2);
return 0;
}
Result:
type is unsigned 16-bit
type is signed 32-bit
Now let us see the consequence of the sign extension in a 64-bit code. The sequence of calculating the expression is shown in Figure 4.
Figure 4 - Calculation of expression in 64-bit code
The member of "obj.a" structure is converted from the bit field of "unsigned short" type to "int". "obj.a << 17" expression has "int" type but it is converted to ptrdiff_t and then to size_t before it is assigned to addr variable. As a result, we will get the value 0xffffffff80000000 instead of 0x0000000080000000 expected.
Be careful when working with bit fields. To avoid the situation described in our example we need only to explicitly convert "obj.a" to size_t type.
...
size_t addr = size_t(obj.a) << 17;
printf("addr 0x%Ix\n", addr);
//Output on 32-bit system: 0x80000000
//Output on 64-bit system: 0x80000000
Diagnosis
Potentially unsafe shifts are detected by PVS-Studio static analyzer when it detects an implicit extension of a 32-bit type to memsize type. The analyzer will warn you about the unsafe construct with the diagnostic warning V101. The shift operation is not suspicious by itself. But the analyzer detects an implicit extension of int type to memsize type when it is assigned to a variable, and informs the programmer about it to check the code fragment that may contain an error. Correspondingly, when there is no extension, the analyzer considers the code safe. For example: "int mask = 1 << bitNum;".
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800
Lesson 12. Pattern 4. Virtual functions
Sometimes you may see errors there is nobody's fault about but they are still errors. Imagine that a long-long time ago (in Visual Studio 6.0) a project was developed that contained the class CSampleApp which was an heir of CWinApp. The base class had the function WinHelp. The heir overlapped this function and performed all the necessary actions. It looked as shown in Figure 1.
Figure 1 - Correct operable code created in Visual Studio 6.0
Then the project is ported to Visual Studio 2005 where the prototype of the function WinHelp has changed. But nobody notices it because the types DWORD and DWORD_PTR coincide in the 32-bit mode and the program still works well (Figure 2).
Figure 2 - Incorrect yet operable 32-bit code
The error waits to occur on a 64-bit system where the sizes of the types DWORD and DWORD_PTR differ (Figure 3). It turns out that the classes contain two DIFFERENT functions WinHelp in the 64-bit mode. Of course it is incorrect. Note that such traps may hide not only in MFC where some functions have different types of the arguments but in the code of your applications and third-party libraries as well.
Figure 3 - The error occurs in the 64-bit code
Let us consider one more error by an example taken from real life. There is a wonderful component library BCGControlBar. You are likely to have heard about it because some components of BCGSoft Ltd company are included into Microsoft Visual Studio 2008 Feature Pack. Well, if you download the trial version of this library, install it and search for the word "WinHelp" through .h-files... you will see that wherever this function is supposedly overlapped the parameter DWORD is used instead of DWORD_PTR. And it means that Help system will behave incorrectly in these classes when ported to a 64-bit system.
Why can such an error still exist in the code of so popular a library? We think the point is that the company's clients have access to the source codes of this library and they may always easily correct these codes. Besides, the function WinHelp is used very rarely nowadays. HtmlHelp is used much more frequently - and it does have the right parameter DWORD_PTR in BCGControlBar. But the fact remains. There is an error in real code and the compiler does not detect it. Such errors may stay hidden for many years.
Note. This text is being written in December, 2009, and it is most likely that this error will be corrected in the next versions, especially as we have written about it to the developers of the library.
Diagnosis
Errors related to virtual functions in 64-bit code can be detected by the static analyzer PVS-Studio. The analyzer will warn you about dangerous virtual functions with the diagnostic warning V301.
A virtual function is considered dangerous if:
- The function is defined in the base class and in the heir-class.
- The types of the functions' arguments do not coincide but are equivalent on a 32-bit system (for example: unsigned, size_t) and are not equivalent on a 64-bit one.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 13. Pattern 5. Address arithmetic
We have chosen the 13-th lesson to discuss the errors related to address arithmetic deliberately. The errors related to pointer arithmetic in 64-bit systems are the most insidious and it would be good that number 13 made you more attentive.
The main idea of the pattern is - use only memsize-types in address arithmetic to avoid errors in 64-bit code.
Consider this code:
unsigned short a16, b16, c16;
char *pointer;
...
pointer += a16 * b16 * c16;
This sample works correctly with pointers if the result of the expression "a16 * b16 * c16" does not exceed INT_MAX (2147483647). This code could always work correctly on a 32-bit platform, because on the 32-bit architecture a program does not have so much memory to create an array of such a size. On the 64-bit architecture, this limitation has been removed and the size of the array may well get larger than INT_MAX items. Suppose we want to shift the value of the pointer in 6.000.000.000 bytes, so the variables a16, b16 and c16 have the values 3000, 2000 and 1000 respectively. When calculating the expression "a16 * b16 * c16", all the variables will be cast to "int" type at first, according to C++ rules, and only then they will be multiplied. An overflow will occur during the multiplication. The incorrect result will be extended to the type ptrdiff_t and the pointer will be calculated incorrectly.
You should be very attentive and avoid possible overflows when dealing with pointer arithmetic. It is good to use memsize-types or explicit type conversions in those expressions that contain pointers. Using an explicit type conversion we may rewrite our code sample in the following way:
short a16, b16, c16;
char *pointer;
...
pointer += static_cast<ptrdiff_t>(a16) *
static_cast<ptrdiff_t>(b16) *
static_cast<ptrdiff_t>(c16);
If you think that inaccurately written programs encounter troubles only when dealing with large data amounts, we have to disappoint you. Consider an interesting code sample working with an array that contains just 5 items. This code works in the 32-bit version and does not work in the 64-bit one:
int A = -2;
unsigned B = 1;
int array[5] = { 1, 2, 3, 4, 5 };
int *ptr = array + 3;
ptr = ptr + (A + B); //Invalid pointer value on 64-bit platform
printf("%i\n", *ptr); //Access violation on 64-bit platform
Let us follow the algorithm of calculating the expression "ptr + (A + B)":
- According to C++ rules, the variable A of the type int is converted to unsigned.
- A and B are summed and we get the value 0xFFFFFFFF of unsigned type.
- The expression "ptr + 0xFFFFFFFFu" is calculated.
The result of this process depends upon the size of the pointer on a particular architecture. If the addition takes place in the 32-bit program, the expression is equivalent to "ptr - 1" and the program successfully prints the value "3". In the 64-bit program, the value 0xFFFFFFFFu is fairly added to the pointer. As a result, the pointer gets far outside the array while we encounter some troubles when trying to get access to the item by this pointer.
Like in the first case, we recommend you to use only memsize-types in pointer arithmetic to avoid the situation described above. Here are two ways to correct the code:
ptr = ptr + (ptrdiff_t(A) + ptrdiff_t(B));
ptrdiff_t A = -2;
size_t B = 1;
...
ptr = ptr + (A + B);
You may argue and propose this way:
int A = -2;
int B = 1;
...
ptr = ptr + (A + B);
Yes, this code can work but it is bad due to some reasons:
- It trains programmers to be inaccurate when working with pointers. You might forget all the details of the code some time later and again redefine one of the variables with unsigned type by mistake.
- It is potentially dangerous to use non-memsize types together with the pointers. Suppose the variable Delta of int type participates in an expression with a pointer. This expression is quite correct. But an error may hide in the process of calculating the variable Delta because 32 bits might be not enough to perform the necessary calculations while working with large data arrays. You can automatically avoid this danger by using a memsize-type for the variable Delta.
- A code that uses the types size_t, ptrdiff_t and other memsize-types when working with pointers leads to a more appropriate binary code. We will speak about it in one of the following lessons.
Array indexing
We single out this type of errors to make our description more structured because array indexing with the use of square brackets is just another way of writing the address arithmetic we have discussed above.
You may encounter errors related to indexing large arrays or eternal loops in programs that process large amounts of data. The following example contains 2 errors at once:
const size_t size = ...;
char *array = ...;
char *end = array + size;
for (unsigned i = 0; i != size; ++i)
{
const int one = 1;
end[-i - one] = 0;
}
The first error lies in the fact that an eternal loop may occur if the size of the processed data exceeds 4 Gbytes (0xFFFFFFFF), because the variable "i" has "unsigned" type and will never reach a value larger than 0xFFFFFFFF. It is possible but not certain - it depends upon the code the compiler will build. For example, there will be no eternal loop in the debug mode while it will completely disappear in the release version, because the compiler will decide to optimize the code using the 64-bit register for the counter and the loop will become correct. All this adds confusion and a code that was good yesterday stops working today.
The second error is related to negative values of the indexes serving to walk the array from end to beginning. This code works in the 32-bit mode but crashes in the 64-bit one right with the first iteration of the loop as an access outside the array's bounds occurs. Let us consider the cause of this behavior.
Although everything written below is the same as in the example with "ptr = ptr + (A + B)", we resort to this repetition deliberately. We need to show you that a danger may hide even in simple constructs and take various forms.
According to C++ rules, the expression "-i - one" will be calculated on a 32-bit system in the following way (i = 0 at the first step):
- The expression "-i" has "unsigned" type and equals 0x00000000u.
- The variable "one" is extended from the type "int" to the type "unsigned" and equals 0x00000001u. Note: the type "int" is extended (according to C++ standard) to the type "unsigned" if it participates in an operation where the second argument has the type "unsigned".
- Two values of the type "unsigned" participate in a subtraction operation and its result equals 0x00000000u - 0x00000001u = 0xFFFFFFFFu. Note that the result has "unsigned" type.
On a 32-bit system, calling an array by the index 0xFFFFFFFFu is equivalent to using the index "-1". I.e. end[0xFFFFFFFFu] is analogous to end[-1]. As a result, the array's item is processed correctly. But the picture will be different in a 64-bit system: the type "unsigned" will be extended to the signed "ptrdiff_t" and the array's index will equal 0x00000000FFFFFFFFi64. It results in an overflow.
To correct the code you need to use such types as ptrdiff_t and size_t.
To completely convince you that you should use only memsize-types for indexing and in address arithmetic expressions, here is the code sample for you to consider.
class Region {
float *array;
int Width, Height, Depth;
float Region::GetCell(int x, int y, int z) const;
...
};
float Region::GetCell(int x, int y, int z) const {
return array[x + y * Width + z * Width * Height];
}
This code is taken from a real program of mathematical modeling where the amount of memory is the most important resource, so the capability of using more than 4 Gbytes on a 64-bit architecture significantly increases the computational power. Programmers often use one-dimensional arrays in programs like this to save memory while treating them as three-dimensional arrays. For this purpose, they use functions analogous to GetCell which provide access to the necessary items. But the code above will work correctly only with arrays that contain less than INT_MAX items because it is 32-bit "int" types that are used to calculate the item's index.
Programmers often make a mistake trying to correct the code in this way:
float Region::GetCell(int x, int y, int z) const {
return array[static_cast<ptrdiff_t>(x) + y * Width +
z * Width * Height];
}
They know that, according to C++ rules, the expression to calculate the index has the type "ptrdiff_t" and hope to avoid an overflow thereby. But the overflow may occur inside the expression "y * Width" or "z * Width * Height" because it is still the type "int" which is used to calculate them.
If you want to correct the code without changing the types of the variables participating in the expression, you may explicitly convert each variable to a memsize-type:
float Region::GetCell(int x, int y, int z) const {
return array[ptrdiff_t(x) +
ptrdiff_t(y) * ptrdiff_t(Width) +
ptrdiff_t(z) * ptrdiff_t(Width) *
ptrdiff_t(Height)];
}
Another - better - solution is to change the types of the variables to a memsize-type:
typedef ptrdiff_t TCoord;
class Region {
float *array;
TCoord Width, Height, Depth;
float Region::GetCell(TCoord x, TCoord y, TCoord z) const;
...
};
float Region::GetCell(TCoord x, TCoord y, TCoord z) const {
return array[x + y * Width + z * Width * Height];
}
Diagnosis
Address arithmetic errors are well diagnosed by PVS-Studio tool. The analyzer warns you about potentially dangerous expressions with the diagnostic warnings V102 and V108.
When possible, the analyzer tries to understand when a non-memsize type used in address arithmetic is safe and refuse from generating a warning on this fragment. As a result, the analyzer's behavior may seem strange. In such cases we ask users to take their time and examine the situation. Consider the following code:
char Arr[] = { '0', '1', '2', '3', '4' };
char *p = Arr + 2;
cout << p[0u + 1] << endl;
cout << p[0u - 1] << endl; //V108
This code works correctly in the 32-bit mode and prints numbers 3 and 1 on the screen. On testing this code we get a warning only on one string with the expression "p[0u - 1]". And this warning is quite right! If you compile and launch this code sample in the 64-bit mode, you will see the value 3 printed on the screen and the program will crash right after it.
If you are sure that the indexing is correct, you may change the corresponding parameter of the analyzer on the settings tab Settings: General or use filters. You may also use an explicit type conversion.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 14. Pattern 6. Changing an array's type
Sometimes it is necessary (or simply convenient) to present array items in the form of items of another type. The following code example shows dangerous and safe type conversions:
int array[4] = { 1, 2, 3, 4 };
enum ENumbers { ZERO, ONE, TWO, THREE, FOUR };
//safe cast (for MSVC)
ENumbers *enumPtr = (ENumbers *)(array);
cout << enumPtr[1] << " ";
//unsafe cast
size_t *sizetPtr = (size_t *)(array);
cout << sizetPtr[1] << endl;
//Output on 32-bit system: 2 2
//Output on 64-bit system: 2 17179869187
As you may see, the output result of the program differs in the 32-bit and 64-bit versions. On the 32-bit system, the access to the array items is correct because the sizes of the types size_t and "int" coincide, so we see the result "2 2".
On the 64-bit system, the output result is "2 17179869187" because it is the value 17179869187 which is located in the 1-st item of the array sizetPtr (see Figure 1). Sometimes it is this behavior you need but usually it is considered an error.
Note. The type enum in Visual C++ compiler by default coincides with the type int, i.e. it is a 32-bit type. You may use enum of another size only with the help of an extension considered non-standard in Visual C++. So the example above is correct from the viewpoint of Visual C++ compiler but from the viewpoint of other compilers conversion of a pointer to int items to a pointer to enum items may be also incorrect.
Figure 1 - Arrangement of array items in memory
To get rid of this incorrectness you should refuse to use unsafe type conversions and modify the program. Another way is to create a new array and copy the values from the original array into it.
You may encounter the described error pattern most often in the code fragments where programmers try to use pointer values as unique 32-bit identifiers.
Diagnosis
Unsafe changes of an array's type are diagnosed by the tool PVS-Studio. The analyzer warns you about potentially dangerous type conversions with the diagnostic warning V114. Accordingly, the analyzer responds only to those constructs that may cause an error on a 64-bit system. For example, the following code sample is correct and the analyzer will not call the programmer's attention to it:
void **pointersArray;
ptrdiff_t correctID = ((ptrdiff_t *)pointersArray)[index];
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 15. Pattern 7. Pointer packing
A lot of errors occurring when porting code to 64-bit systems are related to changes of relations between the size of the pointer and the size of prime integers. In the environment with ILP32 data model prime integers and pointers have the same size. Unfortunately, 32-bit code always relies on this assumption. Pointers are often cast to int, unsigned, long, DWORD and other inappropriate types.
You should always keep in mind that you must use only memsize-types for integer representation of pointers. We believe it is better to use the type uintptr_t because it reflects our intention better and makes the code more portable protecting it from changes in future.
Consider two small examples.
1) char *p;
p = (char *) ((int)p & PAGEOFFSET);
2) DWORD tmp = (DWORD)malloc(ArraySize);
...
int *ptr = (int *)tmp;
The both examples do not consider that the pointer's size might be other than 32 bits. Here an explicit type conversion is used that throws off the more significant bits of the pointer - this is an evident error on a 64-bit system. Below are the correct samples where integer memsize-types (intptr_t and DWORD_PTR) are used to pack the pointers:
1) char *p;
p = (char *) ((intptr_t)p & PAGEOFFSET);
2) DWORD_PTR tmp = (DWORD_PTR)malloc(ArraySize);
...
int *ptr = (int *)tmp;
The two examples discussed above are dangerous because the program failure might stay undetected for a long time. The program may work correctly when dealing with small data amounts on a 64-bit system as long as the processed addresses remain inside the first four Gbytes of memory. But then, as the program will be launched to solve large applied tasks, an overflow will occur outside this area. The code of the examples will lead to an unpredictable program behavior when processing the pointer to an object situated outside this area.
The next code sample will reveal itself right away at the fist run:
void GetBufferAddr(void **retPtr) {
...
// Access violation on 64-bit system
*retPtr = p;
}
unsigned bufAddress;
GetBufferAddr((void **)&bufAddress);
To correct it you should also use a type capable of storing the pointer.
size_t bufAddress;
GetBufferAddr((void **)&bufAddress); //OK
Sometimes it is just necessary to pack a pointer into a 32-bit type. It usually happens when you need to work with obsolete API functions. In these cases you should resort to special functions such as LongToIntPtr, PtrToUlong, etc.
To sum it up, I would like to note that it would be a bad style to pack a pointer into the types which always equal 64 bits. The code below will have to be corrected again when 128-bit systems appear:
PVOID p;
// Bad style. The 128-bit time will come.
__int64 n = __int64(p);
p = PVOID(n);
They say that Microsoft Research developers are already working on the task of providing compatibility of Windows 8 and Windows 9 cores with the 128-bit architecture. So, write a good code at once.
Diagnosis
Packing of pointers into 32-bit types is diagnosed by the tool PVS-Studio with the diagnostic warnings V114 and V202.
We should note that simple errors related to conversions of pointers to 32-bit types are well diagnosed by Visual C++ compiler. For example, the compiler will warn you about the error in the code we have considered above:
char *p;
p = (char *) ((int)p & PAGEOFFSET);
Visual C++ will generate the warning "warning C4311: 'type cast' : pointer truncation from 'char *' to 'int'". But the example with GetBufferAddr will not be suspicious to Visual C++. So, PVS-Studio is more reliable than Visual C++ when you want to make sure that your code has no such errors.
Let us consider one more important thing related to PVS-Studio analyzer. Although some of the errors can be detected with the help of Visual C++ warnings, it is sometimes impossible in practice. Many warnings are often disabled in large and obsolete projects (especially if they contain large third-party libraries) and therefore there is little probability that all 64-bit errors will be detected. On the other hand, it is not desirable to enable the warnings. There may be a lot of these but only some of them can really help you in detecting 64-bit errors. All the rest warnings will be irrelevant to the project and rather tell you about the inaccuracy of the code than potential 64-bit issues.
PVS-Studio is invaluable in this case. It will detect only those potential errors in the code which are directly related to the issues of porting the code to the 64-bit architecture.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 16. Pattern 8. Memsize-types in unions
A union is specific in that way that all the union items (members of the union) are assigned the same memory space, that is they are overlapped. Although you may access this memory space with the help of any member of the union, still you should choose it so that the result is sensible.
You should be very attentive dealing with unions that include pointers and other members of a memsize-type.
When you need to work with a pointer as an integer number, it may be convenient to use a union and work with the numerical representation of the type without explicit conversions. Consider the example:
union PtrNumUnion {
char *m_p;
unsigned m_n;
} u;
u.m_p = str;
u.m_n += delta;
This sample is correct for 32-bit systems and incorrect for 64-bit ones. Changing the member m_n on a 64-bit system we work only with a part of the pointer m_p (see Figure 1).
Figure 1 - The union format on the 32-bit and 64-bit systems
You should use a type that corresponds to the pointer's size:
union PtrNumUnion {
char *m_p;
size_t m_n; //type fixed
} u;
Another usual way of using a union is to represent one member as a set of several smaller members. For example, you may need to split a value of size_t type into bytes to implement the table algorithm of counting zero bits:
union SizetToBytesUnion {
size_t value;
struct {
unsigned char b0, b1, b2, b3;
} bytes;
} u;
SizetToBytesUnion u;
u.value = value;
size_t zeroBitsN = TranslateTable[u.bytes.b0] +
TranslateTable[u.bytes.b1] +
TranslateTable[u.bytes.b2] +
TranslateTable[u.bytes.b3];
This code contains a fundamental algorithmic error that consists in the assumption that the type size_t contains 4 bytes. It is hardly possible at present to search for algorithmic errors in automatic mode but what we can do is to find all the unions and check if they contain memsize-types. On finding such a union we might encounter an error in it and rewrite the code in the following way.
union SizetToBytesUnion {
size_t value;
unsigned char bytes[sizeof(value)];
} u;
SizetToBytesUnion u;
u.value = value;
size_t zeroBitsN = 0;
for (size_t i = 0; i != sizeof(bytes); ++i)
zeroBitsN += TranslateTable[bytes[i]];
Diagnosis
The tool PVS-Studio allows the programmer to quickly find and look through all the unions that contain memsize-types in the program code. The analyzer generates the diagnostic warning V117 for those structures the programmer should consider when porting the code to a 64-bit system.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 17. Pattern 9. Mixed arithmetic
I hope you have already rested from the 13-th lesson and now are ready to study one more important error pattern related to arithmetic expressions in which types of different capacities participate.
Mixed use of memsize-types and non-memsize types in expressions may lead to incorrect results on 64-bit systems and concern changes of the range of the input values. Consider some examples:
size_t Count = BigValue;
for (unsigned Index = 0; Index != Count; ++Index)
{ ... }
This is an example of an eternal loop occurring if Count > UINT_MAX. Suppose that this code works well on a 32-bit system with fewer iterations than UINT_MAX. But the 64-bit version of the program can process more data and may need more iterations. Since the values of Index variable lie within the range [0..UINT_MAX], the condition "Index != Count" will never be fulfilled and it leads to the eternal loop.
Note. Consider that this sample may work well at some particular settings of the compiler. Sometimes it is a source of much confusion because the code seems to be correct. In one of the following lessons we will tell you about phantom errors that reveal themselves only some time later. If you are already longing to learn why the code behaves so strangely, see the article "A 64-bit horse that can count".
To correct the code you should use only memsize-types in the expressions. In our example we may change the type of the variable Index from "unsigned" to size_t.
Another frequent error is using expressions of the following kind:
int x, y, z;
ptrdiff_t SizeValue = x * y * z;
We have already examined such examples with an arithmetic overflow that occurs when calculating expressions using non-memsize types. The result was incorrect of course. The search and detection of this code fragment was complicated by the fact that compilers usually do not generate any warnings on it. From the viewpoint of C++ language it is an absolutely correct construct: several variables of "int" type are multiplied together, after that the result is implicitly extended to the type ptrdiff_t and is assigned to a variable.
Here is a small code sample that shows the danger of inaccurate expressions with mixed types (these results were obtained in Microsoft Visual C++ 2005 in the 64-bit compilation mode):
int x = 100000;
int y = 100000;
int z = 100000;
ptrdiff_t size = 1; // Result:
ptrdiff_t v1 = x * y * z; // -1530494976
ptrdiff_t v2 = ptrdiff_t (x) * y * z; // 1000000000000000
ptrdiff_t v3 = x * y * ptrdiff_t (z); // 141006540800000
ptrdiff_t v4 = size * x * y * z; // 1000000000000000
ptrdiff_t v5 = x * y * z * size; // -1530494976
ptrdiff_t v6 = size * (x * y * z); // -1530494976
ptrdiff_t v7 = size * (x * y) * z; // 141006540800000
ptrdiff_t v8 = ((size * x) * y) * z; // 1000000000000000
ptrdiff_t v9 = size * (x * (y * z)); // -1530494976
All the operands in such expressions must be cast to a type of a larger capacity while performing the calculations. Remember that an expression like
ptrdiff_t v2 = ptrdiff_t (x) + y * z;
does not guarantee a correct result at all. It guarantees only that the expression "ptrdiff_t (x) + y * z" will have the type "ptrdiff_t".
So, if the expression's result must have a memsize-type, there must be only memsize-types in the expression too. Here is the correct version:
ptrdiff_t v2 = ptrdiff_t (x) + ptrdiff_t (y) * ptrdiff_t (z); // OK!
However, it is not always necessary to convert all the arguments to a memsize-type. If an expression consists of identical operators, you may convert only the first argument to the memsize-type. Consider an example:
int c();
int d();
int a, b;
ptrdiff_t v2 = ptrdiff_t (a) * b * c() * d();
The order of calculating the expression with the operators of the same priority has not been defined. More exactly, the compiler may choose any order of calculating the subexpressions (for example the calls of the functions c() and d()) it considers the most efficient, even if the subexpressions may cause side effects. The order of appearance of side effects has not been defined either. But since the multiplication operation refers to left-associative operators, the procedure of calculation will be performed in the following way:
ptrdiff_t v2 = ((ptrdiff_t (a) * b) * c()) * d();
As a result, each of the operators will be cast to the type "ptrdiff_t" before the multiplication and we will get the correct result.
Note. If there are integer calculations in your program and they need the control over overflows, resort to the class SafeInt - you may learn about its implementation and see its description in MSDN.
Mixed use of types may also result in the changes in program logic:
ptrdiff_t val_1 = -1;
unsigned int val_2 = 1;
if (val_1 > val_2)
printf ("val_1 is greater than val_2\n");
else
printf ("val_1 is not greater than val_2\n");
//Output on 32-bit system: "val_1 is greater than val_2"
//Output on 64-bit system: "val_1 is not greater than val_2"
According to C++ rules, the variable val_1 is extended to the type "unsigned int" and becomes the value 0xFFFFFFFFu on a 32-bit system - the condition "0xFFFFFFFFu > 1" is fulfilled. On a 64-bit system, however, it is the variable val_2 that gets extended to the type "ptrdiff_t" - in this case it is the expression "-1 > 1" which is checked. Figures 1 and 2 give the outlines of the transformations that take place.
Figure 1 - Transformations taking place in the 32-bit version of the code
Figure 2 - Transformations taking place in the 64-bit version of the code
If you need to make the code behave in the same way as before, you should change the type of the variable val_2:
ptrdiff_t val_1 = -1;
size_t val_2 = 1;
if (val_1 > val_2)
printf ("val_1 is greater than val_2\n");
else
printf ("val_1 is not greater than val_2\n");
Actually, it would be more correct not to compare signed and unsigned types at all, but this issue lies beyond the current topic.
We have considered only simple expressions. But the described issues may occur when using other C++ constructs too:
extern int Width, Height, Depth;
size_t GetIndex(int x, int y, int z) {
return x + y * Width + z * Width * Height;
}
...
MyArray[GetIndex(x, y, z)] = 0.0f;
If there is a large array (containing more than INT_MAX items), this code will be incorrect and we will be directed to the wrong items of the array MyArray. Although it is the value of "size_t" type which is returned, the expression "x + y * Width + z * Width * Height" is calculated using the type "int". I think you have already guessed what the corrected code will look like:
extern int Width, Height, Depth;
size_t GetIndex(int x, int y, int z) {
return (size_t)(x) +
(size_t)(y) * (size_t)(Width) +
(size_t)(z) * (size_t)(Width) * (size_t)(Height);
}
Or a bit simpler:
extern int Width, Height, Depth;
size_t GetIndex(int x, int y, int z) {
return (size_t)(x) +
(size_t)(y) * Width +
(size_t)(z) * Widt) * Height;
}
In the next example again we have a mixture of a memsize-type (the pointer) and a 32-bit "unsigned" type:
extern char *begin, *end;
unsigned GetSize() {
return end - begin;
}
The result of the expression "end - begin" has the type "ptrdiff_t". Since the function returns the type "unsigned", there occurs an implicit type conversion that leads to a loss of the more significant bits of the result. So, if the pointers begin and end refer to the beginning and the end of the array whose size is more than UINT_MAX (4Gb), the function will return an incorrect result.
And one more example. Here we are going to consider not a returned value but a formal argument of a function:
void foo(ptrdiff_t delta);
int i = -2;
unsigned k = 1;
foo(i + k);
This code resembles an example with incorrect pointer arithmetic discussed in the 13-th lesson, does not it? Right, here we have the same. We get the incorrect result when the actual argument, equaling 0xFFFFFFFF and having the type "unsigned", is implicitly extended to the type "ptrdiff_t".
Diagnosis
Errors occurring in 64-bit systems when integer types and memsize-types are used together are presented in many C++ syntactic constructs. To diagnose these errors several diagnostic warnings are used. PVS-Studio analyzer warns the programmer about possible errors with the help of these warnings: V101, V103, V104, V105, V106, V107, V109, V110, V121.
Let us return to the example we have considered earlier:
int c();
int d();
int a, b;
ptrdiff_t x = ptrdiff_t(a) * b * c() * d();
Although the expression itself multiplies together the arguments extending their types to "ptrdiff_t", an error may hide in the procedure of calculating these arguments. That is why the analyzer still warns you about the mixed types: "V104: Implicit type conversion to memsize type in an arithmetic expression".
PVS-Studio tool also allows you to find potentially unsafe expressions which hide behind explicit type conversions. To enable this function you should enable the warnings V201 and V202. By default, the analyzer does not generate warnings concerning explicit type conversions. For example:
TCHAR *begin, *end;
unsigned size = static_cast<unsigned>(end - begin);
The warnings V201 and V202 will help you detect such incorrect code fragments.
Still the analyzer will pay no attention to type conversions which are safe from the viewpoint of the 64-bit code:
const int *constPtr;
int *ptr = const_cast<int>(constPtr);
float f = float(constPtr[0]);
char ch = static_cast<char>(sizeof(double));
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 18. Pattern 10. Storage of integer values in double
The type double has the capacity of 64 bits and is compatible with the standard IEEE-754 on 32-bit and 64-bit systems.
Note. IEEE 754 is a widely spread standard of floating-point number presentation format used both in software and many hardware (CPU and FPU) implementations of arithmetic operations. Many compilers of programming languages use this standard to store and perform mathematical operations.
Some programmers use the type double to store and work with integer types:
size_t a = size_t(-1);
double b = a;
--a;
--b;
size_t c = b; // x86: a == c
// x64: a != c
This code may be justified when it is executed on a 32-bit system because the type double has 52 significant bits and can store a 32-bit integer value without loss. But when you save a 64-bit integer number into double, the exact result will be lost (see Figure 1).
Figure 1 - The number of significant bits in the types size_t and double
Perhaps an approximate number will do in your program, but I would like to warn you just in case that you may encounter such consequences on the new architecture. And in no case would I advise you to mix integer arithmetic and floating-point arithmetic.
Diagnosis
This error pattern is rather rare. However, these rare errors are in no way less dangerous. The analyzer PVS-Studio warns you about a potential error with the help of the diagnostic warning V113. If you need to find explicit type conversions (from memsize-types to double and vice versa), you may enable the warning V203.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 19. Pattern 11. Serialization and data interchange
Succession to existing data interchange protocols is an important component of the process of porting a program solution to a new platform. You need to provide the capability of reading the existing projects' formats, data interchange between 32-bit and 64-bit processes, etc.
Basically, the errors of this pattern concern serialization of memsize-types and data interchange operations they are used in:
1) size_t PixelsCount;
fread(&PixelsCount, sizeof(PixelsCount), 1, inFile);
2) __int32 value_1;
SSIZE_T value_2;
inputStream >> value_1 >> value_2;
These samples contain errors of two kinds: using types of changeable capacity in binary interfaces and ignoring the byte order.
Using types of changeable capacity
Do not use types that change their sizes depending upon the development environment in binary interfaces of data interchange. All the types in C++ do not have a fixed size, so they cannot be used for this purpose. That is why developers of software development tools and programmers themselves create data types that have a strictly fixed size such as __int8, __int16, INT32, word64, etc.
These types enable data interchange between programs on various platforms although it requires some additional efforts. The two examples shown above are written incorrectly and it will become clear when some data types change their sizes from 32 bits to 64 bits. Keeping in mind the necessity of supporting obsolete data types, you may correct these samples in the following way:
1) size_t PixelsCount;
__uint32 tmp;
fread(&tmp, sizeof(tmp), 1, inFile);
PixelsCount = static_cast<size_t>(tmp);
2) __int32 value_1;
__int32 value_2;
inputStream >> value_1 >> value_2;
But this way of correcting the code is not the best. The program may process more data after being ported to a 64-bit system, so 32-bit types in the code may become a great obstacle. In this case you may leave the obsolete code as it is to make it compatible with the obsolete data format but fix the incorrect types. Then you may create a new binary data format taking into consideration the previous errors. One more way out is to refuse to use binary formats and take the text format or other formats provided by various libraries.
Ignoring the byte order
Even after solving the issue of type sizes, you may encounter the problem of incompatibility of binary formats. The cause lies in a different data representation. It is most often related to a different byte order.
A byte order is a method of writing the bytes of multibyte numbers (see also Figure 1). The little-endian byte order means that the writing begins with the least-significant byte and ends with the most-significant byte. This writing order is employed in the memory of personal computers with x86 and x86-64-processors. The big-endian byte order means that the writing begins with the most-significant byte and ends with the least-significant byte. This order is a standard for TCP/IP protocols. That is why the big-endian byte order is often called the network byte order. This byte order is used in Motorola 68000 and SPARC processors.
Figure 1 - The byte order in a 64-bit type in little-endian and big-endian systems
While developing a binary interface or data format, you should remember about the byte order. If the 64-bit system you are porting your 32-bit application to has a byte order different from that of your application, you will have to adjust your code to this difference. To convert the big-endian byte order into the little-endian byte order and vice versa you may use the functions htonl(), htons(), bswap_64, etc.
Note. Many systems lack functions like bswap_64 while the function ntohl() allows you to reverse only 32-bit values. They forgot to add a version of this function for 64-bit types for some reason. If you need to change the byte order in a 64-bit variable, see the discussion of the topic "64 bit ntohl() in C++ ?" on stackoverflow.com site - there are several examples of how to implement this function.
Diagnosis
Unfortunately, PVS-Studio does not provide diagnosis for this pattern of 64-bit errors because this process cannot be formalized (we cannot compose a diagnostic rule). The only thing we may recommend you is to look through all the code fragments that are responsible for writing and reading data as well as sending data into other processes through the COM technology, for instance.
We would be glad if somebody of our readers proposed some ideas of how to detect the errors of this kind, at least partly.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 20. Pattern 12. Exceptions
Generation and processing of exceptions using integer types is a bad practice of C++ programming. You should use more informative types for these purposes, for example types derived from the class std::exception. But sometimes you have to deal with a low-quality code like this:
char *ptr1;
char *ptr2;
try {
try {
throw ptr2 - ptr1;
}
catch (int) {
std::cout << "catch 1: on x86" << std::endl;
}
}
catch (ptrdiff_t) {
std::cout << "catch 2: on x64" << std::endl;
}
You should be very attentive and avoid generation or processing of exceptions using memsize-types because it may result in changes of program logic. To correct this code you may replace "catch (int)" with "catch (ptrdiff_t)". A more correct way is to use a special class to pass the information about an error that has occurred.
Diagnosis
We have not encountered errors of this type in practice yet but the tool PVS-Studio can detect them. The diagnostic message V115 will be shown when an exception is generated with the help of a memsize-type, while the warning V116 will be generated when a memsize-type is used in catch operator.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 21. Pattern 13. Data alignment
Processors work more efficiently when the data are aligned properly and some processors cannot work with non-aligned data at all. When you try to work with non-aligned data on IA-64 (Itanium) processors, it will lead to generating an exception, as shown in the following example:
#pragma pack (1) // Also set by key /Zp in MSVC
struct AlignSample {
unsigned size;
void *pointer;
} object;
void foo(void *p) {
object.pointer = p; // Alignment fault
}
If you have to work with non-aligned data on Itanium, you should specify this explicitly to the compiler. For example, you may use a special macro UNALIGNED:
#pragma pack (1) // Also set by key /Zp in MSVC
struct AlignSample {
unsigned size;
void *pointer;
} object;
void foo(void *p) {
*(UNALIGNED void *)&object.pointer = p; //Very slow
}
In this case the compiler generates a special code to deal with the non-aligned data. It is not very efficient since the access to the data will be several times slower. If your purpose is to make the structure's size smaller, you can get the best result arranging the data in decreasing order of their sizes. We will speak about it in more detail in one of the next lessons.
Exceptions are not generated when you address non-aligned data on the architecture x64 but you still should avoid them - first, because the access to these data is very much slower, and second, because you may want to port the program to the platform IA-64 in the future.
Consider one more code sample that does not consider the data alignment:
struct MyPointersArray {
DWORD m_n;
PVOID m_arr[1];
} object;
...
malloc( sizeof(DWORD) + 5 * sizeof(PVOID) );
...
If we want to allocate an amount of memory needed to store an object of MyPointersArray type that contains 5 pointers, we should consider that the beginning of the array m_arr will be aligned on an 8-byte boundary. The arrangement of data in memory in various systems (Win32/Win64) is shown in Figure 1.
Figure 1- Data alignment in memory in Win32 and Win64 systems
The correct calculation of the size looks as follows:
struct MyPointersArray {
DWORD m_n;
PVOID m_arr[1];
} object;
...
malloc( FIELD_OFFSET(struct MyPointersArray, m_arr) +
5 * sizeof(PVOID) );
...
In this code we find out the offset of the structure's last member and add this value to its size. You can find out the offset of a structure's or class's member with the help of the macro "offsetof" or FIELD_OFFSET.
Always use these macros to know the offset in the structure without relying on knowing the types' sizes and alignment. Here is an example of code where the address of a structure's member is calculated correctly:
struct TFoo {
DWORD_PTR whatever;
int value;
} object;
int *valuePtr =
(int *)((size_t)(&object) + offsetof(TFoo, value)); // OK
Linux-developers may encounter one more trouble related to alignment. You may learn what it is from our blog-post "Change of type alignment and the consequences".
Diagnosis
Since work with non-aligned data does not cause an error on the x64 architecture and only reduces performance, the tool PVS-Studio does not warn you about packed structures. But if the performance of an application is crucial to you, we recommend you to look through all the fragments in the program where "#pragma pack" is used. This is more relevant for the architecture IA-64 but PVS-Studio analyzer is not designed to verify programs for IA-64 yet. If you deal with Itanium-based systems and are planning to purchase PVS-Studio, write to us and we will discuss the issues of adapting our tool to IA-64 specifics.
PVS-Studio tool allows you to find errors related to calculation of objects' sizes and offsets. The analyzer detects dangerous arithmetic expressions containing several operators sizeof() (it signals a potential error). The number of the corresponding diagnostic message is V119.
However, it is correct in many cases to use several sizeof() operators in one expression and the analyzer ignores such constructs. Here is an example of safe expressions with several sizeof operators:
int MyArray[] = { 1, 2, 3 };
size_t MyArraySize =
sizeof(MyArray) / sizeof(MyArray[0]); //OK
assert(sizeof(unsigned) < sizeof(size_t)); //OK
size_t strLen = sizeof(String) - sizeof(TCHAR); //OK
Appendix
Figure 2 represents types' sizes and their alignment. To learn about objects' sizes and their alignment on various platforms, see the code sample given in the blog-post "Change of type alignment and the consequences".
Figure 2 - Types' sizes and their alignment.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 22. Pattern 14. Overloaded functions
When porting a 32-bit program to a 64-bit platform, you may encounter changes in its logic related to the use of overloaded functions. If a function is overlapped for 32-bit and 64-bit values, the access to it with an argument of a memsize-type will be translated into different calls on different systems. This technique may be useful as, for example, in this code:
static size_t GetBitCount(const unsigned __int32 &) {
return 32;
}
static size_t GetBitCount(const unsigned __int64 &) {
return 64;
}
size_t a;
size_t bitCount = GetBitCount(a);
But this change of logic is potentially dangerous. Imagine a program that uses a class to arrange the stack. This class is specific in that way that it allows you to store values of different types:
class MyStack {
...
public:
void Push(__int32 &);
void Push(__int64 &);
void Pop(__int32 &);
void Pop(__int64 &);
} stack;
ptrdiff_t value_1;
stack.Push(value_1);
...
int value_2;
stack.Pop(value_2);
A careless programmer saves into and then selects from the stack values of different types ("
I think this type of errors is clear to you and you understand that one should be very careful about calls to overloaded functions when passing actual arguments of a memsize-type.
Diagnosis
PVS-Studio does not diagnose this pattern of 64-bit errors. First, it is explained by the fact that we have not encountered such an error in a real application yet, and second, diagnosis of such constructs involves some difficulties. Please write to us if you encounter such an error in real code.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 23. Pattern 15. Growth of structures' sizes
A growth of structures' sizes is not an error by itself but it may lead to consumption of an unreasonably large memory amount and therefore to performance penalty. Let us consider this pattern not as an error but as a cause of 64-bit code inefficiency.
Data in structures of C++ language are aligned in such a way as to make the access to them most effective. Some microprocessors cannot address non-aligned data at all and the compiler has to generate a special code to deal with them. Those microprocessors that can address non-aligned data do it much less efficiently. That is why the C++ compiler leaves empty locations between structures' fields to align them on the addresses of machine words and therefore speed up the access to them. You may disable alignment using special #pragma directives to reduce the amount of memory being consumed but we are not interested in this way now. The amount of memory being used may often be greatly reduced by simply changing the order of fields in the structure without performance penalty.
Consider the following structure:
struct MyStruct
{
bool m_bool;
char *m_pointer;
int m_int;
};
This structure will take 12 bytes on a 32-bit system and we cannot make it less. Each field is aligned on a 4-byte boundary. Even if we move m_bool to the end, it will not change anything. The compiler will still make the structure's size multiple of 4 bytes to align such structures in arrays.
In the 64-bit build mode the structure MyStruct will take 24 bytes. It is clear. First there is one byte for m_bool and 7 vacant bytes for the purpose of alignment because a pointer takes 8 bytes and must be aligned on an 8-byte boundary. Then there are 4 bytes for m_int and 4 vacant bytes to align the structure on an 8-byte boundary.
Fortunately, we may easily fix it by moving m_bool in the end of the structure, as shown below:
struct MyStructOpt
{
char *m_pointer;
int m_int;
bool m_bool;
};
The structure MyStructOpt takes 16 bytes instead of 24. The arrangement of the fields is represented in Figure 1. It is rather a great saving if we use, for instance, 10 million items. In this case we will save 80 Mbytes of memory but what is more significant, we will enhance performance. If there will be few structures, their sizes will not matter - the access will be performed with the same speed. But when there are many items, such things as cache, the number of memory accesses, etc. become significant. And you may say with certainty that 160 Mbytes of data will take less time to process than 240 Mbytes. Even a simple access to all the array items for reading will be faster.
Figure 1 - Arrangement of the fields in the structures MyStruct and MyStructOpt
It is not always possible or convenient to change the order of fields in structures. But if there are millions of such structures, you must find some time for refactoring. The result of such simple optimization as changing the field order may be very great.
You may ask according to what rules the compiler aligns the data. We will answer briefly, but if you want to study this issue in more detail, read the book by Jeffery Richter "Programming Applications for MS Windows". This question is considered rather thoroughly there.
In general, the alignment rule is as follows: each field is aligned on the address multiple of the size of this field. A field of size_t type on a 64-bit system will be aligned on an 8-byte boundary, int on a 4-byte boundary, short on a 2-byte boundary. Fields of char type are not aligned. The size of such a structure is aligned on the size multiple of the size of its maximum item. Let us explain this type of alignment by an example:
struct ABCD
{
size_t m_a;
char m_b;
};
The items will take 8 + 1 = 9 bytes. But if we want to create an array of structures ABCD[2], the size of the structure being 9 bytes, the field m_a of the second structure will lie on the non-aligned address. Therefore the compiler will add 7 empty bytes to the structure to make its size 16 bytes.
The process of optimizing a field arrangement may seem complicated. But there is a very simple and very effective method: you just need to arrange the fields in decreasing order of their sizes. This will be quite enough. In this case, the fields will be arranged without unnecessary gaps. For example, take the following structure of 40 bytes:
struct MyStruct
{
int m_int;
size_t m_size_t;
short m_short;
void *m_ptr;
char m_char;
};
By simply sorting the sequence of the fields in decreasing order of their sizes:
struct MyStructOpt
{
void *m_ptr;
size_t m_size_t;
int m_int;
short m_short;
char m_char;
};
we make this structure's size only 24 bytes.
Diagnosis
The tool PVS-Studio allows you to find structures in the code of 64-bit applications, whose sizes may be reduced by rearranging the fields in them. The analyzer generates the diagnostic message V401 on non-optimal structures.
The analyzer does not always generate a warning about inefficient structures because it tries to avoid too many unnecessary warnings. For example, the analyzer does not generate a message on complex heir classes because such objects are usually very few. For example:
class MyWindow : public CWnd {
bool m_isActive;
size_t m_sizeX, m_ sizeY;
char m_color[3];
...
};
You may reduce this structure's size but there is no practical sense in it.
The course authors: Andrey Karpov ( karpov@viva64.com ), Evgeniy Ryzhkov ( evg@viva64.com ).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 24. Phantom errors
We have finished studying the patterns of 64-bit errors and the last thing we will speak about, concerning these errors, is in what ways they may occur in programs.
The point is that it is not so easy to show you by an example, as in the following code sample, that the 64-bit code will cause an error when "N" takes large values:
size_t N = ...
for (int i = 0; i != N; ++i)
{
...
}
You may try such a simple sample and see that it works. What matters is the way the optimizing compiler will build the code. It depends upon the size of the loop's body if the code will work or not. In examples it is always small and 64-bit registers may be used for counters. In real programs with large loop bodies an error easily occurs when the compiler saves the value of "i" variable in memory. And now let us make it out what the incomprehensible text you have just read means.
When describing the errors, we often used the term "a potential error" or the phrase "an error may occur". In general, it is explained by the fact that one and the same code may be considered both correct and incorrect depending upon its purpose. Here is a simple example - using a variable of "int" type to index array items. If we address an array of graphics windows with this variable, everything is okay. We do not need to, or, rather, simply cannot work with billions of windows. But when we use a variable of "int" type to index array items in 64-bit mathematical programs or databases, we may encounter troubles when the number of the items excesses the range 0..INT_MAX.
But there is one more, subtler, reason for calling the errors "potential": whether an error reveals itself or not depends not only upon the input data but the mood of the compiler's optimizer. Most of the errors we have considered in our lessons easily reveal themselves in debug-versions and remain "potential" in release-versions. But not every program built in the debug mode can be debugged at large data amounts. There might be a case when the debug-version is tested only at small data sets while the exhaustive testing and final user testing at real data are performed in the release-version where the errors may stay hidden.
We encountered the specifics of optimizing Visual C++ 2005 compiler for the first time when preparing the program OmniSample. This is a project included into the PVS-Studio distribution kit which is intended for demonstrating all the errors diagnosed by Viva64 analyzer. The samples included into this project must work correctly in the 32-bit mode and cause errors in the 64-bit mode. Everything was alright in the debug-version but the release-version caused some troubles. The code that must have hung or led to a crash in the 64-bit mode worked! The reason lay in optimization. The way out was found in excessive complication of the samples' codes with additional constructs and adding the key words "volatile" that you may see in the code of the project OmniSample.
The same is with Visual C++ 2008/2010. Of course the code will be a bit different but everything that we will write here may be applied both to Visual C++ 2005 and Visual C++ 2008/2010.
If you find it quite good when some errors do not reveal themselves, put this idea out of your head. Code with such errors becomes very unstable. Any subtle change not even related to the error directly may cause changes in the program behavior. I want to point it out just in case that it is not the compiler's fault - the reason is in the hidden code defects. Further we will show you some samples with phantom errors that disappear and appear again with subtle code changes in release-versions and hunt for which might be very long and tiresome.
Consider the first code sample that works in the release-version although it must not:
int index = 0;
size_t arraySize = ...;
for (size_t i = 0; i != arraySize; i++)
array[index++] = BYTE(i);
This code correctly fills the whole array with values even if the array's size is much larger than INT_MAX. It is impossible theoretically because the variable index has "int" type. Some time later an overflow must lead to accessing the items by a negative index. But optimization gives us the following code:
0000000140001040 mov byte ptr [rcx+rax],cl
0000000140001043 add rcx,1
0000000140001047 cmp rcx,rbx
000000014000104A jne wmain+40h (140001040h)
As you may see, 64-bit registers are used and there is no overflow. But let us make a slightest alteration of the code:
int index = 0;
size_t arraySize = ...;
for (size_t i = 0; i != arraySize; i++)
{
array[index] = BYTE(index);
++index;
}
Suppose the code looks nicer this way. I think you will agree that it remains the same from the viewpoint of the functionality. But the result will be quite different - a program crash. Consider the code generated by the compiler:
0000000140001040 movsxd rcx,r8d
0000000140001043 mov byte ptr [rcx+rbx],r8b
0000000140001047 add r8d,1
000000014000104B sub rax,1
000000014000104F jne wmain+40h (140001040h)
It is that very overflow that must have been in the previous example. The value of the register r8d = 0x80000000 is extended in rcx as 0xffffffff80000000. The result is the writing outside the array.
Here is another example of optimization and how easy it is to spoil everything:
unsigned index = 0;
for (size_t i = 0; i != arraySize; ++i) {
array[index++] = 1;
if (array[i] != 1) {
printf("Error\n");
break;
}
}
This is the assembler code:
0000000140001040 mov byte ptr [rdx],1
0000000140001043 add rdx,1
0000000140001047 cmp byte ptr [rcx+rax],1
000000014000104B jne wmain+58h (140001058h)
000000014000104D add rcx,1
0000000140001051 cmp rcx,rdi
0000000140001054 jne wmain+40h (140001040h)
The compiler has decided to use the 64-bit register rdx to store the variable index. As a result, the code can correctly process an array with a size more than UINT_MAX.
But the peace is fragile. Just make the code a bit more complex and it will become incorrect:
volatile unsigned volatileVar = 1;
...
unsigned index = 0;
for (size_t i = 0; i != arraySize; ++i) {
array[index] = 1;
index += volatileVar;
if (array[i] != 1) {
printf("Error\n");
break;
}
}
The result of using the expression "index += volatileVar;" instead of "index++" is that 32-bit registers start participating in the code and cause the overflows:
0000000140001040 mov ecx,r8d
0000000140001043 add r8d,dword ptr [volatileVar (140003020h)]
000000014000104A mov byte ptr [rcx+rax],1
000000014000104E cmp byte ptr [rdx+rax],1
0000000140001052 jne wmain+5Fh (14000105Fh)
0000000140001054 add rdx,1
0000000140001058 cmp rdx,rdi
000000014000105B jne wmain+40h (140001040h)
In the end let us consider an interesting but large example. Unfortunately, we cannot make it shorter because we need to preserve the necessary behavior to show you. It is the impossibility to predict what a slight change in the code might lead to why these errors are especially dangerous.
ptrdiff_t UnsafeCalcIndex(int x, int y, int width) {
int result = x + y * width;
return result;
}
...
int domainWidth = 50000;
int domainHeght = 50000;
for (int x = 0; x != domainWidth; ++x)
for (int y = 0; y != domainHeght; ++y)
array[UnsafeCalcIndex(x, y, domainWidth)] = 1;
This code cannot fill the array consisting of 50000*50000 items correctly. It cannot do so because an overflow must occur when calculating the expression "int result = x + y * width;".
Thanks to a miracle, the array is filled correctly in the release-version. The function UnsafeCalcIndex is integrated into the loop where 64-bit registers are used:
0000000140001052 test rsi,rsi
0000000140001055 je wmain+6Ch (14000106Ch)
0000000140001057 lea rcx,[r9+rax]
000000014000105B mov rdx,rsi
000000014000105E xchg ax,ax
0000000140001060 mov byte ptr [rcx],1
0000000140001063 add rcx,rbx
0000000140001066 sub rdx,1
000000014000106A jne wmain+60h (140001060h)
000000014000106C add r9,1
0000000140001070 cmp r9,rbx
0000000140001073 jne wmain+52h (140001052h)
All this happened because the function UnsafeCalcIndex is simple and can be easily integrated. But when you make it a bit more complex or the compiler supposes that it should not be integrated, an error will occur that will reveal itself at large data amounts.
Let us modify (complicate) the function UnsafeCalcIndex a bit. Note that the function's logic has not been changed in the least:
ptrdiff_t UnsafeCalcIndex(int x, int y, int width) {
int result = 0;
if (width != 0)
result = y * width;
return result + x;
}
The result is a crash, when an access outside the array is performed:
0000000140001050 test esi,esi
0000000140001052 je wmain+7Ah (14000107Ah)
0000000140001054 mov r8d,ecx
0000000140001057 mov r9d,esi
000000014000105A xchg ax,ax
000000014000105D xchg ax,ax
0000000140001060 mov eax,ecx
0000000140001062 test ebx,ebx
0000000140001064 cmovne eax,r8d
0000000140001068 add r8d,ebx
000000014000106B cdqe
000000014000106D add rax,rdx
0000000140001070 sub r9,1
0000000140001074 mov byte ptr [rax+rdi],1
0000000140001078 jne wmain+60h (140001060h)
000000014000107A add rdx,1
000000014000107E cmp rdx,r12
0000000140001081 jne wmain+50h (140001050h)
I hope we have managed to show you how a 64-bit program that works might easily stop doing that after adding harmless corrections into it or building it with a different version of the compiler.
You will also understand some strange things and peculiarities of the code in OmniSample project which are made specially to demonstrate an error in simple examples even in the code optimization mode.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.
Contacts: e-mail: support@viva64.com, Tula, 300027, PO box 1800.
Lesson 25. Working with patterns of 64-bit errors in practice