This commit is contained in:
Miloslav Ciz 2023-02-12 14:35:35 +01:00
parent 68f77b2118
commit 3de5192a17
9 changed files with 44 additions and 22 deletions

View file

@ -6,11 +6,11 @@ Unless specified otherwise, this article supposes the C99 standard of the C lang
## Undefined/Unspecified Behavior
Undefined, unspecified and implementation-defined behaviors are kinds of unpredictable and sometimes non-intuitive behavior of certain operations that may differ between compilers, platforms or runs because they are not defined by the language specification; this is mostly done on purpose as to allow some implementation freedom which allows implementing the language in a way that is most efficient on given platform. This behavior may be completely random (unpredictable) or implementation-specified (consistent within each implementation but potentially different between implementations). In any case, one has to be very careful about letting such behavior influence computations. Note that tools such as [cppcheck](cppcheck.md) can help find undefined behavior in code. Description of some of these behaviors follow.
Undefined, unspecified and implementation-defined behaviors are kinds of unpredictable and sometimes non-intuitive behavior of certain operations that may differ between compilers, platforms or runs because they are not defined by the language specification; this is mostly done on purpose as to allow some implementation freedom which allows implementing the language in a way that is most efficient on given platform. This behavior may be completely random (unpredictable) or implementation-specified (consistent within each implementation but potentially different between implementations). In any case, one has to be very careful about letting such behavior influence computations. Note that tools such as [cppcheck](cppcheck.md) can help find undefined behavior in code. Description of some such behavior follows.
**Data type sizes including int and char may not be the same on each platform**. Even though we almost take it for granted than char is 8 bits wide, in theory it can be wider. The int (and unsigned int) type width should reflect the architectures native integer type, so nowadays mostly it's mostly 32 or 64 bits. To deal with this we can use the standard library `limits.h` and `stdint.h` headers.
**No specific [endianness](endian.md) is enforced**. Nowadays little endian is what you'll encounter on most platforms, but e.g. [PowerPC](ppc.md) uses big endian.
**No specific [endianness](endian.md) or even encoding of numbers is specified**. Nowadays little endian and [two's complement](twos_complement.md) is what you'll encounter on most platforms, but e.g. [PowerPC](ppc.md) uses big endian ordering.
**Order of evaluation of operands and function arguments is not specified**. I.e. in an expression or function call it is not defined which operands or arguments will be evaluated first, the order may be completely random and the order may differ even when evaluating the same expression at another time. This is demonstrated by the following code:
@ -32,9 +32,9 @@ int main(void)
}
```
**Overflow behavior of signed type operations is not specified.** Sometimes we suppose that e.g. addition of two signed integers that are past the data type's limit will produce two's complement overflow, but in fact this operation's behavior is undefined, C99 doesn't say what representation should be used for numbers. For [portability](portability.md), predictability and safety **it is safer to use unsigned types**.
**Overflow behavior of signed type operations is not specified.** Sometimes we suppose that e.g. addition of two signed integers that are past the data type's limit will produce two's complement overflow (wrap around), but in fact this operation's behavior is undefined, C99 doesn't say what representation should be used for numbers. For [portability](portability.md), predictability and safety **it is safer to use unsigned types** (but safety may come at the cost of performance, i.e. you prevent compiler from performing some optimizations based on undefined behavior).
**Bit shifts by type width or more are undefined.** Also bit shifts by negative values are undefined. So e.g. `x >> 8` is undefined if width of the data type of `x` is 8 bits.
**Bit shifts by type width or more are undefined.** Also bit shifts by negative values are undefined. So e.g. `x >> 8` is undefined if width of the data type of `x` is 8 bits or fewer.
**Char data type signedness is not defined**. The signedness can be explicitly "forced" by specifying `signed char` or `unsigned char`.
@ -44,11 +44,23 @@ Besides being extra careful about writing memory safe code, one needs to also kn
## Different Behavior Between C And C++ (And Different C Standards)
C is **not** a subset of C++, i.e. not every C program is a C++ program (for simple example imagine a C program in which we use the word `class` as an identifier). Furthermore a C program that is at the same time also a C++ program may behave differently when compiled as C vs C++. Of course, all of this may also apply between different standards of C, not just between C and C++.
C is **not** a subset of C++, i.e. not every C program is a C++ program (for simple example imagine a C program in which we use the word `class` as an identifier: it is a valid C program but not a C++ program). Furthermore a C program that is at the same time also a C++ program may behave differently when compiled as C vs C++, i.e. there may be a [semantic](semantics.md) difference. Of course, all of this may also apply between different standards of C, not just between C and C++.
For portability sake it is good to try to write C code that will also compile as C++ (and behave the same). For this we should know some basic differences in behavior between C and C++.
TODO: specific examples
One difference lies for example in [pointers](pointer.md) to string literals. While in C it is possible to have non-const pointers such as
```
char *s = "abc";
```
C++ requires any such pointer to be `const`, i.e.:
```
const char *s = "abc";
```
TODO: more examples
## Compiler Optimizations