From 3b27159e33efe497d48b55de2df1632320bdc786 Mon Sep 17 00:00:00 2001 From: Miloslav Ciz Date: Mon, 20 Mar 2023 21:29:34 +0100 Subject: [PATCH] Update --- c_pitfalls.md | 10 +++++----- function.md | 8 +++++--- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/c_pitfalls.md b/c_pitfalls.md index 1d4557a..873919a 100644 --- a/c_pitfalls.md +++ b/c_pitfalls.md @@ -6,9 +6,9 @@ Unless specified otherwise, this article supposes the C99 standard of the C lang ## Undefined/Unspecified Behavior -Undefined, unspecified and implementation-defined behaviors are kinds of unpredictable and sometimes non-intuitive behavior of certain operations that may differ between compilers, platforms or runs because they are not defined by the language specification; this is mostly done on purpose as to allow some implementation freedom which allows implementing the language in a way that is most efficient on given platform. This behavior may be completely random (unpredictable) or implementation-specified (consistent within each implementation but potentially different between implementations). In any case, one has to be very careful about letting such behavior influence computations. Note that tools such as [cppcheck](cppcheck.md) can help find undefined behavior in code. Description of some such behavior follows. +Undefined (completely unpredictable), unspecified (safe but potentially differing) and implementation-defined (consistent within implementation but potentially differing between them) behavior poses a kind of unpredictability and sometimes non-intuitive, tricky behavior of certain operations that may differ between compilers, platforms or runs because they are not exactly described by the language specification; this is mostly done on purpose so as to allow some implementation freedom which allows implementing the language in a way that is most efficient on given platform. One has to be very careful about not letting such behavior break the program on platforms different from the one the program is developed on. Note that tools such as [cppcheck](cppcheck.md) can help find undefined behavior in code. Description of some such behavior follows. -**Data type sizes including int and char may not be the same on each platform**. Even though we almost take it for granted than char is 8 bits wide, in theory it can be wider. The int (and unsigned int) type width should reflect the architectures native integer type, so nowadays mostly it's mostly 32 or 64 bits. To deal with this we can use the standard library `limits.h` and `stdint.h` headers. +**Data type sizes including int and char may not be the same on each platform**. Even though we almost take it for granted that char is 8 bits wide, in theory it can be wider (even though `sizeof(char)` is always 1). The int (and unsigned int) type width should reflect the architectures native integer type, so nowadays it's mostly 32 or 64 bits. To deal with this we can use the standard library `limits.h` and `stdint.h` headers. **No specific [endianness](endian.md) or even encoding of numbers is specified**. Nowadays little endian and [two's complement](twos_complement.md) is what you'll encounter on most platforms, but e.g. [PowerPC](ppc.md) uses big endian ordering. @@ -32,13 +32,13 @@ int main(void) } ``` -**Overflow behavior of signed type operations is not specified.** Sometimes we suppose that e.g. addition of two signed integers that are past the data type's limit will produce two's complement overflow (wrap around), but in fact this operation's behavior is undefined, C99 doesn't say what representation should be used for numbers. For [portability](portability.md), predictability and safety **it is safer to use unsigned types** (but safety may come at the cost of performance, i.e. you prevent compiler from performing some optimizations based on undefined behavior). +**Overflow behavior of signed type operations is not specified.** Sometimes we suppose that e.g. addition of two signed integers that are past the data type's limit will produce two's complement overflow (wrap around), but in fact this operation's behavior is undefined, C99 doesn't say what representation should be used for numbers. For [portability](portability.md), predictability and preventing bugs **it is safer to use unsigned types** (but safety may come at the cost of performance, i.e. you prevent compiler from performing some optimizations based on undefined behavior). **Bit shifts by type width or more are undefined.** Also bit shifts by negative values are undefined. So e.g. `x >> 8` is undefined if width of the data type of `x` is 8 bits or fewer. **Char data type signedness is not defined**. The signedness can be explicitly "forced" by specifying `signed char` or `unsigned char`. -**[Floating point](float.md) results are not completely defined**, no representation (such as IEEE 754) is defined and there may appear small differences in floating operations under different machines or e.g. compiler optimization settings -- this may lead to [nondeterminism](determinism.md). +**[Floating point](float.md) results are not precisely specified**, no representation (such as IEEE 754) is specified and there may appear small differences in float operations under different machines or e.g. compiler optimization settings -- this may lead to [nondeterminism](determinism.md). ## Memory Unsafety @@ -66,7 +66,7 @@ TODO: more examples ## Compiler Optimizations -C compilers perform automatic optimizations and and other transformations of the code, especially when you tell it to optimize aggressively (`-O3`) which is a standard practice to make programs run faster. However this makes compilers perform a lot of [magic](magic.md) and may lead to unexpected and unintuitive undesired behavior such as bug or even the "unoptimization of code". { I've seen a code I've written have bigger size when I set the `-Os` flag (optimize for smaller size). ~drummyfish } +C compilers perform automatic optimizations and other transformations of the code, especially when you tell it to optimize aggressively (`-O3`) which is a standard practice to make programs run faster. However this makes compilers perform a lot of [magic](magic.md) and may lead to unexpected and unintuitive undesired behavior such as bugs or even the "unoptimization of code". { I've seen a code I've written have bigger size when I set the `-Os` flag (optimize for smaller size). ~drummyfish } Aggressive optimization may firstly lead to tiny bugs in your code manifesting in very weird ways, it may happen that a line of code somewhere which may somehow trigger some tricky [undefined behavior](undefined_behavior.md) may cause your program to crash in some completely different place. Compilers exploit undefined behavior to make all kinds of big brain reasoning and when they see code that MAY lead to undefined behavior a lot of chain reasoning may lead to very weird compiled results. Remember that undefined behavior, such as overflow when adding signed integers, doesn't mean the result is undefined, it means that ANYTHING CAN HAPPEN, the program may just start printing nonsensical stuff on its own or your computer may explode. So it may happen that the line with undefined behavior will behave as you expect but somewhere later on the program will just shit itself. For these reasons if you encounter a very weird bug, try to disable optimizations and see if it goes away -- if it does, you may be dealing with this kind of stuff. Also check your program with tools like [cppcheck](cppcheck.md). diff --git a/function.md b/function.md index 68a16f4..b9d6a18 100644 --- a/function.md +++ b/function.md @@ -4,7 +4,7 @@ Function is a very basic term in [mathematics](math.md) and [programming](progra ## Mathematical Functions -In mathematics functions can be defined and viewed from different angles but a function is basically anything that assigns each member of some set *A* (so called *domain*) exactly one member of a potentially different set *B* (so called *codomain*). A typical example of a function is an equation that from one "input number" computes another number, for example: +In mathematics functions can be defined and viewed from different angles, but it is essentially anything that assigns each member of some [set](set.md) *A* (so called *domain*) exactly one member of a potentially different set *B* (so called *codomain*). A typical example of a function is an equation that from one "input number" computes another number, for example: *f(x) = x / 2* @@ -50,9 +50,11 @@ Plotting functions of multiple parameters is more difficult because we need more Functions can have certain properties such as: -- being **[bijective](bijection.md)**: A bijective function pairs exactly one element from the domain with one element from codomain and vice versa, i.e. for every result (element of codomain) of the function it is possible to unambiguously say which input created it. For bijective functions we can create **[inverse functions](inverse_function.md)** that reverse the mapping (e.g. [arcus sine](asin.md) is the inverse of a [sin](sin.md) function that's limited to the interval where it is bijective). For example *f(x) = 2 * x* is bijective with its inverse function being *f^(-1)(x) = x / 2*, but *f2(x) = x^2* is not bijective because e.g. both 1 and -1 give the result of 1. +- being **[bijective](bijection.md)**: Pairs exactly one element from the domain with one element from codomain and vice versa, i.e. for every result (element of codomain) of the function it is possible to unambiguously say which input created it. For bijective functions we can create **[inverse functions](inverse_function.md)** that reverse the mapping (e.g. [arcus sine](asin.md) is the inverse of a [sin](sin.md) function that's limited to the interval where it is bijective). For example *f(x) = 2 * x* is bijective with its inverse function being *f^(-1)(x) = x / 2*, but *f2(x) = x^2* is not bijective because e.g. both 1 and -1 give the result of 1. - being an **[even function](even_function.md)**: For this function it holds that *f(x) = f(-x)*, i.e. the plotted function is symmetric by the vertical axis. Example is the [cosine](cos.md) function. - being an **[odd function](odd_function.md)**: For this function it holds that *-f(x) = f(-x)*, i.e. the plotted function is symmetric by the center point [0,0]. Example is the [sine](sin.md) function. +- being **[differentiable](differentiable.md)**: Its [derivative](derivative.md) is defined everywhere. +- ... In context of functions we may encounter the term [composition](composition.md) which simply means chaining the functions. E.g. the composition of functions *f(x)* and *g(x)* is written as *(f o g)(x)* which is the same as *f(g(x))*. @@ -79,7 +81,7 @@ Functions commonly used in mathematics range from the trivial ones (such as the ## Programming Functions -In programming the definition of a function is less strict, even though some languages, namely [functional](functional.md) ones, are built around purely mathematical functions -- for distinction we call these strictly mathematical functions **pure**. In traditional languages functions may or may not be pure, a function here normally means a **subprogram** which can take parameters and return a value, just as a mathematical function, but it can further break some of the rules of mathematical functions -- for example it may have so called **[side effects](side_effect.md)**, i.e. performing additional actions besides just returning a number (such as modifying data in memory, printing something to the screen etc.), or use randomness and internal states, i.e. potentially returning different numbers when invoked (called) multiple times with exactly the same parameters. These functions are called **impure**; in programming a *function* without an adjective is implicitly expected to be impure. Thanks to allowing side effects these functions don't have to actually return any value, their purpose may be to just invoke some behavior such as writing something to the screen, initializing some hardware etc. The following piece of code demonstrates this in [C](c.md): +In programming the definition of a function is less strict, even though some languages, namely [functional](functional.md) ones, are built around purely mathematical functions -- for distinction we call these strictly mathematical functions **pure**. In traditional languages functions may or may not be pure, a function here normally means a **subprogram** which can take parameters and return a value, just as a mathematical function, but it can further break some of the rules of mathematical functions -- for example it may have so called **[side effects](side_effect.md)**, i.e. performing additional actions besides just returning a number (such as modifying data in memory which can be read by others, printing something to the screen etc.), or use randomness and internal states, i.e. potentially returning different numbers when invoked (called) multiple times with exactly the same arguments. These functions are called **impure**; in programming a *function* without an adjective is implicitly expected to be impure. Thanks to allowing side effects these functions don't have to actually return any value, their purpose may be to just invoke some behavior such as writing something to the screen, initializing some hardware etc. The following piece of code demonstrates this in [C](c.md): ``` int max(int a, int b, int c) // pure function