This commit is contained in:
Miloslav Ciz 2022-09-19 08:56:30 +02:00
parent 3bef179616
commit 561ddf8822
20 changed files with 309 additions and 19 deletions

View file

@ -10,7 +10,7 @@ These are mainly for [C](c.md), but may be usable in other languages as well.
- **gprof is a utility you can use to profile your code**.
- **`<stdint.h>` has fast type nicknames**, types such as `uint_fast32_t` which picks the fastest type of at least given width on given platform.
- **Keywords such as `inline`, `static` and `const` can help compiler optimize well**.
- **Optimize the [bottlenecks](bottleneck.md)!** Optimizing in the wrong place is a complete waste of time. If you're optimizing a part of code that's taking 1% of your program's run time, you will never speed up your program by more than that 1% even if you speed up the specific part by 10000%. Bottlenecks are usually inner-most loops of the main program loop, you can identify them with [profiling](profiling.md).
- **Optimize the [bottlenecks](bottleneck.md)!** Optimizing in the wrong place is a complete waste of time. If you're optimizing a part of code that's taking 1% of your program's run time, you will never speed up your program by more than that 1% even if you speed up the specific part by 10000%. Bottlenecks are usually inner-most loops of the main program loop, you can identify them with [profiling](profiling.md). Generally initialization code that runs only once in a long time doesn't need much optimization -- no one is going to care if a program starts up 1 millisecond faster (but of course in special cases such as launching many processes this may start to matter).
- **You can almost always trade space (memory usage) for time (CPU demand) and vice versa** and you can also fine-tune this. You typically gain speed by precomputation (look up tables, more demanding on memory) and memory with compression (more demanding on CPU).
- **Be smart, use [math](math.md)**. Example: let's say you want to compute the radius of a zero-centered [bounding sphere](bounding_sphere.md) of an *N*-point [point cloud](point_cloud.md). Naively you might be computing the Euclidean distance (*sqrt(x^2 + y^2 + z^2)*) to each point and taking a maximum of them, however you can just find the maximum of squared distances (*x^2 + y^2 + z^2*) and return a square root of that maximum. This saves you a computation of *N - 1* square roots.
- **Learn about [dynamic programming](dynamic_programming.md)**.
@ -31,6 +31,7 @@ These are mainly for [C](c.md), but may be usable in other languages as well.
- For the sake of embedded platforms **avoid [floating point](floating_point.md)** as that is often painfully slowly emulated in software. Use [fixed point](fixed_point.md).
- **Early branching can create a speed up** (instead of branching inside the loop create two versions of the loop and branch in front of them). This is a kind of space-time tradeoff.
- **Reuse variable to save space**. A warning about this one: readability may suffer, mainstreamers will tell you you're going against "good practice", and some compilers may do this automatically anyway. Be sure to at least make this clear in your comments. Anyway, on a lower level and/or with dumber compilers you can just reuse variables that you used for something else rather than creating a new variable that takes additional RAM; the only prerequisite for "merging" variables is that the variables aren't used at the same time.
- **What's fast on one platform may be slow on another**. This depends on the instruction set as well as on compiler, operating system, quirks of the hardware and other details. You always need to test on the hardware itself.
- **You can optimize critical parts of code in [assembly](assembly.md)**, i.e. manually write the assembly code that takes most of the running time of the program, with as few and as inexpensive instructions as possible (but beware, popular compilers are very smart and it's often hard to beat them). But note that such code loses portability! So ALWAYS have a C (or whatever language you are using) [fallback](fallback.md) code for other platforms, use [ifdefs](ifdef.md) to switch to the fallback version on platforms running on different assembly languages.
## When To Actually Optimize?