Update
This commit is contained in:
parent
68f77b2118
commit
3de5192a17
9 changed files with 44 additions and 22 deletions
|
@ -37,6 +37,7 @@ These are mainly for [C](c.md), but may be usable in other languages as well.
|
|||
- **What's fast on one platform may be slow on another**. This depends on the instruction set as well as on compiler, operating system, available hardware, [driver](driver.md) implementation and other details. In the end you always need to test on the specific platform to be sure about how fast it will run. A good approach is to optimize for the weakest platform you want to support -- if it runs fasts on a weak platform, a "better" platform will most likely still run it fast.
|
||||
- **Mental calculation tricks**, e.g. multiplying by one less or more than a power of two is equal to multiplying by power of two and subtracting/adding once, for example *x * 7 = x * 8 - x*; the latter may be faster as a multiplication by power of two (bit shift) and addition/subtraction may be faster than single multiplication, especially on some primitive platform without hardware multiplication. However this needs to be tested on the specific platform. Smart compilers perform these optimizations automatically, but not every compiler is high level and smart.
|
||||
- **Else should be the less likely branch**, try to make if conditions so that the if branch is the one with higher probability of being executed -- this can help branch prediction.
|
||||
- **You can save space by "squeezing" variables** -- this is a space-time tradeoff, it's a no brainer but nubs may be unaware of it -- for example you may store 2 4bit values in a single `char` variable (8bit data type), one in the lower 4bits, one in the higher 4bits (use bit shifts etc.). So instead of 16 memory-aligned booleans you may create one `int` and use its individual bits for each boolean value. This is useful in environments with extremely limited RAM such as 8bit Arduinos.
|
||||
- **You can optimize critical parts of code in [assembly](assembly.md)**, i.e. manually write the assembly code that takes most of the running time of the program, with as few and as inexpensive instructions as possible (but beware, popular compilers are very smart and it's often hard to beat them). But note that such code loses [portability](portability.md)! So ALWAYS have a C (or whatever language you are using) [fallback](fallback.md) code for other platforms, use [ifdefs](ifdef.md) to switch to the fallback version on platforms running on different assembly languages.
|
||||
- **[Parallelism](parallelism.md) ([multithreading](multithreading.md), [compute shaders](compute_shader.md), ...) can astronomically accelerate many programs**, it is one of the most effective techniques of speeding up programs -- we can simply perform several computations at once and save a lot of time -- but there are a few notes. Firstly not all problems can be parallelized, some problem are sequential in nature, even though most problems can probably be parallelized to some degree. Secondly it is hard to do, opens the door for many new types of bugs, requires hardware support (software simulated parallelism can't work here of course) and introduces [dependencies](dependency.md); in other words it is huge [bloat](bloat.md), we don't recommend parallelization unless a very, very good reason is given. Optional use of [SIMD](simd.md) instructions can be a reasonable midway to going full parallel computation.
|
||||
- **Specialized hardware (e.g. a [GPU](gpu.md)) astronomically accelerates programs**, but as with the previous point, portablity and simplicity greatly suffers, your program becomes bloated and gains dependencies, always consider using specialized hardware and offer software fallbacks.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue