27 lines
		
	
	
	
		
			3.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			27 lines
		
	
	
	
		
			3.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
|   | # Optimization
 | ||
|  | 
 | ||
|  | Optimization means making a program more efficient (in terms of some metric such as speed or memory usage) while preserving its functionality. | ||
|  | 
 | ||
|  | ## Rules & Tips
 | ||
|  | 
 | ||
|  | - Tell your compiler to actually optimize (`-O3`, `-Os` etc.). | ||
|  | - gprof is a utility you can use to profile your code. | ||
|  | - `<stdint.h>` has types such as `uint_fast32_t` which picks the fastest type of at least given width on given platform. | ||
|  | - Keywords such as `inline`, `static` and `const` can help compiler optimize well. | ||
|  | - Optimize the bottlenecks! Optimizing in the wrong place is a complete waste of time. If you're optimizing a part of code that's taking 1% of your program's run time, you will never speed up your program by more than that 1% even if you speed up the specific part by 10000%. | ||
|  | - You can almost always trade space (memory usage) for time (CPU demand) and vice versa and you can also fine-tune this. You typically gain speed by precomputation (look up tables, more demanding on memory) and memory with compression (more demanding on CPU). | ||
|  | - Avoid branches (ifs). They break prediction and instruction preloading and are often source of great performance losses. Don't forget that you can compare and use the result of the operation without using any branching (e.g. `x = (y == 5) + 1;`). | ||
|  | - Use iteration instead of recursion if possible (calling a function is pretty expensive). | ||
|  | - You can use good-enough approximations instead of completely accurate calculations, e.g. taxicab distance instead of Euclidean distance, and gain speed or memory without trading. | ||
|  | - Operations on static data can be accelerated with accelerating structures (indices for database lookups, spatial grids for collision checking, ...). | ||
|  | - Use powers of 2 whenever possible, this is efficient thanks to computers working in binary. Not only may this help nice utilization and alignment of memory, but mainly multiplication and division can be optimized by the compiler to mere bit shifts which is a tremendous speedup. | ||
|  | - Write cache-friendly code (minimize long jumps in memory). | ||
|  | - Compare to 0 if possible. There's usually an instruction that just checks the zero flag which is faster than loading and comparing two arbitrary numbers. | ||
|  | - Consider moving computation from run time to compile time. E.g. if you make a resolution of your game constant (as opposed to a variable), the compiler will be able to partially precompute expressions with the display dimensions and so speed up your program (but you won't be able to dynamically change resolution). | ||
|  | - On some platforms such as ARM the first arguments to a function are passed via registers, so it's better to have few parameters in functions. | ||
|  | - Optimize when you already have a working code. As Donald Knuth put it: "premature optimization is the root of all evil". | ||
|  | - Use your own caches, for example if you're frequently working with some database item you better pull it to memory and work with it there, then write it back once you're done (as opposed to communicating with the DB there and back). | ||
|  | - Single compilation unit (one big program without linking) can help compiler optimize because it can see the whole code at once, not just its parts. | ||
|  | - Search literature for algorithms with better complexity class (sorts are a nice example). | ||
|  | - For the sake of embedded platforms consider avoiding floating point as that is often painfully slowly emulated in software. | ||
|  | - Early branching can bring a speed up (instead of branching inside the loop create two versions of the loop and branch in front of them). |