Update

2025-05-03 16:21:08 +02:00 · 2025-05-03 16:21:08 +02:00 · 783a41a7cf
commit 783a41a7cf
parent 46a27e1930
18 changed files with 2157 additions and 2060 deletions
--- a/float.md
+++ b/float.md
@ -10,6 +10,16 @@ And there is more: floating point behavior really depends on the language you're

 { Really as I'm now getting down the float rabbit hole I'm seeing what a huge mess it all is, I'm not nearly an expert on this so maybe I've written some BS here, which just confirms how messy floats are. Anyway, from the articles I'm reading even being an expert on this issue doesn't seem to guarantee a complete understanding of it :) Just avoid floats if you can. ~drummyfish }

+For starers consider the following snippet (let's now assume the standard 32 bit IEEE float etc.):
+
+```
+for (float f = 0; f < 20000000; f++)
+  if (((int) f) % 4096 == 0) // once in a while output current f
+    printf("%f\n",f);
+```
+
+Take a look at the code and guess what it does. The loop should count up to 20 million and stop, right? NOPE. The loop will never end because *f* will never reach 20 million -- and no, it's not because 20 million would be a too high value, in fact it's laughably low considering that float can store values up to the order of 10^38. What gives then? Upon running the loop you'll notice it gets stuck at the value 16777216.0, which is the line beyond which float's resolution falls below 1, meaning the number can no longer be incremented by one because float cannot represent the next integer, 16777217. And that's just a very basic, innocent looking loop.
+
 Is floating point literal evil? Well, of course not, but it is extremely overused. You may need it for precise scientific simulations, e.g. [numerical integration](numerical_integration.md), but as our [small3dlib](small3dlib.md) shows, you can comfortably do even [3D rendering](3d_rendering.md) without it. So always consider whether you REALLY need float. **You mostly do NOT need it**.

 **Simple example of avoiding floating point**: many noobs think that if they e.g. need to multiply some integer *x* by let's say 2.34 they have to use floating point. This is of course false and just proves most retarddevs don't know elementary school [math](math.md). Multiplying *x* by 2.34 is the same as *(x * 234) / 100*, which  we can [optimize](optimization.md) to an approximately equal division by power of two as *(x * 2396) / 1024*. Indeed, given e.g. *x = 56* we get the same integer result 131 in both cases, the latter just completely avoiding floating point.
@ -60,6 +70,52 @@ The standard specifies many formats that are either binary or decimal and use va

 **Example?** Let's say we have float (binary34) value `11000000111100000000000000000000`: first bit (sign) is 1 so the number is negative. Then we have 8 bits of exponent: `10000001` (129) which converted from the biased format (subtracting 127) gives exponent value of 2. Then mantissa bits follow: `11100000000000000000000`. As we're dealing with a normal number (exponent bits are neither all 1s nor all 0s), we have to imagine the implicit `1.` in front of mantissa, i.e. our actual mantissa is `1.11100000000000000000000` = 1.875. The final number is therefore -1 * 1.875 * 2^2 = -7.5.

+The following table shows approximate resolution (i.e. distance to next representable value) of float (32 bit) and double (64 bit) near given stored value:
+
+| value   | float      | double     |
+| ------- | ---------- | ---------- |
+| 10^-20  | 3 * 10^-28 | 6 * 10^-37 |
+| 10^-19  | 2 * 10^-27 | 5 * 10^-36 |
+| 10^-18  | 4 * 10^-26 | 8 * 10^-35 |
+| 10^-17  | 3 * 10^-25 | 6 * 10^-34 |
+| 10^-16  | 2 * 10^-24 | 5 * 10^-33 |
+| 10^-15  | 4 * 10^-23 | 8 * 10^-32 |
+| 10^-14  | 3 * 10^-22 | 6 * 10^-31 |
+| 10^-13  | 2 * 10^-21 | 5 * 10^-30 |
+| 10^-12  | 4 * 10^-20 | 8 * 10^-29 |
+| 10^-11  | 3 * 10^-19 | 6 * 10^-28 |
+| 10^-10  | 2 * 10^-18 | 5 * 10^-27 |
+| 10^-9   | 4 * 10^-17 | 8 * 10^-26 |
+| 10^-8   | 3 * 10^-16 | 7 * 10^-25 |
+| 10^-7   | 3 * 10^-15 | 5 * 10^-24 |
+| 10^-6   | 4 * 10^-14 | 8 * 10^-23 |
+| 10^-5   | 3 * 10^-13 | 7 * 10^-22 |
+| 10^-4   | 3 * 10^-12 | 5 * 10^-21 |
+| 10^-3   | 4 * 10^-11 | 9 * 10^-20 |
+| 10^-2   | 3 * 10^-10 | 7 * 10^-19 |
+| 10^-1   | 3 * 10^-09 | 5 * 10^-18 |
+| 1       | 5 * 10^-08 | 9 * 10^-17 |
+| 10      | 4 * 10^-07 | 7 * 10^-16 |
+| 100     | 3 * 10^-06 | 6 * 10^-15 |
+| 1000    | 2 * 10^-05 | 4 * 10^-14 |
+| 10000   | 4 * 10^-04 | 7 * 10^-13 |
+| 100000  | 3 * 10^-03 | 6 * 10^-12 |
+| 1000000 | 0.02       | 4 * 10^-11 |
+| 10^7    | 0.42       | 7 * 10^-10 |
+| 10^8    | 3.38       | 6 * 10^-09 |
+| 10^9    | 27.10      | 5 * 10^-08 |
+| 10^10   | 433.68     | 8 * 10^-07 |
+| 10^11   | 3469.44    | 6 * 10^-06 |
+| 10^12   | 27755.57   | 5 * 10^-05 |
+| 10^13   | 444089.21  | 8 * 10^-04 |
+| 10^14   | 3552713.75 | 6 * 10^-03 |
+| 10^15   | 28421710   | 0.05       |
+| 10^16   | 454747360  | 0.84       |
+| 10^17   | 3637978880 | 6.77       |
+| 10^18   | 29103831040| 54.21      |
+| 10^19   | 4 * 10^11  | 867.36     |
+| 10^20   | 3 * 10^12  | 6938.89    |
+
 ## See Also

 - [posit](posit.md)