This commit is contained in:
Miloslav Ciz 2023-10-30 21:42:20 +01:00
parent 5414a07fa2
commit c25381cf0e
9 changed files with 38 additions and 21 deletions

View file

@ -35,7 +35,7 @@ Note a few things: firstly our format is [shit](shit.md) because some numbers ha
Secondly notice the non-uniform distribution of our numbers: while we have a nice resolution close to 0 (we can represent 1/16, 2/16, 3/16, ...), our resolution in high numbers is low (the highest number we can represent is 56 but the second highest is 48, we can NOT represent e.g. 50 exactly). Realize that obviously with 6 bits we can still represent only 64 numbers at most! So float is NOT a magical way to get more numbers, with integers on 6 bits we can represent numbers from 0 to 63 spaced exactly by 1 and with our floating point we can represent numbers spaced as close as 1/16th but only in the region near 0, we pay the price of having big gaps in higher numbers.
Also notice that thing like simple addition of numbers become more difficult and time consuming, you have to include conversions and [rounding](rounding.md) -- while with fixed point addition is a single machine instruction, same as integer addition, here with software implementation we might end up with dozens of instructions (specialized hardware can perform addition fast but still, not all computer have that hardware).
Also notice that things like simple addition of numbers become more difficult and time consuming, you have to include conversions and [rounding](rounding.md) -- while with fixed point addition is a single machine instruction, same as integer addition, here with software implementation we might end up with dozens of instructions (specialized hardware can perform addition fast but still, not all computer have that hardware).
Rounding errors will appear and accumulate during computations: imagine the operation 48 + 1/8. Both numbers can be represented in our system but not the result (48.125). We have to round the result and end up with 48 again. Imagine you perform 64 such additions in succession (e.g. in a loop): mathematically the result should be 48 + 64 * 1/8 = 56, which is a result we can represent in our system, but we will nevertheless get the wrong result (48) due to rounding errors in each addition. So the behavior of float can be **non intuitive** and dangerous, at least for those who don't know how it works.