This commit is contained in:
Miloslav Ciz 2025-01-06 23:39:23 +01:00
parent e6f6091f16
commit 36021baff3
18 changed files with 1978 additions and 1902 deletions

View file

@ -45,7 +45,7 @@ IEEE 754 is THE standard that basically all computers use for floating point now
Numbers in this standard are signed, have positive and negative zero (oops), can represent plus and minus [infinity](infinity.md) and different [NaNs](nan.md) (not a number). In fact there are thousands to billions of different NaNs which are basically wasted values. These inefficiencies are addressed by the mentioned [posits](posit.md).
Briefly the representation is following (hold on to your chair): leftmost bit is the sign bit, then exponent follows (the number of bits depends on the specific format), the rest of bits is mantissa. In mantissa implicit `1.` is considered (except when exponent is all 0s), i.e. we "imagine" `1.` in front of the mantissa bits but this 1 is not physically stored. Exponent is in so called biased format, i.e. we have to subtract half (rounded down) of the maximum possible value to get the real value (e.g. if we have 8 bits for exponent and the directly stored value is 120, we have to subtract 255 / 2 = 127 to get the real exponent value, in this case we get -7). However two values of exponent have special meaning; all 0s signify so called denormalized (also subnormal) number in which we consider exponent to be that which is otherwise lowest possible (e.g. -126 in case of 8 bit exponent) but we do NOT consider the implicit 1 in front of mantissa (we instead consider `0.`), i.e. this allows storing [zero](zero.md) (positive and negative) and very small numbers. All 1s in exponent signify either [infinity](infinity.md) (positive and negative) in case mantissa is all 0s, or a [NaN](nan.md) otherwise -- considering here we have the whole mantissa plus sign bit unused, we actually have many different NaNs ([WTF](wtf.md)), but usually we only distinguish two kinds of NaNs: quiet (qNaN) and signaling (sNaN, throws and [exception](exception.md)) that are distinguished by the leftmost bit in mantissa (1 for qNaN, 0 for sNaN).
Briefly the representation is following (hold on to your chair): leftmost bit is the sign bit, then exponent follows (the number of bits depends on the specific format), the rest of bits is mantissa. In mantissa implicit `1.` is considered (except when exponent is all 0s), i.e. we "imagine" `1.` in front of the mantissa bits but this 1 is not physically stored. Exponent is in so called biased format, i.e. we have to subtract half (rounded down) of the maximum possible value to get the real value (e.g. if we have 8 bits for exponent and the directly stored value is 120, we have to subtract 255 / 2 = 127 to get the real exponent value, in this case we get -7). However two values of exponent have special meaning; all 0s signify so called denormalized (also subnormal) { Lol in Spanish subnormal means retarded. ~drummyfish } number in which we consider exponent to be that which is otherwise lowest possible (e.g. -126 in case of 8 bit exponent) but we do NOT consider the implicit 1 in front of mantissa (we instead consider `0.`), i.e. this allows storing [zero](zero.md) (positive and negative) and very small numbers. All 1s in exponent signify either [infinity](infinity.md) (positive and negative) in case mantissa is all 0s, or a [NaN](nan.md) otherwise -- considering here we have the whole mantissa plus sign bit unused, we actually have many different NaNs ([WTF](wtf.md)), but usually we only distinguish two kinds of NaNs: quiet (qNaN) and signaling (sNaN, throws and [exception](exception.md)) that are distinguished by the leftmost bit in mantissa (1 for qNaN, 0 for sNaN).
The standard specifies many formats that are either binary or decimal and use various numbers of bits. The most relevant ones are the following: