151 lines
8.4 KiB
Markdown
151 lines
8.4 KiB
Markdown
# Fixed Point
|
|
|
|
Fixed point arithmetic is a simple and often [good enough](good_enough.md) method of computer representation of [fractional](rational_number.md) numbers (i.e. numbers with higher precision than [integers](integer.md), e.g. 4.03), as opposed to [floating point](float.md) which is a more complicated way of doing this which in most cases we consider a worse, [bloated](bloat.md) alternative. Probably in 99% cases when you think you need floating point, fixed point will do just fine. Fixed point arithmetic is not to be confused with fixed point of a function in mathematics, a completely unrelated term.
|
|
|
|
Fixed point has at least these advantages over floating point:
|
|
|
|
- **It doesn't require a special hardware coprocessor** for efficient execution and so doesn't introduce a [dependency](dependency.md). Programs using floating point will run extremely slowly on systems without float hardware support as they have to emulate the complex hardware in software, while fixed point will run just as fast as integer arithmetic. For this reason fixed point is very often used in [embedded](embedded.md) computers.
|
|
- It is **natural, easier to understand and therefore better predictable**, less tricky, [KISS](kiss.md), [suckless](sukless.md). (Float's IEEE 754 standard is 58 pages long, the paper *What Every Computer Scientist Should Know About Floating-Point Arithmetic* has 48 pages.)
|
|
- Is easier to implement and so **supported in many more systems**. Any language or format supporting integers also supports fixed point.
|
|
- It isn't ugly and in [two's complement](twos_complement.md) **doesn't waste values** (unlike IEEE 754 with positive and negative zero, denormalized numbers, many [NaNs](nan.md) etc.).
|
|
- Some simpler (i.e. better) programming languages such as [comun](comun.md) don't support float at all, while fixed point can be used in any language that supports integers.
|
|
|
|
## How It Works
|
|
|
|
Fixed point uses a fixed (hence the name) number of digits (bits in binary) for the integer part and the rest for the fractional part (whereas floating point's fractional part varies in size). I.e. we split the binary representation of the number into two parts (integer and fractional) by IMAGINING a radix point at some place in the binary representation. That's basically it. Fixed point therefore spaces numbers [uniformly](uniformity.md), as opposed to floating point whose spacing of numbers is non-uniform.
|
|
|
|
So, **we can just use an integer data type as a fixed point data type**, there is no need for libraries or special hardware support. We can also perform operations such as addition the same way as with integers. For example if we have a binary integer number represented as `00001001`, 9 in decimal, we may say we'll be considering a radix point after let's say the sixth place, i.e. we get `000010.01` which we interpret as 2.25 (2^2 + 2^(-2)). The binary value we store in a variable is the same (as the radix point is only imagined), we only INTERPRET it differently.
|
|
|
|
We may look at it this way: we still use integers but we use them to count smaller fractions than 1. For example in a 3D game where our basic spatial unit is 1 meter our variables may rather contain the number of centimeters (however in practice we should use powers of two, so rather 1/128ths of a meter). In the example in previous paragraph we count 1/4ths (we say our **scaling factor** is 1/4), so actually the number represented as `00000100` is what in floating point we'd write as `1.0` (`00000100` is 4 and 4 * 1/4 = 1), while `00000001` means `0.25`.
|
|
|
|
This has just one consequence: **we have to [normalize](normalization.md) results of multiplication and division** (addition and subtraction work just as with integers, we can normally use the `+` and `-` operators). I.e. when multiplying, we have to divide the result by the inverse of the fractions we're counting, i.e. by 4 in our case (1/(1/4) = 4). Similarly when dividing, we need to MULTIPLY the result by this number. This is because we are using fractions as our units and when we multiply two numbers in those units, the units multiply as well, i.e. in our case multiplying two numbers that count 1/4ths give a result that counts 1/16ths, we need to divide this by 4 to get the number of 1/4ths back again (this works the same as e.g. units in physics, multiplying number of meters by number of meters gives meters squared). For example the following integer multiplication:
|
|
|
|
`00001000` * `00000010` = `00010000` (8 * 2 = 16)
|
|
|
|
in our system has to be normalized like this:
|
|
|
|
(`000010.00` * `000000.10`) / 4 = `000001.00` (2.0 * 0.5 = 1.0)
|
|
|
|
SIDE NOTE: in practice you may see division replaced by the shift operator (instead of `/4` you'll see `>> 2`).
|
|
|
|
With this normalization we also have to **think about how to bracket expressions to prevent rounding errors and [overflows](overflow.md)**, for example instead of `(x / y) * 4` we may want to write `(x * 4) / y`; imagine e.g. *x* being `00000010` (0.5) and *y* being `00000100` (1.0), the former would result in 0 (incorrect, rounding error) while the latter correctly results in 0.5. The bracketing depends on what values you expect to be in the variables so it can't really be done automatically by a compiler or library (well, it might probably be somehow handled at [runtime](runtime.md), but of course, that will be slower). There are also ways to prevent overflows e.g. with clever [bit hacks](bit_hack.md).
|
|
|
|
The normalization and bracketing are basically the only things you have to think about, apart from this everything works as with integers. Remember that **this all also works with negative number in [two's complement](twos_complement.md)**, so you can use a signed integer type without any extra trouble.
|
|
|
|
Remember to **always use a power of two scaling factor** -- this is crucial for performance. I.e. you want to count 1/2th, 1/4th, 1/8ths etc., but NOT 1/10ths, as might be tempting. Why are power of two good here? Because computers work in binary and so the normalization operations with powers of two (division and multiplication by the scaling factor) can easily be optimized by the compiler to a mere [bit shift](bit_shift.md), an operation much faster than multiplication or division.
|
|
|
|
## Code Example
|
|
|
|
For start let's compare basic arithmetic operations in [C](c.md) written with floating point and the same code written with fixed point. Consider the floating point code first:
|
|
|
|
```
|
|
float
|
|
a = 21,
|
|
b = 3.0 / 4.0,
|
|
c = -10.0 / 3.0;
|
|
|
|
a = a * b; // multiplication
|
|
a += c; // addition
|
|
a /= b; // division
|
|
a -= 10; // subtraction
|
|
a /= 3; // division
|
|
|
|
printf("%f\n",a);
|
|
```
|
|
|
|
Equivalent code with fixed point may look as follows:
|
|
|
|
```
|
|
#define UNIT 1024 // our "1.0" value
|
|
|
|
int
|
|
a = 21 * UNIT,
|
|
b = (3 * UNIT) / 4, // note the brackets, (3 / 4) * UNIT would give 0
|
|
c = (-10 * UNIT) / 3;
|
|
|
|
a = (a * b) / UNIT; // multiplication, we have to normalize
|
|
a += c; // addition, no normalization needed
|
|
a = (a * UNIT) / b; // division, normalization needed, note the brackets
|
|
a -= 10 * UNIT; // subtraction
|
|
a /= 3; // division by a number NOT in UNITs, no normalization needed
|
|
|
|
printf("%d.%d%d%d\n", // writing a nice printing function is left as an exercise :)
|
|
a / UNIT,
|
|
((a * 10) / UNIT) % 10,
|
|
((a * 100) / UNIT) % 10,
|
|
((a * 1000) / UNIT) % 10);
|
|
```
|
|
|
|
These examples output `2.185185` and `2.184`, respectively.
|
|
|
|
Now consider another example: a simple [C](c.md) program using fixed point with 10 fractional bits, computing [square roots](sqrt.md) of numbers from 0 to 10.
|
|
|
|
```
|
|
#include <stdio.h>
|
|
|
|
typedef int Fixed;
|
|
|
|
#define UNIT_FRACTIONS 1024 // 10 fractional bits, 2^10 = 1024
|
|
|
|
#define INT_TO_FIXED(x) ((x) * UNIT_FRACTIONS)
|
|
|
|
Fixed fixedSqrt(Fixed x)
|
|
{
|
|
// stupid brute force square root
|
|
|
|
int previousError = -1;
|
|
|
|
for (int test = 0; test <= x; ++test)
|
|
{
|
|
int error = x - (test * test) / UNIT_FRACTIONS;
|
|
|
|
if (error == 0)
|
|
return test;
|
|
else if (error < 0)
|
|
error *= -1;
|
|
|
|
if (previousError > 0 && error > previousError)
|
|
return test - 1;
|
|
|
|
previousError = error;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
void fixedPrint(Fixed x)
|
|
{
|
|
printf("%d.%03d",x / UNIT_FRACTIONS,
|
|
((x % UNIT_FRACTIONS) * 1000) / UNIT_FRACTIONS);
|
|
}
|
|
|
|
int main(void)
|
|
{
|
|
for (int i = 0; i <= 10; ++i)
|
|
{
|
|
printf("%d: ",i);
|
|
|
|
fixedPrint(fixedSqrt(INT_TO_FIXED(i)));
|
|
|
|
putchar('\n');
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
```
|
|
|
|
The output is:
|
|
|
|
```
|
|
0: 0.000
|
|
1: 1.000
|
|
2: 1.414
|
|
3: 1.732
|
|
4: 2.000
|
|
5: 2.236
|
|
6: 2.449
|
|
7: 2.645
|
|
8: 2.828
|
|
9: 3.000
|
|
10: 3.162
|
|
``` |