10 KiB
Fixed Point
Fixed point arithmetic is a simple and often good enough method of computer representation of fractional numbers (i.e. numbers with higher precision than integers, e.g. 4.03), as opposed to floating point which is a more complicated way of doing this which in most cases we consider a worse, bloated alternative. Probably in 99% cases when you think you need floating point, fixed point will do just fine (this is also advocated e.g. in the book Starting Forth). Fixed point arithmetic is not to be confused with fixed point of a function in mathematics (fixed point of a function f(x) is such x that f(x) = x), a completely unrelated term.
Fixed point has at least these advantages over floating point:
- It doesn't require a special hardware coprocessor for efficient execution and so doesn't introduce a dependency. Programs using floating point will run extremely slowly on systems without float hardware support as they have to emulate the complex hardware in software, while fixed point will run just as fast as integer arithmetic. For this reason fixed point is very often used in embedded computers.
- It is natural, easier to understand and therefore better predictable, less tricky, KISS, suckless. (Float's IEEE 754 standard is 58 pages long, the paper What Every Computer Scientist Should Know About Floating-Point Arithmetic has 48 pages.)
- Is easier to implement and so supported in many more systems. Any language or format supporting integers also supports fixed point.
- It isn't ugly and in two's complement doesn't waste values (unlike IEEE 754 with positive and negative zero, denormalized numbers, many NaNs etc.).
- It is usually well defined, or at least better than float. While specifics of float -- such as exact precision, rounding rules etc. -- may be unspecified on many systems, with fixed point we usually know the result of any operation, or at least know the tricky cases we have to watch for (such as overflows). If we make an engine using floating point, it may behave differently on a computer that uses a different standard for floating point; with fixed point our engine will behave the same everywhere.
- Some simpler (i.e. better) programming languages such as comun don't support float at all, while fixed point can be used in any language that supports integers.
- ...
What are the disadvantages? Well, we may for example lack precision sometimes, which can manifest e.g. by "jerky" movement in 3D engines, although there are always tricks and ways to fix this (increasing precision, interpolation, smoothing filters, ...), although it may sometimes be a bit more complicated. While older/simpler computers will benefit from fixed point, the big/"modern" computers on the other hand will suffer from it because we'll typically be using our own software implementation which has to compete with hardware accelerated floating point (still, we argue to rather favor the older/simpler computers). Also fixed point won't offer all the comfort that floating point nowadays comes with such as dealing with overflows etc. This is of course not an inherent disadvantage of fixed point but rather the status quo of computing industry, it's because floating point has been pimped and is delivered on a silver platter. In general using fixed point is a bit more work: we have to correctly choose the precision, manually adjust order of arithmetic operations, check for/prevent overflows etc., but in the end we get a better program. We have to argue for doing things well rather than quickly.
How It Works
Fixed point uses a fixed (hence the name) number of digits (bits in binary) for the integer part and the rest for the fractional part (whereas floating point's fractional part varies in size). I.e. we split the binary representation of the number into two parts (integer and fractional) by IMAGINING a radix point at some place in the binary representation. That's basically it. Fixed point therefore spaces numbers uniformly, as opposed to floating point whose spacing of numbers is non-uniform.
So, we can just use an integer data type as a fixed point data type, there is no need for libraries or special hardware support. We can also perform operations such as addition the same way as with integers. For example if we have a binary integer number represented as 00001001
, 9 in decimal, we may say we'll be considering a radix point after let's say the sixth place, i.e. we get 000010.01
which we interpret as 2.25 (2^2 + 2^(-2)). The binary value we store in a variable is the same (as the radix point is only imagined), we only INTERPRET it differently.
We may look at it this way: we still use integers but we use them to count smaller fractions than 1. For example in a 3D game where our basic spatial unit is 1 meter our variables may rather contain the number of centimeters (however in practice we should use powers of two, so rather 1/128ths of a meter). In the example in previous paragraph we count 1/4ths (we say our scaling factor is 1/4), so actually the number represented as 00000100
is what in floating point we'd write as 1.0
(00000100
is 4 and 4 * 1/4 = 1), while 00000001
means 0.25
.
This has just one consequence: we have to normalize results of multiplication and division (addition and subtraction work just as with integers, we can normally use the +
and -
operators). I.e. when multiplying, we have to divide the result by the inverse of the fractions we're counting, i.e. by 4 in our case (1/(1/4) = 4). Similarly when dividing, we need to MULTIPLY the result by this number. This is because we are using fractions as our units and when we multiply two numbers in those units, the units multiply as well, i.e. in our case multiplying two numbers that count 1/4ths give a result that counts 1/16ths, we need to divide this by 4 to get the number of 1/4ths back again (this works the same as e.g. units in physics, multiplying number of meters by number of meters gives meters squared). For example the following integer multiplication:
00001000
* 00000010
= 00010000
(8 * 2 = 16)
in our system has to be normalized like this:
(000010.00
* 000000.10
) / 4 = 000001.00
(2.0 * 0.5 = 1.0)
SIDE NOTE: in practice you may see division replaced by the shift operator (instead of /4
you'll see >> 2
).
With this normalization we also have to think about how to bracket expressions to prevent rounding errors and overflows, for example instead of (x / y) * 4
we may want to write (x * 4) / y
; imagine e.g. x being 00000010
(0.5) and y being 00000100
(1.0), the former would result in 0 (incorrect, rounding error) while the latter correctly results in 0.5. The bracketing depends on what values you expect to be in the variables so it can't really be done automatically by a compiler or library (well, it might probably be somehow handled at runtime, but of course, that will be slower). There are also ways to prevent overflows e.g. with clever bit hacks.
The normalization and bracketing are basically the only things you have to think about, apart from this everything works as with integers. Remember that this all also works with negative number in two's complement, so you can use a signed integer type without any extra trouble.
Remember to always use a power of two scaling factor -- this is crucial for performance. I.e. you want to count 1/2th, 1/4th, 1/8ths etc., but NOT 1/10ths, as might be tempting. Why are power of two good here? Because computers work in binary and so the normalization operations with powers of two (division and multiplication by the scaling factor) can easily be optimized by the compiler to a mere bit shift, an operation much faster than multiplication or division.
Code Example
For start let's compare basic arithmetic operations in C written with floating point and the same code written with fixed point. Consider the floating point code first:
float
a = 21,
b = 3.0 / 4.0,
c = -10.0 / 3.0;
a = a * b; // multiplication
a += c; // addition
a /= b; // division
a -= 10; // subtraction
a /= 3; // division
printf("%f\n",a);
Equivalent code with fixed point may look as follows:
#define UNIT 1024 // our "1.0" value
int
a = 21 * UNIT,
b = (3 * UNIT) / 4, // note the brackets, (3 / 4) * UNIT would give 0
c = (-10 * UNIT) / 3;
a = (a * b) / UNIT; // multiplication, we have to normalize
a += c; // addition, no normalization needed
a = (a * UNIT) / b; // division, normalization needed, note the brackets
a -= 10 * UNIT; // subtraction
a /= 3; // division by a number NOT in UNITs, no normalization needed
printf("%d.%d%d%d\n", // writing a nice printing function is left as an exercise :)
a / UNIT,
((a * 10) / UNIT) % 10,
((a * 100) / UNIT) % 10,
((a * 1000) / UNIT) % 10);
These examples output 2.185185
and 2.184
, respectively.
Now consider another example: a simple C program using fixed point with 10 fractional bits, computing square roots of numbers from 0 to 10.
#include <stdio.h>
typedef int Fixed;
#define UNIT_FRACTIONS 1024 // 10 fractional bits, 2^10 = 1024
#define INT_TO_FIXED(x) ((x) * UNIT_FRACTIONS)
Fixed fixedSqrt(Fixed x)
{
// stupid brute force square root
int previousError = -1;
for (int test = 0; test <= x; ++test)
{
int error = x - (test * test) / UNIT_FRACTIONS;
if (error == 0)
return test;
else if (error < 0)
error *= -1;
if (previousError > 0 && error > previousError)
return test - 1;
previousError = error;
}
return 0;
}
void fixedPrint(Fixed x)
{
printf("%d.%03d",x / UNIT_FRACTIONS,
((x % UNIT_FRACTIONS) * 1000) / UNIT_FRACTIONS);
}
int main(void)
{
for (int i = 0; i <= 10; ++i)
{
printf("%d: ",i);
fixedPrint(fixedSqrt(INT_TO_FIXED(i)));
putchar('\n');
}
return 0;
}
The output is:
0: 0.000
1: 1.000
2: 1.414
3: 1.732
4: 2.000
5: 2.236
6: 2.449
7: 2.645
8: 2.828
9: 3.000
10: 3.162