less_retarded_wiki/bit_hack.md

173 lines
8.9 KiB
Markdown
Raw Permalink Normal View History

2022-12-21 21:18:46 +01:00
# Bit Hack
2023-11-03 17:46:23 +01:00
Bit [hacks](hacking.md) (also bit tricks, bit [magic](magic.md), bit twiddling etc.) are simple clever formulas for performing useful operations with [binary](binary.md) numbers. Some operations, such as checking if a number is power of two or reversing bits in a number, can be done very efficiently with these hacks, without using loops, [branching](branchless.md) and other undesirably slow operations, potentially increasing speed and/or decreasing size and/or memory usage of code -- this can help us [optimize](optimization.md). Many of these can be found on the [web](www.md) and there are also books such as *Hacker's Delight* which document such hacks.
2023-01-15 14:30:11 +01:00
## Basics
Basic bit manipulation techniques are common and part of general knowledge so they won't be listed under hacks, but for sake of completeness and beginners reading this we should mention them here. Let's see the basic bit manipulation operators in [C](c.md):
2024-01-10 14:44:40 +01:00
- `|` (bitwise [OR](or.md)): Performs the logical OR on all corresponding bits of two operands, e.g. `0b0110 | 0b1100` gives `0b1110` (14 in decimal). This is used to set bits and combine flags (options) into a single numeric value that can easily be passed to function etc. For example to set the lowest bit of a number to 1 just do `myNumber | 1`. Now consider e.g. `#define OPTION_A 0b0001`, `#define OPTION_B 0b0010` and `#define OPTION_C 0b0100`, now we can make a single number that represents a set of selected options e.g. as `OPTION_C | OPTION_B` (the value will be `0101` and says that options B and C have been selected).
- `&` (bitwise [AND](and.md)): Performs the logical AND on all corresponding bits of two operands, e.g. `0b0110 & 0b1100` gives `0b0100` (4 in decimal). This may be used to mask out specific bits, to check if specific bits are set (useful to check the set flags as mentioned above) or to clear (set to zero) specific bits. Consider the flag example from above, if we want to check if value *x* has e.g. the option B set, we simply do `x & OPTION_B` which results in non-zero value if the option is set. Another example may be `myNumber & 0b00001111` (in practice you'll see hexadecimal values, i.e. `myNumber & 0x0F`) which masks out the lowest 4 bits of *myNumber* (which is equivalent to the operation [modulo](mod.md) 16).
- `~` (bitwise [NOT](not.md)): Flips every bit of the number -- pretty straightforward. This is used e.g. for clearing bits as `x & ~(1 << 3)` (clear 4th bit of *x*).
- `^` (bitwise [XOR](xor.md)): Performs the logical XOR on all corresponding bits of two operands, e.g. `0b0110 ^ 0b1100` gives `0b1010` (10 in decimal). This is used to e.g. flip specific bits.
2023-01-15 15:55:54 +01:00
- `<<` and `>>` (binary shift left/right): Performs bitwise shift left or right (WATCH OUT: shifting by data type width or more is undefined behavior in C). This is typically used to perform fast multiplication (left) and division (right) by powers of two (2, 4, 8, 16, ...), just as we can quickly multiply/divide by 10 in decimal by shifting the decimal point. E.g. `5 << 3` is the same as 5 * 2^3 = 5 * 8 = 40.
2024-01-10 14:44:40 +01:00
- We also sometimes use the logical (i.e. NOT bitwise) operators `&&` (AND), `||` (OR) and `!` (NOT); the difference against bitwise operators is that firstly they work with the whole value (i.e. not individual bits), considering 0 to be *false* and anything else to be *true*, and secondly they may employ a bit more complexity, e.g. [short circuit](short_circuit.md) evaluation.
2022-12-21 21:18:46 +01:00
## Specific Bit Hacks
{ Work in progress. I'm taking these from various sources such as the *Hacker's Delight* book or web and rewriting them a bit, always testing. Some of these are my own. ~drummyfish }
2023-11-03 17:46:23 +01:00
TODO: stuff from this gophersite: gopher://bitreich.org/0/thaumaturgy/bithacks
2024-01-10 14:44:40 +01:00
Unless noted otherwise we suppose [C](c.md) syntax and semantics and integer [data types](data_type.md), but of course we mainly want to express formulas and patterns you can use anywhere, not just in C. Keep in mind all potential dangers, for example it may sometimes be better to write an idiomatic code and let compiler do the optimization that's best for given platform, also of course readability will worsen etc. Nevertheless as a hacker you should know about these tricks, it's useful for low level code etc.
2022-12-21 21:18:46 +01:00
**2^N**: `1 << N`
**[absolute value](abs.md) of x ([two's complement](twos_complement.md))**:
```
int t = x >> (sizeof(x) * 8 - 1);
x = (x + t) ^ t;
```
**average x and y without overflow**: `(x & y) + ((x ^ y) >> 1)` { TODO: works with unsigned, not sure about signed. ~drummyfish }
**clear (to 0) Nth bit of x**: `x & ~(1 << N)`
**clear (to 0) rightmost 1 bit of x**: `x & (x - 1)`
**conditionally add (subtract etc.) x and y based on condition c (c is 0 or 1)**: `x + ((0 - c) & y)`, this avoids branches AND ALSO multiplication by c, of course you may replace + by another operators.
**count 0 bits of x**: Count 1 bits and subtract from data type width.
**count 1 bits of x (8 bit)**: We add neighboring bits in parallel, then neighboring groups of 2 bits, then neighboring groups of 4 bits.
```
x = (x & 0x55) + ((x >> 1) & 0x55);
x = (x & 0x33) + ((x >> 2) & 0x33);
x = (x & 0x0f) + (x >> 4);
```
**count 1 bits of x (32 bit)**: Analogous to 8 bit version.
```
x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x & 0x0f0f0f0f) + ((x >> 4) & 0x0f0f0f0f);
x = (x & 0x00ff00ff) + ((x >> 8) & 0x00ff00ff);
x = (x & 0x0000ffff) + (x >> 16);
```
**count leading 0 bits in x (8 bit)**:
```
int r = (x == 0);
if (x <= 0x0f) { r += 4; x <<= 4; }
if (x <= 0x3f) { r += 2; x <<= 2; }
if (x <= 0x7f) { r += 1; }
```
**count leading 0 bits in x (32 bit)**: Analogous to 8 bit version.
```
int r = (x == 0);
if (x <= 0x0000ffff) { r += 16; x <<= 16; }
if (x <= 0x00ffffff) { r += 8; x <<= 8; }
if (x <= 0x0fffffff) { r += 4; x <<= 4; }
if (x <= 0x3fffffff) { r += 2; x <<= 2; }
if (x <= 0x7fffffff) { r += 1; }
```
**divide x by 2^N**: `x >> N`
**divide x by 3 (unsigned at least 16 bit, x < 256)**: `((x + 1) * 85) >> 8`, we use kind of a [fixed point](fixed_point.md) multiplication by reciprocal (1/3), on some platforms this may be faster than using the divide instruction, but not always (also compilers often do this for you). { I checked this particular trick and it gives exact results for any x < 256, however this may generally not be the case for other constants than 3. Still even if not 100% accurate this can be used to approximate division. ~drummyfish }
**divide x by 5 (unsigned at least 16 bit, x < 256)**: `((x + 1) * 51) >> 8`, analogous to divide by 3.
**get Nth bit of x**: `(x >> N) & 0x01`
**is x a power of 2?**: `x && ((x & (x - 1)) == 0)`
**is x even?**: `(x & 0x01) == 0`
**is x odd?**: `(x & 0x01)`
**isolate rightmost 0 bit of x**: `~x & (x + 1)`
**isolate rightmost 1 bit of x**: `x & (~x + 1)` (in [two's complement](twos_complement.md) equivalent to `x & -x`)
**log base 2 of x**: Count leading 0 bits, subtract from data type width - 1.
**maximum of x and y**: `x ^ ((0 - (x < y)) & (x ^ y))`
**minimum of x and y**: `x ^ ((0 - (x > y)) & (x ^ y))`
**multiply x by 2^N**: `x << N`
**multiply by 7 (and other numbers close to 2^N)**: `(x << 3) - x`
**next higher or equal power of 2 from x (32 bit)**:
```
x--;
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
x = x + 1 + (x == 0);
```
**[parity](parity.md) of x (8 bit)**:
```
x ^= x >> 1;
x ^= x >> 2;
x = (x ^ (x >> 4)) & 0x01;
```
**reverse bits of x (8 bit)**: We switch neighboring bits, then switch neighboring groups of 2 bits, then neighboring groups of 4 bits.
```
x = ((x >> 1) & 0x55) | ((x & 0x55) << 1);
x = ((x >> 2) & 0x33) | ((x & 0x33) << 2);
x = ((x >> 4) & 0x0f) | (x << 4);
```
**reverse bits of x (32 bit)**: Analogous to the 8 bit version.
```
x = ((x >> 1) & 0x55555555) | ((x & 0x55555555) << 1);
x = ((x >> 2) & 0x33333333) | ((x & 0x33333333) << 2);
x = ((x >> 4) & 0x0f0f0f0f) | ((x & 0x0f0f0f0f) << 4);
x = ((x >> 8) & 0x00ff00ff) | ((x & 0x00ff00ff) << 8);
x = ((x >> 16) & 0x0000ffff) | (x << 16);
```
**rotate x left by N (8 bit)**: `(x << N) | (x >> (8 - N))` (watch out, in C: N < 8, if storing in wider type also do `& 0xff`)
**rotate x right by N (8 bit)**: analogous to left rotation, `(x >> N) | (x << (8 - N))`
**set (to 1) Nth bit of x**: `x | (1 << N)`
**set (to 1) the rightmost 0 bit of x**: `x | (x + 1)`
**set or clear Nth bit of x to b**: `(x & ~(1 << N)) | (b << N)`
**sign of x (returns 1, 0 or -1)**: `(x > 0) - (x < 0)`
**swap x and y (without tmp var.)**: `x ^= y; y ^= x; x ^= y;` or `x -= y; y += x; x = y - x;`
**toggle Nth bit of x**: `x ^ (1 << N)`
**toggle x between A and B**: `(x ^ A) ^ B`
**x and y have different signs?**: `(x > 0) == (y > 0)`, `(x <= 0) == (y <= 0)` etc. (differs on 0:0 behavior)
TODO: the ugly hacks that use conversion to/from float?
## See Also
- [De Morgan's laws](de_morgans_laws.md)
- [fast inverse square root](fast_inverse_sqrt.md)
- [optimization](optimization.md)