less_retarded_wiki/bit_hack.md
2024-12-12 21:58:18 +01:00

8.9 KiB

Bit Hack

Bit hacks (also bit tricks, bit magic, bit twiddling etc.) are simple clever formulas for performing useful operations with binary numbers. Some operations, such as checking if a number is power of two or reversing bits in a number, can be done very efficiently with these hacks, without using loops, branching and other undesirably slow operations, potentially increasing speed and/or decreasing size and/or memory usage of code -- this can help us optimize. Many of these can be found on the web and there are also books such as Hacker's Delight which document such hacks.

Basics

Basic bit manipulation techniques are common and part of general knowledge so they won't be listed under hacks, but for sake of completeness and beginners reading this we should mention them here. Let's see the basic bit manipulation operators in C:

  • | (bitwise OR): Performs the logical OR on all corresponding bits of two operands, e.g. 0b0110 | 0b1100 gives 0b1110 (14 in decimal). This is used to set bits and combine flags (options) into a single numeric value that can easily be passed to function etc. For example to set the lowest bit of a number to 1 just do myNumber | 1. Now consider e.g. #define OPTION_A 0b0001, #define OPTION_B 0b0010 and #define OPTION_C 0b0100, now we can make a single number that represents a set of selected options e.g. as OPTION_C | OPTION_B (the value will be 0101 and says that options B and C have been selected).
  • & (bitwise AND): Performs the logical AND on all corresponding bits of two operands, e.g. 0b0110 & 0b1100 gives 0b0100 (4 in decimal). This may be used to mask out specific bits, to check if specific bits are set (useful to check the set flags as mentioned above) or to clear (set to zero) specific bits. Consider the flag example from above, if we want to check if value x has e.g. the option B set, we simply do x & OPTION_B which results in non-zero value if the option is set. Another example may be myNumber & 0b00001111 (in practice you'll see hexadecimal values, i.e. myNumber & 0x0F) which masks out the lowest 4 bits of myNumber (which is equivalent to the operation modulo 16).
  • ~ (bitwise NOT): Flips every bit of the number -- pretty straightforward. This is used e.g. for clearing bits as x & ~(1 << 3) (clear 4th bit of x).
  • ^ (bitwise XOR): Performs the logical XOR on all corresponding bits of two operands, e.g. 0b0110 ^ 0b1100 gives 0b1010 (10 in decimal). This is used to e.g. flip specific bits.
  • << and >> (binary shift left/right): Performs bitwise shift left or right (WATCH OUT: shifting by data type width or more is undefined behavior in C). This is typically used to perform fast multiplication (left) and division (right) by powers of two (2, 4, 8, 16, ...), just as we can quickly multiply/divide by 10 in decimal by shifting the decimal point. E.g. 5 << 3 is the same as 5 * 2^3 = 5 * 8 = 40.
  • We also sometimes use the logical (i.e. NOT bitwise) operators && (AND), || (OR) and ! (NOT); the difference against bitwise operators is that firstly they work with the whole value (i.e. not individual bits), considering 0 to be false and anything else to be true, and secondly they may employ a bit more complexity, e.g. short circuit evaluation.

Specific Bit Hacks

{ Work in progress. I'm taking these from various sources such as the Hacker's Delight book or web and rewriting them a bit, always testing. Some of these are my own. ~drummyfish }

TODO: stuff from this gophersite: gopher://bitreich.org/0/thaumaturgy/bithacks

Unless noted otherwise we suppose C syntax and semantics and integer data types, but of course we mainly want to express formulas and patterns you can use anywhere, not just in C. Keep in mind all potential dangers, for example it may sometimes be better to write an idiomatic code and let compiler do the optimization that's best for given platform, also of course readability will worsen etc. Nevertheless as a hacker you should know about these tricks, it's useful for low level code etc.

2^N: 1 << N

absolute value of x (two's complement):

  int t = x >> (sizeof(x) * 8 - 1);
  x = (x + t) ^ t;

average x and y without overflow: (x & y) + ((x ^ y) >> 1) { TODO: works with unsigned, not sure about signed. ~drummyfish }

clear (to 0) Nth bit of x: x & ~(1 << N)

clear (to 0) rightmost 1 bit of x: x & (x - 1)

conditionally add (subtract etc.) x and y based on condition c (c is 0 or 1): x + ((0 - c) & y), this avoids branches AND ALSO multiplication by c, of course you may replace + by another operators.

count 0 bits of x: Count 1 bits and subtract from data type width.

count 1 bits of x (8 bit): We add neighboring bits in parallel, then neighboring groups of 2 bits, then neighboring groups of 4 bits.

  x = (x & 0x55) + ((x >> 1) & 0x55);
  x = (x & 0x33) + ((x >> 2) & 0x33);
  x = (x & 0x0f) + (x >> 4);

count 1 bits of x (32 bit): Analogous to 8 bit version.

  x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
  x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
  x = (x & 0x0f0f0f0f) + ((x >> 4) & 0x0f0f0f0f);
  x = (x & 0x00ff00ff) + ((x >> 8) & 0x00ff00ff);
  x = (x & 0x0000ffff) + (x >> 16);

count leading 0 bits in x (8 bit):

  int r = (x == 0);
  if (x <= 0x0f) { r += 4; x <<= 4; }
  if (x <= 0x3f) { r += 2; x <<= 2; }
  if (x <= 0x7f) { r += 1; }

count leading 0 bits in x (32 bit): Analogous to 8 bit version.

  int r = (x == 0);
  if (x <= 0x0000ffff) { r += 16; x <<= 16; }
  if (x <= 0x00ffffff) { r += 8; x <<= 8; }
  if (x <= 0x0fffffff) { r += 4; x <<= 4; }
  if (x <= 0x3fffffff) { r += 2; x <<= 2; }
  if (x <= 0x7fffffff) { r += 1; }

divide x by 2^N: x >> N

divide x by 3 (unsigned at least 16 bit, x < 256): ((x + 1) * 85) >> 8, we use kind of a fixed point multiplication by reciprocal (1/3), on some platforms this may be faster than using the divide instruction, but not always (also compilers often do this for you). { I checked this particular trick and it gives exact results for any x < 256, however this may generally not be the case for other constants than 3. Still even if not 100% accurate this can be used to approximate division. ~drummyfish }

divide x by 5 (unsigned at least 16 bit, x < 256): ((x + 1) * 51) >> 8, analogous to divide by 3.

get Nth bit of x: (x >> N) & 0x01

is x a power of 2?: x && ((x & (x - 1)) == 0)

is x even?: (x & 0x01) == 0

is x odd?: (x & 0x01)

isolate rightmost 0 bit of x: ~x & (x + 1)

isolate rightmost 1 bit of x: x & (~x + 1) (in two's complement equivalent to x & -x)

log base 2 of x: Count leading 0 bits, subtract from data type width - 1.

maximum of x and y: x ^ ((0 - (x < y)) & (x ^ y))

minimum of x and y: x ^ ((0 - (x > y)) & (x ^ y))

multiply x by 2^N: x << N

multiply by 7 (and other numbers close to 2^N): (x << 3) - x

next higher or equal power of 2 from x (32 bit):

  x--;
  x |= x >> 1;
  x |= x >> 2;
  x |= x >> 4;
  x |= x >> 8;
  x |= x >> 16;
  x = x + 1 + (x == 0);

parity of x (8 bit):

  x ^= x >> 1;
  x ^= x >> 2;
  x = (x ^ (x >> 4)) & 0x01;

reverse bits of x (8 bit): We switch neighboring bits, then switch neighboring groups of 2 bits, then neighboring groups of 4 bits.

  x = ((x >> 1) & 0x55) | ((x & 0x55) << 1);
  x = ((x >> 2) & 0x33) | ((x & 0x33) << 2);
  x = ((x >> 4) & 0x0f) | (x << 4);

reverse bits of x (32 bit): Analogous to the 8 bit version.

  x = ((x >> 1) & 0x55555555) | ((x & 0x55555555) << 1);
  x = ((x >> 2) & 0x33333333) | ((x & 0x33333333) << 2);
  x = ((x >> 4) & 0x0f0f0f0f) | ((x & 0x0f0f0f0f) << 4);
  x = ((x >> 8) & 0x00ff00ff) | ((x & 0x00ff00ff) << 8);
  x = ((x >> 16) & 0x0000ffff) | (x << 16);

rotate x left by N (8 bit): (x << N) | (x >> (8 - N)) (watch out, in C: N < 8, if storing in wider type also do & 0xff)

rotate x right by N (8 bit): analogous to left rotation, (x >> N) | (x << (8 - N))

set (to 1) Nth bit of x: x | (1 << N)

set (to 1) the rightmost 0 bit of x: x | (x + 1)

set or clear Nth bit of x to b: (x & ~(1 << N)) | (b << N)

sign of x (returns 1, 0 or -1): (x > 0) - (x < 0)

swap x and y (without tmp var.): x ^= y; y ^= x; x ^= y; or x -= y; y += x; x = y - x;

toggle Nth bit of x: x ^ (1 << N)

toggle x between A and B: (x ^ A) ^ B

x and y have different signs?: (x > 0) == (y > 0), (x <= 0) == (y <= 0) etc. (differs on 0:0 behavior)

TODO: the ugly hacks that use conversion to/from float?

See Also