less_retarded_wiki/binary.md
2022-12-25 22:32:33 +01:00

7.2 KiB

Binary

The word binary in general refers to having two choices; in computer science binary refers to the base 2 numeral system, i.e. a system of writing numbers with only two symbols, usually 1s and 0s. We can write any number in binary just as we can with our everyday decimal system, but binary is more convenient for computers because this system is easy to implement in electronics (a switch can be on or off, i.e. 1 or 0; systems with more digits were tried but unsuccessful, they failed miserably in reliability). The word binary is also by extension used for non-textual computer files such as native executable programs or asset files for games.

One binary digit can be used to store exactly 1 bit of information. So the number of places we have for writing a binary number (e.g. in computer memory) is called a number of bits or bit width. A bit width N allows for storing 2^N values (e.g. with 2 bits we can store 4 values: 0, 1, 2 and 3).

At the basic level binary works just like the decimal (base 10) system we're used to. While the decimal system uses powers of 10, binary uses powers of 2.

For example let's have a number that's written as 10135 in decimal. The first digit from the right (5) says the number of 10^(0)s (= 1) in the number, the second digit (3) says the number of 10^(1)s (= 10), the third digit (1) says the number of 10^(2)s (= 100) etc. Similarly if we now have a number 100101 in binary, the first digit from the right (1) says the number of 2^(0)s (= 1), the second digit (0) says the number of 2^(1)s (= 2), the third digit (1) says the number of 2^(2)s (=4) etc. Therefore this binary number can be converted to decimal by simply computing 1 * 2^0 + 0 * 2^1 + 1 * 2^2 + 0 * 2^3 + 0 * 2^4 + 1 * 2^5 = 1 + 4 + 32 = 37.

To convert from decimal to binary we can use a simple algorithm that's derived from the above. Let's say we have a number X we want to write in binary. We will write digits from right to left. The first (rightmost) digit is the remainder after integer division of X by 2. Then we divide the number by 2. The second digit is again the remainder after division by 2. Then we divide the number by 2 again. This continues until the number is 0. For example let's convert the number 22 to binary: first digit = 22 % 2 = 0; 22 / 2 = 11, second digit = 11 % 2 = 1; 11 / 2 = 5; third digit = 5 % 2 = 1; 5 / 2 = 2; 2 % 2 = 0; 2 / 2 = 1; 1 % 2 = 1; 1 / 2 = 0. The result is 10110.

TODO: operations in binary

In binary it is very simple and fast to divide and multiply by (powers of) 2, just as it is simply to divide and multiple by (powers of) 10 in decimal (we just shift the radix point, e.g. the binary number 1011 multiplied by 4 is 101100, we just added two zeros at the end). This is why as a programmer you should prefer working with powers of two.

Binary can be very easily converted to and from hexadecimal and octal because 1 hexadecimal (octal) digit always maps to exactly 4 (3) binary digits. E.g. the hexadeciaml number F0 is 11110000 in binary.

We can work with the binary representation the same way as with decimal, i.e. we can e.g. write negative numbers such as -110101 or rational numbers such as 1011.001101. However in a computer memory there are no other symbols than 1 and 0, so we can't use extra symbols such as - or . to represent such values. So if we want to represent more numbers than non-negative integers, we literally have to only use 1s and 0s and choose a specific representation, or format of numbers -- there are several formats for representing e.g. signed (potentially negative) or rational numbers, each with pros and cons. The following are the most common number representations:

  • two's complement: Allows storing integers, both positive, negative and zero. It is probably the most common representation of integers because of its great advantages: basic operations (+, -, *) are performed exactly the same as with "normal" binary numbers, and there is no negative zero (which would be an inconvenience and waste of memory). Inverting a number (from negative to positive and vice versa) is done simply by inverting all the bits and adding 1. The leftmost bit signifies the number's sign (0 = +, 1 = -).
  • sign-magnitude: Allows storing integers, both positive, negative and zero. It's pretty straightforward: the leftmost bit in a number serves as a sign, 0 = +, 1 = -, and the rest of the number is the distance from zero in "normal" representation. So e.g. 0011 is 3 while 1011 is -3. The disadvantage is there are two values for zero (positive, 0000 and negative, 1000) which wastes a value and presents a computational inconvenience, and operations with these numbers are more complicated and slower (checking the sign requires extra code).
  • one's complement: Allows storing integers, both positive, negative and zero. The leftmost bit signifies a sign, in the same way as with sign-magnitude, but numbers are inverted differently: a positive number is turned into negative (and vice versa) by inverting all bits. So e.g. 0011 is 3 while 1100 is -3. The disadvantage is there are two values for zero (positive, 0000 and negative, 1111) which wastes a value and presents a computational inconvenience, and some operations with these numbers may be more complex.
  • fixed point: Allows storing rational numbers (fractions), i.e. numbers with a radix point (such as 1101.011), which can also be positive, negative or zero. It works by supposing a radix point at some fixed position in the binary representation, e.g. if we have an 8 bit number, we may consider 5 leftmost bits to represent the whole part and 3 rightmost bits to be the fractional part (so e.g the number 11010110 represents 11010.110). The advantage here is extreme simplicity (we can use normal integer numbers as fixed point simply by imagining a radix point). The disadvantage may be low precision and small range of representable values.
  • floating point: Allows storing rational numbers in great ranges, both positive, negative and zero, plus some additional values such as infinity and not a number. It allows the radix point to be shifted which gives a potential for storing extremely big and extremely small numbers at the same time. The disadvantage is that float is extremely complex, bloated, wastes some values and for fast execution requires a special hardware unit (which most "normal" computers nowadays have, but are missing e.g. in some embedded systems).

As anything can be represented with numbers, binary can be used to store any kind of information such as text, images, sounds and videos. See data structures and file formats.

See Also