less_retarded_wiki/turing_machine.md
2023-12-13 20:50:14 +01:00

13 KiB

Turing Machine

Turing machine is a mathematical model of a computer which works in a quite simple way but has nevertheless the full computational power that's possible to be achieved. Turing machine is one of the most important tools of theoretical computer science as it presents a basic model of computation (i.e. a mathematical system capable of performing general mathematical calculations) for studying computers and algorithms -- in fact it stood at the beginning of theoretical computer science when Alan Turing invented it in 1936 and used it to mathematically prove essential things about computers; for example that their computational power is inevitably limited (see computability) -- he showed that even though Turing machine has the full computational power we can hope for, there exist problems it is incapable of solving (and so will be any other computer equivalent to Turing machine, even human brain). Since then many other so called Turing complete systems (systems with the exact same computational power as a Turing machine) have been invented and discovered, such as lambda calculus or Petri nets, however Turing machine still remains not just relevant, but probably of greatest importance, not only historically, but also because it is similar to physical computers in the way it works.

The advantage of a Turing machine is that it's firstly very simple (it's basically a finite state automaton operating on a memory tape), so it can be mathematically grasped very easily, and secondly it is, unlike many other systems of computations, actually similar to real computers in principle, mainly by its sequential instruction execution and possession of an explicit memory tape it operates on (equivalent to RAM in traditional computers). However note that a pure Turing machine cannot exist in reality because there can never exist a computer with infinite amount of memory which Turing machine possesses; computers that can physically exist are really equivalent to finite state automata, i.e. the "weakest" kind of systems of computation. However we can see our physical computers as approximations of a Turing machine that in most relevant cases behave the same, so we do tend to theoretically view computers as "Turing machines with limited memory".

In Turing machine data and program are separated (data is stored on the tape, program is represented by the control unit), i.e. it is closer to Harvard architecture than von Neumann architecture.

Is there anything computationally more powerful than a Turing machine? Well, yes, but it's just kind of "mathematical fantasy". See e.g. oracle machine which adds a special "oracle" device to a Turing machine to make it magically solve undecidable problems.

How It Works

Turing machine has a few different versions (such as one with multiple memory tapes, memory tape unlimited in both directions, one with non-determinism, ones with differently defined halting conditions etc.), which are however equivalent in computing power, so here we will describe just one of the most common variants.

A Turing machine is composed of:

  • memory tape: Memory composed of infinitely many cells (numbered 0, 1, 2, ...), each cell can hold exactly one symbol from some given alphabet (can be e.g. just symbols 0 and 1) OR the special blank symbol. At the beginning all memory cells contain the blank symbol. Memory holds the data on which we perform computation.
  • read/write head: Head that is positioned above a memory cell, can be moved to left or right. At the beginning the head is at memory cell 0.
  • control unit: The program (algorithm) that's "loaded" on the machine (the controls unit by itself is really a finite state automaton). It is composed of:
    • a set of N (finitely many) states {Q0, Q1, ... QN-1}: The machine is always in one of these states. One state is defined as starting (this is the state the machine is in at the beginning), one is the end state (the one which halts the machine when it is reached).
    • a set of finitely many rules in the format [stateFrom, inputSymbol, stateTo, outputSymbol, headShift], where stateFrom is the current state, inputSymbol is symbol currently under the read/write head, stateTo is the state the machine will transition to, outputSymbol is the symbol that will be written to the memory cell under read/write head and headShift is the direction to shift the read/write head in (either left, right or none). There must not be conflicting rules (ones with the same combination of stateFrom and inputSymbol).

The machine halts either when it reaches the end state, when it tries to leave the tape (go left from memory cell 0) or when it encounters a situation for which it has no defined rule.

The computation works like this: the input data we want to process (for example a string we want to reverse) are stored in the memory before we run the machine. Then we run the machine and wait until it finishes, then we take what's present in the memory as the machine's output (i.e. for example the reversed string). That is a Turing machine doesn't have a traditional I/O (such as a "printf" function), it only works entirely on data in memory!

Let's see a simple example: we will program a Turing machine that takes a binary number on its output and adds 1 to it (for simplicity we suppose a fixed number of bits so an overflow may happen). Let us therefore suppose symbols 0 and 1 as the tape alphabet. The control unit will have the following rules:

stateFrom inputSymbol stateTo outputSymbol headShift
goRight non-blank goRight inputSymbol right
goRight blank add1 blank left
add1 0 add0 1 left
add1 1 add1 0 left
add0 0 add0 0 left
add0 1 add0 1 left
end

Our start state will be goRight and end will be the end state, though we won't need the end state as our machine will always halt by leaving the tape. The states are made so as to first make the machine step by cells to the right until it finds the blank symbol, then it will step one step left and switch to the adding mode. Adding works just as we are used to, with potentially carrying 1s over to the highest orders etc.

Now let us try inputting the binary number 0101 (5 in decimal) to the machine: this means we will write the number to the tape and run the machine as so:

goRight
  _V_ ___ ___ ___ ___ ___ ___ ___
 | 0 | 1 | 0 | 1 |   |   |   |    ...
 '---'---'---'---'---'---'---'---
    goRight
  ___ _V_ ___ ___ ___ ___ ___ ___
 | 0 | 1 | 0 | 1 |   |   |   |    ...
 '---'---'---'---'---'---'---'---
        goRight
  ___ ___ _V_ ___ ___ ___ ___ ___
 | 0 | 1 | 0 | 1 |   |   |   |    ...
 '---'---'---'---'---'---'---'---
            goRight
  ___ ___ ___ _V_ ___ ___ ___ ___
 | 0 | 1 | 0 | 1 |   |   |   |    ...
 '---'---'---'---'---'---'---'---
                goRight
  ___ ___ ___ ___ _V_ ___ ___ ___
 | 0 | 1 | 0 | 1 |   |   |   |    ...
 '---'---'---'---'---'---'---'---
              add1
  ___ ___ ___ _V_ ___ ___ ___ ___
 | 0 | 1 | 0 | 1 |   |   |   |    ...
 '---'---'---'---'---'---'---'---
          add1
  ___ ___ _V_ ___ ___ ___ ___ ___
 | 0 | 1 | 0 | 0 |   |   |   |    ...
 '---'---'---'---'---'---'---'---
      add0
  ___ _V_ ___ ___ ___ ___ ___ ___
 | 0 | 1 | 1 | 0 |   |   |   |    ...
 '---'---'---'---'---'---'---'---
  add0
  _V_ ___ ___ ___ ___ ___ ___ ___
 | 0 | 1 | 1 | 0 |   |   |   |    ...
 '---'---'---'---'---'---'---'---

 END

Indeed, we see the number we got at the output is 0110 (6 in decimal, i.e. 5 + 1). Even though this way of programming is very tedious, it actually allows us to program everything that is possible to be programmed, even whole operating systems, neural networks, games such as Doom and so on. Here is C code that simulates the above shown Turing machine with the same input:

#include <stdio.h>

#define CELLS 2048       // ideal Turing machine would have an infinite tape...
#define BLANK 0xff       // blank tape symbol
#define STATE_END   0xff
#define SHIFT_NONE  0
#define SHIFT_LEFT  1
#define SHIFT_RIGHT 2

unsigned int state;            // 0 = start state, 0xffff = end state    
unsigned int headPosition;
unsigned char tape[CELLS];     // memory tape

unsigned char input[] =        // what to put on the tape at start
  { 0, 1, 0, 1 };

unsigned char rules[] =
{
// state symbol newstate newsymbol shift
   0,    0,     0,       0,        SHIFT_RIGHT, // moving right
   0,    1,     0,       1,        SHIFT_RIGHT, // moving right
   0,    BLANK, 1,       BLANK,    SHIFT_LEFT,  // moved right
   1,    0,     2,       1,        SHIFT_LEFT,  // add 1
   1,    1,     1,       0,        SHIFT_LEFT,  // add 1
   2,    0,     2,       0,        SHIFT_LEFT,  // add 0
   2,    1,     2,       1,        SHIFT_LEFT   // add 0
};

void init(void)
{
  state = 0;
  headPosition = 0;

  for (unsigned int i = 0; i < CELLS; ++i)
    tape[i] = i < sizeof(input) ? input[i] : BLANK;
}

void print(void)
{
  printf("state %d, tape: ",state);

  for (unsigned int i = 0; i < 32; ++i)
    printf("%c%c",tape[i] != BLANK ? '0' + tape[i] : '.',i == headPosition ?
      '<' : ' ');

  putchar('\n');
}

// Returns 1 if running, 0 if halted.
unsigned char step(void)
{
  const unsigned char *rule = rules;

  for (unsigned int i = 0; i < sizeof(rules) / 5; ++i)
  {
    if (rule[0] == state && rule[1] == tape[headPosition]) // rule matches?
    {
      state = rule[2];
      tape[headPosition] = rule[3];

      if (rule[4] == SHIFT_LEFT)
      {
        if (headPosition == 0)
          return 0;  // trying to shift below cell 0
        else
          headPosition--;
      }
      else if (rule[4] == SHIFT_RIGHT)
        headPosition++;

      return state != STATE_END;
    }

    rule += 5;
  }

  return 0;
}

int main(void)
{
  init();

  print();

  while (step())
    print();

  puts("halted");

  return 0;
}

And here is the program's output:

state 0, tape: 0<1 0 1 . . . . . . . . . . . . . . . . .
state 0, tape: 0 1<0 1 . . . . . . . . . . . . . . . . .
state 0, tape: 0 1 0<1 . . . . . . . . . . . . . . . . .
state 0, tape: 0 1 0 1<. . . . . . . . . . . . . . . . .
state 0, tape: 0 1 0 1 .<. . . . . . . . . . . . . . . .
state 1, tape: 0 1 0 1<. . . . . . . . . . . . . . . . .
state 1, tape: 0 1 0<0 . . . . . . . . . . . . . . . . .
state 2, tape: 0 1<1 0 . . . . . . . . . . . . . . . . .
state 2, tape: 0<1 1 0 . . . . . . . . . . . . . . . . .
halted

Universal Turing machine is an extremely important type of Turing machine: one that is able to simulate another Turing machine -- we can see it as a Turing machine interpreter of a Turing machine. The Turing machine that's to be simulated is encoded into a string (which can then be seen as a programming language -- the format of the string can vary, but it somehow has to encode the rules of the control unit) and this string, along with an input to the simulated machine, is passed to the universal machine which executes it. This is important because now we can see Turing machines themselves as programs and we may use Turing machines to analyze other Turing machines, to become self hosted etc. It opens up a huge world of possibilities.

Non-deterministic Turing machine is a modification of Turing machine which removes the limitation of determinism, i.e. which allows for having multiple different "conflicting" rules defined for the same combination of state and input. During execution such machine can conveniently choose which of these rules to follow, or, imagined differently, we may see the machine as executing all possible computations in parallel and then retroactively leaving in place only the most convenient path (e.g. that which was fastest or the one which finished without getting stuck in an infinite loop). Surprisingly a non-deterministic Turing machine is computationally equivalent to a deterministic Turing machine, though of course a non-deterministic machine may be faster (see especially P vs NP).

Turing machines can be used to define computable formal languages. Let's say we want to define language L (which may be anything such as a programming language) -- we may do it by programming a Turing machine that takes on its input a string (a word) and outputs "yes" if that string belongs to the language, or "no" if it doesn't. This is again useful for the theory of decidability/computability.

See Also