19 KiB
Turing Machine
Turing machine is a mathematical model of a computer which works in a quite simple way but has nevertheless the full computational power, i.e. it is able to perform any possible computation which could be achieved by any other means. Turing machine is one of the most important tools of theoretical computer science as it presents a basic model of computation (i.e. a mathematical system capable of performing general mathematical calculations) for studying computers and algorithms -- in fact it stood at the beginning of theoretical computer science when Alan Turing invented it in 1936 and used it to mathematically prove essential things about computers; for example that their computational power is inevitably limited (see computability) -- he showed that even though Turing machine has the full computational power we can hope for, there exist problems it is incapable of solving (and so will be any other computer equivalent to Turing machine, even human brain). Since then many other so called Turing complete systems (systems with the exact same computational power as a Turing machine) have been invented and discovered, such as lambda calculus or Petri nets, however Turing machine still remains not just relevant, but probably of greatest importance, not only historically, but also because it is similar to physical computers in the way it works.
NOTE: It seems the term "computational power" is sometimes used as a measure of the speed of computation, but here we mean something different -- computational power to us is just the pure ABILITY to compute something in finite time. Turing machines can also be used to theoretically study the "speed" of computation (see computational complexity), but in this introduction we are merely interested in what CAN be computed, no matter in how many steps or how much memory it will require (as long as it's not infinitely many). I.e. when we say a Turing machine can in theory run for example the latest GTA game, we don't mean it would be efficient, fast etc. We are talking pure theoretical possibility.
The advantage of a Turing machine is that it's firstly very simple (it's basically a finite state automaton operating on a memory tape), so it can be mathematically grasped very easily, and secondly it is, unlike many other systems of computations, actually similar to real computers in principle, mainly by its sequential instruction execution and possession of an explicit memory tape it operates on (equivalent to RAM in traditional computers). However note that a pure Turing machine cannot exist in reality because there can never exist a computer with infinite amount of memory which Turing machine possesses; computers that can physically exist are really equivalent to finite state automata, i.e. the "weakest" kind of systems of computation. However we can see our physical computers as approximations of a Turing machine that in most relevant cases behave the same, so we do tend to theoretically view computers as "Turing machines with limited memory".
{ Although purely hypothetically we could entertain an idea of a computer that's capable of manufacturing itself a new tape cell whenever one is needed, which could then have something like unbounded memory tape, but still it would be limited at least by the amount of matter in observable universe. ~drummyfish }
In the "vanilla" Turing machine data and program are separated (data is stored on the tape, program is represented by the control unit), i.e. it is closer to Harvard architecture than von Neumann architecture; however (since with Turing machine we can program anything we could on any other computer) using this basic concept of a Turing machine we can construct so called universal Turing machine, i.e. basically an interpreter of Turing machines, which runs a program (the Turing machine being interpreted) that's stored in the same memory as the data.
Is there anything computationally more powerful than a Turing machine? Well, yes, but it's just kind of "mathematical fantasy". See e.g. oracle machine which adds a special "oracle" device to a Turing machine to make it magically solve undecidable problems.
How It Works
Turing machine has a few different versions (such as one with multiple memory tapes, memory tape unlimited in both directions, one with non-determinism, ones with differently defined halting conditions etc.), which are however equivalent in computing power, so here we will describe just one of the most common variants.
A Turing machine is composed of:
- memory tape: Memory composed of infinitely many cells (numbered 0, 1, 2, ...), each cell can hold exactly one symbol from some given alphabet (can be e.g. just symbols 0 and 1) OR the special blank symbol. At the beginning all memory cells contain the blank symbol. Memory holds the data on which we perform computation.
- read/write head: Head that is positioned above a memory cell, can be moved to left or right. At the beginning the head is at memory cell 0.
- control unit: The program (algorithm) that's "loaded" on the machine (the controls unit by itself is really a finite state automaton). It is composed of:
- a set of N (finitely many) states {Q0, Q1, ... QN-1}: The machine is always in one of these states. One state is defined as starting (this is the state the machine is in at the beginning), one is the end state (the one which halts the machine when it is reached).
- a set of finitely many rules in the format [stateFrom, inputSymbol, stateTo, outputSymbol, headShift], where stateFrom is the current state, inputSymbol is symbol currently under the read/write head, stateTo is the state the machine will transition to, outputSymbol is the symbol that will be written to the memory cell under read/write head and headShift is the direction to shift the read/write head in (either left, right or none). There must not be conflicting rules (ones with the same combination of stateFrom and inputSymbol).
The machine halts either when it reaches the end state, when it tries to leave the tape (go left from memory cell 0) or when it encounters a situation for which it has no defined rule.
The computation works like this: the input data we want to process (for example a string we want to reverse) are stored in the memory before we run the machine. Then we run the machine and wait until it finishes, then we take what's present in the memory as the machine's output (i.e. for example the reversed string). That is a Turing machine doesn't have a traditional I/O (such as a "printf" function), it only works entirely on data in memory!
Let's see a simple example: we will program a Turing machine that takes a binary number on its output and adds 1 to it (for simplicity we suppose a fixed number of bits so an overflow may happen). Let us therefore suppose symbols 0 and 1 as the tape alphabet. The control unit will have the following rules:
stateFrom | inputSymbol | stateTo | outputSymbol | headShift |
---|---|---|---|---|
goRight | non-blank | goRight | inputSymbol | right |
goRight | blank | add1 | blank | left |
add1 | 0 | add0 | 1 | left |
add1 | 1 | add1 | 0 | left |
add0 | 0 | add0 | 0 | left |
add0 | 1 | add0 | 1 | left |
end |
Our start state will be goRight and end will be the end state, though we won't need the end state as our machine will always halt by leaving the tape. The states are made so as to first make the machine step by cells to the right until it finds the blank symbol, then it will step one step left and switch to the adding mode. Adding works just as we are used to, with potentially carrying 1s over to the highest orders etc.
Now let us try inputting the binary number 0101 (5 in decimal) to the machine: this means we will write the number to the tape and run the machine as so:
goRight
_V_ ___ ___ ___ ___ ___ ___ ___
| 0 | 1 | 0 | 1 | | | | ...
'---'---'---'---'---'---'---'---
goRight
___ _V_ ___ ___ ___ ___ ___ ___
| 0 | 1 | 0 | 1 | | | | ...
'---'---'---'---'---'---'---'---
goRight
___ ___ _V_ ___ ___ ___ ___ ___
| 0 | 1 | 0 | 1 | | | | ...
'---'---'---'---'---'---'---'---
goRight
___ ___ ___ _V_ ___ ___ ___ ___
| 0 | 1 | 0 | 1 | | | | ...
'---'---'---'---'---'---'---'---
goRight
___ ___ ___ ___ _V_ ___ ___ ___
| 0 | 1 | 0 | 1 | | | | ...
'---'---'---'---'---'---'---'---
add1
___ ___ ___ _V_ ___ ___ ___ ___
| 0 | 1 | 0 | 1 | | | | ...
'---'---'---'---'---'---'---'---
add1
___ ___ _V_ ___ ___ ___ ___ ___
| 0 | 1 | 0 | 0 | | | | ...
'---'---'---'---'---'---'---'---
add0
___ _V_ ___ ___ ___ ___ ___ ___
| 0 | 1 | 1 | 0 | | | | ...
'---'---'---'---'---'---'---'---
add0
_V_ ___ ___ ___ ___ ___ ___ ___
| 0 | 1 | 1 | 0 | | | | ...
'---'---'---'---'---'---'---'---
END
Indeed, we see the number we got at the output is 0110 (6 in decimal, i.e. 5 + 1). Even though this way of programming is very tedious, it actually allows us to program everything that is possible to be programmed, even whole operating systems, neural networks, games such as Doom and so on. Here is C code that simulates the above shown Turing machine with the same input:
#include <stdio.h>
#define CELLS 2048 // ideal Turing machine would have an infinite tape...
#define BLANK 0xff // blank tape symbol
#define STATE_END 0xff
#define SHIFT_NONE 0
#define SHIFT_LEFT 1
#define SHIFT_RIGHT 2
unsigned int state; // 0 = start state, 0xffff = end state
unsigned int headPosition;
unsigned char tape[CELLS]; // memory tape
unsigned char input[] = // what to put on the tape at start
{ 0, 1, 0, 1 };
unsigned char rules[] =
{
// state symbol newstate newsymbol shift
0, 0, 0, 0, SHIFT_RIGHT, // moving right
0, 1, 0, 1, SHIFT_RIGHT, // moving right
0, BLANK, 1, BLANK, SHIFT_LEFT, // moved right
1, 0, 2, 1, SHIFT_LEFT, // add 1
1, 1, 1, 0, SHIFT_LEFT, // add 1
2, 0, 2, 0, SHIFT_LEFT, // add 0
2, 1, 2, 1, SHIFT_LEFT // add 0
};
void init(void)
{
state = 0;
headPosition = 0;
for (unsigned int i = 0; i < CELLS; ++i)
tape[i] = i < sizeof(input) ? input[i] : BLANK;
}
void print(void)
{
printf("state %d, tape: ",state);
for (unsigned int i = 0; i < 32; ++i)
printf("%c%c",tape[i] != BLANK ? '0' + tape[i] : '.',i == headPosition ?
'<' : ' ');
putchar('\n');
}
// Returns 1 if running, 0 if halted.
unsigned char step(void)
{
const unsigned char *rule = rules;
for (unsigned int i = 0; i < sizeof(rules) / 5; ++i)
{
if (rule[0] == state && rule[1] == tape[headPosition]) // rule matches?
{
state = rule[2];
tape[headPosition] = rule[3];
if (rule[4] == SHIFT_LEFT)
{
if (headPosition == 0)
return 0; // trying to shift below cell 0
else
headPosition--;
}
else if (rule[4] == SHIFT_RIGHT)
headPosition++;
return state != STATE_END;
}
rule += 5;
}
return 0;
}
int main(void)
{
init();
print();
while (step())
print();
puts("halted");
return 0;
}
And here is the program's output:
state 0, tape: 0<1 0 1 . . . . . . . . . . . . . . . . .
state 0, tape: 0 1<0 1 . . . . . . . . . . . . . . . . .
state 0, tape: 0 1 0<1 . . . . . . . . . . . . . . . . .
state 0, tape: 0 1 0 1<. . . . . . . . . . . . . . . . .
state 0, tape: 0 1 0 1 .<. . . . . . . . . . . . . . . .
state 1, tape: 0 1 0 1<. . . . . . . . . . . . . . . . .
state 1, tape: 0 1 0<0 . . . . . . . . . . . . . . . . .
state 2, tape: 0 1<1 0 . . . . . . . . . . . . . . . . .
state 2, tape: 0<1 1 0 . . . . . . . . . . . . . . . . .
halted
Universal Turing machine is an extremely important type of Turing machine: one that is able to simulate another Turing machine -- we can see it as a Turing machine interpreter of a Turing machine. The Turing machine that's to be simulated is encoded into a string (which can then be seen as a programming language -- the format of the string can vary, but it somehow has to encode the rules of the control unit) and this string, along with an input to the simulated machine, is passed to the universal machine which executes it. This is important because now we can see Turing machines themselves as programs and we may use Turing machines to analyze other Turing machines, to become self hosted etc. It opens up a huge world of possibilities.
Non-deterministic Turing machine is a modification of Turing machine which removes the limitation of determinism, i.e. which allows for having multiple different "conflicting" rules defined for the same combination of state and input. During execution such machine can conveniently choose which of these rules to follow, or, imagined differently, we may see the machine as executing all possible computations in parallel and then retroactively leaving in place only the most convenient path (e.g. that which was fastest or the one which finished without getting stuck in an infinite loop). Surprisingly a non-deterministic Turing machine is computationally equivalent to a deterministic Turing machine, though of course a non-deterministic machine may be faster (see especially P vs NP).
Turing machines can be used to define computable formal languages. Let's say we want to define language L (which may be anything such as a programming language) -- we may do it by programming a Turing machine that takes on its input a string (a word) and outputs "yes" if that string belongs to the language, or "no" if it doesn't. This is again useful for the theory of decidability/computability.
How Can Something So Simple Have The Maximum Computational Power And How Do We Know It Is So?
Well, at first people are usually surprised that Turing machine can "compute anything that any other computer", but given that we don't care about efficiency at all it can be shown with a Turing machine we can emulate all that's needed to perform any algorithm: we can have a sequence of instructions (just states transitioning to other states unconditionally), branching (conditional state transitions) and loops (transitioning to previous states) -- that's all we really need. Curiously it turns out that many systems, such as some card games for example, have this property without even having been intended to, i.e. it's not rare for a randomly encountered system to be Turing complete (and it's usually proven to be Turing complete by showing how it can emulate a Turing machine). Even if it would be a challenge to make a big program with Turing machine, we can just start with simple things like basic arithmetic operations, data types, etc. and eventually we reimplement any fancy programming language. If we accept that let's say the C programming language (or any other language we can imagine) can "program anything", we can show (in a long and boring way, but you can try it) that C can really be implemented with a Turing machine and so it can compute anything that can be computed with C. All the fancy things like data types, control structures and preprocessors are just sugar made from very simple basics.
There is so called Church–Turing thesis which is basically a claim -- unprovable but taken as truth, or maybe rather a definition -- that anything we would intuitively call an "algorithm" (a finite series of exact, simple steps leading to computing some result) can always be performed by some Turing machine, or maybe it's better to say in reverse: we say that Turing machine DEFINES what an algorithm is and kind of dare anyone to come up with something that would shake this claim, like finding out that something can be computed only in a way that's fundamentally impossible to be recorded as a series of exact steps. In fact Turing machine was created because we didn't have a rigorous, mathematically precise definition of what an algorithm is, before this we could only intuitively talk about the "series of simple steps" without knowing what a "simple" or "step" can really mean, what kind of resources we can use and so on, so Alan Turing created the machine, and indeed when we examine it in detail, the claim that it embodies the definition of an algorithm becomes "obviously true", so people basically agree that algorithm equals a Turing machine and that nothing more computationally powerful can exist, save for maybe a few religious fanatics who say there is something magical in human brain that's more than pure computation and which cannot be imitated by algorithms, but these are absolutely irrational beliefs. Holding on to mathematics we have to accept that any computable function must by definition be computable by performing a number of calculation steps and this can always be done by a Turing machine. If a human with pen and pencil can compute it, a Turing machine can too.
How Does It Relate To A Practical Computer?
Quite a lot actually -- Turing machines, unlike some equivalent models of computation, are the basis of practically every electronic digital computer you will encounter; we might even say today's computers are just very pimped up, greatly optimized fancy Turing machines with added input/output devices, also having the practical limitation of only having finite memory. We may see the computer's CPU as the finite state machine -- the control unit -- i.e. the hardcoded basic program that's burned in the hardware; this control unit is actually hardwired to implement the universal Turing machine, i.e. to interpret a program that's stored in memory, which achieves the programmability of the computer. How to effectively encode programs so that they are small and fast to interpret is a matter of designing instruction set architectures. Everything around this, like multiple CPU cores, GPUs, caches, buses etc. are just things to make it all faster and more effective.