less_retarded_wiki/brainfuck.md
2024-09-22 01:02:44 +02:00

17 KiB
Raw Blame History

Brainfuck

Brainfuck is an extremely simple, minimalist untyped esoteric programming language; simple by its specification (consisting only of 8 commands) but very hard to program in (it is so called Turing tarpit). It works similarly to a pure Turing machine. In a way it is kind of beautiful by its simplicity, it is very easy to write your own brainfuck interpreter (or compiler) -- in fact the Brainfuck author's goal was to make a language for which the smallest compiler could be made.

There exist self-hosted Brainfuck interpreters and compilers (i.e. themselves written in Brainfuck) which is pretty fucked up. The smallest one is probably the one called dbfi which has only slightly above 400 characters, that's incredible!!! (Esolang wiki states that it's one of the smallest self interpreters among imperative languages). Of course, Brainfuck quines (programs printing their own source code) also exist, but it's not easy to make them -- one example found on the web was a little over 2100 characters long.

The language is based on a 1964 language P´´ which was published in a mathematical paper; it is very similar to Brainfuck except for having no I/O. Brainfuck itself was made in 1993 by Urban Muller, he wrote a compiler for it for Amiga, which he eventually managed to get under 200 bytes.

Since then Brainfuck has seen tremendous success in the esolang community as the lowest common denominator language: just as mathematicians use Turing machines in proofs, esolang programmers use brainfuck in similar ways -- many esolangs just compile to brainfuck or use brainfuck in proofs of Turing completeness etc. This is thanks to Brainfuck being an actual, implemented and working language with I/O and working on real computers, not just some abstract mathematical model. For example if one wants to encode a program as an integer number, we can simply take the binary representation of the program's Brainfuck implementation. Brainfuck also has many derivatives and modifications (esolang wiki currently lists over 600 such languages), e.g. Brainfork (Brainfuck with multithreading), Boolfuck (has only binary cells), Brainfuck++ (adds more features like networking), Pi (encodes Brainfuck program in error agains pi digits), Unary (encodes Brainfuck with a single symbol) etcetc.

In LRS programs brainfuck may be seriously used as a super simple scripting language.

Brainfuck can be trivially translated to comun like this: remove all comments from brainfuck program, then replace +, -, >, <, ., ,, [ and ] with ++ , -- , $>0 , $<0 , ->' , $<0 <- , @' and . , respectively, and prepend $>0 .

Specification

The "vanilla" brainfuck operates as follows:

We have a linear memory of cells and a data pointer which initially points to the 0th cell. The size and count of the cells is implementation-defined, but usually a cell is 8 bits wide and there is at least 30000 cells.

A program consists of these possible commands:

  • +: increment the data cell under data pointer
  • -: decrement the data cell under data pointer
  • >: move the data pointer to the right
  • <: move the data pointer to the left
  • [: jump after corresponding ] if value under data pointer is zero
  • ]: jump after corresponding [ if value under data pointer is not zero
  • .: output value under data pointer as an ASCII character
  • ,: read value and store it to the cell under data pointer

Characters in the source code that don't correspond to any command are normally ignored, so they can conveniently be used for comments.

Brainfuck source code files usually have .bf or .b extension.

Implementation

This is a very simple C implementation of Brainfuck interpreter:

#include <stdio.h>

const char program[] = ",[.-]"; // your program here

#define CELLS 30000
char tape[CELLS];

int main(void)
{
  unsigned int cell = 0;
  const char *i = program;
  int bDir, bCount;

  while (*i != 0)
  {
    switch (*i)
    {
      case '>': cell++; break;
      case '<': cell--; break;
      case '+': tape[cell]++; break;
      case '-': tape[cell]--; break;
      case '.': putchar(tape[cell]); fflush(stdout); break;
      case ',': scanf("%c",tape + cell); break;
      case '[':
      case ']':
        if ((tape[cell] == 0) == (*i == ']'))
          break;

        bDir = (*i == '[') ? 1 : -1;
        bCount = 0;

        while (1)
        {
          if (*i == '[')
            bCount += bDir;
          else if (*i == ']')
            bCount -= bDir;

          if (bCount == 0)
            break;

          i += bDir;
        }

        break;

      default: break;
    }

    i++;
  }

  return 0;
}

TODO: comun implementation

Advanced Brainfuck implementations may include optimizations, for example things like >>><<> may be reduced to >> etc.

And here is a Brainfuck to C transpiler, written in C, which EVEN does the above simple optimization of grouping together additions, subtractions and shifts. It will allow you to compile Brainfuck to native executables. The code is possibly even simpler than the interpreter:

#include <stdio.h>

int main(void)
{
  int c, cNext;

  puts("#include <stdio.h>\nunsigned char m[1024];\n"
       "char *c = m;\nint main(void) {");

#define NEXT { c = cNext; cNext = getchar(); }

  NEXT NEXT

  while (c != EOF)
  {
    switch (c)
    {
      case '>': case '<': case '+': case '-':
      {
        unsigned int n = 1;

        while (cNext == c)
        {
          NEXT
          n++;
        }

        printf("  %s %c= %u;\n",(c == '<' || c == '>') ? "c" : "*c",
          (c == '>' || c == '+') ? '+' : '-',n);

        break;
      }

      case '.': puts("  putchar(*c);"); break;
      case ',': puts("  *c = getchar();"); break;
      case '[': puts("  while (*c) {"); break;
      case ']': puts("  }"); break;
      default: break;
    }

    NEXT
  }

  puts("return 0; }");
  return 0;
}

Programs

Here are some simple programs in brainfuck.

Print HI:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ . + .

Read two 0-9 numbers (as ASCII digits) and add them:

,>,[<+>-]<------------------------------------------------.

Variants

Brainfuck became an inspiration to a plethora of derivative languages (esolang wiki currently lists over 700) many of which pay homage to their ancestor by including the word fuck in the name. Oftentimes we see extensions adding new features or languages that just translate to Brainfuck, i.e. are defined in terms of Brainfuck. Some of notable Brainfuck derivatives include: Fuck (only has a single memory cell), Brainfork (adds multithreading with new command Y), Unary (program source code only uses one character, compiled to BF), Mierda (just replaces the commands with Spanish words), Brainfuck++, Agony etc.

Making Brainfuck Usable: Defining Macrofucker

{ There probably exist BF derivatives in this spirit, it's very natural, I didn't bother checking too much, here I just want to derive this from scratch myself, for educational purposes. ~drummyfish }

What if we want to actually write a more complex program in Brainfuck? How do we tame the beast and get out of the Turing tarpit? We may build a metalanguage on top of Brainfuck that will offer more convenient constructs and will compile to Brainfuck, and maybe we'll learn something about building and bootstrapping computing environments along the way :) We may do this e.g. with a simple system of preprocessing macros, i.e. we will create a language with more advanced commands that will be replaced by plain Brainfuck commands -- on the level of source code -- before it gets executed. This turns out to be a quite effective approach that enables us to create sort of a Forth-like language in which we may program quite complex things with the stack-based computing paradigm.

Hmmm okay, what name do we give the language? Let's call it Macrofucker. It will work like this:

  • Vanilla Brainfuck commands work normally, they'll be simply copied.
  • Additionally we introduce macros. A macro will be defined as: :M<commands>;. : and ; are simply keywords separating the macro definition, M is the macro name, which we'll for simplicity sake limit to single uppercase letters only (so we won't be able to make more macros than there are letters), and <commands> are just commands that will be copy-pasted wherever the macro is used.
  • A macro will be used by simply writing its name, i.e. if we have macro M defined (anywhere in the source code), we can use it by simply writing M. Optionally we may call it with numeric parameter as MX, where X is a decimal number. If no parameter is given, we consider it 0. Macro may be invoked even inside another macro.
  • Inside a macro definition we may use the symbol $ that will make the next character be repeated the macro's argument number of times -- i.e. if the macro was called with let's argument 3, then $> will output >>>. This symbol can also be used in the same sense in front of macro invocation.

For example consider the following piece of code:

:X[-]$+; >X10 >X11 >X12 >X13

We first define macro called X that serves for storing constants in cells. The macro first zeroes the cell ([-]) and then repeats the character + the argument number of times. Then we use the macro 4 times, with constants 10, 11, 12 and 13. We also shift right before each macro invocation so it's as if we're pushing the constants on the stack. This code will compile to:

>[-]++++++++++>[-]+++++++++++>[-]++++++++++++>[-]+++++++++++++

If we examine and run the code, we indeed find that we end up with the values 10, 11, 12 and 13 on the tape:

0 10 11 12 13
           ^

Implementing the preprocessor is about as simple as implementing Brainfuck itself: pretty easy. As soon as we have the preprocessor, we may start implementing a "library" of macros, i.e. we may expand Brainfuck by adding quite powerful commands -- the beauty of it is we'll be expanding the language in Macrofucker itself from now on, no more C code is required beyond writing the simple preprocessor. This is a very cool, minimalist approach of building complex things by adding simple but powerful extensions to very simple things, the kind of incremental programming approach that's masterfully applied in languages such as Forth and Lisp.

So here it is, the Macrofucker preprocessor in C, along with embedded code of the program it processes -- here we include simple library that even includes things such as division, modulus and printing and reading decimal values:

#include <stdio.h>

const char program[] =
  // the library (WARNING: cells to the right may be changed):
  ":Z[-];"                                       // zero: c[0] = 0
  ":L$<;"                                        // left: c -= N
  ":R$>;"                                        // right: c += N
  ":I$+;"                                        // inc: c[0] += N
  ":D$-;"                                        // dec: c[0] -= N
  ":XZ$+;"                                       // const: c[0] = N
  ":N>Z+<[Z>-<]>[<$++>Z]<;"                      // not: c[0] = c[0] == 0 ? N + 1 : 0
  ":CZ>Z<<$<[-$>>+>+<$<<]$>>>[-<$<<+>>$>]<;"     // copy: c[0] = c[-(N + 1)]
  ":M>C<Z>[-<$-->]<;"                            // minus: c[0] *= -(N + 1)
  ":F>Z<[->+<]<$<[->$>+<$<]$>>>[-<<$<+>>$>]<;"   // flip: SWAP(c[0],c[-(N + 1)])
  ":A>C1[-<+>]<;"                                // add: c[0] += c[-1]
  ":S>C1[-<->]<;"                                // subtract: c[0] -= c[-1]
  ":T>C1>C1>Z<<-[->>A<<]>>[-L3+R3]L3;"           // times: c[0] *= c[-1]
  ":EC1>C1[-<->]<N;"                             // equals: c[-2] == c[-1] ? 1 : 0
  ":GZ>C2>C2+<[->->CN[L3+R3Z]<<]<;"              // greater: c[-1] > c[0] ? 1 : 0
  ":B>C1>C1<<Z>>>GN[L3+>>S>GN]<F<;"              // by: c[1] = c[0] % c[-1]; c[0] = c[0] / c[-1]; c++
  ":P>X100>C1BF>X48A.L3X10>BF>X48A.<F>X48A.L4;"  // print: print byte as decimal
  ":VX48>,SFX100T>X48>,SFX10TF<A>X48>,SF<AF2L3;" // value: reads decimal number of three digits
  // the main program itself:
  "Z>V>C[>C1BN[L4+R4Z]<<-]<<P>X10.X2>E[X112.X114.X105.X109.X101.X10.Z]"
;

void process(const char *c, int topLevel)
{
  char f = *c;        // macro name to search
  unsigned int n = 0; // macro argument

  if (!topLevel)      // read the argument
  {
    c++;

    while (*c >= '0' && *c <= '9')
    {
      n = 10 * n + *c - '0';
      c++;
    }
  }

#define IS_MACRO(x) ((x) >= 'A' && (x) <= 'Z')
  c = program;

  while (*c)                     // search for the macro definition
  {
    if (topLevel || (c[0] == ':' && c[1] == f))
    {
      c += topLevel ? 0 : 2;     // skip the beginning macro chars

      while (*c && *c != ';')
      {
        if (*c == ':')
          while ((*++c) != ';'); // skip macro definitions
        else if (*c == '+' || *c == '-' || *c == '<' || *c == '>' ||
          *c == '[' || *c == ']' || *c == '.' || *c == ',')
          putchar(*c);           // normal BF commands
        else if (IS_MACRO(*c))
          process(c,0);          // macro
        else if (*c == '$')
        {
          c++;
          for (unsigned int i = 0; i < n; ++i)
            IS_MACRO(*c) ? process(c,0) : putchar(*c);
        }

        c++;
      }

      return;
    }

    c++;
  }
}

int main(int argc, char **argv)
{
  process(program,1);
  putchar(0);    // allows separating program on stdin from program input
  //puts("013"); // program input may go here
  return 0;
}

The main program we have here is the example program from the algorithm article: it reads a number, prints the number of its divisors and says if the number is prime. Code of the Brainfuck program will be simply printed out on standard output and it can then be run using our Brainfuck interpreter above. Unlike "hello world" this is already a pretty cool problem we've solved with Brainfuck, and we didn't even need that much code to make it happen. Improving this further could allow us to make a completely usable (though, truth be said, probably slow) language. Isn't this just beautiful? Yes, it is :)

So just for completeness, here is a Macrofucker program that prints out the first 10 Fibonacci numbers:

:Z[-];                                        zero the cell
:L$<;                                         go left by n
:R$>;                                         go right by n
:XZ$+;                                        store constant n
:N>Z+<[Z>-<]>[<$++>Z]<;                       not
:CZ>Z<<$<[-$>>+>+<$<<]$>>>[-<$<<+>>$>]<;      copy
:F>Z<[->+<]<$<[->$>+<$<]$>>>[-<<$<+>>$>]<;    flip
:A>C1[-<+>]<;                                 add
:S>C1[-<->]<;                                 subtract
:GZ>C2>C2+<[->->CN[L3+R3Z]<<]<;               greater
:B>C1>C1<<Z>>>GN[L3+>>S>GN]<F<;               divide
:P>X100>C1BF>X48A.L3X10>BF>X48A.<F>X48A.L4;   print

main program

>X10   loop counter
>X0    first number
>X1    second number
<<

[-             loop
  R3
  C1 A         copy and add
  P > X10 .    print number and newline
  < F < F <<   go back and shift numbers
]

which translates to:

>[-]++++++++++>[-]>[-]+<<[->>>[-]>[-]<<<[->>+>+<<<]>>>[-<<<
+>>>]<>[-]>[-]<<<[->>+>+<<<]>>>[-<<<+>>>]<[-<+>]<>[-]++++++
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++++++++++++++>[-]>[-]<<<[->>+>+<<<]>>
>[-<<<+>>>]<>[-]>[-]<<<[->>+>+<<<]>>>[-<<<+>>>]<>[-]>[-]<<<
[->>+>+<<<]>>>[-<<<+>>>]<<<[-]>>>[-]>[-]>[-]<<<<[->>>+>+<<<
<]>>>>[-<<<<+>>>>]<>[-]>[-]<<<<[->>>+>+<<<<]>>>>[-<<<<+>>>>
]<+<[->->[-]>[-]<<[->+>+<<]>>[-<<+>>]<>[-]+<[[-]>-<]>[<+>[-
]]<[<<<+>>>[-]]<<]<>[-]+<[[-]>-<]>[<+>[-]]<[<<<+>>>[-]>[-]<
<<[->>+>+<<<]>>>[-<<<+>>>]<[-<->]<>[-]>[-]>[-]<<<<[->>>+>+<
<<<]>>>>[-<<<<+>>>>]<>[-]>[-]<<<<[->>>+>+<<<<]>>>>[-<<<<+>>
>>]<+<[->->[-]>[-]<<[->+>+<<]>>[-<<+>>]<>[-]+<[[-]>-<]>[<+>
[-]]<[<<<+>>>[-]]<<]<>[-]+<[[-]>-<]>[<+>[-]]<]<>[-]<[->+<]<
[->+<]>>[-<<+>>]<<>[-]<[->+<]<[->+<]>>[-<<+>>]<>[-]++++++++
++++++++++++++++++++++++++++++++++++++++>[-]>[-]<<<[->>+>+<
<<]>>>[-<<<+>>>]<[-<+>]<.<<<[-]++++++++++>>[-]>[-]<<<[->>+>
+<<<]>>>[-<<<+>>>]<>[-]>[-]<<<[->>+>+<<<]>>>[-<<<+>>>]<<<[-
]>>>[-]>[-]>[-]<<<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<>[-]>[-]<<
<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<+<[->->[-]>[-]<<[->+>+<<]>>
[-<<+>>]<>[-]+<[[-]>-<]>[<+>[-]]<[<<<+>>>[-]]<<]<>[-]+<[[-]
>-<]>[<+>[-]]<[<<<+>>>[-]>[-]<<<[->>+>+<<<]>>>[-<<<+>>>]<[-
<->]<>[-]>[-]>[-]<<<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<>[-]>[-]
<<<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<+<[->->[-]>[-]<<[->+>+<<]
>>[-<<+>>]<>[-]+<[[-]>-<]>[<+>[-]]<[<<<+>>>[-]]<<]<>[-]+<[[
-]>-<]>[<+>[-]]<]<>[-]<[->+<]<[->+<]>>[-<<+>>]<<>[-]<[->+<]
<[->+<]>>[-<<+>>]<>[-]+++++++++++++++++++++++++++++++++++++
+++++++++++>[-]>[-]<<<[->>+>+<<<]>>>[-<<<+>>>]<[-<+>]<.<>[-
]<[->+<]<[->+<]>>[-<<+>>]<>[-]+++++++++++++++++++++++++++++
+++++++++++++++++++>[-]>[-]<<<[->>+>+<<<]>>>[-<<<+>>>]<[-<+
>]<.<<<<>[-]++++++++++.<>[-]<[->+<]<[->+<]>>[-<<+>>]<<>[-]<
[->+<]<[->+<]>>[-<<+>>]<<<]

which outputs:

001
001
002
003
005
008
013
021
034
055

See Also