less_retarded_wiki/forth.md
2024-08-06 22:19:44 +02:00

11 KiB

Forth

{ I'm a bit ashamed but I'm not really "fluent" at Forth, I just played around with it for a bit. Yes, I'm planning to get into it more after I do the other million things on my TODO list. Let me know if there is some BS, thank u <3 ~drummyfish }

Forth ("fourth generation" shortened to four characters due to technical limitations) is a very good, extremely minimal stack-based untyped programming language that uses postfix (reverse Polish) notation. Its vanilla form is super simple, it's miles simpler than C, it's very elegant and its compiler/interpreter can be made very easily, giving it high practical freedom (i.e. not being practically controlled by any central organization). As of writing this the smallest Forth implementation, milliforth, has just 340 bytes (!!!) of machine code, that's just incredible. Forth is used e.g. in space technology (e.g. RTX2010, a radiation hardened space computer directly executing Forth) and embedded systems as a way to write efficient low level programs that are, unlike those written in assembly, portable (fun fact: there even exist computers directly running Forth in hardware). Forth was the main influence for Comun, the LRS programming language, it is also used by Collapse OS and Dusk OS as the main language. In its minimalism Forth competes a bit with Lisp.

{ There used to be a nice Forth wiki at wiki.forthfreak.net, now it has to be accessed via archive as it's dead. Also some nice site here https://www.forth.org/compilers.html. ~drummyfish }

{ There is also some discussion about how low level Forth really is, if it really is a language or something like a "metalanguage", or an "environment" to create your own language by defining your own words. Now this is not a place to go very deep on this but kind of a sum up may be this: Forth in its base version is very low level, however it's very extensible and many extend it to some kind of much higher level language, hence the debates. ~drummyfish }

It is usually presented as interpreted language but may as well be compiled, in fact it maps pretty nicely to assembly. Even if interpreted, it can still be very fast. Forth systems traditionally include not just a compiler/interpreter but also an interactive environment, kind of REPL language shell.

There are several Forth standards, most notably ANS Forth from 1994 (the document is proprietary, sharing is allowed, 640 kB as txt). Besides others it also allows Forth to include optional floating point support, however Forth programmers highly prefer fixed point (as stated in the book Starting Forth). Then there is a newer Forth 2012 standard, but it's probably better to stick to the older one.

A free implementation is e.g. GNU Forth (gforth) or pforth (a possibly better option by LRS standards, favors portability over performance).

There is a book called Starting Forth that's freely downloadable and quite good at teaching the language.

Forth was invented by Charles Moore (NOT the one of the Moore's Law though) in 1968, for programming radio telescopes.

Language

Forth is case-insensitive (this may however not be the case in some implementations).

The language operates on an evaluation stack: e.g. the operation + takes the two values at the top of the stack, adds them together and pushed the result back on the stack (i.e. for example 1 2 + in Forth is the same as 1 + 2 in C). Besides this there are also some "advanced" features like variables living outside the stack, if you want to use them.

In fact there are two stacks in Forth: the parameter stack (also data stack) and return stack. Parameter stack is the "normal" stack on which we do most computations and on which we pass parameters and return values. Returns stack is the stack on which return addresses are stored, BUT it is also used as a temporary stack so that we can let's say put aside a few values to dive deeper on the main stack, however this has to be done carefully -- before end of word ("function") is reached, the return stack must be restored to the original state of course.

The stack is composed of cells: the size of the cell is implementation defined. The values stored in cells are just binary, they don't have any data type, so whether a value in given cell is considered signed or unsigned is up to the programmer -- some operators treat numbers as signed and some as unsigned (just like in comun); note that with many operators the distinction doesn't matter (e.g. addition doesn't care if the numbers are signed or not, but comparison does).

Basic abstraction of Forth is so called word: a word is simply a string without spaces like abc or 1mm#3. A word represents simply some operations, which may include running native code, pushing numbers on stack or calling other words, for example the word the word + performs the addition on top of the stack, dup duplicates the top of the stack etc. The programmer can define his own words -- so words are basically kind of "functions" or rather procedures (however words don't return anything or take any arguments in traditional way, they all just invoke some operations -- arguments and return values are passed using the stack). Defining new words expands the current dictionary, so Forth basically extends itself as it's running. A word is defined like this:

: myword operation1 operation2 ... ;

For example a word that computes and average of the two values on top of the stack can be defined as:

: average + 2 / ;

Forth programmers use so called stack notation to document the function's "signature", i.e. what it does with the stack -- they write this notation in a comment above a defined word to signify to others what the word will do. Stack notation has the format ( before -- after ), for example the effect of the above defined average words would be written as ( a b -- avg ) in this notation.

Some built-in words include:

GENERAL:

+           add                     ( a b -- [a+b] )
-           subtract                ( a b -- [a-b] )
*           multiply                ( a b -- [a*b] )
/           divide                  ( a b -- [a/b] )
=           equals                  ( a b -- [-1 if a = b else 0] )
<>          not equals              ( a b -- [-1 if a != b else 0] )
<           less than               ( a b -- [-1 if a < b else 0] )
>           greater than            ( a b -- [-1 if a > b else 0] )
mod         modulo                  ( a b -- [a % b] )
dup         duplicate                 ( a -- a a )
drop        pop stack top             ( a -- )
swap        swap items              ( a b -- b a )
rot         rotate 3              ( a b c -- b c a )
pick        push Nth item   ( xN ... x0 N -- ... x0 xN )
.           pop & print number as signed
u.          pop & print number as unsigned
key         read char on top
.s          print stack
emit        pop & print top as char
cr          print newline
cells       times cell width          ( a -- [a * cell width in bytes] )
depth       pop all & get d.      ( a ... -- [previous stack size] )
quit        don't print "ok" at the end of execution
bye         quit

RETURN STACK:

>r          pops value, pushed it to return stack
r>          pops value from return stack, pushes it
r@          pushes value from return stack (doesn't pop it)
i           pushes value from return stack (without pop)
i'          pushes second value from return stack (without pop)
j           pushes third value from return stack (without pop)

VARIABLES/CONSTS:

variable X      creates var named X (X is a word that pushed its addr)
N X !           stores value N to variable X
N X +!          adds value N to variable X
X @             pushes value of variable X to stack
N constant C    creates constant C with value N
C               pushes the value of constant C

SPECIAL:

( )                       comment (inline)
\                         comment (until newline)
." S"                     print string S (compiles in the string)
X if C then               if X, execute C (only in word def., X is popped)
X if C1 else C2 then      if X, execute C1 else C2 (only in word def.)
do C loop                 loops from stack top value to stack second from,
                          top, special word "i" will hold the iteration val.
begin C until             like do/loop but keeps looping as long as top = 0
begin C while             like begin/until but loops as long as top != 0
begin C again             infinite loop
begin C1 while C2 repeat  loop with middle condition
leave                     loop break (only for counted loops)
allot                     allocates memory, can be used for arrays
recurse                   recursively call the word currently being defined
see W                     shows the definition of word W

example programs:

100 1 2 + 7 * / . \ computes and prints 100 / ((1 + 2) * 7)
cr ." hey bitch " cr \ prints: hey bitch
: myloop 5 0 do i . loop ; myloop \ prints 0 1 2 3 4

TODO: local variables, addresses

Examples

Here is our standardized divisor tree program written in Forth:

\ takes x, pops it and recursively prints its divisor tree
: printdivisortree
  dup 3 <= if
    0 swap 1 swap          \ stack now: 0 1 x
  else
    >r 0 1 r>              \ stack now: a b x

    dup 2 / 1 + 2 do       \ find the closest divisors (a, b)
      dup i mod 0 = if     \ i divides x?
        2 pick 2 pick < if \ a < b?
          i
          swap
          >r               \ use return stack for tmp storage
          swap drop
          swap drop
          dup r@ swap /
          r>
        then
      then
    loop
  then

  ." ( "

  2 pick 0 <> if           \ divisors found?
    2 pick recurse
    dup .
    1 pick recurse
  else
    dup .
  then

  ." ) "

  drop drop drop
;

: digittonum
  dup dup 48 >= swap 57 <= and if
    48 -
  else
    drop -1
  then
;

: main
  begin \ main loop, read numbers from user
    0
    begin
      key

      dup 13 <> while \ newline?

      digittonum

      dup -1 = if
        bye
      then

      swap 10 * +
    repeat

    drop \ key

    dup 1000 < if
      printDivisorTree cr
    else
      bye
    then
  again
;

main
bye

How To

Source code files usually have .fs extension. We can use mentioned gforth to run our files. Let's create file my.fs; in it we write: { Hope the code is OK, I never actually programmed in Forth before. ~drummyfish }

: factorial
  dup 1 > if
    dup 1 - recurse *
  else
    drop 1
  then
;

5 factorial .

bye

We can run this simply with gforth my.fs, the programs should write 120.

See Also