This commit is contained in:
Miloslav Ciz 2024-08-08 22:37:16 +02:00
parent 8dbbd1acb0
commit 367af6e637
22 changed files with 1863 additions and 1829 deletions

View file

@ -2,13 +2,17 @@
{ I'm a bit ashamed but I'm not really "fluent" at Forth, I just played around with it for a bit. Yes, I'm planning to get into it more after I do the other million things on my TODO list. Let me know if there is some BS, thank u <3 ~drummyfish }
Forth ("fourth generation" shortened to four characters due to technical limitations) is a very good, extremely [minimal](minimalism.md) [stack](stack.md)-based untyped [programming language](programming_language.md) that uses [postfix](notation.md) (reverse Polish) notation. Its vanilla form is super simple, it's miles simpler than [C](c.md), it's very [elegant](elegant.md) and its compiler/interpreter can be made very easily, giving it high practical freedom (i.e. not being practically controlled by any central organization). As of writing this the smallest Forth implementation, [milliforth](milliforth.md), has just **340 bytes** (!!!) of [machine code](machine_code.md), that's just incredible. Forth is used e.g. in [space](space.md) technology (e.g. [RTX2010](rtx2010.md), a radiation hardened space computer directly executing Forth) and [embedded](embedded.md) systems as a way to write efficient [low level](low_level.md) programs that are, unlike those written in [assembly](assembly.md), [portable](portability.md) (fun fact: there even exist computers directly running Forth in hardware). Forth was the main influence for [Comun](comun.md), the [LRS](lrs.md) programming language, it is also used by [Collapse OS](collapseos.md) and [Dusk OS](duskos.md) as the main language. In its minimalism Forth competes a bit with [Lisp](lisp.md).
Forth ("fourth generation" shortened to four characters due to technical limitations) is a very [elegant](beauty.md), extremely [minimal](minimalism.md) [stack](stack.md)-based [programming language](programming_language.md) that uses [postfix](notation.md) (reverse Polish) notation -- it is one of the very best programming languages ever conceived. Its vanilla form is super simple, it's much simpler than [C](c.md), it is cleverly designed and its compiler/interpreter can be made easily, giving it high practical freedom (i.e. not being practically controlled by any central organization). As of writing this the smallest Forth implementation, [milliforth](milliforth.md), has just **340 bytes** (!!!) of [machine code](machine_code.md), that's just incredible. Forth is used e.g. in [space](space.md) technology (e.g. [RTX2010](rtx2010.md), a radiation hardened space computer directly executing Forth) and [embedded](embedded.md) systems as a way to write efficient [low level](low_level.md) programs that are, unlike those written in [assembly](assembly.md), [portable](portability.md) (fun fact: there even exist computers directly running Forth in hardware). Forth was the main influence for [Comun](comun.md), the [LRS](lrs.md) programming language, it is also used by [Collapse OS](collapseos.md) and [Dusk OS](duskos.md) as the main language. In its minimalism Forth competes a bit with [Lisp](lisp.md).
**Forth is magical and may be the greatest thing yet conceived in computing**, it really looks like the pinnacle of programming. While in the world of "normal" programming languages you have to make tradeoffs, such as sacrificing performance for flexibility, Forth beats basically all traditional languages at EVERYTHING at once: [simplicity](minimalism.md), [beauty](beauty.md), memory compactness, flexibility, performance and [portability](portability.md). It is also more than just a programming language, it is just a system for computing and can serve for example as a [text editor](text_editor.md) or even an [operating system](os.md) (that is why e.g. DuskOS is written in Forth -- it is not as much written in Forth as it actually IS Forth). Of course you may ask: if it's so great, why isn't it used very much? Someone somewhere once summed it up like this: Forth gives one extreme freedom and this allows [retards](soydev.md) to make bad design and fuck things up -- [capitalism](capitalism.md) needs languages for monkeys, that's why [bad languages](rust.md) are widely used. And remember: popularity has never been a measure of quality -- the best art will never be mainstream, it can only be understood by a few.
Forth is a bit unique in its philosophy, it can really be hardly compared to traditional languages such as [C++](cpp.md) or [Java](java.md) -- while the "typical language" is always basically the same thing for the programmer (no matter the implementation) and provides a few predefined, highly complex, universal, hardcoded constructs that are simply there and cannot be changed (such as an [OOP](oop.md) system, templates, control structures, ...), **Forth adopts [Unix philosophy](unix_philosophy.md)** by defining just the concept of a word and maybe providing a few simple words and letting the programmer extend the language (that is even the compiler/interpreter itself) by defining new words out of the simpler ones, and this includes even things such as control structures (branches, loops, ...) for example. For instance: in traditional languages you have a few predefined formats in which you may write numbers, e.g. in C you may use decimal numbers as `123` or hexadecimal numbers as `0x7b`; in Forth you may change the base at any time to any value by assigning to the `base` variable, which will change how Forth parses and outputs numbers (while a number is considered any word that's not been found in dictionary). Almost everything in Forth can be modified this way, so pure Forth without any words is not much more than a description of a format of how words will be represented and handled on a very basic level -- something on the level of simplicity of let's say [lambda calculus](lambda_calculus.md) -- and only a *Forth system* of basic words (such as that defined by ANS Forth standard) provides a basic "practically usable" language. The point is this can still be extended yet further, without end or limitations.
{ There used to be a nice Forth wiki at wiki.forthfreak.net, now it has to be accessed via archive as it's dead. Also some nice site here https://www.forth.org/compilers.html. ~drummyfish }
{ There is also some discussion about how low level Forth really is, if it really is a language or something like a "metalanguage", or an "environment" to create your own language by defining your own words. Now this is not a place to go very deep on this but kind of a sum up may be this: Forth in its base version is very low level, however it's very extensible and many extend it to some kind of much higher level language, hence the debates. ~drummyfish }
{ Since Forth adopts a kind of unique philosophy, there are some discussion about how low level Forth really is, if it really is a language or something like a "metalanguage", or an "environment" to create your own language by defining your own words. Now this is not a place to go very deep on this but kind of a sum up may be this: Forth in its base version is very low level, however it's very extensible and many Forth systems extend the base language to some kind of much higher level language, hence the debates. ~drummyfish }
It is usually presented as [interpreted](interpreter.md) language but may as well be [compiled](compiler.md), in fact it maps pretty nicely to [assembly](assembly.md). Even if interpreted, it can still be very fast. Forth systems traditionally include not just a compiler/interpreter but also an **interactive environment**, kind of [REPL](repl.md) language shell.
The language is usually presented as [interpreted](interpreter.md) but may perfectly well be [compiled](compiler.md) too, in fact it maps very well to [assembly](assembly.md). Some words may be written directly in machine code, so we may possibly see Forth as a kind of a "wrapper for assembly". And even if interpreted, the language can still be very fast thanks to its simplicity. Forth systems traditionally include not just a compiler/interpreter but also an **interactive environment**, kind of [REPL](repl.md) language shell.
There are several Forth standards, most notably ANS Forth from 1994 (the document is [proprietary](proprietary.md), sharing is allowed, 640 kB as txt). Besides others it also allows Forth to include optional [floating point](float.md) support, however Forth programmers highly prefer [fixed point](fixed_point.md) (as stated in the book *Starting Forth*). Then there is a newer Forth 2012 standard, but it's probably better to stick to the older one.
@ -28,7 +32,7 @@ In fact there are two stacks in Forth: the **parameter stack** (also data stack)
The stack is composed of **cells**: the size of the cell is implementation defined. The values stored in cells are just binary, they don't have any data type, so whether a value in given cell is considered signed or unsigned is up to the programmer -- some operators treat numbers as signed and some as unsigned (just like in [comun](comun.md)); note that with many operators the distinction doesn't matter (e.g. addition doesn't care if the numbers are signed or not, but comparison does).
Basic [abstraction](abstraction.md) of Forth is so called **word**: a word is simply a string without spaces like `abc` or `1mm#3`. A word represents simply some operations, which may include running native code, pushing numbers on stack or calling other words, for example the word the word `+` performs the addition on top of the stack, `dup` duplicates the top of the stack etc. The programmer can define his own words -- so words are basically kind of "[functions](function.md)" or rather procedures (however words don't return anything or take any arguments in traditional way, they all just invoke some operations -- arguments and return values are passed using the stack). Defining new words expands the current **dictionary**, so Forth basically extends itself as it's running. A word is defined like this:
Basic [abstraction](abstraction.md) of Forth is so called **word**: a word is simply a string without spaces like `abc` or `1mm#3`. A word represents simply some operations, which may include running native code, pushing numbers on stack or calling other words, for example the word the word `+` performs the addition on top of the stack, `dup` duplicates the top of the stack etc. The programmer can define his own words -- so words are basically kind of "[functions](function.md)" or rather procedures or routines (however words don't return anything or take any arguments in traditional way, they all just invoke some operations -- arguments and return values are passed using the stack). Defining new words expands the current **dictionary**, so Forth basically extends itself as it's running. A word is defined like this:
```
: myword operation1 operation2 ... ;
@ -40,6 +44,8 @@ For example a word that computes and average of the two values on top of the sta
: average + 2 / ;
```
Dictionary is a very important concept in Forth, it usually stores the words as a [linked list](list.md), starting with the oldest word -- this allows for example temporary shadowing of previously defined words with the same name.
Forth programmers use so called **stack notation** to document the function's "signature", i.e. what it does with the stack -- they write this notation in a comment above a defined word to signify to others what the word will do. Stack notation has the format `( before -- after )`, for example the effect of the above defined `average` words would be written as `( a b -- avg )` in this notation.
Some built-in words include:
@ -53,8 +59,13 @@ GENERAL:
/ divide ( a b -- [a/b] )
= equals ( a b -- [-1 if a = b else 0] )
<> not equals ( a b -- [-1 if a != b else 0] )
< less than ( a b -- [-1 if a < b else 0] )
> greater than ( a b -- [-1 if a > b else 0] )
< less than (signed) ( a b -- [-1 if a < b else 0] )
> greater than (signed) ( a b -- [-1 if a > b else 0] )
u< less than (unsigned) ( a b -- [-1 if a u< b else 0] )
u> greater than (unsigned) ( a b -- [-1 if a u> b else 0] )
0= equals zero ( a -- [-1 if a = 0 else 0] )
and bitwise and ( a b -- [a&b] )
or bitwise or ( a b -- [a|b] )
mod modulo ( a b -- [a % b] )
dup duplicate ( a -- a a )
drop pop stack top ( a -- )
@ -68,7 +79,7 @@ key read char on top
emit pop & print top as char
cr print newline
cells times cell width ( a -- [a * cell width in bytes] )
depth pop all & get d. ( a ... -- [previous stack size] )
depth gets stack size ( a ... -- [previous stack size] )
quit don't print "ok" at the end of execution
bye quit
@ -83,11 +94,12 @@ j pushes third value from return stack (without pop)
VARIABLES/CONSTS:
variable X creates var named X (X is a word that pushed its addr)
variable X creates var named X (X will be a word that pushed its addr.), allocates 1 cell
create X assigns X address (without allocating memory)
N X ! stores value N to variable X
N X +! adds value N to variable X
X @ pushes value of variable X to stack
N constant C creates constant C with value N
N constant C creates constant C with value N (C will be a new word)
C pushes the value of constant C
SPECIAL:
@ -95,6 +107,8 @@ SPECIAL:
( ) comment (inline)
\ comment (until newline)
." S" print string S (compiles in the string)
" S" create string S (don't print, pushes pointer and length)
type print string (expects pointer and length)
X if C then if X, execute C (only in word def., X is popped)
X if C1 else C2 then if X, execute C1 else C2 (only in word def.)
do C loop loops from stack top value to stack second from,
@ -104,34 +118,39 @@ begin C while like begin/until but loops as long as top != 0
begin C again infinite loop
begin C1 while C2 repeat loop with middle condition
leave loop break (only for counted loops)
allot allocates memory, can be used for arrays
N allot allocates N bytes of memory (moves end-of-mem ptr), e.g. for arrays
here returns current end-of-mem address ("H" pointer)
exit exits from current word
recurse recursively call the word currently being defined
see W shows the definition of word W
see W shows (decompiles) the definition of word W
' W get address of word W
```
example programs:
Forth uses counted **strings** (unlike [C](c.md) which uses NULL terminated strings), i.e. a string consists of an address pointing to the string start, and number saying the length of the string.
TODO: local variables, addresses, arrays, compile-time behavior of words, strings, double words
## Examples
These are some tiny example programs:
```
100 1 2 + 7 * / . \ computes and prints 100 / ((1 + 2) * 7)
```
```
cr ." hey bitch " cr \ prints: hey bitch
cr ." hey bitch" cr \ prints: hey bitch
```
```
: myloop 5 0 do i . loop ; myloop \ prints 0 1 2 3 4
```
TODO: local variables, addresses
## Examples
Here is our standardized **[divisor tree](divisor_tree.md)** program written in Forth:
And here is our standardized **[divisor tree](divisor_tree.md)** program written in Forth:
```
\ takes x, pops it and recursively prints its divisor tree
: printdivisortree
: printDivisorTree
dup 3 <= if
0 swap 1 swap \ stack now: 0 1 x
else
@ -167,7 +186,7 @@ Here is our standardized **[divisor tree](divisor_tree.md)** program written in
drop drop drop
;
: digittonum
: digitToNum
dup dup 48 >= swap 57 <= and if
48 -
else
@ -177,13 +196,15 @@ Here is our standardized **[divisor tree](divisor_tree.md)** program written in
: main
begin \ main loop, read numbers from user
0
." enter a number: "
0 \ number to read
begin
key
dup 13 <> while \ newline?
digittonum
digitToNum
dup -1 = if
bye
@ -195,6 +216,7 @@ Here is our standardized **[divisor tree](divisor_tree.md)** program written in
drop \ key
dup 1000 < if
dup . cr
printDivisorTree cr
else
bye