Update

2024-07-18 20:20:45 +02:00 · 2024-07-18 20:20:45 +02:00 · fb2cdc5096
commit fb2cdc5096
parent cc40dcb437
11 changed files with 1799 additions and 1786 deletions
--- a/bootstrap.md
+++ b/bootstrap.md
@ -1,24 +1,26 @@
 # Bootstrap/Boot

-In general bootstrapping (from the idiom "pull yourself up by your bootstraps"), sometimes shortened to just *booting*, refers to a clever process of self-establishing some relatively complex system starting from something very small, without much external help. As an example imagine something like a "civilization bootstrapping kit" that contains only few primitive tools along with instructions on how to use those tools to mine ore, turn it into metal out of which one makes more tools which will be used to obtain more material and so on up until having basically all modern technology and factories set up in relatively short time ([civboot](civboot.md) is a project like this). The term *bootstrapping* is however especially relevant in relation to [computer](computer.md) technology -- here it has two main meanings:
+In general bootstrapping (from the idiom "pull yourself up by your bootstraps"), sometimes shortened to just *booting*, refers to a clever process of self-establishing a relatively complex system starting from something very small, without much external help. A nice example comes from the nature: a big plant that can do complex things such as reproduce itself initially grows ("bootstraps") from just a very tiny seed. As another example imagine something like a "civilization bootstrapping kit" that contains only a few primitive tools along with instructions on how to use those tools to mine ore, turn it into metal out of which one makes more tools which will be used to obtain more material and so on up until having basically all modern technology and factories set up in relatively short time ([civboot](civboot.md) is a project like this). The term *bootstrapping* is however especially relevant in relation to [computer](computer.md) technology -- here it has two main meanings:

 - The process by which a computer starts and sets up the [operating system](operating_system.md) after power on, which often involves several stages of loading various modules, running several bootloaders etc. This is traditionally called **booting** (*rebooting* means restarting the computer).
 - Utilizing the principle of bootstrapping for making greatly independent [software](software.md), i.e. software that doesn't [depend](depend.md) on other software as it can set itself up. This is usually what **bootstrapping** (the longer term) means. This is also greatly related to **[self hosting](self_hosting.md)**, another principle whose idea is to "implement technology using itself".

 ## Bootstrapping: Making Dependency-Free Software

-Bootstrapping as a general principle can aid us in creation of extremely free technology by greatly minimizing all its [dependencies](dependency.md), we are able to create a small amount of code that will self-establish our whole computing environment with only very small effort during the process. The topic mostly revolves around designing [programming language](programming_language.md) [compilers](compiler.md), but in general we may be talking about bootstrapping whole computing environments, operating systems etc.
+Bootstrapping -- as the general concept of letting a big thing grow out of a small seed -- may aid us in building extremely [free](free_software.md) (as in freedom), [portable](portability.md), self-contained (and also [secure](security.md)) technology by reducing all its [dependencies](dependency.md) to a minimum. If we are building a big computing environment (such as an operating system), we should make sure that all the big things it contains are made only with the smaller things that are further on built using yet smaller things and so on until some very tiny piece of code, i.e. we shall make sure there is always a way to set this whole system from the ground up, from a very small amount of initial code/tools. Being able to do this means our system is *bootstrappable* and it will allow us for example to set our whole system up on a completely new computing platform (e.g. a new CPU architecture) as long as we can set up that tiny initial prerequisite code. This furthermore removes the danger of dependencies that might kill our system and also allows security freaks to inspect the whole process of the system set up so that they can trust it (because even free software that sometime in the past touched a proprietary compiler can't generally be trusted -- see [trusting trust](trusting_trust.md)). I.e. bootstrapping means creating a very small amount of code that will self-establish our whole computing environment by first compiling small compilers that will then compile more complex compilers which will compile all the tools and programs etc. This topic is discussed for example in designing [programming language](programming_language.md) [compilers](compiler.md) and [operating systems](os.md). For examples of bootstrapping see e.g. [DuskOS](duskos.md) ([collapse](collapse.md)-ready operating system that bootstraps itself from a tiny amount of code), [GNU](gnu.md) [mes](mes.md) (bootstrapping system of the GNU operating system) or [comun](comun.md) (LRS programming language, now self hosted and bootstrappable e.g. from a few hundred lines of [C](c.md)).

-**Why be concerned with bootstrapping when we already have our systems set up?** There are many reasons, one of the notable ones is that we may lose our current technology due to societal [collapse](collapse.md), which is not improbable, it keeps happening throughout history over and over, so many people fear (rightfully so) that if by whatever disaster we lose our current computers, Internet etc., we will also lose with it all modern art, data, software we so painfully developed, digitized books, inventions and so on; not talking about the horrors that will follow if we're unable to quickly reestablish our computer networks we are so dependent on. Setting up what we currently have completely from scratch would be extremely difficult, a task for centuries -- just take a while to consider all the activity and knowledge that's required around the globe to create a single computer with all its billions of lines of code worth of software that makes it work. Knowledge of old technology gets lost -- to make modern computers we first needed older, primitive computers, but now that we only have modern computers no one remembers anymore how to make the older computers -- if we lose the current ones, we won't be able to make them, we will lack the tools. Another reason for bootstrapping is independence of technology which brings e.g. [freedom](freedom.md) (your operating system being able to be set up anywhere without some corporation's proprietary driver or hardware unit is pursued by many), robustness, simplicity, ability to bring existing software to new platforms and so on, i.e. things that are practical even in current world.
+**Why be concerned with bootstrapping when we already have our systems set up?** Besides the obvious elegance of this whole approach there are many other practical reasons -- as mentioned, some are concerned about "security", some want portability, control and independence -- one of other notable justifications is that we may lose our current technology due to societal [collapse](collapse.md), which is not improbable as it keeps happening throughout history over and over, so many people fear (rightfully so) that if by whatever disaster we lose our current computers, Internet etc., we will also lose with it all modern art, data, software we so painfully developed, digitized books and so on; not talking about the horrors that will follow if we're unable to quickly reestablish our computer networks we are so dependent on. Setting up what we currently have completely from scratch would be extremely difficult, a task for centuries -- just take a while to consider all the activity and knowledge that's required around the globe to create a single computer with all its billions of lines of code worth of software that makes it work. Knowledge of old technology gets lost -- to make modern computers we first needed older, primitive computers, but now that we only have modern computers no one remembers anymore how to make the older computers -- modern computers are sustaining themselves but once they're gone, we won't know how to make them again, i.e. if we lose computers, we will also lose tools for making computers. This applies on many levels (hardware, operating systems, programming languages and so on).

-**[Forth](forth.md) has traditionally been used for making bootstrapping environments**; [Dusk OS](duskos.md) is an example of such project. Similarly simple language such as [Lisp](lisp.md) and [comun](comun.md) will probably work too.
+Bootstrapping has to start with some initial prerequisite machine dependent binary code that kickstarts the self-establishing process, i.e. it's not possible to get rid of absolutely ALL binary code and have a pure bootstrappable code that would run on every computer -- that would require making a program that can native run on any computer, which can't be done -- but it is possible to get it to absolute minimum -- let's say a few dozen bytes of machine code that can even be hand-made on paper and can be easily inspected for "safety". This initial binary code is called *bootstrapping binary seed*. This code can be as simple as a mere translator of some extremely simple bytecode (that may consist only of handful of instructions) to the platform's assembly language. There even exists the extreme case of a single instruction computer, but in practice it's not necessary to go as far. The initial binary seed may then typically be used to translate a precompiled bytecode of our system's compiler to native runnable code and voila, we can now happily start compiling whatever we want.
+
+[Forth](forth.md) is a language that has traditionally been used for making bootstrapping environments; [Dusk OS](duskos.md) is an example of such project. Similarly simple language such as [Lisp](lisp.md) and [comun](comun.md) can work too (GNU Mes uses a combination of [Scheme](scheme.md) and C).

 **How to do this then?** To make a computing environment that can bootstrap itself you can do it like this:

-1. **Make a [simple](kiss.md) [programming language](programming_language.md) L**. You can choose e.g. the mentioned [Forth](forth.md) but you can even make your own, just remember to keep it extremely simple -- simplicity of the base language is the key feature here. The language will serve as tool for writing software for your platform, i.e. it will provide some comfort in programming (so that you don't have to write in assembly) but mainly it will be an **[abstraction](abstraction.md) layer** for the programs, it will allow them to run on any hardware/platform. The language therefore has to be **[portable](portability.md)**; it should probably abstracts things like [endianness](byte_sex.md), native integer size, control structures etc., so as to work nicely on all [CPUs](cpu.md), but it also mustn't have too much abstraction (such as [OOP](oop.md)) otherwise it will quickly get complicated. The language can compile e.g. to some kind of very simple [bytecode](bytecode.md) that will be easy to translate to any [assembly](assembly.md). At first you'll have to temporarily implement L in some already existing language, e.g. [C](c.md). NOTE: in theory you could just make bytecode, without making L, and just write your software in that bytecode, but the bytecode has to focus on being simple to translate, i.e. it will e.g. likely have few opcodes, which will be in conflict with making it at least somewhat comfortable to program on your platform. However one can try to make some compromise and it will save the complexity of translating language to bytecode, so it can be considered ([uxn](uxn.md) seems to be doing this).
-2. **Write L in itself, i.e. [self host](self_hosting.md) it**. This means you'll use L to write a [compiler](compiler.md) of L that outputs L's bytecode. Once you do this, you have a completely independent language and can throw away the original compiler of L written in another language. Now compile L with itself -- you'll get the bytecode of L compiler. At this point you can bootstrap L on any platform as long as you can execute the L bytecode on it -- this is why it was crucial to make L and its bytecode very simple. In theory it's enough to just interpret the bytecode but it's better to translate it to the platform's native machine code so that you get maximum efficiency (the nature of bytecode should make it so that it isn't really more diffiult to translate it than to interpret it). If for example you want to bootstrap on an [x86](x86.md) CPU, you'll have to write a program that translates the bytecode to x86 assembly; if we suppose that at the time of bootstrapping you will only have this x86 computer, you will have to write the translator in x86 assembly manually. If your bytecode really is simple and well made, it shouldn't be hard though (you will mostly be replacing your bytecode opcodes with given platform's machine code opcodes).
-3. **Further help make L bootstrapable**. This means making it even easier to execute the L bytecode on any given platform -- you may for example write the bytecode translators for common platforms like x86, ARM, RISC-V and so on. At this point you have L bootstrappable without any [work](work.md) on the platform you have translators for and on others it will just take a tiny bit of work to write its own translator.
-4. **Write everything else in L**. This means writing the platform itself and software such as various tools and libraries. You can potentially even use L to write a higher level language for yet more comfort in programming. Since everything here is written in L and L can be bootstrapped, everything here can be bootstrapped as well.
+1. **Make a [simple](kiss.md) [programming language](programming_language.md) L**. You can choose e.g. the mentioned [Forth](forth.md) but you can even make your own, just remember to keep it extremely simple -- simplicity of the base language is the key feature here. If you also need a more complex language, write it in L. The language L will serve as tool for writing software for your platform, i.e. it will provide some comfort in programming (so that you don't have to write in assembly) but mainly it will be an **[abstraction](abstraction.md) layer** for the programs, it will allow them to run on any hardware/platform. The language therefore has to be **[portable](portability.md)**; it should probably abstracts things like [endianness](byte_sex.md), native integer size, control structures etc., so as to work nicely on all [CPUs](cpu.md), but it also mustn't have too much abstraction (such as [OOP](oop.md)) otherwise it will quickly get complicated. The language can compile e.g. to some kind of very simple [bytecode](bytecode.md) that will be easy to translate to any [assembly](assembly.md). Make the bytecode very simple (and document it well) as its complexity will later on determine the complexity of the bootstrap binary seed. At first you'll have to temporarily implement L in some already existing language, e.g. [C](c.md). NOTE: in theory you could just make bytecode, without making L, and just write your software in that bytecode, but the bytecode has to focus on being simple to translate, i.e. it will probably have few opcodes for example, which will be in conflict with making it at least somewhat comfortable to program on your platform. However one can try to make some compromise and it will save the complexity of translating language to bytecode, so it can be considered ([uxn](uxn.md) seems to be doing this).
+2. **Write L in itself, i.e. [self host](self_hosting.md) it**. This means you'll use L to write a [compiler](compiler.md) of L that outputs L's bytecode. Once you do this, you have a completely independent language and can start using it instead of the original compiler of L written in another language. Now compile L with itself -- you'll get the bytecode of L compiler. At this point you can bootstrap L on any platform as long as you can execute the L bytecode on it -- this is why it was crucial to make L and its bytecode very simple. In theory it's enough to just interpret the bytecode but it's better to translate it to the platform's native machine code so that you get maximum efficiency (the nature of bytecode should make it so that it isn't really more diffiult to translate it than to interpret it). If for example you want to bootstrap on an [x86](x86.md) CPU, you'll have to write a program (L compiler [backend](backend.md)) that translates the bytecode to x86 assembly; if we suppose that at the time of bootstrapping you will only have this x86 computer, you will have to write the translator in x86 assembly manually. If your bytecode really is simple and well made, it shouldn't be hard though (you will mostly be replacing your bytecode opcodes with given platform's machine code opcodes). Once you have the x86 backend, you can completely bootstrap L's compiler on any x86 computer.
+3. **Further help make L bootstrapable**. This means making it even easier to execute the L bytecode on any given platform -- you may for example write backends (the bytecode translators) for common platforms like x86, ARM, RISC-V, C, Lisp and so on. At this point you have L bootstrappable without any [work](work.md) on the platforms for which you provide backends and on others it will just take a tiny bit of work to write its own translator.
+4. **Write everything else in L**. This means writing the platform itself and software such as various tools and libraries. You can potentially even use L to write a higher level language (e.g. C) for yet more comfort in programming. Since everything here is written in L and L can be bootstrapped, everything here can be bootstrapped as well.

 ## Booting: Computer Starting Up