Update

2024-08-22 22:58:37 +02:00 · 2024-08-22 22:58:37 +02:00 · 4bff69ec4a
commit 4bff69ec4a
parent d5e4b36718
12 changed files with 2056 additions and 1813 deletions
--- a/assembly.md
+++ b/assembly.md
@ -1,6 +1,6 @@
 # Assembly

-Assembly (also ASM) is, for any given [hardware](hw.md) computing platform ([ISA](isa.md), basically a [CPU](cpu.md) architecture), the lowest level [programming language](programming_language.md) that expresses typically a linear, unstructured (i.e. without nesting blocks of code) sequence of CPU instructions -- it maps (mostly) 1:1 to [machine code](machine_code.md) (the actual [binary](binary.md) CPU instructions) and basically only differs from the actual machine code by utilizing a more human readable form (it gives human friendly nicknames, or mnemonics, to different combinations of 1s and 0s). Assembly is converted by [assembler](assembler.md) into the the machine code, something akin a computer equivalent of the "[DNA](dna.md)", the lowest level instructions for the computer. Assembly is similar to [bytecode](bytecode.md), but bytecode is meant to be [interpreted](interpreter.md) or used as an intermediate representation in [compilers](compiler.md) while assembly represents actual native code run by hardware. In ancient times when there were no higher level languages (like [C](c.md) or [Fortran](fortran.md)) assembly was used to write computer programs -- nowadays most programmers no longer write in assembly (majority of [zoomer](zoomer.md) "[coders](coding.md)" probably never even touch anything close to it) because it's hard (takes a long time) and not [portable](portability.md), however programs written in assembly are known to be extremely fast as the programmer has absolute control over every single instruction (of course that is not to say you can't fuck up and write a slow program in assembly).
+Assembly (also ASM) is, for any given [hardware](hw.md) computing platform ([ISA](isa.md), basically a [CPU](cpu.md) architecture), the native, lowest level [programming language](programming_language.md) that expresses typically a linear, unstructured (i.e. without nesting blocks of code) sequence of very simple CPU instructions -- it maps (mostly) 1:1 to [machine code](machine_code.md) (the actual [binary](binary.md) CPU instructions) and basically only differs from the actual machine code by utilizing a more human readable form (it gives human friendly nicknames, or mnemonics, to different combinations of 1s and 0s). Assembly is converted by [assembler](assembler.md) into the the machine code, something akin a computer equivalent of the "[DNA](dna.md)", the lowest level instructions for the computer. Assembly is similar to [bytecode](bytecode.md), but bytecode is meant to be [interpreted](interpreter.md) or used as an intermediate representation in [compilers](compiler.md) and may even be quite high level while assembly represents actual native code run by the hardware. In ancient times when there were no higher level languages (like [C](c.md) or [Fortran](fortran.md)) assembly was used to write computer programs -- nowadays most programmers no longer write in assembly (majority of [zoomer](zoomer.md) "[coders](coding.md)" probably never even touch anything close to it) because it's hard (takes a long time) and not [portable](portability.md), however programs written in assembly are known to be extremely fast as the programmer has absolute control over every single instruction (of course that is not to say you can't fuck up and write a slow program in assembly) and is able to manually [optimize](optimization.md) every single detail about the program.

 { see this meme lol :D http://lolwut.info/images/4chan-g1.png ~drummyfish }

@ -12,7 +12,7 @@ The most common assembly languages you'll encounter nowadays are **[x86](x86.md)

 To be precise, a typical assembly language is actually more than a set of nicknames for machine code instructions, it may offer helpers such as [macros](macro.md) (something akin the C preprocessor), pseudoinstructions (commands that look like instructions but actually translate to e.g. multiple instructions), [comments](comment.md), directives, automatic inference of opcode from operands, named labels for jumps (as writing literal jump addresses would be extremely tedious) etc. I.e. it is still much easier to write in assembly than to write pure machine code even if you knew all opcodes from memory. For the same reason remember that just replacing assembly mnemonics with binary machine code instructions is not yet enough to make an executable program! More things have to be done such as [linking](linking.md) [libraries](library.md) and converting the result to some [executable format](executable_format.md) such as [elf](elf.md) which contains things like header with metainformation about the program etc.

-**How will programming in assembly differ from your mainstream high-level programming?** Quite a lot, assembly is extremely low level, so you get no handholding or much programming "safety" (apart from e.g. CPU operation modes), you have to do everything yourself -- you'll be dealing with things such as function [call conventions](call_convention.md), [interrupts](interrupt.md), [syscalls](syscall.md) and their conventions, counting CPU cycles of individual instructions, looking up exact hexadecimal memory addresses, opcodes, defining memory segments, dealing with [endianness](endianness.md), raw [goto](goto.md) jumps, [call frames](call_frame.md) etc. You have no branching (if-then-else), loops or functions, you make these yourself with gotos. You can't write expressions like `(a + 3 * b) / 10`, no, you have to write every step of how to evaluate this expression using registers, i.e. something like: load *a* to register *A*, load *b* to register *B*, multiply *B* by 3, add register *B* to *A*, divide *A* by 10. You don't have any [data types](data_type.md), you have to know yourself that your variables really represent signed values so when you're dividing, you have to use signed divide instruction instead of unsigned divide -- if you mess this up, no one will tell you, your program simply won't work. And so on.
+**How will programming in assembly differ from your mainstream high-level programming?** Quite a lot, assembly is extremely low level, so you get no handholding or much programming "safety" (apart from e.g. CPU operation modes), you have to do everything yourself -- for example assembly languages are **untyped**, i.e. no one is going to offer or check your data types, everything is just 1s and 0s. You will also be dealing with things such as function [call conventions](call_convention.md), call stack and call frames, [interrupts](interrupt.md), overflows, [system calls](syscall.md) and their conventions, counting CPU cycles of individual instructions, looking up exact hexadecimal memory addresses, studying opcodes, defining memory segments, dealing with [endianness](endianness.md), raw [goto](goto.md) jumps, manual [memory management](memory_management.md) etc. You have no branching (if-then-else), loops or functions, you make these yourself with gotos. You can't write expressions like `(a + 3 * b) / 10`, no, you have to write every single step of how to evaluate this expression using registers, i.e. something like: load *a* to register *A*, load *b* to register *B*, multiply *B* by 3, add register *B* to *A*, divide *A* by 10. As said, you don't have any [data types](data_type.md), you have to know yourself that your variables really represent let's say a signed value so when you're dividing, you have to use signed divide instruction instead of unsigned divide -- if you mess this up, no one will tell you, your program simply won't work. And so on.

 ## Typical Assembly Language

@ -24,7 +24,7 @@ The working of the language reflects the actual [hardware](hardware.md) architec

 Writing instructions works similarly to how you call a [function](function.md) in high level language: you write its name and then its [arguments](argument.md), but in assembly things are more complicated because an instruction may for example only allow certain kinds of arguments -- it may e.g. allow a register and immediate constant (kind of a number literal/constant), but not e.g. two registers. You have to read the documentation for each instruction. While in high level language you may write general [expressions](expression.md) as arguments (like `myFunc(x + 2 * y,myFunc2())`), here you can only pass specific values.

-There are also no complex [data types](data_type.md), assembly only works with numbers of different size, e.g. 16 bit integer, 32 bit integer etc. Strings are just sequences of numbers representing [ASCII](ascii.md) values, it is up to you whether you implement null terminated strings or Pascal style strings. [Pointers](pointer.md) are just numbers representing addresses. It is up to you whether you interpret a number as signed or unsigned (some instructions treat numbers as unsigned, some as signed, some don't care because it doesn't matter).
+There are also no advanced [data types](data_type.md), assembly only works with numbers of different sizes, e.g. 16 bit integer, 32 bit integer etc. Strings are just sequences of numbers representing [ASCII](ascii.md) values, it is up to you whether you implement null terminated strings or Pascal style strings. [Pointers](pointer.md) are just numbers representing addresses. It is up to you whether you interpret a number as signed or unsigned (some instructions treat numbers as unsigned, some as signed, some don't care because it doesn't matter).

 Instructions are typically written as three-letter abbreviations and follow some unwritten naming conventions so that different assembly languages at least look similar. Common instructions found in most assembly languages are for example: