You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

26 lines
5.5 KiB
Markdown

# Assembly
1 year ago
Assembly (also ASM) is, for any given hardware computing platform ([ISA](isa.md), basically a [CPU](cpu.md) architecture), the lowest level [programming language](programming_language.md) that expresses a linear (typically unstructured) sequence of instructions -- it maps (mostly) 1:1 to [machine code](machine_code.md) (the actual [binary](binary.md) CPU instructions) and basically only differs from the actual machine code by utilizing a more human readable form (it mostly just gives human friendly nicknames to combinations of 1s and 0s). Assembly is converted by [assembler](assembler.md) into the the machine code. Assembly is similar to [bytecode](bytecode.md), but bytecode is meant to be [interpreted](interpreter.md) or used as an intermediate representation in [compilers](compiler.md) while assembly represents actual native code run by hardware. In ancient times when there were no higher level languages (like [C](c.md) or [Fortran](fortran.md)) assembly was used to write computer programs -- nowadays most programmers no longer write in assembly (majority of zoomer "[coders](coding.md)" probably never even touch anything close to it) because it's hard (takes a long time) and not [portable](portability.md), however programs written in assembly are known to be extremely fast as the programmer has absolute control over every single instruction.
1 year ago
**Assembly is NOT a single language**, it differs for every architecture, i.e. every model of CPU has potentially different architecture, understands a different machine code and hence has a different assembly; therefore **assembly is not [portable](portability.md)** (i.e. the program won't generally work on a different type of CPU or under a different [OS](os.md))! For this reason (and also for the fact that "assembly is hard") you shouldn't write your programs directly in assembly but rather in a bit higher level language such as [C](c.md) (which can be compiled to any CPU's assembly). However you should know at least the very basics of programming in assembly as a good programmer will come in contact with it sometimes, for example during hardcore [optimization](optimization.md) (many languages offer an option to embed inline assembly in specific places), debugging, reverse engineering, when writing a C compiler for a completely new platform or even when designing one's own new platform.
The most common assembly languages you'll encounter nowadays are **[x86](x86.md)** (used by most desktop [CPUs](cpu.md)) and **[ARM](arm.md)** (used by most mobile CPUs) -- both are used by [proprietary](proprietary.md) hardware and though an assembly language itself cannot (as of yet) be [copyrighted](copyright.md), the associated architectures may be "protected" (restricted) e.g. by [patents](patent.md). **[RISC-V](risc_v.md)** on the other hand is an "[open](open.md)" alternative, though not yet so wide spread. Other assembly languages include e.g. [AVR](avr.md) (8bit CPUs used e.g. by some [Arduinos](arduino.md)) and [PowerPC](ppc.md).
To be precise, a typical assembly language is actually more than a set of nicknames for machine code instructions, it may offer helpers such as [macros](macro.md) (something aking the C preprocessor), pseudoinstructions (commands that look like instructions but actually translate to e.g. multiple instructions), [comments](comment.md), directives, named labels for jumps (as writing literal jump addresses would be extremely tedious) etc.
## Typical Assembly Language
1 year ago
Assembly languages are usually unstructured, i.e. there are no control structures such as `if` or `while` statements: these have to be manually implemented using labels and jump ([goto](goto.md)) instructions. There may exist macros that mimic control structures. The typical look of an assembly program is however still a single column of instructions with arguments, one per line.
1 year ago
The working of the language reflects the actual [hardware](hardware.md) architecture -- usually there is a small number (something like 16) of [registers](register.md) which may be called something like R0 to R15, or A, B, C etc. Sometimes registers may even be subdivided (e.g. in x86 there is an *eax* 32bit register and half of it can be used as the *ax* 16bit register). These registers are the fastest available memory (faster than the main RAM memory) and are used to perform calculations. Some registers are general purpose and some are special: typically there will be e.g. the FLAGS register which holds various 1bit results of performed operations (e.g. [overflow](overflow.md), zero result etc.). Some instructions may only work with some registers (e.g. there may be kind of a "[pointer](pointer.md)" register used to hold addresses along with instructions that work with this register, which is meant to implement [arrays](array.md)). Values can be moved between registers and the main memory.
Instructions are typically written as three-letter abbreviations and follow some unwritten naming conventions so that different assembly languages at least look similar. Common instructions found in most assembly languages are for example:
- MOV (move): move a number between registers and/or memory.
- JMP (jump): unconditional jump to far away instruction.
- BEQ (branch if equal): jump if result of previous comparison was equality.
- ADD (add): add two numbers.
- NOP (no operation): do nothing (used e.g. for delays).
- CMP (compare): compare two numbers and set relevant flags (typically for a subsequent conditional jump).
Assembly languages may offer simple helpers such as macros.