You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

51 lines
6.2 KiB

# Assembly
1 year ago
Assembly (also ASM) is, for any given hardware computing platform ([ISA](, basically a [CPU]( architecture), the lowest level [programming language]( that expresses a linear (typically unstructured) sequence of instructions -- it maps (mostly) 1:1 to [machine code]( (the actual [binary]( CPU instructions) and basically only differs from the actual machine code by utilizing a more human readable form (it mostly just gives human friendly nicknames to combinations of 1s and 0s). Assembly is converted by [assembler]( into the the machine code. Assembly is similar to [bytecode](, but bytecode is meant to be [interpreted]( or used as an intermediate representation in [compilers]( while assembly represents actual native code run by hardware. In ancient times when there were no higher level languages (like [C]( or [Fortran]( assembly was used to write computer programs -- nowadays most programmers no longer write in assembly (majority of zoomer "[coders](" probably never even touch anything close to it) because it's hard (takes a long time) and not [portable](, however programs written in assembly are known to be extremely fast as the programmer has absolute control over every single instruction.
1 year ago
**Assembly is NOT a single language**, it differs for every architecture, i.e. every model of CPU has potentially different architecture, understands a different machine code and hence has a different assembly; therefore **assembly is not [portable](** (i.e. the program won't generally work on a different type of CPU or under a different [OS](! For this reason (and also for the fact that "assembly is hard") you shouldn't write your programs directly in assembly but rather in a bit higher level language such as [C]( (which can be compiled to any CPU's assembly). However you should know at least the very basics of programming in assembly as a good programmer will come in contact with it sometimes, for example during hardcore [optimization]( (many languages offer an option to embed inline assembly in specific places), debugging, reverse engineering, when writing a C compiler for a completely new platform or even when designing one's own new platform.
The most common assembly languages you'll encounter nowadays are **[x86](** (used by most desktop [CPUs]( and **[ARM](** (used by most mobile CPUs) -- both are used by [proprietary]( hardware and though an assembly language itself cannot (as of yet) be [copyrighted](, the associated architectures may be "protected" (restricted) e.g. by [patents]( **[RISC-V](** on the other hand is an "[open](" alternative, though not yet so wide spread. Other assembly languages include e.g. [AVR]( (8bit CPUs used e.g. by some [Arduinos]( and [PowerPC](
To be precise, a typical assembly language is actually more than a set of nicknames for machine code instructions, it may offer helpers such as [macros]( (something aking the C preprocessor), pseudoinstructions (commands that look like instructions but actually translate to e.g. multiple instructions), [comments](, directives, named labels for jumps (as writing literal jump addresses would be extremely tedious) etc.
1 year ago
Assembly is extremely low level, so you get no handholding or much programming "safety" (apart from e.g. CPU operation modes), you have to do everything yourself -- you'll be dealing with things such as function [call conventions](, [interrupts](, [syscalls]( and their conventions, memory segments, [endianness](, raw addresses/[goto]( jumps, call frames etc.
## Typical Assembly Language
1 year ago
Assembly languages are usually unstructured, i.e. there are no control structures such as `if` or `while` statements: these have to be manually implemented using labels and jump ([goto]( instructions. There may exist macros that mimic control structures. The typical look of an assembly program is however still a single column of instructions with arguments, one per line.
1 year ago
The working of the language reflects the actual [hardware]( architecture -- most architectures are based on [registers]( so usually there is a small number (something like 16) of registers which may be called something like R0 to R15, or A, B, C etc. Sometimes registers may even be subdivided (e.g. in x86 there is an *eax* 32bit register and half of it can be used as the *ax* 16bit register). These registers are the fastest available memory (faster than the main RAM memory) and are used to perform calculations. Some registers are general purpose and some are special: typically there will be e.g. the FLAGS register which holds various 1bit results of performed operations (e.g. [overflow](, zero result etc.). Some instructions may only work with some registers (e.g. there may be kind of a "[pointer](" register used to hold addresses along with instructions that work with this register, which is meant to implement [arrays]( Values can be moved between registers and the main memory.
Instructions are typically written as three-letter abbreviations and follow some unwritten naming conventions so that different assembly languages at least look similar. Common instructions found in most assembly languages are for example:
- MOV (move): move a number between registers and/or memory.
- JMP (jump): unconditional jump to far away instruction.
- BEQ (branch if equal): jump if result of previous comparison was equality.
- ADD (add): add two numbers.
- NOP (no operation): do nothing (used e.g. for delays).
- CMP (compare): compare two numbers and set relevant flags (typically for a subsequent conditional jump).
1 year ago
Assembly languages may offer simple helpers such as macros.
## Example
TODO: some C code and how it translates to different assembly langs
#include <stdio.h>
char incrementDigit(char d)
d >= '0' && d < '9' ?
d + 1 :
int main(void)
char c = getchar();
return 0;