23 KiB
Unix
"Those who don't know Unix are doomed to reinvent it, poorly." --obligatory quote by Henry Spencer
Unix (plurar Unixes or Unices) is an old operating system developed since 1960s as a research project of Bell Labs, which has become one of the most influential pieces of software in history and whose principles (e.g. the Unix philosophy, everything is a file, ...) live on in many so called Unix-like operating systems such as Linux and BSD (at least to some degree). The original system itself is no longer in use (it was later followed by a new project, plan9, which itself is now pretty old), the name UNIX is nowadays a trademark and a certification. However, as someone once said, Unix is not so much an operating system as a way of thinking.
In one aspect Unix has reached the highest level a software can strive for: it has transcended its implementation and became a de facto standard. This means it has become a set of interface conventions, "paradigms", cultural and philosophical ideas rather than being a single system, it lives on as a concept that has many implementations. This is extremely important as we don't depend on any single Unix implementation but we have a great variety of choice between which we can switch without greater issues. This is very important for freedom -- it prevents monopolization -- and its one of the important reasons to use unix-like systems.
The main highlights of Unix are possibly these:
- Unix philosophy: a kind of general mindset of software development, usually summed up as "do one things well" (rather than "do everything but poorly") and "make programs work in collaboration with other programs", advising on using universal text interfaces for communication etc. This often comes with the idea of pipes, a way of chaining programs (typically using the pipe
|
operator, hence the name) by sending one program's output to other program's input. - everything is a file: Unix chose to use the file abstraction to enable universal communication of programs with hardware and among themselves, i.e. on unices most things such as printing, reading keyboard, networking etc. will be likely implemented as reading or writing to/from some special (sometimes just virtual) file. This has the advantage of being able to just use some file reading library or syscall, not having to access physical memory bits in memory, which may be difficult, unsafe etc.
- Text centrism (great command line preference), value on portability (even over performance), sharing of source code, freedom of information and openness, connection to hacker culture, valuing human time over machine time, ...
- ...
Unix is greatly connected to software minimalism, however most unices are still not minimalist to absolute extreme and many unix forks (e.g. GNU/Linux) just abandon minimalism as a priority. So the question stands: is Unix LRS or is it too bloated? The answer to this will be similar to our stance towards the C language (which itself was developed alongside Unix); from our point of view Unix -- i.e. its concepts and some of their existing implementations -- is relatively good, there is a lot of wisdom to take away (e.g. "do one thing well", modularity, "use text interfaces", ...), however these are intermixed with things which under more strict minimalism we may want to abandon (e.g. multiple users, file permissions and ownership, also "everything is a file" requires we buy into the file abstraction and will often also imply existence of a file system etc., which may be unnecessary, even multitasking could be dropped), so in some ways we see Unix as a temporary "least evil" tool on our way to truly good, extremely minimalist technology. DuskOS is an example of operating system more close to the final idea of LRS. But for now Unix is very cool, some Unix-like systems are definitely a good choice nowadays.
There is a semi humorous group called the UNIX HATERS that has a mailing list and a whole book that criticizes Unix, arguing that the systems that came before it were much better -- though it's mostly just joking, they give some good points sometimes. It's like they are the biggest boomers for whom the Unix is what Windows is to the Unix people.
History
In the 1960s, Bell Labs along with other groups were developing Multics, a kind of operating system -- however the project failed and was abandoned for its complexity and expensiveness of development. In 1969 two Multics developers, Ken Thompson and Dennis Ritchie, then started to create their own system, this time with a different philosophy; that of simplicity (see Unix philosophy). They weren't alone in developing the system, a number of other hackers helped program such things as a file system, shell and simple utility programs. At VCF East 2019 Thompson said that they developed Unix as a working system in three weeks. At this point Unix was written in assembly.
In the early 1970s the system got funding as well as its name Unix (a pun on Multix). By now Thompson and Richie were developing a new language for Unix which would eventually become the C language. In version 4 (1973) Unix was rewritten in C.
Unix then started being sold commercially. This led to its fragmentation into different versions such as the BSD or Solaris. In 1983 a version called System V was released which would become one of the most successful. The fragmentation and a lack of a unified standard led to so called Unix Wars in the late 1980s, which led to a few Unix standards such as POSIX and Single Unix Specification.
For zoomers and other noobs: Unix wasn't like Windows, it was more like DOS, things were done in text interface only (even a TUI or just colorful text was a luxury) -- if you use the command line in "Linux" nowadays, you'll get an idea of what it was like, except it was all even more primitive. Things we take for granted such as a mouse, copy-pastes, interactive text editors, having multiple user accounts or running multiple programs at once were either non-existent or advanced features in the early days. There weren't even personal computers back then, people accessed share computers over terminals. Anything these guys did you have to see as done with stone tools -- they didn't have GPUs, gigaherts CPUs, gigabytes of RAM, scripting languages like Python or JavaScript, Google, stack overflow, wifi, mice, IDEs, multiple HD screens all around, none of that -- and yet they programmed faster, less buggy software that was much more efficient. If this doesn't make you think, then probably nothing will.
How To For Noobs
UNDER CONSTRUCTION
Note: here by "Unix" we will more or less assume a system conforming to some version of the POSIX standard.
This section will help complete noobs kickstart their journey with a Unix-like system such as GNU/Linux or BSD. Please be aware that each system has its additional specifics, for example package managers, init systems and so on -- these you must learn about elsewhere as here we may only cover the core parts those systems inherited from the original Unix. Having learned this though you should be able to somewhat fly any Unix like system. Obviously we'll be making some simplifications here too.
Learning to use Unix practically means learning the command line plus a few extra things (such as various concepts, philosophies, conventions, file system structure etc.). Your system will have a way for you to enter the command line where you can interact with it only through textual commands (i.e. without GUI). Sometimes the system boots up to command line, sometimes you must click some icon (usually called terminal, term, shell, command line etc.), sometimes you can switch TTYs with CTRL+ALT+Fkeys etc. To command line virgins this will seem a little intimidating but it's absolutely required to know at least the basics, on Unices the command line is extremely powerful, efficient and much can only ever be achieved through command line.
The gist: unsurprisingly in command line you write commands -- many of these are actually tiny programs called Unix utilities (or just "utils"). These are tools for you to do whatever you want (including stuff that on normie systems are usually done by clicking with a mouse). For example ls
is a program that writes out list of files in the working directory, cd
is a program that changes working directory etc. There are many more such programs and you must learn at least the most commonly used ones. Good news is that these programs are more or less the same on every Unix system so you just learn this once. There also exist other kinds of commands -- those defined by the shell language (shell is basically a fancy word for the textual interface), which allow us to combine the utilities together and even program the shell (we call this scripting). First learn the utils (see the list below).
PRO TIP: convenient features are often implemented, most useful ones include going through the history of previously typed commands with UP/DOWN keys and completing commands with the TAB key, which you'll find yourself using very frequently. Try it. It's enough to type just first few letters and then press tab, the command will be completed (at least as much as can be guessed).
You run a utility simply by writing its name, for example typing ls
will show you a list of files in your current directory. Very important is the man
command that shows you a manual page for another command, e.g. typing man ls
should display a page explaining the ls
utility in detail. Short help for a utility can also usually be obtained by writing -h
after it, for example grep -h
.
Unix utilities (and other programs) can also be invoked with arguments that specify more detail about what should be done. Arguments are written after the utility name and are separated by spaces (if the argument itself should contain space, it must be enclosed between double quotes, e.g.: "abc def"
is a single arguments containing space, but abc def
are two arguments). For example the cd
(change directory) utility must be given the name of a directory to go to, e.g. cd mydirectory
.
Some arguments start with one or two minus characters (-
), for example -h
or --help
. These are usually called flags and serve either to turn something on/off or to name other parameters. For example many utilities accept a -s
flag which means "silent" and tells the utility to shut up and not write anything out. A flag oftentimes has a short and long form (the long one starting with two minus characters), so -s
and --silent
are the same thing. The other type of flag says what kind of argument the following argument is going to be -- for example a common one is --output
(or -o
) with which we specify the name of the output file, so for instance running a C compiler may look like c99 mysourcecode.c --output myprogram
(we tell the compiler to name the final program "myprogram"). Short flags can usually be combined like so: instead of -a -b -c
we can write just -abc
. Flags accepted by utilities along with their meaning are documented in the manual pages (see above).
To run a program that's present in the current directory as a file you can't just write its name (like you could e.g. in DOS), it MUST be prefixed it with ./
(shorthand for current directory), otherwise the shell thinks you're trying to run an INSTALLED program, i.e. it will be looking for the program in a directory where programs are installed. For example having a program named "myprogram" in current directory it will be run with ./myprogram
. Also note that to be able to run a file as a program it must have the executable mode set, which is done with chmod +x myprogram
(you may have to do this if you e.g. download the program from the Internet). Programs can also take arguments just like we saw with the built-in utilities, so you can run a program like ./myprogram abc def --myflag
.
Now to the very basic stuff: browsing directories, moving and deleting files etc. This is done with the following utils: ls
(prints files in current directory), pwd
(prints path to current directory), cd
(travels to given directory, cd ..
travels back), cat
(outputs content of given file), mkdir
(creates directory), rm
(removes given file; to remove a directory -rf
flag must be present), cp
(copies file), mv
(moves file, including directory -- note that moving also serves for renaming). As an exercise try these out (careful with rm -rf
) and read manual pages of the commands (you'll find that ls
can also tell you for example the file sizes and so on).
Files and file system: On Unices the whole filesystem hierarchy starts with a directory called just /
(the root directory), i.e. every absolute (full) path will always start with slash. For example pictures belonging to the user john may live under /home/john/pictures
. It's also possible to use relative paths, i.e. ones that are considered to start in the current (working) directory. A dot (.
) stands for current directory and two dots (..
) for the directory "above" the current one. I.e. if our current directory is /home/john
, we can list the pictures with ls pictures
as well as ls /home/john/pictures
or ls ./pictures
. Absolute and relative paths are distinguished by the fact the absolute one always starts with /
while relative don't. There are several types of files, most importantly regular files (the "normal" files) and directories (there are more such symbolic links, sockets, block special files etc., but for now we'll be ignoring these). Unix has a paradigm stating that everything's a file, so notably accessing e.g. hardware devices is done by accessing special device files (placed in /dev
).
Files additionally have attributes, importantly so called permissions -- unfortunately these are a bit complicated, but as a mere user working with your own files you won't have to deal too much with them, only remember if you encounter issues with accessing files, it's likely due to this. In short: each file has an owner and then also a set of permissions that say who's allowed to do what with the file. There are three kind of permissions: read (r
), write (w
) and execute (x
), and ALL THREE are defined for the file's owner, for the file's group and for everyone else, plus there is a magical value suid/sgid/sticky we won't delve into. All of this is then usually written either as a 4 digit octal number (each digit expresses the three permission bits) or as a 12 character string (containing the r
/w
/x
/-
characters). Well, let's not dig much deeper now.
TODO: more more more
Here is a quick cheatsheet of the most common Unix utilities:
name | function | possible arguments (just some) |
---|---|---|
alias | create or display alias (nickname for another command) | alias=command |
awk | text processing language (advanced) | |
bc | interactive calculator | |
c99 | C language compiler (advanced) | file, -o (output file) |
cd | change directory | directory name (.. means back) |
chmod | change file mode | +x (execute), +w (write), +r (read), file |
cmp | compare files | -s (silent), file1, file2 |
cp | copy files | -r (recursive, for dirs), file, newfile |
date | write date and/or time | format |
df | report free space on disk | -k (use KiB units) |
du | estimate size of file (useful for directories) | -k (use KiB units), -s (only total), file |
echo | write out string (usually for scripts) | |
ed | ed is the standard text editor | |
expr | evaluate expression (simple calculator) | expression (as separate arguments) |
false | return false value | |
grep | search for pattern in file | pattern, file, -i (case insensitive) |
head | show first N lines of a file | -n (count), file |
kill | terminate process or send a signal to it | processid, -9 (kill), -15 (terminate) |
ls | list directory (shows files in current dir.) | -s (show file sizes in block) |
man | show manual page for topic | topic |
mkdir | make directory | name |
mv | move (rename) file | -i (ask for rewrite), file, newfile |
pwd | print working directory | |
rm | remove files | -r (recursive, for dirs), -f (force) |
sed | stream editing util (replacing text etc.), see also regex | script, file |
sh | shell (the command line interpreter, usually for scripting) | -c (command string) |
sort | sort lines in file | -r (reverse), -u (unique), file |
tail | show last N lines of a file | -n (count), file |
true | return true value | |
uname | output system name and info | -a (all, output everything) |
vi | advanced text editor | |
wc | word count (count characters or lines in file, can tell exact file size) | -c (character), -l (lines), file |
NOTES on the utilities:
- Typically there are two ways of feeding input data to a utility: either by specifying a file to read from or by feeding the input on to the utility's standard input. This also applies to the output. Using standard input/output is a more "Unix" way as it allows us to chain the utlities with pipes, make one program feed its output to another an input.
- Utilities try to follow common conventions so that it's easier to guess and remember what flags mean etc., for example
-h
is commonly a flag for getting help,-o
is one for specifying output file etc. - Specific Unix systems will normally have more feature rich utilities, supporting additional flags and even adding new utilities. Check out manual pages on your system. You'll have to learn about common utils that aren't part of POSIX, e.g. wget, ssh, curl, sudo, apt and more.
Now on to a key feature of Unix: pipelines and redirects. Processes (running programs) on Unix have so called standard input (stdin) and standard output (stdout) -- these are streams of data (often textual but also binary) that the process takes on input and output respectively. There may also exist more streams (notably e.g. standard error output) but again, we'll ignore this now. When you run a program (utility etc.) in the command line, standard input will normally come from your keyboard and standard output will be connected to the terminal (i.e. you'll see it being written out in the command line). However sometimes you may want the program to take input from a file and/or to write its output to a file (imagine e.g. keeping logs), or you may even want one program to feed its output as an input to another program! This is very powerful as you may combine the many small utilities into more powerful units. See also Unix philosophy.
Most commonly used redirections are done like this:
command > file
: redirects output of command to file file (rewriting its content if there is any).command < file
: redirects input of command to come from file.command >> file
: output of command will be appended (added to the end) to file.
Example:
TODO: stdin/out/err, utils, shell, sh (running programs, ...), usual "workflows" (man pages, history, arrows, tab-completion, ...), often used commands, examples, permissions, variables and exit codes, wildcards