You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

10 KiB

C

{ We have a C tutorial! ~drummyfish }

C is a low level, statically typed imperative compiled programming language, the go-to language of most less retarded. It is the absolutely preferred language of the suckless community as well as of most true experts, for example the Linux and OpenBSD developers, because of its good minimal design, level of control, uncontested performance and a greatly established and tested status.

C is usually not considered an easy language to learn because of its low level nature: it requires good understanding of how a computer actually works and doesn't prevent the programmer from shooting himself in the foot. Programmer is given full control (and therefore responsibility). There are things considered "tricky" which one must be aware of, such as undefined behavior of certain operators and raw pointers. This is what can discourage a lot of modern "coding monkeys" from choosing C, but it's also what inevitably allows such great performance -- undefined behavior allows the compiler to choose the most efficient implementation.

History and Context

C was developed in 1972 at Bell Labs alongside the Unix operating system by Dennis Ritchie and Brian Kerninghan, as a successor to the B language (portable language with recursion) written by Denis Ritchie and Ken Thompson, which was in turn inspired by the the ALGOL language (code blocks, lexical scope, ...).

In 1973 Unix was rewritten in C. In 1978 Keninghan and Ritchie published a book called The C Programming Language, known as K&R, which became something akin the C specification. In 1989, the ANSI C standard, also known as C89, was released by the American ANSI. The same standard was also adopted a year later by the international ISO, so C90 refers to the same language. In 1999 ISO issues a new standard that's known as C99.

TODO

Standards

C is not a single language, there have been a few standards over the years since its inception in 1970s. The notable standards and versions are:

  • K&R C: C as described by its inventors in the book The C Programming Language, before official standardization. This is kind of too ancient nowadays.
  • C89/C90 (ANSI/ISO C): First fully standardized version, usable even today, many hardcore C programmers stick to this version so as to enjoy maximum compiler support.
  • C95: A minor update of the previous standard, adds wide character support.
  • C99: Updated standard from the year 1999 striking a great balance between "modern" and "good old". This is a good version to use in LRS programs, but will be a little less supported than C89, even though still very well supported.
  • C11: Updated standard from the year 2011. This one is too bloated and isn't worth using.
  • C17/C18: Yet another update, yet more bloated and not worth using anymore.
  • ...

LRS should use C99 or C89 as the newer versions are considered bloat and don't have such great support in compilers, making them less portable and therefore less free.

The standards of C99 and older are considered pretty future-proof and using them will help your program be future-proof as well. This is to a high degree due to C having been established and tested better than any other language; it is one of the oldest languages and a majority of the most essential software is written in C, C compiler is one of the very first things a new hardware platform needs to implement, so C compilers will always be around, at least for historical reasons. C has also been very well designed in a relatively minimal fashion, before the advent of modern feature-creep and and bullshit such as OOP which cripples almost all "modern" languages.

Compilers

  • gcc: the main "big name" that can compile all kinds of languages including C, used by default in many places, very bloated
  • clang: another big bloated compiler, kind of competes with gcc
  • tcc: tiny C compiler, suckless, cannot optimize as well as the big compilers but is pretty elegant
  • scc: another small/suckless C compiler
  • 8c, 8cc
  • ...

Standard Library

So the standard library (libc) is a subject of live debate because while its interface and behavior are given by the C standard, its implementation is a matter of each compiler; since the standard library is so commonly used, we should take great care in assuring it's extremely well written. As you probably guessed, the popular implementations (glibc et al) are bloat. Better alternatives thankfully exist, such as:

Bad Things About C

C isn't perfect, it was one of the first relatively higher level languages and even though it has showed to have been designed extremely well, some things didn't age great, or were simply bad from the start. We still prefer this language as usually the best choice, but it's good to be aware of its downsides or smaller issues, if only for the sake of one day designing a better version of C. So, let's go:

  • C specification (the ISO standard) is proprietary :( The language itself probably can't be copyrighted, nevertheless this may change in the future, and a proprietary specs lowers C's accessibility and moddability (you can't make derivative versions of the spec).
  • The specification is also long as fuck. A good, free language should have a simple definition. It could be simplified a lot by simplifying the language itself as well as dropping some truly legacy considerations (like BCD systems?) and removing a lot of undefined behavior.
  • Some behavior is weird and has exceptions, for example a function can return anything, including a struct, except for an array. This makes it awkward to e.g. implement vectors which would best be made as arrays but you want functions to return them, so you may do hacks like wrapping them instide a struct just for this.
  • Some things could be made simpler, e.g. using reverse polish notation for expressions, rather than expressions with brackets and operator precedence, would make implementations much simpler, increasing sucklessness.
  • Some things like enums could be dropped entirely, they can easily be replaced with macros.
  • The preprocessor isn't exactly elegant, it has completely different syntax and rules from the main language, not very suckless.
  • The syntax isn't perfect, e.g. it's pretty weird that the condition after if has to be in brackets, it could be designed better. Keywords also might be better being single chars, like ? instead of if or the weird long ass names with spaces like unsigned long long could be made nicer. A shorter, natural-language-neutral source code would be probably better. Both line and block comments could be implemented with a single character, e.g. # which would end either with a newline or another #.
  • Some basic things that are part of the standard library or extensions, like fixed with types and binary literals, could be part of the language itself.
  • TODO: moar

Basics

This is a quick overview, for a more in depth tutorial see C tutorial.

A simple program in C that writes "welcome to C" looks like this:

#include <stdio.h> // standard I/O library

int main(void)
{
  // this is the main program
    
  puts("welcome to C");

  return 0; // end with success
}

You can simply paste this code into a file which you name e.g. program.c, then you can compile the program from command line like this:

gcc -o program program.c

Then if you run the program from command line (./program on Unix like systems) you should see the message.

Cheatsheet

It's pretty important you learn C, so here's a little cheat sheet for you.

data types (just some):

  • int: signed integer, at least 16 bits (-32767 to 32767) but usually more
  • unsigned int: unsigned integer, at least 16 bit (0 to 65535) but usually more
  • char: smallest integer of at least 8 bits (1 byte, 256 values), besides others used for containing ASCII characters
  • unsigned char: like char but unsigned (0 to 255)
  • float: floating point number (usually 32 bit)
  • double: like float but higher precision (usually 64 bit)
  • short: smaller signed integer, at least 16 bits (32767 to 32767)
  • long: bigger signed integer, at least 32 bits (-2147483647 to 2147483647)
  • pointer: memory address (size depends on platform), always tied to a specific type, e.g. a pointer to integer: *int, pointer to double: *double etc.
  • array: a sequence of values of some type, e.g. an array of 10 integers: int[10]
  • struct: structure of values of different types, e.g. struct myStruct { int myInt; chat myChar; }
  • note: header stdint.h contains fixed-width data types such as uint32_t etc.
  • note: there is no string, a string is an array of chars which must end with a value 0 (string terminator)
  • note: there is no real bool (actually it is in header stdbool), integers are used instead (0 = false, 1 = true)

branching aka if-then-else:

if (CONDITION)
{
  // do something here
}
else // optional
{
  // do something else here
}

for loop (repeat given number of times):

for (int i = 0; i < MAX; ++i)
{
  // do something here, you can use i
}

while loop (repeat while CONDITION holds):

while (CONDITION)
{
  // do something here
}

do while loop (same as while but CONDITION at the end):

do
{
  // do something here
} while (CONDITION);

function definition:

RETURN_TYPE myFunction (TYPE1 param1, TYPE2 param2, ...)
{ // return type can be void
  // do something here
}

See Also