less_retarded_wiki/oop.md
2024-06-08 16:41:14 +02:00

26 KiB

Object-Oriented Programming

"I invented the term 'object oriented' and C++ was not what I had in mind" --Alan Kay, inventor of OOP

Object-oriented programming (OOP, also object-obsessed programming, objectfuscated programming, capital-oriented programming or artificial inelegance) is a programming paradigm that tries to model reality as a collection of abstract objects that communicate with each other and obey some specific rules. While the idea itself isn't bad and can be useful in certain cases and while pure OOP in very old languages like Smalltalk may have even been quite elegant, by later adoption by capitalist businesses the concept has been extremely twisted and degenerated to unbelievable levels -- OOP has become extremely overused, extremely badly implemented and downright forced in programming languages that nowadays try to apply this abstraction to every single program and concept, creating anti-patterns, unnecessary issues and of course greatly significant amounts of bloat. We therefore see the OOP of today as a cancer of programming. OOP was basically a software development fashion wave that scarred the industry for decades, it has poisoned minds of several generations. Nowadays despite OOP still keeping many fans the critical stance towards it isn't even controversial anymore, many others have voiced the criticism over and over, usually the most competent programmers like Richard Stallman and Linus Torvalds and groups like suckless and bitreich. Ugly examples of OOP gone bad include Java and C++ (which at least doesn't force it). Other languages such as Python and Javascript include OOP but have lightened it up a bit and at least allow you to avoid using it. You should probably learn OOP but only to see why it's bad (and to actually understand 99% of code written nowadays). Stop objectifying programs!

A real life analogy to give a bit of high level overview: the original Smalltalk style OOP was kind of like when society invented democracy -- a simple idea which everyone understands (we are 10 cavemen, let's just vote on stuff mkay?) that's many times useful and works well, e.g. on a scale of a village or a small city. Then cities grew bigger (just as software did), into states and empires and the idea kept getting more and more complicated -- people just wanted to keep the democracy, apply it to everything and scale it indefinitely, but for that they had to add more complexity, they implemented representatives, parliaments, senates, presidents, vicepresidents, ministers, judges, more and more bureaucracy, hybrid ideas (free market, controlled economy, ...), corruption and inefficiencies crept in, the system degenerated into what we have today -- a hugely expensive paperworking machine that's exploited and hacked, with laws so complicated no one really understands them, with magic, randomness and unpredictability, producing just waste and bullshit, crumbling under own weight. This is also the way OOP went -- they started inventing static classes/methods, abstract classes/methods, multiple inheritances, interfaces, design patterns, overriding, hybrid paradigms and so on until we ended up with ugly abominations on which today's technology stands. Now a few things have to be noted. Firstly these abominations are a disaster, they came from our mistake of taking the original simple idea (simple small scale voting democracy) and saying "let's make this the only thing in the world and let's scale it a million times!" Such idea is stupid from the start and there is no doubt about that. However another evil is that people are taught to do everything this way -- today's programmers will use the mainstream OOP everywhere, even in simple programs, they don't even think about if they should, they are simply taught "always use this". This is like in real life wanting to govern a family by having elections each year to vote for the head of the family, then having members of family vote for other members of the family to be their representatives that will talk for them (the same kind of craziness as wanting to strictly respect encapsulation even in trivial programs), then if someone wants to buy anything he has to ask for a budget several months in advance and have others vote on it while an elected anti corruption committee is watching etcetc. This kind of insanity is what's normal in software nowadays. Now the only sane discussion can be had only about the usefulness and scope of the original, simple idea (simple voting in small groups, simple pure OOP) and here we say that it may be good, but only applied to just some specific situations, i.e. we say simple OOP is good for some problems but not all, just like voting is a good solution to some problems (e.g. a group of friends deciding where to go party), but not all (e.g. passengers in a car voting on which way to steer and which pedals to press).

Principles

Bear in mind that OOP doesn't have a single, crystal clear definition. It takes many forms and mutations depending on language and it is practically always combined with other paradigms such as the imperative paradigm, so things may be fuzzy.

Generally OOP programs solve problems by having objects that communicate with each other. Every object is specialized to do some thing, e.g. one handles drawing text, another one handles caching, another one handles rendering of pictures etc. Every object has its data (e.g. a human object has weight, race etc.) and methods (object's own functions, e.g. human may provide methods getHeight, drinkBeer or petCat). Objects may send messages to each other: e.g. a human object sends a message to another human object to get his name (in practice this means the first object calls a method of the other object just like we call functions, e.g.: human2.getName()).

Now many OO languages use so called class OOP. In these we define object classes, similarly to defining data types. A class is a "template" for an object, it defines methods and types of data to hold. Any object we then create is then created based on some class (e.g. we create the object alice and bob of class Human, just as normally we create a variable x of type int). We say an object is an instance of a class, i.e. object is a real manifestation of what a class describes, with specific data etc.

The more "lightweight" type of OOP is called classless OOP which is usually based on having so called prototype objects instead of classes. In these languages we can simply create objects without classes and then assign them properties and methods dynamically at runtime. Here instead of creating a Human class we rather create a prototype object that serves as a template for other objects. To create specific humans we clone the prototype human and modify the clone.

OOP furthermore comes with some basic principles such as:

  • encapsulation: Object should NOT be able to access other object's data directly -- they may only use their methods. For example an object shouldn't be able to access the height attribute of a Human object, it should be able to access it only via methods of that object such as getHeight. (This leads to the setter/getter antipattern).
  • polymorphism: Different objects (e.g. of different classes) may have methods with the same name which behave differently for either object and we may just call that method without caring what kind of object that is (the correct implementation gets chosen at runtime). E.g. objects of both Human and Bomb classes may have a method setOnFire, which with the former will kill the human and with the latter will cause an explosion killing many humans. This is good e.g. in a case when we have an array of GUI components and want to perform e.g. resize on every one of them: we simply iterate over the whole array and call the method resize on each object without caring whether the object is a button, checkbox or a window.
  • inheritance: In class OOP classes form a hierarchy in which parent classes can have child classes, e.g. a class LivingBeing will have Human and Animal subclasses. Subclasses inherit stuff from the parent class and may add some more. However this leads to other antipatterns such as the diamond_problem. Inheritance is nowadays regarded as bad even by normies and is being replaced by composition.

Why It's Shit

Just a brief summary of why the mainstream OOP is a fail:

  • OOP is just a bad abstraction for many problems that by their nature aren't object-oriented. OOP is not a silver bullet, yet it tries to behave as one. The greatest issue of OOP is that it's trying to solve everything. For example it forces the idea that data and algorithms should always come together, but that's simply a stupid statement in general, there is no justification for it, some data is simply data and some algorithms are simply algorithms. You may ask what else to use instead of OOP then -- see the section below.
  • For simple programs (which most programs should be) such as many Unix utilities OOP is simply completely unnecessary.
  • It is in conflict with Unix philosophy -- Unix philosophy advocates for making small programs that do one task well and for these, as mentioned above, OOP is more of a burden. "Doing one thing well" is a similar definition of object in OOP and here the two paradigms clash -- if we adopt Unix philosophy, any program should basically be just a single object, negating the whole purpose of OOP. To truly make use of OOP we have to accept that a program will consist of multiple objects, i.e. that it will do several things at once -- in other words OOP advocates for creating monolithic programs (bloat).
  • OOP languages make you battle artificial restrictions rather than focus on solving the problem at hand.
  • Great number of the supposed "features" and design patterns (setters/getters, singletons, inheritance, ...) turned out to actually be antipatterns and burdens -- this isn't a controversial statement, even OOP proponents usually agree with this, they just try to somehow document and dodge all the traps.
  • OOP as any higher abstraction very often comes with overhead, memory footprints and performance loss (bloat) as well as more complex compilers, language specifications, more dependencies, magic etc.
  • The relatively elegant idea of pure OOP didn't catch on and the practically used OOP languages are abomination hybrids of imperative and OOP paradigms that just take more head space, create friction and unnecessary issues to solve. Sane languages now allow the choice to use OOP fully, partially or avoid it completely, which leads to a two-in-one overcomplication.
  • The naive idea of OOP that the real world is composed of nicely defined objects such as Humans and Trees also showed to be completely off, we instead see shit like AbstractIntVisitorShitFactory etc. Everyone who ever tried to make some kind of categorization knows it's usually asking for trouble, categories greatly overlap, have unclear borders, multiple parents etcetc.
  • The idea that OOP would lead to code reusability also completely failed, it's simply not the case at all, implementation code of specific classes is typically burdened with internal and external dependencies just like any other bloated code. OOPers believed that their paradigm would create a world full of reusable blackboxes, but that wasn't the case, OOP is neither necessary for blackboxing, nor has the practice shown it would contribute to it -- quite on the contrary, e.g. simple imperative header-only C libraries are much more reusable than those we find in the OOP world.
  • Good programmers don't need OOP because they know how to program -- OOP doesn't invent anything, it is merely a way of trying to force good programming mostly on incompetent programmers hired in companies, to prevent them from doing damage (e.g. with encapsulation). However this of course doesn't work, a shit programmer will always program shit, he will find his way to fuck up despite any obstacles and if you invent obstacles good enough for stopping him from fucking up, you'll also stop him from being able to program something that works well as you tie his hands. Yes, good programmers write shit buggy code too, but that's more of a symptom of bad, overcomplicated bloated capitalist design of technology that's just asking for bugs and errors -- here OOP is trying to cure symptoms of an inherently wrong direction, it is not addressing the root cause.
  • OOP just mostly repeats what other things like modules already do.
  • If you want to program in object-oriented way and have a good justification for it, you don't need an OOP language anyway, you can emulate all aspects of OOP in simple languages like C. So instead of building the idea into the language itself and dragging it along forever and everywhere, it would be better to have optional OOP libraries.
  • It generalizes and simplifies programming into a few rules of thumb such as encapsulation, again for the sake of inexperienced noobs. However there are no simple rules for how to program well, good programming requires a huge amount of experience and as in any art, good programmer knows when breaking the general rules is good. OOP doesn't let good programmers do this, it preaches things like "global variables bad" which is just too oversimplified and hurts good programming.
  • ...

Pure OOP (The "Legit" But Unused Kind Of OOP)

TODO

Similarly to how functional languages are based on some very simple mathematical system such as lamba calculus, pure object oriented languages have a similar thing, most notably the sigma calculus (defined in the paper called A Theory Of Primitive Objects by Abadi and Cardelli).

So Which Paradigm To Use Instead Of OOP?

After many people realized OOP is kind of shit, there has been a boom of "OOP alternatives" such as functional, traits, agent oriented programming, all kinds of "lightweight"/optional OOP etc etc. Which one to use?

In short: NONE, by default use the imperative paradigm (also here many times interchangeably called "procedural"). Remember this isn't to say you shouldn't ever apply a different paradigm, but imperative should be the default, most prevalent and suitable one to use in solving most problems. There is nothing new to invent or "beat" OOP.

But why imperative? Why can't we simply improve OOP or come up with something ultra genius to replace it with? Why do we say OOP is bad because it's forced and now we are forcing imperative paradigm? The answer is that the imperative paradigm is special because it is how computers actually work, it is not made up but rather it's the natural low level paradigm with minimum abstraction that reflects the underlying nature of computers. You may say this is just bullshit arbitrary rationalization but no, these properties makes imperative paradigm special among all other paradigms because:

  • Its implementation is simple and suckless/LRS because it maps nicely and naturally to the underlying hardware -- basically commands in a language simply translate to one or more instructions. This makes construction of compilers easy.
  • It's predictable and efficient, i.e. a programmer writing imperative code can see quite clearly how what he's writing will translate to the assembly instructions. This makes it possible to write highly efficient code, unlike high level paradigms that perform huge amounts of magic for translating foreign concepts to machine instructions -- and of course this magic may differ between compilers, i.e. what's efficient code in one compiler may be inefficient in another (similar situation arose e.g. in the world of OpenGL where driver implementation started to play a huge role and which led to the creation of a more low level API Vulkan).
  • It doesn't force high amounts of unnecessary high level abstraction. This means we MAY use any abstraction, even OOP, if we currently need it, e.g. via a library, but we aren't FORCED to use a weird high level concepts on problems that can't be described easily in terms of those concepts. That is if you're solving a non-OOP problem with OOP, you waste effort on translating that problem to OOP and the compiler then wastes another effort on un-OOPing this to translate this to instructions. With imperative paradigm this can't happen because you're basically writing instructions which has to happen either way.
  • It is generally true that the higher the abstraction, the smaller its scope of application should be, so the default abstraction (paradigm) should be low level. This works e.g. in science: psychology is a high level abstraction but can only be applied to study human behavior, while quantum physics is a low level abstraction which applies to the whole universe.

Once computers start fundamentally working on a different paradigm, e.g. functional -- which BTW might happen with new types of computers such as quantum ones -- we may switch to that paradigm as the default, but until then imperative is the way to go.

History

TODO

Code Example

OK so let's dive into this for the sake of demonstration, here is some kind of C++ code along the lines of a typical OOP textbook example:

#include <iostream>

using namespace std;

class Animal // abstract class
{
  protected:
    static int animalsTotal;
    const char *name;

  public:
    Animal(const char *name);
    const char *getName();
    virtual void makeSound() = 0;
    static int getAnimalsTotal();
};

int Animal::animalsTotal = 0;

int Animal::getAnimalsTotal()
{
  return animalsTotal;
}

class Cat: public Animal // cat is a subclass of animal
{
  protected:
    int treesClimbed;

  public:
    Cat(const char *name);
    virtual void makeSound();
    void climbTree();
};

class Dog: public Animal // dog is a subclass of animal
{
  protected:
    int ballFetched;

  public:
    Dog(const char *name);
    virtual void makeSound();
    void fetch();
};

Animal::Animal(const char *name)
{
  this->name = name;
  animalsTotal++;
}

const char *Animal::getName()
{
  return this->name;
}

Cat::Cat(const char *name): Animal(name)
{
  this->treesClimbed = 0;
}
    
Dog::Dog(const char *name): Animal(name)
{
  this->ballFetched = 0;
}

void Cat::climbTree()
{
  this->treesClimbed++;
}

void Dog::fetch()
{
  this->ballFetched++;
}

void Cat::makeSound()
{
  cout << "meow";
}

void Dog::makeSound()
{
  cout << "woof";
}

int main()
{
  #define ANIMALS 5

  Animal *animals[ANIMALS]; // pointers to animals

  animals[0] = new Cat("Mittens");
  animals[1] = new Dog("Doge");
  animals[2] = new Cat("Mr. Jinx");
  animals[3] = new Cat("Toby");
  animals[4] = new Dog("Hachiko");

  cout << "There are " << Animal::getAnimalsTotal() << " animals in total:" << endl;

  for (int i = 0; i < ANIMALS; ++i)
  {
    cout << animals[i]->getName() << ": ";
    animals[i]->makeSound();
    cout << endl;
  }

  for (int i = 0; i < ANIMALS; ++i)
    delete animals[i];

  return 0;
}

It should write out:

There are 5 animals in total:
Mittens: meow
Doge: woof
Mr. Jinx: meow
Toby: meow
Hachiko: woof

Now let's quickly go over the code (it's really a super quick summary, if you don't understand it grab some book on OOP).

The code defines 3 classes. The first class, Animal, represents an animal, it has an attribute called name that records the animal's given name. There is also a static attribute called animalsTotal -- static attribute means it belongs to the class itself, NOT the objects, i.e. it's basically a global variable that's just associated with the class. The class also has methods, such as getName that simply returns the animal's name or the getAnimalsTotal method -- this one is however a static method, meaning it belongs to the class and can be called without any object at hand. The Animal class is abstract, which means we cannot make objects of this class directly, it only serves as a base class for further subclasses. In C++ abstract class is any class that has at least one pure virtual methods, here the method makeSound -- such method is marked with = 0 after it, which means it doesn't have an implementation (it doesn't have to be implemented as there won't be any objects of this class on which the method could be called). Then there are two subclasses of Animal: Cat and Dog. These are no longer abstract, i.e. we will be able to make cat and dog objects; these subclasses inherit the attributes of the parent class (the name attribute, i.e. cats and dogs will have their names) and methods, such as getName -- this method is a "normal" method, it behaves the same for all animal classes, it just returns the name, there's nothing special about it. However note the method makeSound -- this is a virtual method, meaning it will behave differently for each specific class, i.e. cat makes a different sound than dog -- so each of these classes has to implement its version of this method. Notice the methods that have the same name as the class (e.g. Cat::Cat or Dog::Dog) -- these are called constructors and are automatically called every time an object of the class is created. In the main function we then create an array of 5 Animal pointers, i.e. pointers that can point to any animal subclass; then we create some cats and dogs and let the array point to them. Then we iterate over the array and call each object's makeSound method -- this demonstrates so called polymorphism: we don't care what the object is (if it's a cat or dog), we always call the same makeSound method and the language makes it so that the correct version for the object is called. Polymorphism is kind of one of the highlights of OOP, so it's good to stress it here -- it is connected to the virtual methods.

Now let's see how we could emulate this OOP code in a non-OOP language, just for the sake of it -- we'll do it in C (note that we are really trying to closely emulate the code above, NOT solve the problem the way in which it would normally be solved without OOP). It may look something like this (it can potentially be done in different ways, of course):

#include <stdio.h>
#include <stdlib.h>

typedef struct
{
  int _treesClimbed;
} Cat;

typedef struct
{
  int _ballsFetched;
} Dog;

typedef struct _Animal
{
  const char *_name;
  void (*makeSound)(struct _Animal *);

  union
  {
    Cat cat;
    Dog dog;
  } subclass;
} Animal;

int _AnimalAnimalsTotal;

void AnimalNew(Animal *this, const char *name)
{
  this->_name = name;
  _AnimalAnimalsTotal++;
}

int AnimalGetAnimalsTotal(void)
{
  return _AnimalAnimalsTotal;
}

const char *AnimalGetName(Animal *this)
{
  return this->_name;
}

void CatMakeSound(Animal *this)
{
  printf("meow");
}

void DogMakeSound(Animal *this)
{
  printf("woof");
}

void CatNew(Animal *this, const char *name)
{
  AnimalNew(this,name);
  this->subclass.cat._treesClimbed = 0;
  this->makeSound = CatMakeSound;
}

void DogNew(Animal *this, const char *name)
{
  AnimalNew(this,name);
  this->subclass.dog._ballsFetched = 0;
  this->makeSound = DogMakeSound;
}

void CatClimbTree(Animal *this)
{
  this->subclass.cat._treesClimbed++;
}

void DogFetch(Animal *this)
{
  this->subclass.dog._ballsFetched++;
}

int main(void)
{
  #define ANIMALS 5

  Animal *animals[ANIMALS];

  animals[0] = malloc(sizeof(Animal));
  CatNew(animals[0],"Mittens");

  animals[1] = malloc(sizeof(Animal));
  DogNew(animals[1],"Doge");

  animals[2] = malloc(sizeof(Animal));
  CatNew(animals[2],"Mr. Jinx");

  animals[3] = malloc(sizeof(Animal));
  CatNew(animals[3],"Toby");

  animals[4] = malloc(sizeof(Animal));
  DogNew(animals[4],"Hachiko");

  printf("There are %d animals in total:\n",AnimalGetAnimalsTotal());

  for (int i = 0; i < ANIMALS; ++i)
  {
    printf("%s: ",AnimalGetName(animals[i]));
    animals[i]->makeSound(animals[i]);   
    putchar('\n'); 
  }

  for (int i = 0; i < ANIMALS; ++i)
    free(animals[i]);

  return 0;
}

Here we implement the virtual methods with function pointers. We use normal functions instead of class methods and simply have their names prefixed with the class name. Inheritance is made with an union holding the subclass stuff. Private things are prefixed with _ -- we rely on people respecting this and not accessing these things directly.

Now let's see how we'd solve the same problem in C in a natural way:

#include <stdio.h>

#define ANIMAL_CAT 0
#define ANIMAL_DOG 1

typedef struct
{
  int type;
  const char *name;
} Animal;

int animalsTotal = 0;

Animal animalNew(int type, const char *name)
{
  animalsTotal++;

  Animal a;

  a.type = type;
  a.name = name;

  return a;
}

void animalMakeSound(Animal *animal)
{
  switch (animal->type)
  {
    case ANIMAL_CAT: printf("meow"); break;
    case ANIMAL_DOG: printf("woof"); break;
    default: break;
  }
}

int main(void)
{
  #define ANIMALS 5

  Animal animals[ANIMALS];

  animals[0] = animalNew(ANIMAL_CAT,"Mittens");
  animals[1] = animalNew(ANIMAL_DOG,"Doge");
  animals[2] = animalNew(ANIMAL_CAT,"Mr. Jinx");
  animals[3] = animalNew(ANIMAL_CAT,"Toby");
  animals[4] = animalNew(ANIMAL_DOG,"Hachiko");

  printf("There are %d animals in total:\n",animalsTotal);

  for (int i = 0; i < ANIMALS; ++i)
  {
    printf("%s: ",animals[i].name);
    animalMakeSound(&(animals[i]));
    putchar('\n'); 
  }

  return 0;
}

Notice the lack of bullshit. OOPers will argue something about scalability or something, but that's argument of bloat so it's invalid -- basically they tell you "just wait till you have 10 million lines of code, then it becomes elegant", but of course, such code is already bad only by its size -- code of such size should never be written. They will also likely invent some highly artificial example tailored to suit OOP which you will however never meet in practice -- you can safely ignore these.