less_retarded_wiki/sw_rendering.md
2023-03-13 21:33:27 +01:00

11 KiB

Software Rendering

Sofware (SW) rendering refers to rendering computer graphics without the help of graphics card (GPU), i.e. computing images only with CPU. This mostly means rendering 3D graphics but can also refer to other kinds of graphics such as drawing fonts or video. Before GPUs were invented, all rendering was done in software, of course -- games such as Quake or Thief were designed with SW rendering and only added optional GPU acceleration later. SW rendering for traditional 3D graphics is also called software rasterization, as rasterization is the basis of current real-time 3D graphics.

SW rendering has advantages and disadvantages, though from our point of view its advantages prevail (at least given only capitalist GPUs exist nowadays). Firstly it is much slower than GPU graphics -- GPUs are designed to perform graphics-specific operations very quickly and, more importantly, they can process many pixels (and other elements) in parallel, while a CPU has to compute pixels sequentially one by one and that in addition to all other computations it is otherwise performing. This causes a much lower FPS in SW rendering. For this reasons SW rendering is also normally of lower quality (lower resolution, nearest neighbour texture filtering, ...) to allow workable FPS. Nevertheless thanks to the ginormous speeds of today's CPUs simple fullscreen SW rendering can be pretty fast on PCs and achieve even above 60 FPS; on slower CPUs (typically embedded) SW rendering is usable normally at around 30 FPS if resolutions are kept small.

On the other hand SW rendering is more portable (as it can be written purely in a portable language such as C), less bloated and eliminates the dependency on GPU so it will be supported almost anywhere as every computer has a CPU, while not all computers (such as embedded devices) have a GPU (or, if they have it, it may not be sufficient, supported or have a required driver). SW rendering may also be implemented in a simpler way and it may be easier to deal with as there is e.g. no need to write shaders in a special language, manage transfer of data between CPU and GPU or deal with parallel programming. SW rendering is the KISS approach.

SW rendering may also utilize a much wider variety of rendering techniques than only 3D rasterization traditionally used with GPUs and their APIs, thanks to not being limited by hard-wired pipelines, i.e. it is more flexible. This may include splatting, raytracing or BSP rendering (and many other "pseudo 3D" techniques).

A lot of software and rendering frameworks offer both options: accelerated rendering using GPU and SW rendering as a fallback (in case the first option is not possible). Sometimes there exists a rendering API that has both an accelerated and software implementation (e.g. TinyGL for OpenGL).

For simpler and even somewhat more complex graphics purely software rendering is mostly the best choice. LRS suggests you prefer this kind of rendering for its simplicity and portability, at least as one possible option. On devices with lower resolution not many pixels need to be computed so SW rendering can actually be pretty fast despite low specs, and on "big" computers there is nowadays usually an extremely fast CPU available that can handle comfortable FPS at higher resolutions. There is a LRS software renderer you can use: small3dlib.

SW renderers are also written for the purpose of verifying rendering hardware, i.e. as a reference implementation.

Note that SW rendering doesn't mean our program is never touching GPU at all, in fact most personal computers nowadays require some kind of GPU to even display anything. SW rendering only means that computation of the image to be displayed doesn't use any hardware specialized for this purpose.

Some SW renderers make use of specialized CPU instructions such as MMX which can make SW rendering faster thanks to handling multiple data in a single step. This is kind of a mid way: it is not using a GPU per se but only a mild form of hardware acceleration. The speed won't reach that of a GPU but will outperform a "pure" SW renderer. However the disadvantage of a hardware dependency is still present: the CPU has to support the MMX instruction set. Good renderers only use these instructions optionally and fall back to general implementation in case MMX is not supported.

Programming A Software Rasterizer

{ In case small3dlib is somehow not enough for you :) ~drummyfish }

Difficulty of this task depends on features you want -- a super simple flat shaded (no textures, no smooth shading renderer is relatively easy to make, especially if you don't need movable camera, can afford to use floating point etc. See the details of 3D rendering, especially how the GPU pipelines work, and try to imitate them in software. The core of these renderers is the triangle rasterization algorithm which, if you want, can be very simple -- even a naive one will give workable results -- or pretty complex and advanced, using various optimizations and things such as the top-left rule to guarantee no holes and overlaps of triangles. Remember this function will likely be the performance bottleneck of your renderer so you want to put effort into optimizing it to achieve good FPS. Once you have triangle rasterization, you can draw 3D models which consist of vertices (points in 3D space) and triangles between these vertices (it's very simple to load simple 3D models e.g. from the obj format) -- you simply project (using perspective) 3D position of each vertex to screen coordinates and draw triangles between these pixels with the rasterization algorithm. Here you need to also solve visibility, i.e. possible overlap of triangles on the screen and correctly drawing those nearer the view in front of those that are further away -- a very simple solution is a z buffer, but to save memory you can also e.g. sort the triangles by distance and draw them back-to-front (painter's algorithm). You may add a scene data structure that can hold multiple models to be rendered. If you additionally want to have movable camera and models that can be transformed (moved, rotated, scaled, ...), you will additionally need to look into some linear algebra and transform matrices that allow to efficiently compute positions of vertices of a transformed model against a transformed camera -- you do this the same way as basically all other 3D engines (look up e.g. some OpenGL tutorials, see model/view/projection matrices etc.). If you also want texturing, the matters get again a bit more complicated, you need to compute barycentric coordinates (special coordinates within a triangle) as you're rasterizing the triangle, and possibly apply perspective correction (otherwise you'll be seeing distortions). You then map the barycentrics of each rasterized pixel to UV (texturing) coordinates which you use to retrieve specific pixels from a texture. On top of all this you may start adding all the advanced features of typical engines such as acceleration structures that for example discard models that are completely out of view, LOD, instancing, MIP maps and so on.

Possible tricks, cheats and optimizations you may utilize include:

  • Using painter's algorithm (sorting triangles and drawing back to front) instead of z-buffer if you need to save a lot of RAM. But remember sorting doesn't work perfectly, glitches will inevitably appear, and you will probably gain overdraw penalty.
  • Ad previous point: you don't have to perform whole triangle sorting each frame if you need to save speed, it may be good enough to perform a constant continuous sorting by performing only a few iterations of some sorting algorithm per frame.
  • You may lower the quality of far-away objects in many ways, e.g. with LOD, only using affine texturing for them (as opposed to perspective-correct one) or even just using a constant color (average color of the texture), maybe even just drawing 2D sprites instead of 3D models etc. This may help a lot.
  • Generally use cheap approximations such as Gouraud (per-vertex) shading instead of Phong (per-pixel), nearest neighbour texture sampling, only approximate perspective correction (every N pixels), simplified handling of near-plane culling (e.g. just pushing the vertices in front of camera instead of actually culling a triangle) etc.
  • Use general optimization techniques: e.g. using power of two resolution for textures, fixed screen resolution that's known at compile time or inlining of your shader function will probably help performance.
  • TODO: MORE

Specific Renderers

These are some notable software renderers:

  • Build engine: While not a "true 3D", this was a very popular proprietary portal-rendering engine for older games like Duke Nukem 3D or Blood.
  • BRender: Old commercial renderer used in games such as Carmageddon, Croc or Harry Potter 1. Later made FOSS.
  • Dark Engine: Old proprietary game engine which includes a SW renderer, used mainly in the game Thief. The author writes about it at https://nothings.org/gamedev/thief_rendering.html.
  • id Tech: Multiple engines by Id software (later made FOSS) used for games like Quake included a software renderer. Quake's SW renderer was partially described in the Michael Abrash's Graphics Programming Black Book.
  • Irrlich: FOSS game engine including a software renderer as one of its backends.
  • Mesa: FOSS implementation of OpenGL that includes a software rasterizer.
  • small3dlib: LRS C 3D rasterizer, very simple.
  • SSRE: The guy who wrote LIL also made this renderer named Shitty Software Rendering Engine, accessible here.
  • System Shock engine: Old proprietary game engine.
  • TinyGL: Implements a subset of OpenGL.
  • Many old games in the 90s implemented their own software renderers, so you can look there.

See Also