Sofware (SW) rendering refers to [rendering](rendering.md) [computer graphics](graphics.md) without the help of [graphics card](gpu.md) (GPU), i.e. computing images only with [CPU](cpu.md). This mostly means rendering [3D](3d.md) graphics but can also refer to other kinds of graphics such as drawing [fonts](font.md) or [video](video.md). Before GPUs were invented, all rendering was done in software, of course -- games such as [Quake](quake.md) or Thief were designed with SW rendering and only added optional GPU acceleration later. SW rendering for traditional 3D graphics is also called software [rasterization](rasterization.md), as rasterization is the basis of current real-time 3D graphics.
SW rendering has advantages and disadvantages, though from our point of view its advantages prevail (at least given only capitalist GPUs exist nowadays). Firstly it is **much slower** than GPU graphics -- GPUs are designed to perform graphics-specific operations very quickly and, more importantly, they can process many pixels (and other elements) in [parallel](parallelism.md), while a CPU has to compute pixels sequentially one by one and that in addition to all other computations it is otherwise performing. This causes a much lower [FPS](fps.md) in SW rendering. For this reasons SW rendering is also normally of **lower quality** (lower resolution, [nearest neighbour](nn.md) texture filtering, ...) to allow workable FPS. Nevertheless thanks to the ginormous speeds of today's CPUs simple fullscreen SW rendering can be pretty fast on PCs and achieve even above 60 FPS; on slower CPUs (typically [embedded](embedded.md)) SW rendering is usable normally at around 30 FPS if resolutions are kept small.
On the other hand SW rendering is more [portable](portability.md) (as it can be written purely in a portable language such as [C](c.md)), less [bloated](bloat.md) and **eliminates the [dependency](dependency.md) on GPU** so it will be supported almost anywhere as every computer has a CPU, while not all computers (such as [embedded](embedded.md) devices) have a GPU (or, if they have it, it may not be sufficient, supported or have a required [driver](driver.md)). SW rendering may also be implemented in a simpler way and it may be easier to deal with as there is e.g. no need to write [shaders](shader.md) in a special language, manage transfer of data between CPU and GPU or deal with parallel programming. SW rendering is the [KISS](kiss.md) approach.
SW rendering may also utilize a much wider variety of rendering techniques than only 3D [rasterization](rasterization.md) traditionally used with [GPUs](gpu.md) and their [APIs](api.md), thanks to not being limited by hard-wired pipelines, i.e. it is more flexible. This may include [splatting](splatting.md), [raytracing](raytracing.md) or [BSP rendering](bsp.md) (and many other ["pseudo 3D"](pseudo3d.md) techniques).
A lot of software and rendering frameworks offer both options: accelerated rendering using GPU and SW rendering as a [fallback](fallback.md) (in case the first option is not possible). Sometimes there exists a rendering [API](api.md) that has both an accelerated and software implementation (e.g. [TinyGL](tinygl.md) for [OpenGL](opengl.md)).
For simpler and even somewhat more complex graphics **purely software rendering is mostly the best choice**. [LRS](lrs.md) suggests you prefer this kind of rendering for its simplicity and portability, at least as one possible option. On devices with lower resolution not many pixels need to be computed so SW rendering can actually be pretty fast despite low specs, and on "big" computers there is nowadays usually an extremely fast CPU available that can handle comfortable FPS at higher resolutions. There is a LRS software renderer you can use: [small3dlib](s3l.md).
SW renderers are also written for the purpose of verifying rendering hardware, i.e. as a [reference implementation](reference_implementation.md).
Note that SW rendering doesn't mean our program is never touching [GPU](gpu.md) at all, in fact most personal computers nowadays **require** some kind of GPU to even display anything. SW rendering only means that computation of the image to be displayed doesn't use any hardware specialized for this purpose.
Some SW renderers make use of specialized CPU instructions such as [MMX](mmx.md) which can make SW rendering faster thanks to handling multiple data in a single step. This is kind of a mid way: it is not using a GPU per se but only a mild form of hardware acceleration. The speed won't reach that of a GPU but will outperform a "pure" SW renderer. However the disadvantage of a hardware dependency is still present: the CPU has to support the MMX instruction set. Good renderers only use these instructions optionally and fall back to general implementation in case MMX is not supported.
{ In case [small3dlib](small3dlib.md) is somehow not enough for you :) ~drummyfish }
Difficulty of this task depends on features you want -- a super simple [flat shaded](flat_shading.md) (no textures, no smooth [shading](shading.md) renderer is relatively easy to make, especially if you don't need movable camera, can afford to use [floating point](float.md) etc. See the details of [3D rendering](3d_rendering.md), especially how the GPU pipelines work, and try to imitate them in software. The core of these renderers is the **[triangle](triangle.md) [rasterization](rasterization.md)** algorithm which, if you want, can be very simple -- even a naive one will give workable results -- or pretty complex and advanced, using various optimizations and things such as the [top-left rule](top_left_rule.md) to guarantee no holes and overlaps of triangles. Remember this function will likely be the performance [bottleneck](bottleneck.md) of your renderer so you want to put effort into [optimizing](optimization.md) it to achieve good [FPS](fps.md). Once you have triangle rasterization, you can draw 3D models which consist of vertices (points in 3D space) and triangles between these vertices (it's very simple to load simple 3D models e.g. from the [obj](obj.md) format) -- you simply project (using [perspective](perspective.md)) 3D position of each vertex to screen coordinates and draw triangles between these pixels with the rasterization algorithm. Here you need to also solve [visibility](visibility.md), i.e. possible overlap of triangles on the screen and correctly drawing those nearer the view in front of those that are further away -- a very simple solution is a [z buffer](z_buffer.md), but to save memory you can also e.g. [sort](sorting.md) the triangles by distance and draw them back-to-front ([painter's algorithm](painters_algorithm.md)). You may add a [scene](scene.md) data structure that can hold multiple models to be rendered. If you additionally want to have movable camera and models that can be transformed (moved, rotated, scaled, ...), you will additionally need to look into some [linear algebra](linear_algebra.md) and [transform matrices](transform_matrix.md) that allow to efficiently compute positions of vertices of a transformed model against a transformed camera -- you do this the same way as basically all other 3D engines (look up e.g. some [OpenGL](opengl.md) tutorials, see model/view/projection [matrices](matrix.md) etc.). If you also want texturing, the matters get again a bit more complicated, you need to compute [barycentric](barycentric.md) coordinates (special coordinates within a triangle) as you're rasterizing the triangle, and possibly apply [perspective correction](perspective_correction.md) (otherwise you'll be seeing distortions). You then map the barycentrics of each rasterized pixel to [UV](uv.md) (texturing) coordinates which you use to retrieve specific pixels from a texture. On top of all this you may start adding all the advanced features of typical engines such as [acceleration structures](acceleration_structure.md) that for example discard models that are completely out of view, [LOD](lod.mf), instancing, [MIP maps](mip_map.md) and so on.
Possible tricks, cheats and [optimizations](optimization.md) you may utilize include:
- Using painter's algorithm (sorting triangles and drawing back to front) instead of z-buffer if you need to save a lot of RAM. But remember sorting doesn't work perfectly, glitches will inevitably appear, and you will probably gain overdraw penalty.
- Ad previous point: you don't have to perform whole triangle sorting each frame if you need to save speed, it may be good enough to perform a constant continuous sorting by performing only a few iterations of some sorting algorithm per frame.
- You may lower the quality of far-away objects in many ways, e.g. with [LOD](lod.md), only using affine texturing for them (as opposed to perspective-correct one) or even just using a constant color (average color of the texture), maybe even just drawing 2D sprites instead of 3D models etc. This may help a lot.
- Generally use cheap [approximations](approximation.md) such as [Gouraud](gouraud.md) (per-vertex) [shading](shading.md) instead of [Phong](phong.md) (per-pixel), nearest neighbour texture sampling, only approximate perspective correction (every N pixels), simplified handling of near-plane culling (e.g. just pushing the vertices in front of camera instead of actually culling a triangle) etc.
- Use general [optimization](optimization.md) techniques: e.g. using power of two resolution for textures, fixed screen resolution that's known at compile time or inlining of your shader function will probably help performance.
- **[Build engine](build_engine.md)**: While not a "true 3D", this was a very popular [proprietary](proprietary.md) portal-rendering engine for older games like Duke Nukem 3D or Blood.
- **[BRender](brender.md)**: Old commercial renderer used in games such as Carmageddon, Croc or Harry Potter 1. Later made [FOSS](foss.md).
- **[Dark Engine](dark_engine.md)**: Old proprietary game engine which includes a SW renderer, used mainly in the game Thief. The author writes about it at https://nothings.org/gamedev/thief_rendering.html.
- **[id Tech](id_tech.md)**: Multiple engines by [Id software](id.md) (later made [FOSS](foss.md)) used for games like [Quake](quake.md) included a software renderer. Quake's SW renderer was partially described in the *Michael Abrash's Graphics Programming Black Book*.
- **[Irrlich](irrlicht.md)**: [FOSS](foss.md) game engine including a software renderer as one of its [backends](backend.md).
- **[Mesa](mesa.md)**: [FOSS](foss.md) implementation of [OpenGL](opengl.md) that includes a software rasterizer.
- **[small3dlib](s3l.md)**: [LRS](lrs.md) [C](c.md) 3D rasterizer, very simple.
- **SSRE**: The guy who wrote [LIL](lil.md) also made this renderer named Shitty Software Rendering Engine, accessible [here](http://runtimeterror.com/tech/ssre/).