12 KiB
Collision Detection
Geometric collision detection is a fundamental problem in certain types of programs, most notably those simulating physics of mechanical bodies (physics engines) -- collision detection means computing whether (and also how) geometric shapes (most typically either in a 2D or 3D space) overlap. This article will be considering collision detection in physics engines, but the problem (and its solutions) appears in other contexts as well (e.g. frustum culling in computer graphics, raytracting and so on). Collision detection is but one phase of simulating physics, it may further lead to so called collision resolution, a different phase that tries to deal with the detected collision (i.e. separate the bodies, update their velocities, make them "bounce off" etc.). Again, here we'll just be talking about pure collision detection. A physics engine will most likely be either 2D or 3D, hence we'll mostly be examining predominantly one of these two cases -- it probably won't surprise anyone that the 3D case is generally more complex.
There are two main types of collision detection:
- discrete: Detecting collisions only at one point in time (each engine tick or "frame") -- this is easier but can result in detecting the collisions in wrong ways or missing them completely (imagine a fast flying object that in one moment is wholly in front of a wall and at the next instant wholly behind it -- collision won't be detected even though the object should have crashed into the wall). Nevertheless this is completely usable, one just has to be careful enough about the extreme cases.
- continuous: Detecting collisions considering the continuous motion of the involved bodies (still done at discrete ticks but considering the whole motion since the last tick) -- this is more difficult to program and more costly to compute, but also correctly detects collisions even in extreme cases. Sometimes engines perform discrete detection by default and use continuous detection in special cases (e.g. when speeds become very high or in other error-prone situations). Continuous detection can be imagined as a collision detection of a higher dimensional bodies where the added dimension is time -- e.g. detecting collisions of 2D circles becomes detecting collisions of "tubes" in 3D space. If you don't want to go all the way to implementing continuous collisions, you may consider an in-between solution by detecting collisions in smaller steps (which may also be done only sometimes, e.g. only for high speed bodies or only when an actual discrete collision is detected).
Collision detection is non-trivial and usually quite difficult to implement for several reasons. For example we may need to detect NOT JUST the presence of the collision but also its additional parameters which are typically the exact point of collision, collision depth and collision normal (and maybe even more, depending on what we're doing, e.g. volume of overlap etc.) -- this is because these parameters are normally needed for subsequently resolving the collision (the bodies will be shifted along the normal by the collision depth to become separated and impulses will be applied at the collision point to update their velocities). Here we may have to make some additional decisions, for example if we'll require the point of collision to lie on the surface of both bodies or if we'll just want the "averaged point of collision" (note that we'll practically never detect the bodies touching in exactly one point, normally we'll be detecting an overlapped volume); then we also want to define what exactly the depth of collision means (do we want the maximum or depth along the normal?), what to do if something doesn't make sense (like a collision normal in point-point collision) and so on. However that's not nearly all, collision detections are hard because we need to detect general cases, i.e. not just collisions of body surfaces but of WHOLE VOLUMES (imagine e.g. a tiny cuboid inside an arbitrarily rotated bigger cone -- the surfaces don't collide but the bodies do). This may be mathematically quite hard and even if we find the solution, it may just be slow -- for example precise collision detection of arbitrary 3D triangle meshes are just impractically difficult, which is why we approximate them with simpler shapes, but even for relatively simple shapes like arbitrarily rotated 3D boxes the solution is still not as easy as it may seem at first glance. And as if this wasn't enough of a hassle, we will typically have to write not one, but many detection algorithms, each one for every possible pair of bodies that may collide -- so for example if we have spheres, boxes and capsules, we have to write detection algorithms for sphere-sphere, sphere-box, sphere-capsule, box-box, box-capsule and capsule-capsule. For N shapes we'll need (N^2 + N) / 2 algorithms.
Wanting our algorithms to be at least somewhat fast, we perform an optimization by dividing this into two phases:
- broad phase: Quickly estimates which bodies MAY collide, usually with bounding volumes (such as spheres or axis aligned bounding boxes) or space indexing and algorithms such as sweep and prune. This phase quickly opts-out of checking collision of objects that definitely CANNOT collide because they're e.g. too far away.
- narrow phase: Applying the precise, expensive collision detection on the potentially colliding pairs of bodies determined in the broad phase. This yields the real collisions.
In many cases it is also important to correctly detect the order of collisions in time -- it may well happen a body collides not with one but with multiple bodies at the time of collision detection and the computed behavior may vary widely depending on the order in which we consider them. Imagine that body A is colliding with body B and body C at the same time; in real life A may have first collided with B and be deflected so that it would have never hit C, or the other way around, or it might have collided with both. In continuous collision detection we know the order as we also have exact time coordinate of each collision (even though the detection itself is still computed at discrete time steps), i.e. we know which one happened first. With discrete collisions we may use heuristics such as the direction in which the bodies are moving, but this may fail in certain cases (consider e.g. collisions due to rotations).
On shapes: general rule is that mathematically simpler shapes are better for collision detection. Spheres (or circles in 2D) are the best, they are stupidly simple -- a collision of two spheres is simply decided by their distance (i.e. whether the distance of their center points is less that the sum of the radii of the spheres), which also determines the collision depth, point and the collision normal is always aligned with the vector pointing from one sphere center to the other. So if you can, use spheres -- it is even worth using multiple spheres to approximate more complex shapes if possible. Axis aligned shapes (i.e. ones whose orientation will always be aligned with principal axes) are also good -- very often you'll see axis-aligned cubes, boxes, squares or rectangles because checking their collisions is quite easy -- it's enough to check distances of their centers along each principal axis (we can notice that rectangle/box is basically a circle/sphere if we consider consider Chebyshev distance instead of Euclidean distance, so compared to circles/spheres we may even have a faster distance computation). Capsules ("extruded spheres"), infinite planes, half-planes, infinite cylinders (distance from a line) are also among the simpler shapes. Axis-aligned capsules are often used as a shape for game characters (including the player) as they're quite simple and work well e.g. for walking up stairs. Cylinders and cuboids with arbitrary rotation are a bit harder but still used a lot. Note this nice trick: when detecting collision between two arbitrarily rotated bodies, you can transform the problem so that you make one of the bodies axis-aligned; then the problem simplifies a lot and once you have the solution, transform it back to the original frame of reference. Triangle meshes (the shape most commonly used for graphical real-time 3D models) are very difficult but may be approximated e.g. by a convex hull which is manageable (a convex hull is an intersection of a number of half-spaces) -- if we really want to precisely collide full 3D meshes, we may split each one into several convex hulls (but we need to write the non-trivial splitting algorithm of course). Also note that more complex shapes max be created as a union of the simpler shapes, so a house shape may in fact be created just from one axis aligned box and another rotated box (for the roof) -- when detecting the collision with this house you'll simply detect the collision with both of the shapes it consists of. Shapes having fewer dimensions than your space may cause you trouble (e.g. line segments in 3D -- these will probably never collide with each other and if they do, what will the normal be? etc.), i.e. if possible in 3D you want to stick with shapes that have volume and in 2D with shapes that have area.
Up until now we have considered analytical collision detection, i.e. the mathematically precise solution that we get by solving an equation (such as in the case of two spheres and comparing their distance with the sum of their radii) -- this is based on analytic geometry and the fact that geometry can be handled by algebraic methods. This method is a cool approach to solving collisions if you can manage it, it is precise and often even fast, however -- as mentioned -- in many cases too hard and tedious. Therefore we may also consider a good enough approximate solutions using numerical methods instead, i.e. the "less elegant", more bruteforce kinds of methods that do things such as iterative checking of many points in space to see if there is a collision happening there or not. This will be less precise and may be slower for simpler shapes (though for very complex shapes it may be faster), however the algorithm will be simpler AND, very significantly, it can kind of work for general cases, i.e. you may just have one algorithm that will be able to check all kinds of shapes. Even numerical methods may be written well and optimizing them may result in a very good algorithm.
Numerical algorithms can be made in many ways; consider for example the following: you can divide your whole space into a grid ("Minecraft" style) and then for every shape you have to simply write a function that checks if a single point lies in the shape or not (this is much simpler than checking collisions between complex shapes) -- now to check if a collision is happening you may just check all the grid points in space and if you find a grid square whose center point lies in two shapes at once, you detected a collision (which you may then even check with greater precision by subdividing the cell itself and checking all the subcells etc.). This can be optimized for example with things like octrees that will allow you to skip big empty areas and not waste time on checking them. Or you don't have to have a global grid but you may just iterate over shapes and for each such shape check some N points that lie inside it (which you just somehow generate, e.g. by interpolating between its vertices or something), again checking if any of these points lies in another shape. Or you may be trying something like iteratively stepping towards where an intersection of two shapes should lies (like e.g. Newton's method). There are many possibilities here.
Signed distance functions can potentially be used to implement collision detection. { That's how I did it in tinyphysicsengine. ~drummyfish }
TODO: some actual algorithms