2024-10-30 16:31:42 +01:00

19 KiB

Raw Permalink Blame History

Debugging

Debugging is a term predominantly related to computer technology (but at times extended beyond it) where it signifies the practice of actively searching for bugs (errors, design flaws, defects, ...) and fixing them; most typically it occurs as part of software programming, but we may also talk about debugging hardware etc. Debugging is notoriously tedious and stressful, it can even take majority of the programmer's time and some bugs are extremely hard to track down, however systematic approaches can be applied to basically always succeed in fixing any bug. Debugging is sometimes humorously defined as "replacing old bugs with new ones".

Fun fact: the term debugging allegedly comes from the old times when it meant LITERALLY getting rid of bugs that broke computers by getting stuck in the relays.

Spare yourself debugging by testing as you go -- while programming it's best to at least quickly test the program is working after each small step change you make. Actually you should be writing automatic tests along with your main program that quickly tests that all you've programmed so far still works (see also regression). This way you discover a bug early and you know it's in the part you just changed so you find it and fix it quickly. If you don't do this and just write the whole program before even running it, your program will just crash and you won't have a clue why -- at this point you most likely have SEVERAL bugs working together and so even finding one or two of them will still leave your program crashing -- this situation is so shitty that the time you saved earlier won't nearly be worth it.

Which kind of bug is the biggest pain in the ass to debug? One take on answering this might the following: statistical bugs. That is bugs that aren't really an error in the code but an error in the mathematical model behind the code, and furthermore ones that don't manifest in a single place but only in the whole. For example when programming a monte carlo chess engine -- the code may be perfect, it's doing exactly what you want it to do, but the engine is making wrong moves because you actually want the wrong thing; debuggers won't help you, you cannot point a finger at a specific line in the code, you have to think deeply about statistics, sampling, probabilities and things that many times betray intuition, your brain starts to emit smoke and only then you realize you actually chose a wrong mathematical model and have to rewrite the whole thing. This is debugging on a high level.

Debugging Software

Debugging of programs will commonly occur in these steps:

Discovering bug: you notice a bug, this usually happens during testing but of course can also just happen during normal use etc.
Reproducing it: reproducing the bug is extremely important -- actually you probably can't move on without it. Reproducing denotes finding an exact way to make the bug manifest, e.g. "click this button while holding down this key" etc.
Locating it: now as you have a crashing program, you examine it and find WHY exactly it crashes, which line of code causes the crash etc.
Fixing it: finally once you know why and where the bug exists, you just make it go away. Sometimes a hotfix is quickly applied before implementing a proper fix.

For debugging your program it may greatly help to make debugging builds -- you may pass flags to your compiler that make the program better for debugging, e.g. with gcc you will likely want to use -g (generate debug symbols) and -Og (optimize for debugging) flags. Without this debuggers will still work but may be e.g. unable to tell you the name of function inside which the program crashed etc.

Also as with everything you get better at debugging with practice, especially when you learn about common types of bugs and how they manifest -- for example you'll learn to just quickly scan for the famous off by one bugs near any loop, you'll learn that when a value grows and then jumps to zero it's an overflow, that your program getting stuck and crashing after a while could mean infinite recursion etc.

The following are some of the most common methods used to debug software, roughly in order by which one typically applies them in practice.

Testing

Testing is an area of itself, it's the main method of finding bugs. There are many kind of testing like manual testing (just playing around with the program), automatic testing (automatized testing by a program), security/penetration testing, stress testing, whitebox/blackbox testing, unit testing, code reviews and whatnot. Formal verification is similar to testing that can reveal further bugs, but it's more difficult to do.

Eyeballing

A quick way to spot small bugs is obviously to just observe the source code, nevertheless this really just works for the small, extremely obvious bugs like writing = instead of == etc.

Searching The Internet

Even for certain obscure and less frequent errors just copy pasting the error message to the search engine usually reveals its most common cause on Stack Overflow. You will do this a lot when learning a new language/library/environment etc.

Manual Execution

In this method you try to go through the program yourself step by step, just as the computer would. By this you will find out just WHY and WHERE your program gets to a wrong result or to a line that makes it crash.

Printing

Using print statements is extremely popular and efficient method of locating bugs; the idea is to use the language's print functions to log what's happening. By this you can e.g. find where exactly (which line of source code) your program crashes, you simply insert printf("asdf\n"); somewhere and keep moving this print statement and re running the program until the text stops showing up on the screen - then you know the program crashes before it reached the print. Note that you can use the principle of binary search (also known as wolf fence algorithm) to move the print in the code so that you find the crash place relatively quickly. Besides this prints can of course also show you e.g. values in variables so you can e.g. check WHERE EXACTLY the value changes to a wrong value and so on.

The advantage of this is that you don't need any extra debugger, the method works basically everywhere and is actually very effective, it may be all you will need in 99% of cases. { TBH I don't even regularly use debugger, debugging with prints just works for me. ~drummyfish }

IMPORTANT NOTE especially for C programmers: output is usually line buffered, so in each print you HAVE TO add a newline (\n) at the end to make it print immediately. If you don't do this, it may happen that the print will be executed but the output will stay waiting in the output buffer as the program crashes so it won't show up on your screen. Similarly in other languages you may want to call some flush function etc.

Sometimes a bug can be super nasty and make the program crash always in random places, even depending e.g. on where you put the print statement, even if the print statement shouldn't really have an effect on the rest of the program. When you spot something like this, you probably have some super nasty bug related to undefined behavior or optimization, try to mess with optimization flags, use static analyzers, reduce your program to a minimum program that still bugs etc. This may take some time.

Logging

Logging is similar to the debug prints but it's something you just do automatically as you program (see also asserts below), logging system is a permanent part of the program, i.e. something that will stays as the program's feature rather than a temporary way of finding and fixing a specific bug. Logging means your program records what it's doing by printing it to the command line or into some text file -- this creates a log that will be useful for many things, including debugging. The advantage here is that if a user encounters a bug, he can just send the programmer his log file which the programmer can read and get some idea about what happened. For this logs should adhere to some rules and be a bit more sophisticated than mere quick printouts: firstly log outputs should be nice and more verbose (i.e. output e.g. step 225: variable x = 342 instead of asdf 225 342) so as to be understandable to anyone, they should be nicely formatted because a log will likely be long so it should be friendly to be filtered with regexes etc., it should also be possible to turn logs off. With bigger project there are also options to set different log levels (e.g. the highest level will print almost everything the program is doing, lower level will print only important steps and so on), set where to store the log (i.e. print to console, store to some specified file, ...) and so on.

Rubber Duck

Rubber duck debugging works like this: you try to explain your code to someone -- even someone who doesn't understand programming, for example rubber duck -- and in doing this you often spot some error in reasoning. Explaining code to a programmer may have a further advantage as he may ask you clever questions.

Reducing Your Program To Minimum

When dealing with a super nasty bug in a complex program that's dodging solutions by the simpler methods, it is useful to just copy your program elsewhere and there strip down everything off of it while still keeping the bug in place. I.e. you just keep deleting functions and all the program does while making sure the bug you're after is still happening. This will firstly eliminate places where you have to look for the bug but mainly will usually lead you to reducing the program to just a few lines of code that behave extremely weirdly, like a function whose behavior depends on where you put a print statement of if you use a wider data type etc. Then you usually find the problematic line or whatever it is that's causing the bug and once you know the line, you can look at it really carefully, google the behavior of each operator etc. to really find the bug.

Asserts

Assertions are checks for conditions that should always hold, for example if you're programming some game, it should always hold that the player is within the level boundaries at all times, so you can just regularly keep checking this condition in your program -- if this assert fails, there is probably some bug (maybe you calculated the position wrong, maybe some pointer overwrote your value, ...), and the location of this condition can also help you locate the bug (you will know approximately when and where in the code it happened). Similarly you can just watch all important variables and their relationships. In bigger projects adding asserts on the go is sometimes considered a "good habit" or is even a required practice, i.e. it is not something you start doing only when you discover a bug -- the purpose of asserts is more to discover bugs early and prevent disasters (running a code that's internally working bad) rather than help fix them (but they'll help with that too). Asserts can be implemented with special debuggers or libraries, however a more KISS way is to simply do it yourself, it's a simple condition check -- you should just make it so that you can disable all assert check easily because while you will use them in debugging, for the release build you'll want more performance, so you'll want to turn unnecessary condition checks off. For example in C you can make an assert macro like:

#ifdef DEBUG
  #define ass(cond,text) if (!(cond)) printf("ASSERT FAILED: %s!\n",text);
#else
  #define ass(c,t) ;
#endif

...
// assert correct player position:
ass(abs(playerPos.x) <= WORLD_BOUND && abs(playerPos.y) <= WORLD_BOUND,"player position")
...

Here if you don't define the DEBUG macro, the assert macro will just be an empty command that does nothing.

Finding The Breaking Change (Regression)

If something that used to work stops working, it's a regression. Here the first step towards fixing it is finding which exact change to the program broke it, i.e. find the last software version before the bug that didn't have the bug -- for this version control systems like git are very cool as they allows you to switch between different commits. In this search apply the binary search principle again (just like you search a word in a dictionary, i.e. keep checking the middle commit and move either before or after it depending on whether it already has the bug or not). Once you find the offending commit it's usually easy to spot the bug, you just have a relatively small amount of code in the commit which you can keep checking by parts if it's not immediately obvious (i.e. try to recreate the commit part by part and see when exactly it breaks, this moves you yet closer to the bug).

Recording The Data, Plotting, Visualizing

Some bugs may be not so easy to grasp by it being hard to even point out exactly what is happening wrong, for example slight graphical or sound glitches when you notice something a bit off, like a camera suddenly jumping a bit -- here it may help keep continuous track of various variables, e.g. the camera transformation and other vectors -- by purely printing them out or even plotting them -- and then note when the bug appeared: for example you may simply close the program immediately after you notice the bug, and then you know you should be looking for something weird happening near the end in the log of the data and their graphs. Maybe you'll notice the camera jumps when its rotation angle switches from negative to positive, when its rotation aligns with a principal axis or something similar -- this may very well point you to the core of the issue.

Debuggers And Other Debugging Tools Like Profilers

There exist many software tools specialized for just helping with debugging (there are even physical hardware debuggers etc.), they are either separate software (good) or integrated as part of some development environment (bad by Unix philosophy) such as an IDE, web browser etc. Nowadays a compiler itself will typically do some basic checks and give you warning about many things, but oftentimes you have to turn them on (check man pages of your compiler).

The most typical debugging tool is a debugger, a program that lets you to play around with the program as it's running, it typically allows doing things like like:

Step through the program line-by-line (typically there is are two options: step by lines and step by functions), sometimes even backwards in time.
Run the program and pause it exactly where you need (breakpoints).
Print stack trace, i.e. the exact chain of function calls at certain point in time. This is extremely useful if your program crashes, you will see not only at which line it crashed but exactly through which functions it got to that line, which is usually the important thing.
Inspect values in RAM, CPU registers etc.
Modify values in RAM, registers etc.
Modify code on-the-fly.
Assert if certain conditions hold.
Link lines of assembly to lines in original source code.
Warn about suspicious things.
...

As a free software C programmer you will most likely use gdb, the GNU debugger.

Furthermore there many are other useful tools such as:

dynamic program analyzer: A tool that will watch your program running and check for things like memory leaks, access of unallocated memory, suspicious behavior, unsafe behavior, call of obsolete functions and many others. The most famous such tool is probably valgrind, it's a good habit to just use valgrind from time to time to check our program. Similar things can also be done with gdb.
profiler: A kind of dynamic analyzer that focuses on statistical measuring of resource usage of your program, typically execution times of different functions, memory consumption or network usage. Basically it will watch your program run and then draw you graphs of which parts of your programs consume most resources. This usually helps optimization but can also serve to find bugs (you can spot where your memory starts leaking etc.). Some basic profiling can be done even without profiler, just inside your own program, but it can get tricky. One famous profiler is e.g. gprof.
static source code analyzer: Static analyzers look at the source code (or potentially even compiled binary) and try to find bugs/inefficiencies in it without running it, just by reasoning. Static analyzers often tell you about undefined behavior, potential overflows, unused code, unreachable branches, unsatisfiable conditions, confusing formatting and so on. This complement the dynamic analysis. Some tools to do this are e.g. cppcheck, pmccabe and splint, though thanks to compilers performing a lot of static analysis themselves these seem not as widely used as dynamic analyzers nowadays.
hex editor: Tool allowing you to mess with binary files, useful for any program that works with binary files. A simple hex viewer is e.g. hexdump.
emulators, virtual machines, ...: Running your program on different platform often reveals bugs -- while your program may work perfectly fine on your computer, it may start crashing on another because that computer may have different integer size, endianness, amount of RAM, timings, file system etcetc. Emulators and VMs allow you to test exactly this AND furthermore often allow easy inspection of the emulated machine's memory and so on.
...

Other Tips

Additionally these may help deal with bugs as well:

With weird bugs try to rebuild everything from scratch -- delete all object files, intermediate and temporary files and compile the whole project from scratch, turn it on an off again, it may be that there's some peculiar bug and/or weirdness in the build system or something.
If you had a working build, added something and it's bugging and you're stuck debugging it, it may be faster to just scratch your changes, revert to the working code and implement the thing again, more carefully. First time you may have just made some stupid oversight that's hard to find, you won't make it again the second time.
When stuck also maybe try the program on different computer, use different compiler, interpreter etc. -- this can at least give you more information, i.e. like that the bug is specific to your operating system or that it doesn't happen if there is more RAM etc.
...

Shotgun Debugging

This is kind of an improper YOLO way of trying to fix bugs, you just change a lot of stuff in your program in hopes a bug will go away, it rarely works, doesn't really get rid of the bug (just of its manifestation at best) and can at best perhaps be a simple hotfix. Remember if you don't understand how you fixed a bug, you didn't actually fix it.

TODO: mini gdb tutorial

19 KiB Raw Permalink Blame History