How To Debug Any Program
Debugging computer programs is a vital skill for any software engineer. For most of us, there exists a considerable range of problems which we are able to solve but not able to implement perfectly on the first attempt. I've often found myself staring at code that seems to be correct and wracking my brain for ideas on where it might contain bugs.
Re-reading code without any apparent errors and hoping for a revelation is an inconsistent strategy at best. Over the years, I’ve come to realize that a principled, action-oriented approach to debugging is both more efficient and more reliable.
This blog post addresses the fundamental principles for debugging computer programs in a systematic way. The ideas presented here may seem obvious. However, I believe that a deliberate understanding of fundamental principles is important to avoid bad habits and time-consuming mental errors that can be costly in an interview or competition environment.
Start by Questioning Everything
The following excerpt is from one of my favorite books: Zen and the Art of Motorcycle Maintenance. Although the excerpt discusses the debugging of a physical system, the underlying principles translate directly to debugging computer programs.
In statement of the problem, the main skill is in stating absolutely no more than you are positive you know. It is much better to enter a statement 'Solve Problem: Why doesn't cycle work?' which sounds dumb but is correct, than it is to enter a statement 'Solve Problem: What is wrong with the electrical system?' when you don't absolutely know the trouble is in the electrical system... This careful approach to the beginning questions keeps you from taking a major wrong turn which might cause you weeks of extra work or can even hang you up completely.
When a program is behaving unexpectedly, our attention tends to be drawn first to the most complex portions of the code. However, mistakes can come in all forms. I've personally been guilty of rushing to debug sophisticated portions of my code when the real bug was that I forgot to read in the input file. In the following section, we'll discuss how to reliably focus our attention on the portions of the program that need correction.
Then Question as Little as Possible
Suppose that we have a program and some input on which its behavior doesn’t match our expectations. The goal of debugging is to narrow our focus to as small a section of the program as possible. Once our area of interest is small enough, the value of the incorrect output that is being produced will typically tell us exactly what the bug is.
For example, if we have a function that's supposed to compute the sum of two integers and it's telling us that 2^30 + 2^30 = -2^31, it's fairly straightforward to understand and correct the issue. But if we let that unexpected result cascade through the rest of the program, compounded through other operations like map lookups or conditional logic, the final output will easily lose any discernible meaning. We might see the program print 23 instead of 24 as its final output and have no idea what it means or where the issue began.
In order to catch the point at which our program diverges from expected behavior, we must inspect the intermediate state of the program. Suppose that we select some point during execution of the program and print out all values in memory. We can inspect the results manually and decide whether they match our expectations. If they don't, we know for a fact that we can focus on the first half of the program. It either contains a bug, or our expectations of what it should produce were misguided. If the intermediate state does match our expectations, we can focus on the second half of the program. It either contains a bug, or our understanding of what input it expects was incorrect.
Question Things Efficiently
For practical purposes, inspecting intermediate state usually doesn't involve a complete memory dump. We'll typically print a small number of variables and check whether they have the properties we expect of them. Verifying the behavior of a section of code involves:
- Before it runs, inspecting all values in memory that may influence its behavior.
- Reasoning about the expected behavior of the code.
- After it runs, inspecting all values in memory that may be modified by the code.
Reasoning about expected behavior is typically the easiest step to perform even in the case of highly complex programs. Practically speaking, it's time-consuming and mentally strenuous to write debug output into your program and to read and decipher the resulting values. It is therefore advantageous to structure your code into functions and sections that pass a relatively small amount of information between themselves, minimizing the number of values you need to inspect.
It’s also helpful to design the memory model of your program in a way that is as clear and intuitive as possible to you. Maintaining clear and precise mathematical definitions for the contents of variables allows you to know which variables are relevant to a section of code and how you can expect them to change.
Finding the Right Question to Ask
We’ve assumed so far that we have available a test case on which our program behaves unexpectedly. Sometimes, getting to that point can be half the battle. There are a few different approaches to finding a test case on which our program fails. It is reasonable to attempt them in the following order:
- Verify correctness on the sample inputs.
- Test additional small cases generated by hand.
- Adversarially construct corner cases by hand.
- Re-read the problem to verify understanding of input constraints.
- Design large cases by hand and write a program to construct them.
- Write a generator to construct large random cases and a brute force oracle to verify outputs.
Making Your Code Question Itself
Sometimes it happens that we create a test case on which our program fails, but it's simply too large to inspect by hand. We can programmatically apply the same principles we use when debugging by hand by inserting state verification checks into our program.
Suppose that we've written a clever function to compute some property of a set of N intervals in O(N log N). If a simple O(N^2) approach exists, we can run it whenever the function is called and check that its output is identical to our own before exiting the function. If it doesn't, we can immediately terminate the program and announce that an issue exists inside this particular function.
Putting it All Together
The techniques laid out in this post provide a systematic roadmap for debugging any computer program. Over time, you can expect to build some intuition for where bugs might lie in particular programs. But even when intuition fails you, a step-by-step application of basic principles will steadily drive you towards a better understanding of the system you are working with and ultimately to the root cause of the bug you are trying to track down.