The only programmer that does not need to learn debugging is one who does not program, because the only source code files without bugs are empty files.
The moment you decide to take on debugging code, especially when it was not written by you, or when it was written by an old version of you that did not anticipate you would remain a programmer in the near future, things may get confusing. If this happens, you are on the right path, because confusion is only a step away from knowledge acquisition. In her talk Brain's API, Sasha Laundy made a point that:
Confusion is a sign you are about to learn something new
– Sasha Laundy
Having a positive mindset is the first step to debugging. As Sasha mentioned in that linked talk, rather than using words like "damn it, I am too dumb to get this", you could see a bug and be amazed by it with a tone like "interesting, where could this error have come from?"
Steps to debugging
- Identification and evaluation
- Isolation, Reduction, and Replication
Identification and evaluation
You write unit tests and have a 100% test coverage on your application, and then your manager calls your attention to a problem that customers are having.
Client X says her cart is always empty, even when she adds item to it.
You are inclined to say "that shouldn't happen", but bugs are bound to happen. Your unit tests have only reduced the likeliness of bug occurrence. How do you identify what the problem is? You need a log or trace.
Logs are important, and it is always great to have some logging in place for your program. When program logs do not unmask the culprit, the server logs may contain little bits of information to track the cause of the bug but you may not know where to start from or what to look into.
A lot of times, the obvious cause for the bug is shown in the program's stack trace. Before pulling every strand of your hair out, it is good to just carefully read through what errors the program had spat out. Stack traces include line and column numbers of the place of error. It is handy to know about tools like ack or ag for power-grepping through the codebase.
A peek into tcpdump and strace could also save you a lot of time.
Strace only works on Linux, but macOS has
dtruss as an alternative.
To evaluate the bug, you may also need to reproduce the cases around its occurrence. To do that, there are things to consider like could it be due to network latency? Could an ad blocker be blocking a script? Could the user be behind a proxy that is blacklisted by the server? There are so many possibilities.
In a previous article, charles proxy was used to cheat the browser, by spoofing a file located on the server with a local file. Charles proxy can also be used to simulate a kind of MITM attack, where you intercept requests and make REST endpoints send different responses than what was originally sent. You can use the Chrome Dev tools for a lot of things like this, throttle the network, and, according to this tweet
2nd new favorite feature of Chrome Dev Tools is: Request Blocking! Simulate network failures. Achieve maximum rigidity! pic.twitter.com/H2vTAMnUYX— Dave Rupert (@davatron5000) June 29, 2017
you can block specific requests, simulating a network failure. There should be fallback measures taken for when things fail this way, that hopefully provide feedback to your users.
Rubberducking might also be a good way to evaluate the existing code and spot the bug (and maybe the reason for it). If you think buying a rubber duck to explain your code to is silly, or if you simply cannot afford to buy one, you can take advantage of self-messaging on a platform like Slack.
By going through all these evaluations, we may reproduce the bug and know what part of the code is responsible for it.
Isolation, Reduction, and Replication
When a bug has been identified, fixing it may require isolation or reduction. Code isolation is when you take an excerpt of a code base to create a reduced test case (RTC). This is also a point where you start considering getting help, if you are relatively new to the field. Before requesting help, you need to have a RTC that would be easy on whoever is willing to help.
While they may sound similar, debugging by reduction is quite different from debugging by isolation. The term "Reduced Test Case" in isolation may even make it more confusing, but a good way to think about it is this; To isolate, you examine what you take out, and to reduce, you examine what you have left in. A reduced test case serves as a term for involving a second party to review your isolated code. It may be an isolation for you, but it is a reduced version of your problem for them.
You can reduce on a codebase simply by gradually reversing your changes to a point where the bug does not occur, but this can be a very hectic and time consuming task, so we do what programmers do best. Automate!
Version control systems are very important and used on most code bases today, so you can take advantage of them, rather than building a new automated system for code reduction. With git, if you make frequent releases, you can checkout into a previous commit in history, to see, if the bug exists.
$ git log --oneline --decorate --graph * 2eff587 Add specs for X feature * cd8ffb4 Replace imperative procedure with declarative library $ git checkout 2eff587
This checks the log to see previous commits, and with descriptive commit titles can help you find a point in history you would like to jump back to.
If the bug does not exist at that point in history, then you can go back to the future with
$ git checkout develop
assuming your working branch is
develop. It could be
master or some other branch.
Replicate the current working branch, by copying into a separate folder that is not tracked, and go back to the bug-free commit. Then, gradually start trying to add things back in, and see, if the bug can be reproduced.
A little problem with this is you may have uncommitted work you are not ready to commit yet, and you cannot check out into other branches without making commits. Here is how you can get around that:
git add -A; git rm $(git ls-files --deleted) 2> /dev/null; git commit --no-verify -m "--wip -- [skip ci]"
That will create a work-in-progress (WIP) commit, and let you checkout any branch or old commit you wish. When you are done and you need to clean out the WIP commit, you do this:
git log -n 1 | grep -q -c "--wip--" && git reset HEAD~1
I think git is great for reduction, but if you think it is too complex or you have easier ways, then go for it. The goal is to apply a reduction strategy not to follow strict technology rules.
Resolving the bug
- Disk usage
If an app is taking too long to load for a particular user, a likely problem could be the time required for content to be delivered. In such cases, you could consider moving to a CDN for content delivery. The size of assets being delivered can also play a big role here, as much as the problem can lie in the network. Compress and minify assets, and test that your server has a good Time to First Byte (TTFB). TTFB can be tested from the Network tab of devtools. If the TTFB of your development server is slower than that on production for your users, try to resolve that with the server administrator or operations department of your team.
this is generally useful for any debugging process. Then, we can head over to the Memory tab where we are provided with options for the type of profiling we want:
By reducing memory consumption and managing resources properly, you can save users' device energy. An example is using the requestAnimationFrame (rAF) to handle animation frames over setInterval, as rAF can pause animations when users move to a different browser tab. If a user suffers from quick energy drain, it is a bug that needs to be treated as seriously as any other bug.
You know WWW does not mean World Western Web, and some users are prone to network latency. If your users begin to have network problems, you need to simulate their network conditions, and, as mentioned earlier, the chrome devtools and charles proxy handle this fine. On devtools, just head over to the network tab and use the throttling dropdown to simulate network events your users would experience. Another common problem that could happen on the network is race conditions, so if you are loading multiple scripts, check that you have them ordered by the priority of dependencies. You could add the
defer attribute to some scripts to defer their time of execution on the network timeline. You could also use
async in rare cases.
If your application stores cached/temporary data on your user's device, and the user complains about full memory each time they run your application, consider prioritizing storage and trying to offload temporary data to server-side storage where required. Cache persistence may also occur here. For chrome, you could open dev tools and right-click the refresh button to see, if you are only having cache problems, or you could check Disable cache on the network tab of devtools.
duplication is far cheaper than the wrong abstraction
Write detailed commits, write documentation, and introduce code reviews, as someone else on your team may spot likely bugs before they hit production.
The steps given here are meant to help mostly with solo-debugging. Remember, there are communities you could always reach out to for help, once you are at the point where you create test cases. You could also report bugs to bug platforms of various browsers. Here are some: