(C) 2009 Hank Wallace
This series of articles concerns embedded systems design and programming, and how to do it with excellence, whether you are new to the discipline or a veteran. This article is about debugging techniques.
So you’re in the lab and the marketing manager walks in with one of your babies that has been returned from a customer. “The gizmoflotchy is whining and the floobydust collector overflows on alternate Mondays.” The problem is exhibiting itself right in front of your face. What to do?
You cannot connect the emulator because the proper adapter is not soldered to the PCB. You don’t want to depower the unit to do that because the CPU will reset and the problem will go away. The only tools you have are a scope and a meter, and what you can divine from event logs and terminal ports.
But you are not stumped because you know that the best emulator on planet Earth is the one between your ears. You wrote the program, and it’s still running in your brain. Emulators can be crutch tools that discourage thinking, but you don’t have that problem. Emulators are no substitute for understanding how the program works, but you don’t have that problem.
Here are some things you can do:
- Collect all the information possible in the software domain. Event logs, crash dumps, lists of running threads and tasks, state machine states — all these things can give you a hint.
- Scope the hardware. Which devices are being polled or otherwise selected? What memory area is active? Which interrupt lines are active, and are their rates proper? Are the power supplies in spec? Are their any overheated parts?
- Make a list of all the possible areas of the design that could be causing the issue. Don’t rule anything out.
- Scan the code sections that are suspect.
- Consider all the data you have collected and brainstorm scenarios that could reproduce the problem. Try those on other units.
You have to have an open mind. A customer of mine was complaining that a program I wrote was resetting periodically, or locking up. It turned out to be an opamp on an instrumentation board which was crowbarring with negative transients, preventing the analog signal from getting to the ADC. It was not my problem at all, but I had to rule out all software causes before they would seriously look at the hardware.
To help in situations like this, I generally leave test points on the PCB and drive them with signals indicative of certain functions within the product. For example, a timer interrupt should drive one test point, producing a pulse on every tick. Serial or data interrupts should do likewise. Passes through the main loop or through each thread can similarly pulse a test point or squirt some data out a debug serial port. A scope is good for checking signal activity, but more importantly it can view tracepoints in a running system without any emulator attached.
And if you connect an emulator to a target for debug, understand that emulators that require changes to the program (such as PIC) invoke the Heisenberg Uncertainty Principle. That is, when you connect the emulator and change the system in order to instrument the system, you may disturb the system enough to make the problem go away.
I’m always very careful when using an emulator. Emulators are sometimes poorly designed, so we have to be cautious when counting on them for debugging services. All tools must be used intelligently.
Hank Wallace is the owner of Atlantic Quality Design, Inc., a consulting firm located in Fincastle, Virginia. He has experience in many areas of embedded software and hardware development, and system design. See www.aqdi.com for more information.