Troubleshooting Unexpected Behavior

There exist some common diagnostic strategies and tools among various embedded Linux computing systems that can help determine what has caused a system to behave unexpectedly. Employing some or all of these tactics may not completely solve the problem, but they can provide valuable information, or even help shape the question that you may eventually need to ask the support team for help with.

The first and most valuable action you can perform when a system appears to be malfunctioning is to connect the system to a serial console if at all possible. The blinking patterns of the LEDs are sometimes useful, but just as often times they are just user-programmable lights that will have no meaning to our engineering team. The serial console's output on the other hand will frequently provide you or our engineering team immediate actionable information that can save you hours, if not days of diagnostic head-scratching.

The serial console is the first user interface to become alive on the SBC as it comes from Technologic Systems. As a result, the information there can contain facts about how the system is booting before more complicated software packages have begun to run. It can also contain answers as to why those software packages are perhaps not being run. For this reason, access to the serial console is paramount for the beginning of any diagnostic investigation: When everything else has failed, the Serial console will be one of the first, and generally also one of the last interfaces providing information and access. If something is failing at any point in the usage cycle, the serial console is likely to be one of the most fundamental tools at your disposal.

Within the serial console, in addition to the raw output from early boot or late shutdown procedures potentially comes a number of possible other diagnostic utilities within the software of the SBC itself. These vary from one SBC to the next, but diagnostic information from the Linux 'dmesg' command, can frequently provide enlightening details, depending on the nature of the error. Other operating systems and interfaces can provide similar access to basic system logs.

The next tool in early diagnostic is a Digital Volt Meter. This is another essential diagnostic tool. If perhaps the device is not powering on at all, the volt meter can quickly inform of a potential problem with the power coming into the device, or perhaps a possible hardware-related problem on the SBC itself. Frequently an SBC will have several clearly labelled "test points" where the probes of a DVM can be used to rapidly determine the functionality of various internal power rails, which can speed a solution either from the engineering team, or worst case rapidly determine that an RMA might be the proper solution to the problem.

The dreaded "restore the software to factory spec" must also take its place on this list. Many single-board computers are deceptively simple devices with an underlying layer of software that represents literally millions of lines of code and thousands of man-hours of software engineering. Sometimes the simplest route to determining a fast solution is to program a bootable media with the factory default image and observe if the system behaves in an expected manner using the default software configuration. This action can help weed out many possible hardware-related issues, as well as provide a foundation for the next diagnostic step:

Finally, a minimally working example: Create a list of steps to replicate the issue. This is not just useful to the engineering team providing support: It's frequently the path to an answer before the support team is even contacted. This list can be a literal "steps to replicate" the issue or it may be an adaptation of your software, stripping the code down to the minimum number of lines of code required to demonstrate the problem. Many times the act of creating a minimally working example is sufficient to highlight and demonstrate the root cause of the issue and lead to a direct solution without the need to wait for a supporting engineer's response.

There are of course many other excellent troubleshooting resources and tools. Honorable mention to the venerable oscilloscope and the logic analyzer, both excellent tools for troubleshooting various signal issues when communication fails or sometimes "things don't come up right".

Whatever the trouble may ultimately be, we hope this article has provided some useful information, and of course, if you're still having trouble making things work, feel free to contact us!

embeddedTS Support

How can we help you today?

Troubleshooting Unexpected Behavior

Related Articles