Refining Your Troubleshooting Skills

Many of the electronic equipment we receive for repair, well over 75%, have no description of the fault. Usually, there is just a tag with three words like “Won’t turn on” or “Fault light flashing”. This is ok because we specialize in repair and refurbishment without knowing the fault or even the function of the equipment. But we would like to be sure our clients have gone through some form of fault finding and diagnosis to rule out any other machine issue causing the apparent equipment error.

From our experience over the past 14 years, Rom-Control has repaired all types of industrial electronics ranging from power supplies to microprocessor-based PLCs, AC and DC variable speed drives, power conditioning and surge protection systems, etc, etc, etc. The type of equipment seems limitless as does the industries who present these control systems to us for repair. Therefore, we thought it might be worthwhile mentioning a few checks you can make to ensure all other parts of the machine are working before suspecting the guilty-looking electronic brain.

 First Things First

Remember that understanding is key to successful troubleshooting. Try to reproduce it and note what steps are required to do so. When determining how to reproduce a problem, you will also want to understand how repeatable it is. For example, when the issue is reproduced, does it occur 100% of the time or only sometimes?

Divide and conquer.

In a typical control system, multiple components are working together. When issues arise, they could be caused by any of the components or the communication between them. If you are unsure of an issue’s root cause, create a few tests to help divide the possibilities. In some circumstances, it may be possible to simplify the system to reduce the data you need to analyze. This is particularly helpful when debugging code or communications. To do this, simplify the system as much as possible by disconnecting devices and/or disabling code. Is the issue resolved?

Trial and error

The trial-and-error method works best for issues that can be repeated rapidly. The best approach is to change only one variable at a time and observe the effect it has on the issue. When using this method, be sure to document the change and result with each trial. Otherwise, you could find yourself after 15 trials, trying to remember the result of the third one.

For slower issues, such as those that are time-dependent, it’s possible to attempt several changes with each trial and record the result. Minimize the number of changes in each trial and make no more than two or three changes at a time. Keep in mind when changing multiple variables at a time, they could be dependant upon each other and further complicate troubleshooting attempts.

Log Data

Computer log files are a good place to begin troubleshooting if a control-system issue involves any servers, windows-based HMIs, or other devices that keep log files. Every Windows computer has a tool called Event Viewer that can be found in the Administrative Tools of the control panel. Often the “System” and “Application” logs also provide valuable data. Certain applications may also maintain their own log files which should be checked.

Many PLCs can also log data points and create trend charts on the fly. This method is effective for any timing or PLC code-related issues. For intermittent issues, setting up some type of data logger within the PLC may also help to capture the issue when you’re not physically present.

Keep an open mind

A methodical approach to troubleshooting can help reduce the time and effort it takes to determine the root cause of an issue and then drive a solution. Keep an open mind as during the troubleshooting phase, there may be an aspect of the issue that you initially misunderstood and, in turn, could prevent you from finding a solution. When you feel that you are at the end of the road with no other options, take a step back and talk to others not familiar with the issue for a fresh look at your approach.