Understand the machine
By Alan Friedman
When using vibration analysis or other predictive maintenance technologies, it is important to first identify the problem you want to solve and then consider the appropriate test or technology to solve it. More often than not, the cart is put before the horse and a technology is used with no particular goal in mind. We are taking tests because we are supposed to be taking tests, but we don’t know what we are testing for nor do we have a clear economic justification for doing this testing. In the context of predictive maintenance, we could rephrase this and state that the first step in a predictive maintenance program is to “understand the machine.”
Understanding the machine, or the asset, has many layers of meaning. From a purely technical standpoint, the presumption is that we are using a PdM technology to determine if the mechanical health of the machine or asset is degrading over time, operating out of specification or on the verge of failure. It is logical then that we have some idea of how the machine behaves when it is healthy and how it degrades over time or fails. In fact, if we do a mental study of how the machine fails, then what we should test for should become very obvious.
It is also important to understand how quickly the machine fails as this will give us some idea of how frequently to test it if it is worth testing at all. For example, if we have a knife that cuts some parts and we know the knife can cut 50 of them before becoming dull, then we don’t need to test the knife for sharpness; we simply change it after 50 parts are cut. On the other hand, a dynamic instability in a high-speed turbine can cause it to go from perfect health to catastrophic failure in a matter of minutes. In this case it doesn’t make sense to test the machine on a quarterly basis. Instead we should have a continuously monitoring protection system. A more common case is the machine that slowly wears out over time and can be tested for this wear. Test frequency should be based on the rate at which the machine fails and can be increased as the machine begins to show signs of deterioration.
In order to determine if a machine is operating out of spec or is degrading, we must have some idea of what it looks like when it is healthy. This is called a “baseline” and forms the starting point or reference for later comparisons. If the asset is not new and we are not sure of its current health, then it is okay to use its current condition, whatever that may be, as a reference. I may not be in perfect physical health today, but I can certainly use today as a reference to determine if my health is deteriorating and that may be more important for me than determining how my health compares to some ideal of perfect health. How this reference is measured is of course dependent on how the machine degrades or fails and how we plan to test for this. If we are using vibration analysis, the baseline should be collected while the machine is operating in a particular state (speed and load) and future comparisons should be made with the machine tested under the same conditions.
Once a baseline is in place and a test plan determined, we arrive at a point where data must be analyzed and the condition of the machine qualified. This is most easily done by beginning with a list of what faults you are looking for (typically beginning with the most common) along with the indications these faults produce in the data. Instead of looking at the data and trying to interpret its meaning, begin with the list of faults and indications and step by step, check the data to see if each fault exists or not. This is to say, begin by thinking about what you are testing for and then see if these tests are positive or negative. It sounds logical, but most people take the opposite approach!
Beyond the purely technical aspects of understanding how the machine or asset fails and how to test for it, we must also consider the consequences of the machine failing. This gives us an economic incentive for testing the machine as well as a way to quantify the results of our actions and the return on investment for our monitoring program. For example: I have a $20 fan on my desk that I could test for unbalance and bearing problems using a vibration analysis system. I also know that if I frequently balance the fan I can extend its operating life. The consequences of the fan failing, however, are nil and the costs of testing and balancing it are higher than the cost of replacing it. Therefore, although I can be successful from a technical standpoint, I won’t be successful from an economic standpoint.
If we consider both the technical and economic contexts of our program before we begin testing our assets, then we will be on the road to having a program that is both technically and economically sound. From the technical standpoint we choose the correct technology to use to monitor the asset and because we will know what we are testing for, our analysis will be direct and simple. By framing this in an economic context from the start, we will have a way to justify, as well as monitor, the success of the program over time. In conclusion, determine the problem first and then look for solutions.