Reliability Prediction for Mechatronic Drive Systems Dipl.-Ing. Sebastian Bobrowski, Prof. Dr.-Ing. Wolfgang Schinköthe, University of Stuttgart, Institute of Design and Production in Precision Engineering (IKFF), Stuttgart, Germany, bobrowski@ikff.uni-stuttgart.de Dr. rer. nat. Maik Döring, Prof. Dr. rer. nat. Uwe Jensen, University of Hohenheim, Institute of Applied Mathematics and Statistics (IAMS), Stuttgart, Germany, maik.doering@uni-hohenheim.de Abstract Presently, for many systems and components, especially for mechatronic, electromechanical and also mechanical components of machines and devices, few failure data are available. Manufacturers gain information about the reliability and lifetime of their products, for a specific application, from endurance tests at customer-specific operating conditions. These experiments provide specific failure time data. However, they are time-consuming and expensive. Statistically ensured statements require performance tests with adequate test lot sizes, though it is impossible to cover all of the imaginable combinations of applied load profiles and impact parameters. Additionally, findings gained through experiments are valid only for the specific applied test conditions and loads. On the other hand, developers require as early as possible meaningful key data characterizing the applied components to determine the overall reliability of the device or machine. Often, modified components which are based on the same technology are applied using other load profiles. But the available test data cannot be applied as it is and first needs to be prepared. Using a variety of existing data sets from endurance tests of similar components and other load cases, we can derive prognoses for newly developed components under new application environments. For this purpose, we develop further, adopt and test stochastic models based on well-known regression models of Survival Analysis for engineering applications. The final objective of this research project is to develop prognosis tools to statistically predict the failure behaviour of mechatronic systems for values of the impact parameters that were not tested. The tools should be able to involve existing failure data for the prognosis. We demonstrate examples of data sets of DC motors and planetary gear drives, which were recorded at the IKFF facilities. 1 Introduction 1.1 Use of Test Data In order to investigate the failure behaviour of mechatronic components and systems and to prove their reliability, suppliers and manufacturers perform endurance tests. The specimens, which are assumed to be identical, are tested applying different load profiles at conditions close to the particular consumer application. Resulting test data, however, are often used only to achieve the goals of the specific data acquisition. Endurance tests require a high input of financial and hardware resources, for example test benches and specimens. For economic reasons, companies try to minimize the number of experiments. Consequently, often few, heterogeneous test data recorded using test lots with low numbers of specimens are available. The full potential of these heterogeneous experiments is not tapped. Simple tools and methods to use the test data for further reliability predictions are currently missing. 1.2 Load and Load Capacity The reliability depends on many influences. Failure Probability Test lot of identical specimens Spreading Failures Failure Probability Distribution Function at Operating Conditions A Influences on the Reliability Lifetime Failure Probability Distribution Function at Operating Conditions B Fig. 1 Influence of different operating conditions on the reliability of identical specimens Load The load profile consists of type, intensity and duration of the load. Not only the load profile and surrounding influences (like temperature, humidity, salt water, dust and sand), but also the exact position and orientation can have significant influences on the reliability, depending on the ISBN 978-3-8007-3537-2 75 VDE VERLAG GMBH Berlin Offenbach
type and application of the specimens. These influencing factors affect the failure mechanisms, which can, depending on the system, be simple or complex. In reality, the resulting test data for lifetimes are subject to statistically comprehensible spreading. Fig. 1 illustrates the failure probability functions under two different operating conditions. Load Capacity Also, when using different specimens, a different design size or different subcomponents inside the specimen (for example different applied materials, tribology), failure probability distribution functions vary. These differences originate from a varying, modified or related failure mechanism (for example effects of scaling of size). Both the load itself and the load-bearing capacity of the specimen affect the shape and location of the failure probability distribution function. 1.3 Benefit of a Reliability Prediction As Fig. 2 illustrates, key figures of reliability for mechatronic drive systems can be estimated by means of reliability predictions. It is possible to illustrate endurance test results to identify the current state of reliability and to derive recommendations for the component design. Test Planning Endurance Tests Reliability Prediction, also for not tested Operating Conditions Decisions Test Results from Related Experiments Estimates Visualization Design Changes Reliability Goals Fig. 2 Benefit of a reliability prediction 1.4 Exemplary Specimens Failure Probability Distributions Reliability Key Figures Material Selection Changes of Geometry We demonstrate examples of data sets of DC motors with brushes in 3 different voltage types and of precision planetary gear drives with different numbers of stages and gear ratios [1, 5]. Both categories of specimens possess a comparable technology, respectively, which is an important requirement for modelling. The tested DC motors have equal housing diameter and length. For different motor types (different nominal voltages), the coils inside the motors are different. At the planetary gears, the modular design is arranged in a way such that several similar stages with equal module can be stringed together one after the other. Two different stage reduction ratios were investigated in our study. The two categories of specimens (Fig. 3) are very different and require an adaptation of the models. From the comparison of the procedure to set up the prediction model for different system types, further knowledge towards a general methodology to create adequate regression models can be gained. Fig. 3 Exemplary examination of different categories of specimens 1.5 Ways to generate a Reliability Prediction There exist a variety of approaches to set up a reliability prediction. Fig. 4 shows established procedures applied by the industry. At its simplest, experts are consulted in order to give a rough estimation on the reliability behaviour of the considered systems. Alternatively, existing test data are illustrated, for example in a Weibull diagram. Upon the basis of the plotted Weibull lines, the shape and location of the probability function of new systems is estimated. Know-How of Experts/ Experience of past Tests Manual Test Data Analysis Physical Model Experts Estimates (partly by Intuition) Estimates, for Example assuming Weibull Distributions Analytic Description of Failure Mechanisms Regression over Impact Parameters Regression Model using Related Failure Data Fig. 4 Exemplary approach to create reliability predictions for mechatronic drive systems If the physical interactions of a failure mechanism (for example linear wear) are known, a physical lifetime model can be set up. The failure behaviour is integrated as a function of the impact parameters in this model. If the system interactions are rather opaque, this approach becomes more difficult. The mechanisms are often unknown in detail (for example electrical brush sparking at DC motors or impacts of tribology). Mutual dependencies might exist and intensification due to various superposed mechanisms can occur. ISBN 978-3-8007-3537-2 76 VDE VERLAG GMBH Berlin Offenbach
In most cases, not all quantities and interrelations which have an influence on the lifetime, can be monitored and controlled. Therefore, lifetimes are associated with a statistic model using random variables. By means of tests and thereby recorded lifetime data, the unknown parameters of the statistic model are estimated. Subsequently, key figures of the lifetime distributions can be evaluated. Using this procedure, no description of physical principles of the specific failure mechanism is necessary. By means of regression models like the Cox Proportionalhazards Model (see chapter 2.2), lifetimes of different technical systems with similar components or load profiles can be described. During modeling stage, impact parameters, so-called covariates, which characterize the differences of the components or load profiles, are required to be determined. At the choice of covariates, one resorts to physical or technical knowledge. Based on the variety of existing data of endurance tests of related components and a possibly small number of lifetime tests with the new component, regression and distribution parameters are estimated. Afterwards, predictions of the failure behaviour, also for untested load cases or new components, can be provided. Finally, prediction tools for the use of existing failure data can be developed. 2 Statistical Modeling 2.1 Proceeding The execution and the evaluation of endurance tests are closely connected to each other (Fig. 5). At first, the regression model should be specified. The key impact parameters on the failure mechanism of the specimens investigated in the tests have to be recorded. Based on the data acquired during the tests, the model parameters are estimated. Subsequently, lifetime distributions or derived key figures for arbitrary values of the explaining variables, (thus arbitrary load cases,) can be predicted and visualized. -Failure Times -Censoring Indicators -Covariate Values Execution of the Test Record of Experimental Lifetime Data Model Setup/ Model Choice Estimate of Model Parameters Deduction of Reliability Characteristics, also for Non-Measured Values of Covariates Evaluation of the Test Data -Covariates -Regression Coefficients -Baseline Hazard Rate -Mean Time to Failures -Distribution Functions -Key Figures Fig. 5 Proceeding to predict the reliability using the Cox regression model For the analysis of lifetimes of some objects, it is important to have mathematical models. Let F( t) = P( T t) be the distribution function of the lifetime T of an object at time t. F(t) describes the probability of a failure up to a time t. Further let f ( t) = df( t) / dt be the density function and let λ(t) be the hazard rate (failure rate) at time t defined by f ( ) ( t) λ t =. 1 F( t) Each, the hazard rate, the density function and the distribution function characterize the distribution of the lifetime T. Simple classes of lifetime models are the parametric families of distributions like Weibull or Lognormal. In many examples, one has more heterogeneous populations of systems, such that more complex models should be used. The regression models of the Survival Analysis, which were developed for medical applications originally, are quite useful, since they take two features into account: Censoring. In medical treatments the patient is still alive at the point in time when the study is closed and the data are to be analyzed. Similarly, in mechanical engineering, endurance tests will be stopped by economic reasons, while some of the observed systems are still working. A lifetime T is said to be right censored if one only knows that T is larger than an observed right censoring value. Covariates. Returning to a medical study, the objective may be to compare different treatment effects on the survival time possibly correcting for information available on each patient such as age and disease progression indicators. Also, in life tests one has different technical systems or different surrounding conditions. Additional variables like load or temperature, so-called covariates, will be introduced, such that the objects could be characterized in more detail. This leaves us with a statistical regression problem. The conjunction between censoring, covariates and the distribution of the lifetime is modeled in the regression models by the intensity, which is related to the hazard rate. Roughly speaking, the intensity is the probability that an observed technical system, subject to risk, will fail in the next instant. Basically, there are two approaches for modeling the intensity, the Aalen model and the Cox model. In the Aalen model (additive model) the intensity is given by a linear combination of the covariates. And in the Cox model (Proportional Hazards Model) the ratio of the intensities of two objects is time-independent. Indeed, the Aalen model is more flexible, but it requires a larger amount of data. Therefore in the following the Cox model is considered. 2.2 The Cox Proportional-hazards Regression Model One of the most popular regression models is the Cox model (or proportional hazards model). For each object i, i = 1,, n, there are, in addition to the possibly censored lifetime, k covariates Y i, 1,..., Y i observed, which describe, k the object and the environmental conditions in more ISBN 978-3-8007-3537-2 77 VDE VERLAG GMBH Berlin Offenbach
detail. Given the vector of covariates Y, the conditional hazard of the lifetime T at time t is P( T t + h T > t, Y ) λ ( t; Y ) = lim. h 0+ h Cox suggested that this so-called intensity could be modeled as the product of an unspecified deterministic baseline hazard λ and an exponential function with an 0 argument linear in the covariates. This leads to the following model of the intensity for object i: ( β Y + Y ) λ = +, i ( t ) λ0 ( t) Ri ( t) exp 1 i,1... β k i, k where R is the risk indicator, equal to one as long as object i is observed (at risk). β = ( β, β, 1 2 i T..., β k ) is the vector of the unknown regression parameters. Based on the lifetime data of an experiment, the regression parameters and the baseline hazard function λ have to be estimated 0 statistically. For each object, the data consist of the possibly censored failure time T; an indicator equal to one if T is a true failure time, zero if it is censored; and the vector of explanatory variables Y. The Cox model itself makes three assumptions: first, that the ratio of the hazards of two objects is the same at all times; secondly, that the explanatory variables act multiplicatively on the hazard; and thirdly, that conditionally on the explanatory variables, the failure times of two individuals are independent. As in all regression models, one also assumes that the explanatory variables have been transformed so that they may be entered without further transformation and that all interactions have been included explicitly. The vector of the regression coefficients β is estimated by maximizing the so-called partial likelihood. Having computed β, the estimated vector of regression coefficients, one can calculate the estimate of the cumulative baseline hazard t Λ ( t) = λ ( s ds 0 0 ) 0 by the estimator of Breslow, which applies also the maximum likelihood idea. Estimation of the intensity function itself can be done by taking a smooth derivative of the cumulative hazard. The standard Cox model and the estimation methods are implemented in the most statistic software tools [2, 4, 6]. 3 Exemplification of Model Use 3.1 Calculation Results regarding DC Motor Data In order to receive an overview over the underlying data base for the particular prediction, failure times were associated with the empirical failure probabilities. This allows to sketch the approximate shape and location of the failure probability functions. Fig. 6 shows failure time data of DC motors with brushes. The empirical distribution functions of the failure times for the particular load levels of 2.5, 3.75, 5, 6.25 and 7.5 mnm are plotted using dashed lines. A high load results in early failure times. The displayed data were recorded at real systems. That is the reason why occurring deviations were recorded as well. The failure times exhibit partly high spreading. This spreading can occur due to different, superposed reasons like for example the influences caused by the test bench or by the differences from production batch to batch and also differences from specimen to specimen. The longer the elapsed time is, the larger the spreading of the failure times of the particular considered specimens. A Cox model with the covariate load torque can be fitted to the data. After all model parameters are determined, prediction curves for different values of the covariate load torque can be plotted. In the graph, a prediction for the distribution function of the load level 6.5 mnm is exemplarily displayed. This curve does not lie between the empirical distribution functions for the load levels 6.25 mnm and 7.5 mnm. The reason is that for this prediction the data of all load levels have an influence. The course of result plots generated in this manner for different load torques can be illustrated in a plot of moving curves. A Cox Model with only one Covariate: Load Torque Load Levels of DC Motors 12 V 7.5 mnm 6.25 mnm Prediction 6.5 mnm 5 mnm 3.75 mnm 2.5 mnm Fig. 6 Data base and prediction for a DC motor type with a nominal voltage of 12 V For the investigated DC motors different voltage types exist. These differ only regarding the applied coil resulting in a different nominal voltage. In the case of different types (for example type 12 V, 18 V and 24 V), we can assume, that a related failure mechanism is present. Ideally, when we compare two types, their failure mechanisms result in probability distribution functions which are shaped analogously, according to the existing degree of relation between the two considered types (for example shifting and stretching). For the illustration of the use of the Cox model, Monte Carlo simulated data was generated (compare to [3]). In ISBN 978-3-8007-3537-2 78 VDE VERLAG GMBH Berlin Offenbach
detail, we specified failure times for four related motor types and for different load levels each. For the motor types 1, 2 and 4, failure times for the load levels of 2.5, 3.75, 5, 6.25 and 7.5 mnm were defined. For the motor type 3, failure times for the load levels of 2.5 and 7.5 mnm were specified only. Here, the differences regarding the failure times between the motor types are defined to feature constant time lags (shifts in time). In Fig. 7, the empirical distribution functions for these simulated failure times are displayed (green dashed lines). The goal is to predict reliability key figures for motor type 3 for untested loads. For that purpose, a Cox model with the covariates load and motor type (indicator variables) was fitted to the specified data. In Fig. 7, the predicted distribution function for a motor of type 3 at a load of 4 mnm is displayed (orange solid line). The data of the system types 1, 2 and 4 provide additional information for the prediction. For reasons of simplification, in this illustration, it is assumed, that from system type to type, similar failure patterns exist. This assumption might be valid only for system types which are of similar technology basis. For example, if the geometry is scaled in size from motor type to motor type, the failure mechanism might be related and approximately similar (stretching or shifting of patterns in the failure probability functions at different load levels). In realitity, the differences present from design type to design type might be a mixture of shifting and stretching. Type 1 Type 2 Type 3 Type 4 Prediction: Type 3; 4 mnm 7.5 mnm 6.25 mnm 5.0 mnm 3.75 mnm 2.5 mnm Analog to the analysis of motor data presented before, the probability functions of the test lots at different operating conditions can also be calculated for the planetary gears. Here, a gear with 3 stages and a reduction ratio of 308:1 was investigated with 26 specimens. In Fig. 8, the empirical distribution functions of lifetimes for the associated 4 lots of specimens with the load levels of 300, 450, 600 and 750 mnm are displayed. After fitting a Cox model with one covariate load torque, a prediction for the distribution function of the load level of 550 mnm is displayed as an example. A Cox Model with only one Covariate: Load Torque Load Levels for Gears with 3 Stages 750mNm 600mNm 450mNm ~18rpm respectively Prediction 550 mnm 300mNm Fig. 8 Data base and prediction of gears with 3 stages of similar type To compare data of gears with similar stage reduction ratio but different number of stages, the operating point can be converted into quantities at the planetary gear. In the case considered, predominantly the last planetary gear stage determines the failure. The reasons for the failure mostly are gear wear or tooth base breaking. Hence, general quantities are converted to quantities at the tooth. In Fig. 9, numbers of load cycles at the particular line load (which is dependent on the load torque at the gearbox output) for gears with 2, 3 and 4 stages are plotted in logarithmic scaling. Gears with a stage reduction ratio of 3.7 are displayed using red crosses, whereas gears with a stage reduction ratio of 6.8 are displayed using blue points. The failure times of the component tooth wheel are determined by the weakest tooth in the particular gear. Fitting a line to the failure points, respectively, it becomes possible to locate the component Wöhler curves approximately. One can read off the order of magnitude of how many load cycles the gear teeth of the last stage endure at the particular load. Fig. 7 Simulation of a prediction Stage Reduction Ratio 6.8 Stage Reduction Ratio 3.7 3.2 Calculation Results regarding Planetary Gear Data 10 5 10 6 10 7 10 8 10 9 Fig. 9 Component Wöhler curves for two different stage reduction ratios ISBN 978-3-8007-3537-2 79 VDE VERLAG GMBH Berlin Offenbach
It must be taken into account here, that the number of load cycles does not tell anything about the revolution speed the gear was operated with. A gear with 2 stages reaches its lifetime limiting number of load cycles at equal gearbox input speed and at equal load clearly earlier than a similar gear with 3 stages. In order to receive an overview over the distributions of the lifetimes of the components using the Cox model, it is not necessary to compute Wöhler curves. The Cox model with the covariate line load can be fitted to the data directly. In order to receive as a result a prediction for the lifetime in hours instead of the number of load cycles until failure, given a concrete revolution speed, it is possible to transform the number of load cycles into the time domain. Fig. 10 shows the prediction of the failure probabilities for two different stage reduction ratios at a line load of 3 N/mm. be straight forward, is to interpolate among empirical distribution functions. The realisation by means of a kernel estimator will be the matter of future investigations. Acknowledgement We thank the DFG (German Research Foundation) for the support and funding of the research project reliability prediction for mechatronic systems with the aid of statistical models exemplified by precision engineering components Zuverlässigkeitsprognose mechatronischer Systeme mit Hilfe statistischer Modelle am Beispiel feinwerktechnischer Komponenten (Ge: Je 162 / 10-1, Schi 457 / 12-1). 5 References [1] BEIER, Michael: Lebensdaueruntersuchungen an feinwerktechnischen Planetenradgetrieben mit Kunststoffverzahnung. University of Stuttgart, Institute of Design and Production in Precision Engineering, Dissertation in German, 2010. Prediction for Line Load: 3 N/mm Prediction for Driving Torque: 404.2 mnm (Stage Reduction Ratio 3.7) Prediction for Driving Torque: 346 mnm (Stage Reduction Ratio 6.8) [2] BERTSCHE, Bernd ; GÖHNER, Peter ; JENSEN, Uwe ; SCHINKÖTHE, Wolfgang ; WUNDERLICH, Hans-Joachim: Zuverlässigkeit mechatronischer Systeme - Grundlagen und Bewertung in frühen Entwicklungsphasen. Springer, Berlin, Heidelberg, 2009. Fig. 10 Cox prediction of the failure probability for two different stage reduction rates with one covariate line load for a gearbox output speed of 100 rpm It must be emphasised, that a reliability prediction is just an estimate. For reasons of simplification, the confidence intervals were not displayed in the diagrams shown here. The precision of a failure probability estimate is mainly determined by the test lot size. Sufficiently ensured statements require real experiments with an adequate test lot size. 4 Summary For the investigated DC motors as well as the precision gears, Cox s regression model can be used to predict the failure probability. Difficulties arise, when explanatory variables of important influences on the reliability were not captured. Considering different design sizes, it might be useful to calculate comparable quantities from the observed measurements. [3] BOBROWSKI, Sebastian ; DÖRING, Maik ; JEN- SEN, Uwe ; SCHINKÖTHE, Wolfgang: Reliability Prediction using the Cox Proportional Hazards Model. In: 56th International Scientific Colloquium, Ilmenau University of Technology, September 2011. [4] COX, D.R.: Regression Models and Life Tables. In: Journal of the Royal Statististical Society, Series. B 34 (1972), p. 187 220. [5] KÖDER, Thilo: Zuverlässigkeit von mechatronischen Systemen am Beispiel feinwerktechnischer Antriebe. University of Stuttgart, Institute of Design and Production in Precision Engineering, Dissertation in German, 2006. [6] MARTINUSSEN, Torben ; SCHEIKE, Thomas H.: Dynamic Regression Models for Survival Data. Springer, New York, 2006. Here, we illustrated possibilities to predict the reliability for mechatronic drive systems using the Cox model. Further investigations and attempts towards verification are currently in progress. Another approach which seems to ISBN 978-3-8007-3537-2 80 VDE VERLAG GMBH Berlin Offenbach