ENERGY Big Data Analytics for SCADA Machine Learning Models for Fault Detection and Turbine Performance Elizabeth Traiger, Ph.D., M.Sc. 14 April 2016 1 SAFER, SMARTER, GREENER
Points to Convey Big Data in Wind Industry Analysis on Large Volume Data Practicalities Into to the Black Box Machine Learning Basics Supervised Learning Gearbox Fault Detection Unsupervised Learning Random Forest Turbine Performance Classification General Machine Learning Truths 2
Big Data in Wind Industry Big Data Volume Velocity Varied Beyond Capabilities of Traditional Data Processing 3
Big Data in Wind Industry Atmospheric Performance SCADA Vibration/ Acceleration Grid Temperature Market 4
Big Data in Wind Industry Traditional Data Analysis Methodology Model Driven Big Data / Predictive Analytics Data Driven Rule Based Pattern Based Explanatory Predictive Time Averaged Real Time Processor Bound Distributed 5
Analysis on Large Volume Data Practicalities 6
Analysis on Large Volume Data Practicalities 7
Analysis on Large Volume Data Practicalities 8
Analysis on Large Volume Data Practicalities Structured Unstructured Wind Speed Temperature Yaw Angle Power Voltage Wind Speed Temperature Yaw Angle Market Price Inspection Condition 9
Into to the Black Box Machine Learning Basics Pattern Recognition Machine Learning Separation Predictive Generalization 10
Into to the Black Box Machine Learning Basics Supervised Unsupervised Classification Regression Clustering Dimension Reduction Training Set Validation Set 11
Into the Black Box Machine Learning Basics SOURCE: https://s3.amazonaws.com/mlmastery/machinelearningalgorithms.png? s=iph8dvzbonmmouyrjzfq 12
Into to the Black Box Machine Learning Basics - Supervised Representation Learners Evaluation Optimization 13
Into to the Black Box Machine Learning Basics - Supervised Source: http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/ 14
Condition Supervised learning example Gearbox Fault Classification Total Failure Early Fault Identified Time 15
Supervised learning example Gearbox Fault Classification Input Generator bearing temp. at T-2 Generator bearing temp. at T-1 Power output at T Support Vector Machine Output Fault Classification Generator speed at T Wind Speed 3. Source: By Cyc - Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curid=3566688
Into to the Black Box Machine Learning Basics - Unsupervised Source: http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms 17
Unsupervised learning example Turbine Performance Veer Shear TOD TI TE Wind Speed WD AD Power 18
Unsupervised learning example Turbine Performance Random Forest Dissimilarity 19
Unsupervised learning example Turbine Performance WS (AD Corrected) AD WD TI TOD TE 20
General Machine Learning Truths Data is not enough High dimension is no longer intuitive Feature engineering is paramount More data is better than a smart algorithm No one model is a best fit Embrace constant change Uncertainty about Uncertainty 21
Theory References 1. Pedro Domingos. 2012. A few useful things to know about machine learning. Commun. ACM 55, 10 (October 2012), 78-87. DOI = http://dx.doi.org/10.1145/2347736.2347755 2. Hastie, T., Tibshirani, R., and Friedman, J. H., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York: Springer, 2011. 3. Brian D. Ripley and N. L. Hjort. Pattern Recognition and Neural Networks. Cambridge University Press, New York, NY, USA., 1 st edition, 1995 4. I. Witten, E. Frank and M. Hall. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Mateo, CA 3 rd edition, 2011. 22
Happy Learning Elizabeth Traiger, Ph.D, M.Sc elizabeth.traiger@dnvgl.com www.dnvgl.com SAFER, SMARTER, GREENER 23