Big Data Analytics for SCADA

Similar documents
Clustering Connectionist and Statistical Language Processing

Utilising SCADA data to enhance performance monitoring of operating assets: The move to real-time performance management

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Data Mining with Weka

Knowledge Discovery from patents using KMX Text Analytics

Statistics W4240: Data Mining Columbia University Spring, 2014

Speaker First Plenary Session THE USE OF "BIG DATA" - WHERE ARE WE AND WHAT DOES THE FUTURE HOLD? William H. Crown, PhD

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES

UK PhD Centre for Financial Computing

Certificate Program in Applied Big Data Analytics in Dubai. A Collaborative Program offered by INSOFE and Synergy-BI

Data Mining and Soft Computing. Francisco Herrera

Machine Learning Capacity and Performance Analysis and R

EAMS for Future Grids

MS1b Statistical Data Mining

Knowledge Discovery from Data Bases Proposal for a MAP-I UC

Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

New Ensemble Combination Scheme

: Introduction to Machine Learning Dr. Rita Osadchy

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Faculty of Science School of Mathematics and Statistics

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

Maschinelles Lernen mit MATLAB

Random forest algorithm in big data environment

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

Weather forecast prediction: a Data Mining application

Using Artificial Intelligence to Manage Big Data for Litigation

Using Predictive Analytics to Detect Fraudulent Claims

Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data

MA2823: Foundations of Machine Learning

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

Data Mining & Data Stream Mining Open Source Tools

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

Data Analytics at NICTA. Stephen Hardy National ICT Australia (NICTA)

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies

Statistics for BIG data

Leveraging Ensemble Models in SAS Enterprise Miner

Onshore Wind Services

Spam detection with data mining method:

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages , Warsaw, Poland, December 2-4, 1999

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Comparison of K-means and Backpropagation Data Mining Algorithms

A Brief Tutorial on Database Queries, Data Mining, and OLAP

ANALYTICS IN BIG DATA ERA

Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning

College of Health and Human Services. Fall Syllabus

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

Package acrm. R topics documented: February 19, 2015

Master's projects at ITMO University. Daniil Chivilikhin PhD ITMO University

Data Mining Practical Machine Learning Tools and Techniques

Industrial Roadmap for Connected Machines. Sal Spada Research Director ARC Advisory Group

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence

GE Power & Water Renewable Energy. Digital Wind Farm THE NEXT EVOLUTION OF WIND ENERGY.

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification

Azure Machine Learning, SQL Data Mining and R

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

Web Document Clustering

Machine Learning for Data Science (CS4786) Lecture 1

BIG DATA What it is and how to use?

Mining. Practical. Data. Monte F. Hancock, Jr. Chief Scientist, Celestech, Inc. CRC Press. Taylor & Francis Group

The Optimality of Naive Bayes

Numerical Algorithms for Predicting Sports Results

Machine Learning: Overview

Linear Algebra Methods for Data Mining

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

Introduction to Machine Learning Using Python. Vikram Kamath

Clustering Marketing Datasets with Data Mining Techniques

Smarter Energy: optimizing and integrating renewable energy resources

Towards applying Data Mining Techniques for Talent Mangement

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Getting Even More Out of Ensemble Selection

Course Description This course will change the way you think about data and its role in business.

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

SURVEY REPORT DATA SCIENCE SOCIETY 2014

Benchmarking Open-Source Tree Learners in R/RWeka

A Divided Regression Analysis for Big Data

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Mechanics lecture 7 Moment of a force, torque, equilibrium of a body

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113

NON FUNCTIONAL REQUIREMENT TRACEABILITY AUTOMATION-AN MOBILE MULTIMEDIA APPROACH

Transcription:

ENERGY Big Data Analytics for SCADA Machine Learning Models for Fault Detection and Turbine Performance Elizabeth Traiger, Ph.D., M.Sc. 14 April 2016 1 SAFER, SMARTER, GREENER

Points to Convey Big Data in Wind Industry Analysis on Large Volume Data Practicalities Into to the Black Box Machine Learning Basics Supervised Learning Gearbox Fault Detection Unsupervised Learning Random Forest Turbine Performance Classification General Machine Learning Truths 2

Big Data in Wind Industry Big Data Volume Velocity Varied Beyond Capabilities of Traditional Data Processing 3

Big Data in Wind Industry Atmospheric Performance SCADA Vibration/ Acceleration Grid Temperature Market 4

Big Data in Wind Industry Traditional Data Analysis Methodology Model Driven Big Data / Predictive Analytics Data Driven Rule Based Pattern Based Explanatory Predictive Time Averaged Real Time Processor Bound Distributed 5

Analysis on Large Volume Data Practicalities 6

Analysis on Large Volume Data Practicalities 7

Analysis on Large Volume Data Practicalities 8

Analysis on Large Volume Data Practicalities Structured Unstructured Wind Speed Temperature Yaw Angle Power Voltage Wind Speed Temperature Yaw Angle Market Price Inspection Condition 9

Into to the Black Box Machine Learning Basics Pattern Recognition Machine Learning Separation Predictive Generalization 10

Into to the Black Box Machine Learning Basics Supervised Unsupervised Classification Regression Clustering Dimension Reduction Training Set Validation Set 11

Into the Black Box Machine Learning Basics SOURCE: https://s3.amazonaws.com/mlmastery/machinelearningalgorithms.png? s=iph8dvzbonmmouyrjzfq 12

Into to the Black Box Machine Learning Basics - Supervised Representation Learners Evaluation Optimization 13

Into to the Black Box Machine Learning Basics - Supervised Source: http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/ 14

Condition Supervised learning example Gearbox Fault Classification Total Failure Early Fault Identified Time 15

Supervised learning example Gearbox Fault Classification Input Generator bearing temp. at T-2 Generator bearing temp. at T-1 Power output at T Support Vector Machine Output Fault Classification Generator speed at T Wind Speed 3. Source: By Cyc - Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curid=3566688

Into to the Black Box Machine Learning Basics - Unsupervised Source: http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms 17

Unsupervised learning example Turbine Performance Veer Shear TOD TI TE Wind Speed WD AD Power 18

Unsupervised learning example Turbine Performance Random Forest Dissimilarity 19

Unsupervised learning example Turbine Performance WS (AD Corrected) AD WD TI TOD TE 20

General Machine Learning Truths Data is not enough High dimension is no longer intuitive Feature engineering is paramount More data is better than a smart algorithm No one model is a best fit Embrace constant change Uncertainty about Uncertainty 21

Theory References 1. Pedro Domingos. 2012. A few useful things to know about machine learning. Commun. ACM 55, 10 (October 2012), 78-87. DOI = http://dx.doi.org/10.1145/2347736.2347755 2. Hastie, T., Tibshirani, R., and Friedman, J. H., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York: Springer, 2011. 3. Brian D. Ripley and N. L. Hjort. Pattern Recognition and Neural Networks. Cambridge University Press, New York, NY, USA., 1 st edition, 1995 4. I. Witten, E. Frank and M. Hall. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Mateo, CA 3 rd edition, 2011. 22

Happy Learning Elizabeth Traiger, Ph.D, M.Sc elizabeth.traiger@dnvgl.com www.dnvgl.com SAFER, SMARTER, GREENER 23