Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I



Similar documents
DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

not possible or was possible at a high cost for collecting the data.

Database Marketing, Business Intelligence and Knowledge Discovery

Healthcare Measurement Analysis Using Data mining Techniques

DATA MINING TECHNIQUES AND APPLICATIONS

Chapter 6. The stacking ensemble approach

Statistics for BIG data

An Introduction to Advanced Analytics and Data Mining

ICT Perspectives on Big Data: Well Sorted Materials

Maximization versus environmental compliance

Statistics 215b 11/20/03 D.R. Brillinger. A field in search of a definition a vague concept

Data Warehousing and Data Mining in Business Applications

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

What is Data Mining, and How is it Useful for Power Plant Optimization? (and How is it Different from DOE, CFD, Statistical Modeling)

Business Intelligence and Decision Support Systems

A Review of Data Mining Techniques

Information Visualization WS 2013/14 11 Visual Analytics

NEURAL NETWORKS IN DATA MINING

D A T A M I N I N G C L A S S I F I C A T I O N

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Chapter 12 Discovering New Knowledge Data Mining

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Better decision making under uncertain conditions using Monte Carlo Simulation

The Scientific Data Mining Process

How To Use Neural Networks In Data Mining

Sanjeev Kumar. contribute

Big Data Strategies Creating Customer Value In Utilities

White Paper. Data Mining for Business

APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION ANALYSIS.

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Using Data Mining for Mobile Communication Clustering and Characterization

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Database management System, Data warehousing and Data Mining

Data Mining + Business Intelligence. Integration, Design and Implementation

A Comparison of System Dynamics (SD) and Discrete Event Simulation (DES) Al Sweetser Overview.

[callout: no organization can afford to deny itself the power of business intelligence ]

Cleaned Data. Recommendations

Comparison of K-means and Backpropagation Data Mining Algorithms

Soft-Computing Models for Building Applications - A Feasibility Study (EPSRC Ref: GR/L84513)

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Federico Rajola. Customer Relationship. Management in the. Financial Industry. Organizational Processes and. Technology Innovation.

Data Mining and Soft Computing. Francisco Herrera

Ms. Aruna J. Chamatkar Assistant Professor in Kamla Nehru Mahavidyalaya, Sakkardara Square, Nagpur

Big Data: Rethinking Text Visualization

How to Plan a Successful Load Testing Programme for today s websites

Big Data with Rough Set Using Map- Reduce

Data Mining and Neural Networks in Stata

Total Exploration & Production: Field Monitoring Case Study

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining Applications in Higher Education

Five Essential Components for Highly Reliable Data Centers

Learning is a very general term denoting the way in which agents:

Neural Networks and Back Propagation Algorithm

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

An innovative approach combining industrial process data analytics and operator participation to implement lean energy programs: A Case Study

A New Approach for Evaluation of Data Mining Techniques

Cloud computing for maintenance of railway signalling systems

Intrusion Detection via Machine Learning for SCADA System Protection

AV-24 Advanced Analytics for Predictive Maintenance

The Data Mining Process

Data Discovery, Analytics, and the Enterprise Data Hub

E-Learning Using Data Mining. Shimaa Abd Elkader Abd Elaal

Nagarjuna College Of

Using Artificial Intelligence to Manage Big Data for Litigation

Dynamic Data in terms of Data Mining Streams

Data Mining Applications in Manufacturing

MSCA Introduction to Statistical Concepts

Data Mining Solutions for the Business Environment

Business Intelligence

From Big Data to Smart Data Thomas Hahn

Machine Learning: Overview

Data Mining for Successful Healthcare Organizations

Application of CFD modelling to the Design of Modern Data Centres

Get to Know the IBM SPSS Product Portfolio

HOW INTELLIGENT BUILDING TECHNOLOGY CAN IMPACT YOUR BUSINESS BY REDUCING OPERATING COSTS

SWAN. The Value of Online Water Network Monitoring SUMMARY

Application of Building Energy Simulation to Air-conditioning Design

Prerequisites. Course Outline

INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr.

Big Data Analytics. Tools and Techniques

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

Fleetwide Monitoring State of the Art Collaboration Meets Predictive Analytics for Optimizing the Entire Fleet

Knowledge Discovery from patents using KMX Text Analytics

A Data Mining Analysis to evaluate the additional workloads caused by welding distortions

SPATIAL DATA CLASSIFICATION AND DATA MINING

Modeling the Relation Between Driving Behavior and Fuel Consumption

Big Data, Physics, and the Industrial Internet! How Modeling & Analytics are Making the World Work Better."

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Using Business Intelligence to Mitigate Graduation Delay Issues

IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

The Advantages of Enterprise Historians vs. Relational Databases

Transcription:

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I

Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy Budgets etc... Data holds Key to proving you with useful insights into energy usage of your process Can help with operational planning, short term planning and strategic Planning

Data and Energy Management Your can t manager what you don t understand or control Seat of pants management days have long gone Lack of data is no excuse for poor performance Cheap storage has encouraged corporations to gather huge amounts of data - which are rarely used in the decision making processes data overload The lack of friendly tools to analysing this data is probably the main cause of this problem

Using AI to improve your business

Artificial Intelligence Techniques Data Mining for process and product improvements and management information systems Neural networks to mimic processes Intelligent sensors Soft Sensors Supervisory Control Advanced Control Monitoring and Diagnostic Analysis Optimisation Scheduling Process Modelling and Simulation Knowledge Based systems for product improvements and management information systems

The power of Data Data files can be records of the behaviour, performance of machines, processes, environmental data, the environment, etc. Accumulating data files has been perceived as just information gathering for future use, when in reality it can represent information overload or clutter unless relationships and patterns in the data can be derived As such, data files can be viewed as reservoirs of knowledge that can be mined to discover relationships, patterns and rules. The objective of data mining is to learn from this data is to extract such useful knowledge.

Learning from Data Learning from data falls into two categories - symbolic and connectionist learning Symbolic learning can generate rules and patterns from data files like decision trees Connectionist learning generates networks of processing units from the same data such as neural networks By definition, symbolic learning generates results that are understandable to the human user - such as decision trees

Learning from Data Connectionist learning builds numeric computer models from data, with typical members of this latter category being Neural Networks While the accuracy of connectionist models can often be very good, they suffer from a lack of understandability - a 'black box solution

Combing the Techniques The power of data mining combined with the modelling and prediction accuracy of neural networks and the optimisation power of genetic algorithms can all be used to provide an insight in process plant operations that were not easily obtainable before. Using data mining, industrial systems and processes can be Mined to obtaining important rules and patterns hidden in the data Neural Networks can then be set to work to model processes and provide predictions on process outcomes for different input values Genetic Optimisation techniques can use the information learned in the data mining exercise to optimise an aspect of the process such as energy usage, cost of production or production rates

Learning from Data Its an alternative knowledge engineering strategy if the data represents records of expert decision making It can derive new patterns and relationships that improve our understanding of a certain process and therefore enable us to make better decisions in the future Data models can be used to predict the outcome of future events

Learning from data Data mining aim is to establish the significant input variables and than determine how they affect the output capability of the process such what variables have the most significant effect on the energy used in a refrigeration process, a boiler energy usage etc. Data Mining may extract many such pattern rules from a data file. It may only take a handful of deduced rules to provide invaluable information about you process

Brewery Data Mining

Brewery Data Set

Conventional Data Analysis Linear Statistical Modeling Developed in 1970 s Used only a few variables Used much computational Resources Can t scan/exploited all data Too many assumptions made and maybe key variable omitted or their effect on domain of interest is unknown Accuracy variable and job specific

The basic problems with Simple Data Analysis Systems Can t cope with multi-variable, non linear processes - assume a linear world that transform all relations to simple line with know starting value Lack intelligence - the ability to think like humans Can t analyse data effectively especially large data sets with interrelation dependents between variables Only Consider a subset of all plant data at a time-and may miss a important parameter Assume key properties such as specific heat capacity or thermal conductivity don t change with time.

Linear and Non-Linear Models Linear Models A linear model uses parameters that are constant and do not vary throughout a simulation. This means that we can enter one fixed value for the parameter at the beginning of the simulation and it will remain the same throughout A non-linear model introduces dependent parameters that are allowed to vary throughout the course of a simulation run, and its use becomes necessary where interdependencies between parameters cannot be considered insignificant.

Linear Models In a linear model, all the parameters are independent of each others In the real world however, parameters are always dependent upon other parameters to some degree, but in many cases the dependency is so small it can be ignored. For example, the density of any solid material is dependent upon its temperature, but the variation is generally so small over normal temperature ranges that it can be ignored unless the temperature range is large Where possible, it is always best to use a linear model, as it is simpler and faster running than a non-linear model but if significant inter-dependcies exist a non-linear model is best

Linear and Non Linear Models The choice between using a linear and a non-linear model is dependent upon how significantly the values of any of the parameters involved vary in relation to any of the other parameters To model a non-linear parameter, we must update the simulation material parameters at each iteration step of the simulation to take into account the change in say conductivity for varying temperature changes for that iteration

Chaos Theory Chaos theory is a field of study in studying the behaviour of dynamic systems that are highly sensitive to initial conditions and rendering their potential outcomes chaotic and their outcome not predictable Example of chaotic systems with behaviours include The weather Human reactions and behaviour ( and impact on HVAC KPI )

Data Mining Architecture Data Warehouse Server Data Query Reporting Manual discovery Data Visualisation Client Semi manual discovery Automatic Pattern Discovery Data Mining

Data Warehouse/Mining Technologies Data Warehouse Transform Clean Reconstruct Operational Data Orders Process Sales Data Mining Tools Decisions Makers

Various Typical Energy Analysis Techniques

Typical Manual Visualation Techniques

Problems with interoperating many influencing factors associated with energy usage

Problems with interoperating many influencing factors associated with energy usage - Data Overload

Understanding Performance Variability: Data Mining In some circumstances, a more detailed analysis is appropriate For major energy users Where energy is a complex issue affected by multiple influencing factors Where there is access to substantial historical data, for example from a data historian Data mining has the following characteristics: It handles massive databases It finds patterns automatically It expresses the patterns as a set of rules

Data mining Output Decision Tree

Understanding Performance Variability: Data Mining The decision tree shown in above represents a set of rules generated in a data-mining analysis. The rules identify the key driver for the energy use of a refrigeration system and quantify the impact of that driver. The highlighted "route" through the tree is characterized by the following rule: If the solvent temperature is > 223 C and < 214 C Based on the 86.67% probability that is identified under "Attributes" on the right-hand side of the tree, the energy use is determined by the analysis under these conditions to be 67,167 kwh Rules are generated automatically in such an analysis. The user defines only the objectives and influencing factors. The process essentially subdivides historical operations into modes; where energy use is different, the modes are characterized by rules.

A more complex analysis

A more complex analysis A real analysis will create substantially more complex decision trees (as there are more complex rules), such as the one illustrated in diagram above. Such a tree was able to: Identify key drivers involved Quantify the impact on energy use Identify the best operating modes The results was able to identify a process point for the liquid flow and reagent use that determines with a 50.59% probability that energy consumption will be 193, 965 kwh under these conditions.

Stages of an initial data-mining analysis 1. Understand Process Operation 2. Define Performance Objectives 3. Identify and Collect Historical Operational Data 4. Data Cleaning 5. Data Analysis (manual / Visualisation/Data mining) 6. Interpretation 7. Develop Opportunities 8. Implementation

Boiler Analysis

Data Mining of Boiler Parameters

The result The impact of manifold header pressure on the operating cost is illustrated in the decision tree only that is partially illustrated in above In this case it shows that a higher steam pressure reduces the operating cost per unit of steam produced. In comparison, the simple linear plot of cost vs. manifold steam pressure would not clearly show the influence of manifold steam pressure. This is due to the changes within the data set that are happening for many other factors that affect boiler performance

Examples of Typical Applications of Advanced Computational Techniques in Energy Management Data Mining to identify a complex process KPI s Data Mining to establish the level of importance process parameters contribute to energy usage in a particular process or domain of interest Model of utility plant operations using Data Mining. Use captured data to model input/output relationships with Neural Networks. Like transfer functions but more easier to do. Using Genetic Algorithms to Optimise plant operations after data mining to discover key process attributes (like classical operational optimisation using linear programming)

Establishing Targets in EnMS Targets are expected performance values that can be compared with actual performance to discover whether a plant or process is performing well or not. Targets take several forms, including the following: Historical average performance is a commonly used target. These can be used to alert operations staff when performance is below average The simplest form of such a target is the average energy use during an earlier period, for example, the last year or the last month Often, targets will have some adjustment for external influencing factors, such as production rate or ambient temperature. Typically, this adjustment is based on a regression or multiple-regression analysis

Using Data Mining to set better targets The accuracy and robustness of targets is vitally important in energy management systems. An incorrect target will be misleading and improvements may not be reflected in the calculations or poor performance may not be identified. Poor targets result in a loss of confidence in monitoring and ultimately failure to achieve energy savings. A more sophisticated historical target can be developed using data mining and similar techniques. More data can be analysed, more influencing factors can be accounted for, and the root cause of poor targeting - non-linear relationships can be handled effectively

Real Time Targeting

Real Time Target Setting Historical v Best Practice A target produced from a detailed analysis of data collected (for example, hourly or every 15 minutes) should be sufficiently accurate to implement on-line in real time. The benefits of this include more rapid identification of operating problems. The present use of historical average performance can be considered a benchmark against which future performance can be compared. It represents what typically would have happened had no changes (improvements) been made. A best-practice target identifies what a process or plant could achieve if it were operated well. It differs from average historical performance since it is based on facts about the improvement potential not on just past out of date performance

Determining Best Practice Performance Alternatively, best-practice targets can represent the best performance achieved in the past, given the particular (external) conditions. This can be discovered from historical operating data using data mining and similar techniques. A best-practice target is discovered by identifying periods of operation in the past where external conditions were similar to those currently in place and then selecting the best performing period as the target temperatures, pressures, flows, etc., that are

Neural Networks Neural networks can be used to empirically model non-linear data, to continually learn in an online mode, and to provide recommendations back to engineers, operators or directly to the control to keep the unit tuned. To determine the optimum solution the neural network uses evolutionary search algorithms to perform a sophisticated version of what if, identifying appropriate set points for the current operating conditions.

Applications of Neural Networks There are many process parameters that we cannot measure although experience process operators know the range of operational parameters that results in best quality or process output. Examples include : Stickiness of adhesives, Glass Quality, Taste etc. These are often provided by laboratory analyses resulting in infrequent feedback and substantial measurement delays, rendering automatic process control impossible. Inferential estimation is one method that has been designed to overcome this problem. The technique has also been called 'sensor-data fusion' and 'soft-sensing'.

Rule Induction Decision Tree

Optimisation Industrial Processes are rarely optimized and stray of optimisation easily Selecting the economically optimal tradeoff between energy, raw material conversion rates etc. Evaluate potential return on investments resulting from optimised set points. Compare this result to current set points

Optimisation using Genetic Algorithms Previously optimum solutions expensive to find and required high computational power Examples of Optimisation: Optimising production mix to maximise production and minimise energy costs Load management Process scheduling

Optimisation using Genetic Algorithms Genetic algorithms Evolve superior solutions from a population of solutions Generations of solutions are improved by Survival of Fittest For each generations, the solutions are assessed and the poor ones discarded and good one cross bred to give the next generation getting closure to the optimum

The Benefits of Process Optimisation Increased Yield Higher and more consistent Quality Reduced energy and environmental Impact Fewer process and plant trips Increased Throughput Reduced Maintenance and Longer plant life Reduced Manpower Improved Environmental performance Increase Safety