Evaluation of Machine Learning Techniques for Green Energy Prediction



Similar documents
Leveraging Renewable Energy in Data Centers: Present and Future

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

Machine learning for algo trading

Predicting Solar Generation from Weather Forecasts Using Machine Learning

An Introduction to Data Mining

Machine Learning for Data Science (CS4786) Lecture 1

Neural Networks and Support Vector Machines

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Scalable Developments for Big Data Analytics in Remote Sensing

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University

Predict Influencers in the Social Network

HT2015: SC4 Statistical Data Mining and Machine Learning

Predicting Flight Delays

Impact of workload and renewable prediction on the value of geographical workload management. Arizona State University

Classification algorithm in Data mining: An Overview

Intrusion Detection via Machine Learning for SCADA System Protection

8. Machine Learning Applied Artificial Intelligence

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

NEURAL NETWORKS A Comprehensive Foundation

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Forecasting and Hedging in the Foreign Exchange Markets

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

Using Data Mining for Mobile Communication Clustering and Characterization

Machine Learning.

MS1b Statistical Data Mining

An Overview of Knowledge Discovery Database and Data mining Techniques

Statistics for BIG data

MACHINE LEARNING IN HIGH ENERGY PHYSICS

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague.

Machine Learning. CUNY Graduate Center, Spring Professor Liang Huang.

Machine Learning with MATLAB David Willingham Application Engineer

Chapter 12 Discovering New Knowledge Data Mining

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

A Simple Introduction to Support Vector Machines

A Content based Spam Filtering Using Optical Back Propagation Technique

STA 4273H: Statistical Machine Learning

Role of Neural network in data mining

Big Data Analytics and Optimization

Data Mining Techniques for Prognosis in Pancreatic Cancer

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Employer Health Insurance Premium Prediction Elliott Lui

Predictive Dynamix Inc

The Basics of Graphical Models

Bayesian networks - Time-series models - Apache Spark & Scala

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

Comparison of K-means and Backpropagation Data Mining Algorithms

Learning outcomes. Knowledge and understanding. Competence and skills

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Classifier

Final Project Report

Classification of Bad Accounts in Credit Card Industry

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

Joseph Twagilimana, University of Louisville, Louisville, KY

Supervised and unsupervised learning - 1

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

DATA MINING TECHNIQUES AND APPLICATIONS

MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS

Statistical Models in Data Mining

A New Approach For Estimating Software Effort Using RBFN Network

Model-based Synthesis. Tony O Hagan

The Artificial Prediction Market

Big Data Analytics CSCI 4030

Compression algorithm for Bayesian network modeling of binary systems

Data, Measurements, Features

Social Media Mining. Data Mining Essentials

Multi-Class and Structured Classification

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Car Insurance. Prvák, Tomi, Havri

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

MSCA Introduction to Statistical Concepts

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Content-Based Recommendation

Novelty Detection in image recognition using IRF Neural Networks properties

A Comparative Analysis of Classification Techniques on Categorical Data in Data Mining

Spark: Cluster Computing with Working Sets

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Cell Phone based Activity Detection using Markov Logic Network

An Introduction to Machine Learning

Machine Learning Introduction

Introduction to Support Vector Machines. Colin Campbell, Bristol University

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

MSCA Introduction to Statistical Concepts

The Operational Value of Social Media Information. Social Media and Customer Interaction

New Ensemble Combination Scheme

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions

Supervised Learning (Big Data Analytics)

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research

Machine Learning in Python with scikit-learn. O Reilly Webcast Aug. 2014

A Novel Method for Predicting the Power Output of Distributed Renewable Energy Resources

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu

Transcription:

arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques for Green energy (wind, solar and biomass) prediction based on weather forecasts. Weather is constituted by multiple attributes: temperature, cloud cover, wind speed/ direction which are discrete random variables. One of our objectives is to predict the weather based on the previous weather data. Additionally we are interested in finding correlation (dependencies in order to reduce the dimensionality of the data set) between these variables, predicting missing data predict deviations in weather forecasts (for job scheduling within the green control center), finding clusters within the data (constituted by closely related variables e.g. PCA that can be used to remove redundant variables), classification, finding (non-linear using SVMs) regression models, training artificial neural networks based on the historical data so that they can be used for prediction in the future. 2 Machine learning techniques We analyze the following machine learning techniques: Bayesian inference Neural Networks

Support Vector Machines Clustering technique: PCA 2.1 Bayesian learning Bayesian learning is based on using previous information to predict the future. It is based on the Bayes? rule p(x/y)=p(x Y)/P(Y) (1) = P(Y/X) P(X)/P(Y) (2) = p(x/y) P(X) (3) = p(x/y) (1/P(Y)) (4) If we consider the set of all the events that are correlated to X (i.e. that X depends upon) say Y 1,, Y n, then from (1) we get: X= (X Y 1 ) (X Y n ) (5) = p(x)= p(x Y 1 )+ + p(x Y n ) (6) p(x)= p(x/y 1 ) p(y 1 )+ + p(x/y n ) p(y n ) (7) n p(x)= p(x/y i ) p(y i ) (8) Bayesian networks use a directed acyclic graph for the different random variables (represented using nodes) and their conditional dependencies (represented using edges) and uses inference algorithm to make prediction. Different graphical techniques like belief propagation or expectation maximization can be used to spread the information within the Bayesian network. Expectation maximization algorithms are used to model scenarios, which consist of latent variables, and consist of two steps. The structure of a dynamic Bayesian network changes with time. One simple example is a Hidden Markov Model. Markov model is a graph where each node corresponds to a state and has an associated transition probability to the neighboring nodes (which measure the probability with which the state transition will happen to the neighboring nodes). In the Hidden Markov Model some states are hidden as in the transition probability for these nodes (how the system behaves when in the state corresponding to these nodes) is not known. 2 i=1

Belief propagation includes algorithm like sum-product message passing. This technique is based on using the information stored in the neighboring nodes along with conditional probability to pass message within the network. It can also be represented using a Linear programming formulation. If the input variable follows a distribution such as Gaussian, it becomes much easier to use Bayesian inference. One of the disadvantages is that it is computationally hard although it is conceptually simple. This technique can also be used to recover the missing parameters/ variables based on the previous value (prior distribution) and correlation with other parameters. 2.2 Clustering techniques These techniques are used to identify closely related data points in any given data set. K-means clustering is a commonly studied example where the data points have to be grouped into K clusters such that the mean distance between the clusters is minimized. These techniques can be used to identify patterns in a weather plot say in a temperature vs. cloud cover plot where, region with higher density of points can be identified (which can be used to infer when the temperature is of certain value then it is more likely that the cloud cover is of certain value). Principal Component Analysis (PCA) is one such technique that is based on generating linearly uncorrelated variable (principal components) from a data set of multiple correlated variables. 2.3 Binary Decision Trees This can be used for scheduling the jobs based on deviation between the predicted and actual weather conditions. One can calculate different schedules based upon the different weather forecast/ prediction and store it in a binary decision tree. Later on depending on the deviation from the actual weather data one can use the rules stored in the nodes of the decision tree to select one of the schedules. 2.4 Support Vector Machines SVM is a supervised machine learning technique, which is used for binary classification of the input into different sets after training the model with previous data. It can 3

work in multiple dimensions represented by different variables. SVM works by building a model which is used for classification. It is based on a kernel method which is used for classification/regression, reducing dimensionality or clustering. Further they can also be used for nonlinearizing the regression and avoiding over/ underfitting. This is important because it is not possible to find a linear fit for many of the observed data in real world. In a simple linear regression techniques such as minimizing least square error the decision to select a regression function for the input data is based on fitting a line that minimizes the total square error for the entire data-set. However, for a data set that is very unevenly distributed the linear regression may be not be the best suited technique. Kernel techniques also make it easier to apply linear methods in higher dimensions. Another advantage of Kernel methods is that they allow comparison of two objects consisting of multiple attributes. Another perspective to understanding SVMs is to view it as a Minimization (optimization) problem where the goal is to minimize the average distance between the two clusters and the hyperplane which is equivalent to identifying a hyperplane which has the maximum distance from the nearest data point from both the clusters. LibSVM and SVMLight are two software libraries that can be used for SVM. 2.5 Artificial Neural Networks This technique is based on the simulation of a biological neural network, which consists of artificial neurons. By training the artificial neurons over a period of time (with historical data) one can predict future output. It is based on studying how the neurons connect (represented by connection weights) and pass information to each other for different inputs. Neurons collect the input from neighboring neurons and based on the magnitude of this input decide whether to fire or not. There can be multiple layers in a ANN e.g. input, hidden and output layer. Additionally we suggest the following techniques: Gibbs sampling, Ridge regression, Mean field value, RBF kernel. 4

3 Conclusion Bayesian inference can be used to predict the values of missing variables. Clustering techniques can be used to identify similarities between points in a data set in addition to removing redundancy. Binary decision tree can be used for dynamic scheduling (based on rules stored on nodes) within the green control center. SVMs and ANNs have to be analyzed further on if they can be useful for weather prediction. References [1] Adrian E. Raftery, Tilmann Gneiting, Fadoua Balabdaoui and Michael Polakowski, Using Bayesian Model Averaging to Calibrate Forecast Ensembles, Monthly Weather Review. Vol. 133, no. 5, pp. 1155-1174. May 2005. [2] Antonio S. Cofino, Rafael Cano, Carmen Sordo, Jose M. Guttirez, Bayesian networks for probabilistic weather prediction, In Proceedings of the 15th European Conference on Artificial Intelligence, ECAI 2002, p 695-699. [3] Burrows, William R., Colin Price, Laurence J. Wilson, 2005: Warm season lightning probability prediction for Canada and the northern United States. Wea. Forecasting, 20, 971-988. [4] Thomas A.Henzinger, Anmol V. Singh, Vasu Singh, Thomas Wies, Damien Zufferey Proceedings of the 3rd USENIX conference on Hot topics in cloud computing, 1-1 [5] Yulia Gel, Adrian E. Raftery, Tilmann Gneiting, Claudia Tebaldi, Doug Nychka, William Briggs, Mark S. Roulston and Veronica J. Berrocal, Calibrated Probabilistic Mesoscale Weather Field Forecasting: The Geostatistical Output Perturbation Method, Journal of the American Statistical Association, Vol. 99, No. 467 (Sep., 2004), pp. 575-590 [6] McIntyre, M. E., NUMERICAL WEATHER PREDICTION: A vision of the future, Weather, 43, 8, Blackwell Publishing Ltd. 1477-8696, p 294-298, 1988 5

[7] Inigo Goiri, Ryan Beauchea, Kien Le, Thu D. Nguyen, Md. E. Haque, Jordi Guitart, Jordi Torres, Ricardo Bianchini, GreenSlot: scheduling energy consumption in green datacenters, Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis [8] Inigo Goiri, Kien Le, Thu D. Nguyen, Jordi Guitart, Jordi Torres, and Ricardo Bianchini, GreenHadoop: Leveraging Green Energy in Data- Processing Frameworks, European Conference on Computer Systems (EuroSys 2012) Bern, Switzerland, April 10-13, 2012 [9] Navin Sharma, Pranshu Sharma, David Irwin, and Prashant Shenoy, Predicting Solar Generation from Weather Forecasts Using Machine Learning, Proceedings of the Second IEEE International Conference on Smart Grid Communications (SmartGridComm), Brussels, Belgium, October, 2011. 6