Renewable Energy 52 (2013) 118e127. Contents lists available at SciVerse ScienceDirect. Renewable Energy

Size: px
Start display at page:

Download "Renewable Energy 52 (2013) 118e127. Contents lists available at SciVerse ScienceDirect. Renewable Energy"

Transcription

1 Renewable Energy 52 (213) 118e127 Contents lists available at SciVerse ScienceDirect Renewable Energy journal homepage: Short-term solar power prediction using a support vector machine Jianwu Zeng, Wei Qiao * Department of Electrical Engineering, University of Nebraska-Lincoln, Lincoln, NE , USA article info abstract Article history: Received 3 March 212 Accepted 1 October 212 Available online Keywords: Autoregressive (AR) model Radial basis function neural network (RBFNN) Short term Solar power prediction (SPP) Support vector machine (SVM) This paper proposes a least-square (LS) support vector machine (SVM)-based model for short-term solar power prediction (SPP). The input of the model includes historical data of atmospheric transmissivity in a novel two-dimensional (2D) form and other meteorological variables, including sky cover, relative humidity, and wind speed. The output of the model is the predicted atmospheric transmissivity, which then is converted to solar power according to the latitude of the site and the time of the day. Computer simulations are carried out to validate the proposed model by using the data obtained from the National Solar Radiation Database (NSRDB). Results show that the proposed model not only significantly outperforms a reference autoregressive (AR) model but also achieves better results than a radial basis function neural network (RBFNN)-based model in terms of prediction accuracy. The superiority of using transmissivity over sigmoid functions for data normalization is testified. Simulation studies also show that the use of additional meteorological variables, especially sky cover, improves the accuracy of SPP. Ó 212 Elsevier Ltd. All rights reserved. 1. Introduction Due to the uncertainty and intermittency of solar energy, shortterm solar power prediction (SPP) has become an important issue in increasing the penetration of solar energy sources in electric power grids. Accurate short-term SPP not only helps optimize the integration of solar energy into electric power grids, but also ensures a favorable trading performance of solar energy in electricity markets [1]. However, the SPP accuracy largely depends on the meteorological and climatic conditions of a location [2], which makes the SPP a challenging problem. The goal of SPP is to predict solar radiation of a location at a specific time in the future. There are mainly two categories of models for SPP: physical models and statistical models. Physical models use mathematical equations to describe physics and dynamics of the atmosphere that influences solar radiation [1,3]. They work well for medium- and long-term SPP. Statistical models are mainly based on time series analysis [4]. They have lower complexity than physical models and can perform well for short-term SPP. Therefore, statistical models are of concern in this paper for short-term SPP. Autoregressive (AR) and autoregressive moving average (ARMA) [5] are frequently used linear models in SPP [6,7]. These linear models are superior to the traditional persistence model. Artificial intelligence-based nonlinear models, such as the Takagi-Sugeno * Corresponding author. Tel.: þ ; fax: þ address: wqiao@engr.unl.edu (W. Qiao). fuzzy model [8], wavelet-based models [9], and artificial neural network (ANN)-based models [1] have been shown their superiority to the linear models. Recently, the use of support vector machines (SVMs) for time series prediction has been studied [11], such as temperature prediction [12], short-term wind prediction [13], and solar flare forecasting [14]. The success of using SVMs for time series prediction is largely due to its good generalization ability. However, the effectiveness of using SVMs for short-term SPP has not been studied yet. In SPP, feature selection plays a key role in determining the performance of the prediction models. Improper features will lead to poor regression in an SPP model. It is known that solar radiation largely depends upon meteorological conditions. Some studies have shown that the SPP models using certain extra meteorological variables can achieve better performance than those using solar radiation data only [15]. Sun duration, air temperature [16], cloudiness index (CI) [17], atmospheric transmissivity [18], day length, and time can be used as extra inputs for SPP models. Common approaches to predict solar power is first predicting CI or atmospheric transmissivity and then converting them to solar radiation according to known formula [19]. Nevertheless, solar radiation was taken as a 1D time series in most of the existing work, which turned out to be inferior to a 2D representation [2]. The 2D representation of solar radiation makes it possible to further combine image processing methods with nonlinear prediction methods to improve the accuracy of SPP [21]. This paper proposes a SVM-based model for short-term SPP. Computer simulations are carried out to show the superiority of the /$ e see front matter Ó 212 Elsevier Ltd. All rights reserved.

2 J. Zeng, W. Qiao / Renewable Energy 52 (213) 118e Acronyms 2D ANN AR ARMA DOY LS NSRDB MAE MAPE RBFNN SPP SVM TOD SVM-based model over an AR model and a radial basis function neural network (RBFNN) model by using data obtained from the National Solar Radiation Database (NSRDB). The 2D transmissivity and other meteorological variables, including sky cover, relative humidity, and wind speed, are used as the inputs for different models to predict the atmospheric transmissivity, which then is converted to solar power according to the latitude of the site and the time of the day. The composition of this paper is as follows. Section 2 describes different SPP models. The novel 2D solar radiation representation and data normalization are illustrated in Section 3. Computer simulations are provided in Section 4 to demonstrate the superiority of the SVM-based model and investigate the factors that influence the performance of the model. Section 5 provides concluding remarks of the paper. 2. The SPP models Consider the original dataset consisting of m variables, where one is the ground solar radiation time series y t (t ¼ 1,., N), and others are meteorological variables X t, which is a N (m 1) matrix. In this paper, X t (i) is the ith (i ¼ 1,., m 1) column of the matrix X t and represents a time series of the ith meteorological variable. X (i) (t) represents the value of X t (i) at the time t. In this paper the following normalized time series of solar radiation is used as the input and output of the predict models. y t ¼ f ðy t Þ (1) where f($) is the normalization function to be defined in Section 3.3; y t is the normalized values of the original time series y t. Similarly, X ðiþ t can be normalized to X ðiþ t The AR model two-dimensional artificial neural network autoregressive autoregressive moving average day of the year least square National Solar Radiation Database mean absolute error mean absolute percentage error radial basis function neural network solar power prediction support vector machine time of the day The AR model expresses a time series as a linear function of its past values. The order of the AR model indicates how many past values are used. An AR model, AR(m$D), with an order of m$d can be written as: solar radiation; m is the number of input variables; a ij (i ¼ 1,., m and j ¼ 1,., D) is the AR coefficient; x t ¼½X ð1þ ðtþ; X ð1þ ðt 1Þ;. ; X ð1þ ðt D þ 1Þ;. ; X ðm 1Þ ðtþ; X ðm 1Þ ðt 1Þ;. ; X ðm 1Þ ðt D þ 1Þ; yðtþ; yðt 1Þ;. ; yðt D þ 1ÞŠ T are the current and past values of the time series X ðiþ t (i ¼ 1,., m 1) and y t ; e(t) is a noise or error term, which is assumed to be a normally distributed random number The RBFNN-based model The RBFNNs are a class of feed-forward ANNs constructed based on the function approximation theory. Fig. 1 shows the structure of an RBFNN, where x t (i) denotes the ith component of x t. It has three functionally distinct layers. The input layer is simply a set of sensory units. The second layer is a hidden layer of sufficient dimension, which use RBFs as activation functions to perform a nonlinear transformation from the input space to a higher-dimensional hidden-unit space. The third layer performs a linear transformation from the hidden-unit space to the output space. The output of the RBFNN is given by: b yðt þ hþ ¼ X n i ¼ 1 Fig. 1. The structure of an RBFNN. w i fðx t ; c i ; s i Þþw (3) where x t R m$d is the input regression vector, which is the same as that of the AR model; n is the number of neurons (i.e., RBF units) in the hidden layer; w is a bias term; w i is the weight between the hidden and output layers; and f($) is the RBF activation function in the hidden layer, which is defined as:! fðx t ; c i ; s i Þ¼exp kx t c i k 2 2s 2 i where c i and s i are the center and width of the RBF function, respectively. The values of c i and s i can be determined by different methods. The simplest method is to randomly choose a subset of the data points as the RBF centers. A more sophisticated approach is to cluster the data into an appropriate number of clusters, whose centers are then used as the centers of the RBF units. In this paper, a local Gaussian mixture model [22] with spherical covariance structure is created to determine the RBF centers by a K-means (4) b yðt þ hþ ¼½a11 ; a 12 ;.; a 1D ; a 21 ; a 22 ;.; a 2D ;.; a m1 ; a m2 ;.; a md Š$x t þ eðtþ (2) where h is the prediction horizon; b yðt þ hþ is the h-step-ahead predicted value of yðt þ hþ, which is the h-step-ahead normalized clustering algorithm [23]. The Gaussian mixture model is trained by using the Expectation Maximum (EM) algorithm [24].

3 12 J. Zeng, W. Qiao / Renewable Energy 52 (213) 118e127 It has been shown [25] that setting the widths of the RBFs to be equal to the variances of the corresponding mixture model tends to give poor results, because the widths are too small and there is insufficient overlap between the RBF functions. In this paper, all the widths are set at the same value, which is proportional to the maximum Euclidean distance, d max, between RBF centers. s i ¼ k s $d max (5) where k s is a nonnegative scalar whose typical value is in the range of [.1,.2] [26]. Given a time series y t, (3) can be written as: b yt ¼ F$w (6) where w ¼ [w, w 1,., w n ] T is the vector of the output weights and bias term; and F is the matrix of hidden-layer activations due to the input data y t. A sum-of-square error function is defined by: E ¼ 1 b 2 y 2 t y t Since this error function is a quadratic function of the vector w, pseudo-inverse can be used to determine the optimal w to minimize the value of the error function. w ¼ F þ $y t (8) where F þ ¼ (F T F) 1 F T. The Netlab toolbox [25] is used to construct the RBFNN in simulation studies of the paper The SVM-based model The SVMs belong to the class of kernel methods. The use of an SVM for time series prediction can be expressed as follows. b yðt þ hþ ¼v T 4ðx t Þþb (9) where b R is a bias term; v R M is the weight vector; and 4: R m$d / R M (M m$d) is a nonlinear feature map, which transforms the input vector x t R m$d to a higher-dimensional vector 4(x t ) R M. Fig. 2 shows the structure of an SVM, where 4 (i) (x t ) denotes the ith component of 4(x t ). In an SVM, the historical data of the time series is mapped into a higher-dimensional feature space via a nonlinear mapping 4; then linear regression is used in the high-dimensional feature space to predict the time series, which is equivalent to solve a nonlinear regression problem in the low-dimensional space of the original time series [27]. The key issue to solve such a prediction problem is to find the optimal values of the SVM parameters w and b. This can be done by solving a constrained optimization problem [28]. Fig. 2. The structure of an SVM. (7) min 1 2 vt v þ g XN t ¼ 1 e 2 ðtþ s:t yðt þ hþ ¼v T 4ðx t Þþb þ eðtþ; t ¼ 1; 2; :::; N (1) where yðt þ hþ is the observed value of b yðt þ hþ; e(t) is the prediction error at time t; g is a regularization parameter, which balances the fitting in the training stage and generalization in the implementation stage. A too large or too small g might deteriorate the generalization ability of the SVM in the implementation stage. Problem (1) can be solved by using Lagrange multipliers and the solution is expressed in its dual form. Then the resulting SVM of (9) is called a least-square (LS) SVM and can be represented as follows: b yðt þ hþ ¼ X N i ¼ 1 a i Kðx t ; x i Þþb (11) where a i (i ¼ 1,., N) is the nonnegative Lagrange multiplier of Problem (1); and K(x t, x i ) ¼ 4(x t )4(x i ) is a positive-definite kernel function. The input samples associated with the nonzero Lagrange multipliers in (11) are referred to as the support vectors (SVs). The value of a multiplier a i represents the contribution of the corresponding sample to the SVM, namely, a larger a i indicates that its corresponding SV is more important. Commonly used kernel functions include linear, polynomial, and RBF kernels. An SVM with the following RBF kernel [29] is used in this paper.! Kðx t ; x i Þ¼exp kx t x i k 2 (12) s 2 where s is the width of the RBF kernel, which determines the influence area of the SVs over the data space. 3. Data representation and preprocessing The National Solar Radiation Database (NSRDB) [3] is used to validate the effectiveness of the proposed model. The data was recorded from 1991 to 25 at 1454 locations in the United States and contains 47 variables, including hourly solar radiation and other meteorological data. In this paper, three sites located in different regions of the United States are selected, which are Seattle (Station ID: 72793) in northwest, Denver (Station ID: ) in Midwest, and Miami (Station ID: 7222) in southeast. The Denver data is used in following illustration D representation To visualize the benefits of using the 2D representation, the average values of the 14-year NSRDB data (Jan. 1, 1991eDec. 31, 24) are firstly considered as a 1D time series and then as a 2D image formed in the raster scan form with the columns and rows corresponding to days and hours, respectively. Figs. 3 and 4 show the 1D and 2D representations of the 14-year average solar radiation data, respectively, where each data point is the average value of the data points at the same time in the 14 years. In Fig. 3, it is visually difficult to grasp the solar radiation characteristics within a day although the seasonal behavior is obvious. In Fig. 4, both daily and seasonal behavior of solar radiation can be easily interpreted, where a larger value in the range of [, 9] indicates a stronger radiation. In winter, the sunrise to sunset period is shorter than that of summer. While in summer, radiation at noon achieves the strongest of the whole year. Such a 2D representation provides a significant insight into not only the

4 J. Zeng, W. Qiao / Renewable Energy 52 (213) 118e easily calculated by replacing k with 24 in (13). Table 1 shows the autocorrelation coefficients between current solar radiation and historical solar radiation obtained from the 14-year average data. In Table 1, i denotes an hour in a day (e.g., 1 pm) and j denotes a day (e.g., September 1) in a year. The value in cell (i 2, j 3) represents the autocorrelation coefficient between the latest solar radiation data and the data obtained 3 days and 2 h (i.e., 74 h) ago, which is.647. An important observation from Table 1 is that there are strong correlations between the radiations not only in consecutive hours, but also in some hours of consecutive days. The correlation between two consecutive days in the same hour is stronger than that between the current hour and 2 h ahead of the same day. Therefore, when constructing a prediction model, the data from the same hour of the previous day should be used with a higher priority than the data of the previous 2 h in the same day. In this study, the radiation data at the same hour of the former two days and the latest data are used as the input of the prediction model. radiation pattern as a function of time, but also the horizontal and vertical correlations within the 2D data. As shown in Fig. 4, the solar radiation values are close to zero from 8 pm to 6 am of the next day because there is no sunlight at night. It is unnecessary to use all recorded data for either training or testing. Therefore, only the data recorded during 6 ame8 pm are used in this paper Input dimension determination The embedding dimension D of the input of the prediction model is determined by the autocorrelation coefficients of y t in historical 14 years (Jan. 1, 1991eDec. 31, 24) defined as follows. 1 r k ¼ ðn kþ$s 2 y X N t ¼ kþ1 yðtþ m y $ yðt kþ m y (13) where r k (k ¼ 1, 2,., 6) is called the autocorrelation coefficient at lag k; m y and s y are the mean and standard deviation of the time series y t, respectively; N is the total number of samples of the time series y t and the value of N is in this paper. Due to its diurnal characteristic, i.e., the radiation from 9 pm to 5 am can be taken as zero, 15-h data instead of 24-h data is used in a day. The autocorrelation coefficients between two consecutive days can be Day of the year Fig. 3. Hourly solar radiation data in a 1D time plot Normalization Two functions f 1 ($) and f 2 ($) are used for normalization of y t in this paper. 1 p t ¼ f 1 ðy t Þ¼ 1 þ exp y (14) t m y s y where p t is the normalized value of y t using the sigmoid function f 1 ($), which is also used for normalizing other meteorological variables X t (i) (i ¼ 1,., m 1). Another method for data normalization is based on the concept of transmissivity [31], which is defined as the ratio between the solar radiation received on the ground surface and the incoming solar radiation (extraterrestrial radiation) on the top of atmosphere. s t ¼ f 2 ðy t Þ¼ y t R t (15) where s t is the time series of the transmissivity; y t and R t are the time series of the ground and extraterrestrial radiations, respectively. The extraterrestrial solar radiation can be accurately calculated by using geometry factors (latitude and longitude), day of the year (DOY), and time of the day (TOD). Therefore, once the transmissivity is known, the ground radiation can be easily obtained. The transmissivity takes time and weather variations into account. It not only reflects the ground radiation, but also contains certain weather information. Fig. 5 compares two normalization methods for radiation data. Fig. 5(a) shows the original radiation observed at the Denver station on July 16, 2 and Feb. 26, 21, where the R t curve indicates seasonal variations of the solar radiation. The extraterrestrial radiation on July 16 is much stronger than that on Feb. 26. The ground radiation y t curve reflects the effect of the weather condition on the solar radiation. For example, the sky on Feb. 26 was clearer than that on July 16; otherwise, the ground radiation on July 16 should be much larger than that on Feb Hour of the day Fig. 4. A 2D image view of the solar radiation data Table 1 Autocorrelation coefficients of solar radiation. Hours Days j 3 j 2 j 1 j i i i i

5 122 J. Zeng, W. Qiao / Renewable Energy 52 (213) 118e127 a b 12 y t.9 π t R t τ t Radiation (W/m 2 ) 8 6 Normalized value /16/2 2/26/21 4 7/16/2 2/26/21 Fig. 5. Comparison of the two methods for normalization of the radiation data: (a) the original ground and extraterrestrial radiation curves; (b) the normalized ground radiation curve using sigmoid function and transmissivity. Fig. 6. Flowchart of the overall SPP system. 26. Fig. 5(b) shows the normalized values of ground radiation by using the two methods. It can be seen that the sigmoid normalized ground radiation values (p t ) in both days are similar; it fails to discover the weather difference. On the other hand, the transmissivity recognizes the weather condition, a larger s is equivalent to a clearer sky, which plays an important role in SPP. Due to this physical meaning, the normalization by the transmissivity is superior to that by the sigmoid function. Fig. 6 illustrates the flowchart of the overall SPP system, which includes four blocks, i.e., normalization, feature representation, prediction, and denormalization. The input of the SPP system includes the ground solar radiation y t and other meteorological variables X t. They are normalized by using transmissivity or sigmoid function, and the results are y t and X ðiþ t, respectively, where y t ¼ p t if sigmoid function is used for normalization and y t ¼ s t if transmissivity is used for normalization. After data normalization, the input vector x t of the prediction model is generated by the feature representation block. The output of the prediction model is b yðt þ hþ, which is the h-step-ahead predicted value of the normalized solar radiation. It is converted to the ground solar radiation by the denormalization block Performance evaluation The mean absolute error (MAE), mean absolute percentage error (MAPE), and correlation coefficient (r) of the observed and Fig. 7. Training and testing set generation.

6 J. Zeng, W. Qiao / Renewable Energy 52 (213) 118e MAE MAPE 18 8 ρ= MAE (W/m 2 ) MAPE (%) Actual radiation (W/m 2 ) predicted values are used to evaluate the performance of the SPP models. Their definitions are expressed as follows. MAE ¼ 1 N MAPE ¼ 1 N X N t ¼ 1 X N byðtþ yðtþ (16) t ¼ 1 byðtþ yðtþ yðtþ Training length (years) Fig. 8. The MAE and MAPE as functions of the training length for the Denver dataset. (17) 1 Predicted value Fitted line Predicted radiation (W/m 2 ) Fig. 1. Correlation between the real and predicted solar radiations in Denver. P Nt ¼ 1 byðtþ m by $ yðtþ m y r ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P 2 r ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (18) Nt P 2 ¼ 1 byðtþ m Nt by ¼ 1 yðtþ m y where y and by are the observed and predicted values of ground solar radiation, respectively; m by represents the mean values of by. Smaller values of MAE and MAPE imply a superior prediction performance of the model. r ( jrj 1) is a measure of linear correlation between two variables. A larger positive r indicates that the samples are more correlated. In order to evaluate the improvement of one model to another, a parameter called skill is defined as follows: Fig. 9. One-hour-ahead prediction in Denver using SVM.

7 124 J. Zeng, W. Qiao / Renewable Energy 52 (213) 118e127 Fig. 11. One-hour-ahead prediction in Seattle using SVM. 1 Denver 1 Seattle MAE (W/m 2 ) 8 6 MAE (W/m 2 ) 8 6 AR RBFNN SVM MAPE (%) MAPE (%) Fig. 12. Comparison of the MAEs and MAPEs of the AR, RBFNN, and SVM-based prediction models using the data in Denver and Seattle.

8 J. Zeng, W. Qiao / Renewable Energy 52 (213) 118e Radiation (W/m 2 ) Obsv AR RBF SVM skill ¼ je e 1 j e 1% (19) where e 1 and e are the MAE of the SPP using a new model and the reference model, respectively. A larger skill value indicates more improvement of the new model over the reference model. 4. Simulation results 9/5/25 Time (h) 9/6/25 Fig. 13. Comparison of the 1-h-ahead predicted values of the three models with the observations in two consecutive days in Seattle. In this section, simulations are carried out for short-term SPP using the NSRDB. The original data is divided into two sets; one is the training set and the other is the testing set. Fig. 7 shows the division of the data of (k þ 1) years that are used for training and testing, where L is the length of the testing set; s represents the first testing data sample; m ¼ s þ L/2 is the middle point of the testing set; then p and p 1, which have the bilateral symmetric structure with respect to m, represent the first and the last data samples of the season, respectively; L j1 is the length of the data that are selected for training in the jth year, where j ¼, 1,., k, and k is the number of historical years selected for generating training data. In this paper, L 11 ¼ L 21 ¼ / ¼ L k1. The data samples in [p j, p j1 ] of the jth year (j ¼ 1,., k) aswellasin[p, s ) of the testing year are selected to form the training set. In this study, the testing set is selected to be the data from Sept. 1, 25 to Sept. 1, 25, which has moderate numbers of sunny and cloudy days. Given the number of historical years k, the training data is automatically generated by the method shown in Fig. 7. Then the training set contains the data from July 17 to Oct. 2 in previous years and from July 17 to Aug. 31 in 25. The inputs of all three models include the latest observed solar radiation, radiations at the hour of prediction in the previous two consecutive days, and the latest observations of some meteorological features, including sky cover, wind speed, and relative humidity. In Table 2 Comparison of MAES using different normalization methods. addition, since there is no radiation at night, only the observations from 5 am to 9 pm are used. Therefore, for the 1-h prediction, the radiations from 6 am to 9 pm in a day are predicted. During testing, all of the predicted values are true out-of-sample forecasts, inwhich only the historical data samples are used. The predicted data is then compared to the actual measured value. The procedure is repeated for the next hour until it runs over the entire testing dataset Short-term prediction Fig. 8 shows the MAE and MAPE as functions of the length of the training data (called the training length) for a prediction horizon of 1 h in Denver. As shown in Fig. 8, it is not true that the longer the better for the training length. The MAE and MAPE decrease drastically with the increase of the training length up to six years. However, after six years the MAE and MAPE increase with the training length. Therefore, six year is selected as the best training length for the Denver dataset. Figs. 9 and 1 show the 1-h-ahead prediction results in Denver using the SVM-based model, where the normalized error is defined as: ð1 by t =y t Þ1%. As shown in Fig. 9, the SVM-based model works well especially during clear days (the 45the15th hours), where the predicted values closely follow the observations. Large prediction errors mainly occur in those days when the ground radiation drastically changes. For instance, the weather conditions during the 3the45th and 15the12th hours make the prediction less accurate than in the 45the15th hours. The error distribution shows that the majority of the prediction errors concentrate in a small range. More than 7% of the normalized errors are less than 1% in the case of 1-h-ahead SPP. Fig. 1 shows the correlation between the real and predicted solar radiations in Denver. As aforementioned, r (jrj 1) is a measure of linear correlation between two variables. r ¼.974 indicates that the predicted and actual radiations are highly correlated. Moreover, such a conclusion can also be drawn from the slope of the fitting line, which is close to one or 45 in this case. Therefore, the predicted values closely match the actual data, which indicates an accurate prediction. Fig. 11 shows the 1-h-ahead prediction in Seattle, which has less clear days than Denver, by using the SVM-based model. Similarly, more than 5% of the normalized errors are less than 1% and approximately 45% of the normalized errors concentrate within 5% Comparison of three SPP models The AR model, which has been shown to be superior to the persistence model, is used as the reference model in this paper. Fig. 12 compares the AR, RBFNN and SVM-based models for SPP using the data in Denver and Seattle. The RBFNN and SVM achieved much better results than the AR model in terms of accuracy. In Denver s 1-h-ahead prediction, the MAE of the AR model is 62 W/ m 2, while the MAE of the RBFNN and SVM are 43 W/m 2 and 33.7 W/ m 2, respectively. The inferiority of the AR model shows that the linear model is worse than the nonlinear models to capture the nonlinear characteristic of the solar radiation. Moreover, the prediction accuracy improvement brought by the proposed SVM- Table 3 Comparison of MAES without (with) meterological variables. Sigmoid function (transmissivity) Without (with) meteorological variables Seattle (W/m 2 ) Denver (W/m 2 ) Miami (W/m 2 ) (34.23) (33.773) (62.864) (48.53) (48.77) (78.285) (62.275) (58.174) (91.476) Seattle (W/m 2 ) Denver (W/m 2 ) Miami (W/m 2 ) (34.23) (33.773) (62.864) (48.53) (48.77) (78.285) (64.169) (58.174) (91.476)

9 126 J. Zeng, W. Qiao / Renewable Energy 52 (213) 118e127 4 Denver 15 Seattle 8 Miami 3 Sky cover Skill(%) 2 1 Wind Humidity Fig. 14. Comparison of feature s skill in three sites. based model is better than that of the RBFNN. Fig. 13 compares the predicted values of the three models with the observations during two consecutive days in Seattle (i.e., the 75the15th hours in Fig. 11). Likewise, the predicted values obtained from the SVM follow the observed values much closer than the AR and RBFNN models. For example, on Sept. 6, 25, the SVM-based model quickly captures the change in the solar radiation during the 22nde 28th hours in the morning and achieves smaller prediction errors than other two models. These results show that the SVM model is superior to the RBFNN model for SPP because the SVM has better generalization ability than the RBFNN. This can be explained from two aspects. First, the objective function of the SVM not only includes the error term used for training the RBFNN, but also has an extra weights term designed to balance the training and fitting errors, as show in (1). Moreover, in the SVM approach the size of the hidden layer is obtained as the result of the quadratic programming. Therefore, the complexity of the SVM is automatically handled in the process of training the SVM. While in the RBFNN approach, the number of hidden neurons needs to be provided before training Effectiveness analysis Effectiveness analysis is explored in this study to evaluate different normalization methods and the use of additional meteorological variables. Table 2 compares the MAEs of SPP using the SVM with two different normalization methods, i.e., the sigmoid function and the transmissivity. The use of transmissivity reduces the prediction errors to some extent, because the transmissivity contains certain information that is useful for prediction. Table 3 compares the SPP results of using the SVM with (the values in brackets) and without additional meteorological variables, i.e., sky cover, relative humidity, and wind speed. The results indicate that using additional meteorological information always improves in Seattle and Denver. The improvement in Denver is particularly significant. For example, at the Denver site when the prediction horizon is 3 h, the MAE is improved by 14 W/m 2, which corresponds to 19% improvement over the prediction without using additional meteorological variables. However, the effectiveness of using meteorological variables in Miami is reverse, i.e., the use of meteorological variables worsens the prediction accuracy. It is necessary to explore the importance of using each meteorological variable (i.e., feature). The importance of using a feature can be evaluated by measuring the skill of adding this feature. The skills of the three features in different sites are compared in Fig. 14. It shows that sky cover plays a more important role in prediction than humidity and wind speed in three sites. It also indicates that humidity and wind speed are not good features to be used individually; they might have an adverse effect toward the prediction because the skills of humidity and wind speed might become negative, as shown in Fig. 14. However, the overall effectiveness of using meteorological variables is not simply superimposition of different features. As shown in Table 3, the prediction accuracy was improved in Seattle and Denver if all of the three features were used. While in Miami low humidity and small wind variance provide little weather information for the prediction. Therefore, they become redundant features. Moreover, relative humidity and wind speed are site-dependent features and should not be taken as general inputs for the prediction model. However, the sky cover feature should be included as the input of the prediction model because it always improves the prediction accuracy. Take Denver for example, while using sky cover as an additional input, the skill is 2.8% compared to that use solar radiation only. 5. Conclusion and discussion This paper has presented a SVM-based model for short-term SPP. Simulation studies using the data from the NSRDB at three different sites have yielded several conclusions. In terms of prediction accuracy, the proposed SVM-based model significantly

10 J. Zeng, W. Qiao / Renewable Energy 52 (213) 118e outperformed the AR model, because the SVM has better ability of capturing nonlinear and time-varying nature of the solar radiation data than the AR model does; in addition, the SVM model achieved a better performance than the RBFNN model, which is largely due to the good generalization ability of SVM. Second, the proposed model has used a novel 2D representation for hourly solar radiation, which gives more insight into the solar radiation pattern than the regular 1D representation. In addition, the normalization of solar radiations with the transmissivity has produced smaller prediction errors than the sigmoid function normalization, since the transmissivity contains extra useful information of weather features. Moreover, simulation results have also indicated that other meteorological variables should be used to improve the SPP. For example, the sky cover is a prominent, site-independent feature that should be taken as an extra input of the model. Acknowledgment This material is based upon work supported in part by the U.S. National Science Foundation (NSF) under CAREER Award ECCS and the U.S. Federal Highway Administration (FHWA) under Agreement No. DTFH61-1-H-3. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the view of the NSF or FHWA. References [1] Martin L, Zarzalejo L, Polo J, Navarro A, Marchante R, Cony M. Prediction of global solar irradiance based on time series analysis: application to solar thermal power plants energy production planning. Solar Energy 21;84:1772e81. [2] Falayi E, Adepitan J, Rabiu A. Empirical models for the correlation of global solar radiation with meteorological data for Iseyin, Nigeria. Physical Sciences 28;3:21e6. [3] Badescu V. Modeling solar radiation at the earth surface. Springer; 28. [4] Paolik C, Voyant C, Muselli M, Nivet M. Solar radiation forecasting using adhoc time series preprocessing and neural networks. In: Proceeding of the 5th international conference on emerging intelligent computing technology and applications, Ulsan, South Korea; 29. p. 898e97. [5] Box G, Jenkins G, Reinsel G. Time series analysis: forecasting and control. Prentice-Hall; [6] Reikard G. Predicting solar radiation at high resolutions: a comparison of time series forecasts. Solar Energy 29;83(3):342e9. [7] Torres T, Garcia A, De Blas M, De Francisco A. Forecast of hourly average wind speed with ARMA models in Navarre (Spain). Solar Energy 25;79:65e77. [8] Iqdour R, Zeroual A. A rule based fuzzy model for the prediction of daily solar radiation. In: Proceeding of international conference on industrial technology, Hammamet, Tunisia; 24. p. 1482e7. [9] Gapizzi G, Bonanno F, Napoli C. A wavelet based prediction of wind and solar energy for long-term simulation of integrated generation systems. In: Proceeding of international symposium on power electronics, electrical drives, automation and motion, Pisa, Italy; 21. p. 586e92. [1] Assi A, Shamisi M, Jama M. Prediction of monthly average daily global solar radiation in Al Ain city e UAE using artificial neural networks. In: Proceeding of 4th WSEAS international conference on renewable energy sources, Kantaoui, Sousse, Tunisia; 21. p. 19e13. [11] Vapnik V. The nature of statistical learning theory. Springer Verlag; [12] Chen J, Liu H, Wu W, Xie D. Estimation of monthly solar radiation from measured temperature using support vector machines e a case study. Renewable Energy 211;36:413e2. [13] Mohandes M, Halawani T, Rehman S, Hussain A. Support vector machines for wind speed prediction. Renewable Energy 24;29(6):939e47. [14] Li R, Wang H, He H, Cui Y, Du Z. Support vector machine combined with k- nearest neighbors for solar flare forecasting. Chinese Journal of Astronomy and Astrophysics 27;7(3):441e7. [15] Sfetsos A, Coonick A. Univariate and multivariate forecasting of hourly solar radiation with artificial intelligence techniques. Solar Energy 2;68(2): 169e78. [16] Rivington M, Bellocchi G, Matthews K, Bachan K. Evaluation of three model estimations of solar radiation at 24 UK stations. Agricultural and Forest Meteorology 25;132:228e43. [17] Crispim E, Ferreira P, Ruano A. Solar radiation prediction using RBF neural networks and cloudiness indices. In: Proceeding of international joint conference on neural networks, Vancouver, BC, Canada; July 26. p. 2611e8. [18] Deng F, Su G, Liu C, Wang Z. Prediction of solar radiation resources in China using the LS-SVM algorithms. In: Proceeding of the 2nd international conference on computer and automation engineering, Singapore; Feb. 21. p. 31e5. [19] Almakaleh A. New method for energy prediction of solar energy collectors systems in Yemen. In: Proceeding of ISES world congress, Beijing, China; Sept. 27. p. 267e11. [2] Hocaoglu F, Gerek L, Kurban M. A novel 2-D model approach for the prediction of hourly solar radiation. In: Proceeding of the 9th international work conference on artificial neural networks, San Sebastián, Spain; 27. p. 749e56. [21] Hocaoglu F, Gerek L, Kurban M. Hourly solar radiation forecasting using optimal coefficient 2-D linear filters and feed-forward neural networks. Solar Energy 28;82:714e26. [22] Rasmussen C. The infinite Gaussian mixture model. In: Proceeding of the 12th advances in neural information processing systems, Vancouver, Canada; 2. p. 554e6. [23] MacQueen J. Some methods for classification and analysis of multivariate observations. In: Proceeding of the 5th Berkeley symposium on mathematical statistics and probability, San Francisco, USA; p. 281e97. [24] Dempster A, Laird N, Rubin D. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 1977;39(1): 1e38. [25] Nabney I. Netlab algorithms for pattern recognition. Springer Verlag; 22. [26] Shi J, Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 2;22(8):888e95. [27] Miller K, Vapnik V. Using support vector machine for time series prediction. Cambridge: MIT Press; [28] Suykens J, Vandewalle J. Least squares support vector machine classifiers. Neural Processing Letters 1999;9(3):293e3. [29] Smola A, Schölkopf B. A tutorial on support vector regression. Statistics and Computing 23;14(2):199e222. [3] National Renewable Energy Laboratory. National solar radiation database 1991e25 update: user s manual. Tech. rep. TP National Renewable Energy Laboratory; 27. [31] Bintanjaand R, Broeke M. The influence of clouds on the radiations budget of ice and snow surface in Antarctica and Greenland in summer. International Journal of Climatology 1996;16:1281e96.

Predicting Solar Generation from Weather Forecasts Using Machine Learning

Predicting Solar Generation from Weather Forecasts Using Machine Learning Predicting Solar Generation from Weather Forecasts Using Machine Learning Navin Sharma, Pranshu Sharma, David Irwin, and Prashant Shenoy Department of Computer Science University of Massachusetts Amherst

More information

A Study on the Comparison of Electricity Forecasting Models: Korea and China

A Study on the Comparison of Electricity Forecasting Models: Korea and China Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 675 683 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.675 Print ISSN 2287-7843 / Online ISSN 2383-4757 A Study on the Comparison

More information

Solar Irradiance Forecasting Using Multi-layer Cloud Tracking and Numerical Weather Prediction

Solar Irradiance Forecasting Using Multi-layer Cloud Tracking and Numerical Weather Prediction Solar Irradiance Forecasting Using Multi-layer Cloud Tracking and Numerical Weather Prediction Jin Xu, Shinjae Yoo, Dantong Yu, Dong Huang, John Heiser, Paul Kalb Solar Energy Abundant, clean, and secure

More information

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt

More information

Power Prediction Analysis using Artificial Neural Network in MS Excel

Power Prediction Analysis using Artificial Neural Network in MS Excel Power Prediction Analysis using Artificial Neural Network in MS Excel NURHASHINMAH MAHAMAD, MUHAMAD KAMAL B. MOHAMMED AMIN Electronic System Engineering Department Malaysia Japan International Institute

More information

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network Qian Wu, Yahui Wang, Long Zhang and Li Shen Abstract Building electrical system fault diagnosis is the

More information

Machine Learning in FX Carry Basket Prediction

Machine Learning in FX Carry Basket Prediction Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines

More information

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S. AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

More information

Applications of improved grey prediction model for power demand forecasting

Applications of improved grey prediction model for power demand forecasting Energy Conversion and Management 44 (2003) 2241 2249 www.elsevier.com/locate/enconman Applications of improved grey prediction model for power demand forecasting Che-Chiang Hsu a, *, Chia-Yon Chen b a

More information

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network , pp.67-76 http://dx.doi.org/10.14257/ijdta.2016.9.1.06 The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network Lihua Yang and Baolin Li* School of Economics and

More information

Artificial Neural Network and Non-Linear Regression: A Comparative Study

Artificial Neural Network and Non-Linear Regression: A Comparative Study International Journal of Scientific and Research Publications, Volume 2, Issue 12, December 2012 1 Artificial Neural Network and Non-Linear Regression: A Comparative Study Shraddha Srivastava 1, *, K.C.

More information

Automated Stellar Classification for Large Surveys with EKF and RBF Neural Networks

Automated Stellar Classification for Large Surveys with EKF and RBF Neural Networks Chin. J. Astron. Astrophys. Vol. 5 (2005), No. 2, 203 210 (http:/www.chjaa.org) Chinese Journal of Astronomy and Astrophysics Automated Stellar Classification for Large Surveys with EKF and RBF Neural

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Handling of incomplete data sets using ICA and SOM in data mining

Handling of incomplete data sets using ICA and SOM in data mining Neural Comput & Applic (2007) 16: 167 172 DOI 10.1007/s00521-006-0058-6 ORIGINAL ARTICLE Hongyi Peng Æ Siming Zhu Handling of incomplete data sets using ICA and SOM in data mining Received: 2 September

More information

SOLAR IRRADIANCE FORECASTING, BENCHMARKING of DIFFERENT TECHNIQUES and APPLICATIONS of ENERGY METEOROLOGY

SOLAR IRRADIANCE FORECASTING, BENCHMARKING of DIFFERENT TECHNIQUES and APPLICATIONS of ENERGY METEOROLOGY SOLAR IRRADIANCE FORECASTING, BENCHMARKING of DIFFERENT TECHNIQUES and APPLICATIONS of ENERGY METEOROLOGY Wolfgang Traunmüller 1 * and Gerald Steinmaurer 2 1 BLUE SKY Wetteranalysen, 4800 Attnang-Puchheim,

More information

Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

More information

Using artificial intelligence for data reduction in mechanical engineering

Using artificial intelligence for data reduction in mechanical engineering Using artificial intelligence for data reduction in mechanical engineering L. Mdlazi 1, C.J. Stander 1, P.S. Heyns 1, T. Marwala 2 1 Dynamic Systems Group Department of Mechanical and Aeronautical Engineering,

More information

Prediction Model for Crude Oil Price Using Artificial Neural Networks

Prediction Model for Crude Oil Price Using Artificial Neural Networks Applied Mathematical Sciences, Vol. 8, 2014, no. 80, 3953-3965 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.43193 Prediction Model for Crude Oil Price Using Artificial Neural Networks

More information

A Wavelet Based Prediction Method for Time Series

A Wavelet Based Prediction Method for Time Series A Wavelet Based Prediction Method for Time Series Cristina Stolojescu 1,2 Ion Railean 1,3 Sorin Moga 1 Philippe Lenca 1 and Alexandru Isar 2 1 Institut TELECOM; TELECOM Bretagne, UMR CNRS 3192 Lab-STICC;

More information

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

More information

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

More information

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Journal of Computational Information Systems 10: 17 (2014) 7629 7635 Available at http://www.jofcis.com A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Tian

More information

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling 1 Forecasting Women s Apparel Sales Using Mathematical Modeling Celia Frank* 1, Balaji Vemulapalli 1, Les M. Sztandera 2, Amar Raheja 3 1 School of Textiles and Materials Technology 2 Computer Information

More information

Statistical Learning for Short-Term Photovoltaic Power Predictions

Statistical Learning for Short-Term Photovoltaic Power Predictions Statistical Learning for Short-Term Photovoltaic Power Predictions Björn Wolff 1, Elke Lorenz 2, Oliver Kramer 1 1 Department of Computing Science 2 Institute of Physics, Energy and Semiconductor Research

More information

Predicting the Solar Resource and Power Load

Predicting the Solar Resource and Power Load 1 Predicting the Solar Resource and Power Load David Sehloff, Celso Torres Supervisor: Alex Cassidy, Dr. Arye Nehorai Department of Electrical and Systems Engineering Washington University in St. Louis

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

Sales Forecast for Pickup Truck Parts:

Sales Forecast for Pickup Truck Parts: Sales Forecast for Pickup Truck Parts: A Case Study on Brake Rubber Mojtaba Kamranfard University of Semnan Semnan, Iran mojtabakamranfard@gmail.com Kourosh Kiani Amirkabir University of Technology Tehran,

More information

Use of Artificial Neural Network in Data Mining For Weather Forecasting

Use of Artificial Neural Network in Data Mining For Weather Forecasting Use of Artificial Neural Network in Data Mining For Weather Forecasting Gaurav J. Sawale #, Dr. Sunil R. Gupta * # Department Computer Science & Engineering, P.R.M.I.T& R, Badnera. 1 gaurav.sawale@yahoo.co.in

More information

A system of direct radiation forecasting based on numerical weather predictions, satellite image and machine learning.

A system of direct radiation forecasting based on numerical weather predictions, satellite image and machine learning. A system of direct radiation forecasting based on numerical weather predictions, satellite image and machine learning. 31st Annual International Symposium on Forecasting Lourdes Ramírez Santigosa Martín

More information

DATA MINING-BASED PREDICTIVE MODEL TO DETERMINE PROJECT FINANCIAL SUCCESS USING PROJECT DEFINITION PARAMETERS

DATA MINING-BASED PREDICTIVE MODEL TO DETERMINE PROJECT FINANCIAL SUCCESS USING PROJECT DEFINITION PARAMETERS DATA MINING-BASED PREDICTIVE MODEL TO DETERMINE PROJECT FINANCIAL SUCCESS USING PROJECT DEFINITION PARAMETERS Seungtaek Lee, Changmin Kim, Yoora Park, Hyojoo Son, and Changwan Kim* Department of Architecture

More information

Supply Chain Forecasting Model Using Computational Intelligence Techniques

Supply Chain Forecasting Model Using Computational Intelligence Techniques CMU.J.Nat.Sci Special Issue on Manufacturing Technology (2011) Vol.10(1) 19 Supply Chain Forecasting Model Using Computational Intelligence Techniques Wimalin S. Laosiritaworn Department of Industrial

More information

THE SVM APPROACH FOR BOX JENKINS MODELS

THE SVM APPROACH FOR BOX JENKINS MODELS REVSTAT Statistical Journal Volume 7, Number 1, April 2009, 23 36 THE SVM APPROACH FOR BOX JENKINS MODELS Authors: Saeid Amiri Dep. of Energy and Technology, Swedish Univ. of Agriculture Sciences, P.O.Box

More information

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

More information

Data Mining mit der JMSL Numerical Library for Java Applications

Data Mining mit der JMSL Numerical Library for Java Applications Data Mining mit der JMSL Numerical Library for Java Applications Stefan Sineux 8. Java Forum Stuttgart 07.07.2005 Agenda Visual Numerics JMSL TM Numerical Library Neuronale Netze (Hintergrund) Demos Neuronale

More information

Predicting daily incoming solar energy from weather data

Predicting daily incoming solar energy from weather data Predicting daily incoming solar energy from weather data ROMAIN JUBAN, PATRICK QUACH Stanford University - CS229 Machine Learning December 12, 2013 Being able to accurately predict the solar power hitting

More information

Energy Load Mining Using Univariate Time Series Analysis

Energy Load Mining Using Univariate Time Series Analysis Energy Load Mining Using Univariate Time Series Analysis By: Taghreed Alghamdi & Ali Almadan 03/02/2015 Caruth Hall 0184 Energy Forecasting Energy Saving Energy consumption Introduction: Energy consumption.

More information

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling 1 Forecasting Women s Apparel Sales Using Mathematical Modeling Celia Frank* 1, Balaji Vemulapalli 1, Les M. Sztandera 2, Amar Raheja 3 1 School of Textiles and Materials Technology 2 Computer Information

More information

Optimizing the prediction models of the air quality state in cities

Optimizing the prediction models of the air quality state in cities Air Pollution XV 89 Optimizing the prediction models of the air quality state in cities J. Skrzypski, E. Jach-Szakiel & W. Kamiński Faculty of Process and Environmental Engineering, Technical University

More information

Modeling and Prediction of Network Traffic Based on Hybrid Covariance Function Gaussian Regressive

Modeling and Prediction of Network Traffic Based on Hybrid Covariance Function Gaussian Regressive Journal of Information & Computational Science 12:9 (215) 3637 3646 June 1, 215 Available at http://www.joics.com Modeling and Prediction of Network Traffic Based on Hybrid Covariance Function Gaussian

More information

A Novel Method for Predicting the Power Output of Distributed Renewable Energy Resources

A Novel Method for Predicting the Power Output of Distributed Renewable Energy Resources A Novel Method for Predicting the Power Output of Distributed Renewable Energy Resources Aris-Athanasios Panagopoulos1 Joint work with Georgios Chalkiadakis2 and Eftichios Koutroulis2 ( Predicting the

More information

Big Data Analytic Paradigms -From PCA to Deep Learning

Big Data Analytic Paradigms -From PCA to Deep Learning The Intersection of Robust Intelligence and Trust in Autonomous Systems: Papers from the AAAI Spring Symposium Big Data Analytic Paradigms -From PCA to Deep Learning Barnabas K. Tannahill Aerospace Electronics

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Modeling of System of Systems via Data Analytics Case for Big Data in SoS 1

Modeling of System of Systems via Data Analytics Case for Big Data in SoS 1 Modeling of System of Systems via Data Analytics Case for Big Data in SoS 1 Barnabas K. Tannahill Aerospace Electronics and Information Technology Division Southwest Research Institute San Antonio, TX,

More information

Forecasting Solar Power Generated by Grid Connected PV Systems Using Ensembles of Neural Networks

Forecasting Solar Power Generated by Grid Connected PV Systems Using Ensembles of Neural Networks Forecasting Solar Power Generated by Grid Connected PV Systems Using Ensembles of Neural Networks Mashud Rana Australian Energy Research Institute University of New South Wales NSW, Australia md.rana@unsw.edu.au

More information

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive

More information

Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network

Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network Prince Gupta 1, Satanand Mishra 2, S.K.Pandey 3 1,3 VNS Group, RGPV, Bhopal, 2 CSIR-AMPRI, BHOPAL prince2010.gupta@gmail.com

More information

Machine learning algorithms for predicting roadside fine particulate matter concentration level in Hong Kong Central

Machine learning algorithms for predicting roadside fine particulate matter concentration level in Hong Kong Central Article Machine learning algorithms for predicting roadside fine particulate matter concentration level in Hong Kong Central Yin Zhao, Yahya Abu Hasan School of Mathematical Sciences, Universiti Sains

More information

Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network

Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network Dušan Marček 1 Abstract Most models for the time series of stock prices have centered on autoregressive (AR)

More information

Flexible Neural Trees Ensemble for Stock Index Modeling

Flexible Neural Trees Ensemble for Stock Index Modeling Flexible Neural Trees Ensemble for Stock Index Modeling Yuehui Chen 1, Ju Yang 1, Bo Yang 1 and Ajith Abraham 2 1 School of Information Science and Engineering Jinan University, Jinan 250022, P.R.China

More information

Joseph Twagilimana, University of Louisville, Louisville, KY

Joseph Twagilimana, University of Louisville, Louisville, KY ST14 Comparing Time series, Generalized Linear Models and Artificial Neural Network Models for Transactional Data analysis Joseph Twagilimana, University of Louisville, Louisville, KY ABSTRACT The aim

More information

Dynamic intelligent cleaning model of dirty electric load data

Dynamic intelligent cleaning model of dirty electric load data Available online at www.sciencedirect.com Energy Conversion and Management 49 (2008) 564 569 www.elsevier.com/locate/enconman Dynamic intelligent cleaning model of dirty electric load data Zhang Xiaoxing

More information

Evaluation of Machine Learning Techniques for Green Energy Prediction

Evaluation of Machine Learning Techniques for Green Energy Prediction arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques

More information

SIMPLIFIED PERFORMANCE MODEL FOR HYBRID WIND DIESEL SYSTEMS. J. F. MANWELL, J. G. McGOWAN and U. ABDULWAHID

SIMPLIFIED PERFORMANCE MODEL FOR HYBRID WIND DIESEL SYSTEMS. J. F. MANWELL, J. G. McGOWAN and U. ABDULWAHID SIMPLIFIED PERFORMANCE MODEL FOR HYBRID WIND DIESEL SYSTEMS J. F. MANWELL, J. G. McGOWAN and U. ABDULWAHID Renewable Energy Laboratory Department of Mechanical and Industrial Engineering University of

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

A Simple Introduction to Support Vector Machines

A Simple Introduction to Support Vector Machines A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear

More information

INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr.

INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr. INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr. Meisenbach M. Hable G. Winkler P. Meier Technology, Laboratory

More information

Forecasting Geographic Data Michael Leonard and Renee Samy, SAS Institute Inc. Cary, NC, USA

Forecasting Geographic Data Michael Leonard and Renee Samy, SAS Institute Inc. Cary, NC, USA Forecasting Geographic Data Michael Leonard and Renee Samy, SAS Institute Inc. Cary, NC, USA Abstract Virtually all businesses collect and use data that are associated with geographic locations, whether

More information

Meteorological Forecasting of DNI, clouds and aerosols

Meteorological Forecasting of DNI, clouds and aerosols Meteorological Forecasting of DNI, clouds and aerosols DNICast 1st End-User Workshop, Madrid, 2014-05-07 Heiner Körnich (SMHI), Jan Remund (Meteotest), Marion Schroedter-Homscheidt (DLR) Overview What

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Global Seasonal Phase Lag between Solar Heating and Surface Temperature

Global Seasonal Phase Lag between Solar Heating and Surface Temperature Global Seasonal Phase Lag between Solar Heating and Surface Temperature Summer REU Program Professor Tom Witten By Abstract There is a seasonal phase lag between solar heating from the sun and the surface

More information

Intrusion Detection via Machine Learning for SCADA System Protection

Intrusion Detection via Machine Learning for SCADA System Protection Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. s.l.yasakethu@surrey.ac.uk J. Jiang Department

More information

UNIVERSITY OF CALGARY. Forecasting Photo-Voltaic Solar Power in Electricity Systems. Yue Zhang A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

UNIVERSITY OF CALGARY. Forecasting Photo-Voltaic Solar Power in Electricity Systems. Yue Zhang A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES UNIVERSITY OF CALGARY Forecasting Photo-Voltaic Solar Power in Electricity Systems by Yue Zhang A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

More information

6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining 6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

More information

A fast multi-class SVM learning method for huge databases

A fast multi-class SVM learning method for huge databases www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,

More information

An Introduction to Machine Learning

An Introduction to Machine Learning An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,

More information

Distributed Solar Prediction with Wind Velocity

Distributed Solar Prediction with Wind Velocity Distributed Solar Prediction with Wind Velocity Justin Domke, Nick Engerer, Aditya Menon, Christfried Webers National ICT Australia and the Australian National University Abstract The growing uptake of

More information

Exploratory Data Analysis Using Radial Basis Function Latent Variable Models

Exploratory Data Analysis Using Radial Basis Function Latent Variable Models Exploratory Data Analysis Using Radial Basis Function Latent Variable Models Alan D. Marrs and Andrew R. Webb DERA St Andrews Road, Malvern Worcestershire U.K. WR14 3PS {marrs,webb}@signal.dera.gov.uk

More information

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Jun Wang Department of Mechanical and Automation Engineering The Chinese University of Hong Kong Shatin, New Territories,

More information

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH g.schindlmayr@enbw.com

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH g.schindlmayr@enbw.com A Regime-Switching Model for Electricity Spot Prices Gero Schindlmayr EnBW Trading GmbH g.schindlmayr@enbw.com May 31, 25 A Regime-Switching Model for Electricity Spot Prices Abstract Electricity markets

More information

How To Forecast Solar Power

How To Forecast Solar Power Forecasting Solar Power with Adaptive Models A Pilot Study Dr. James W. Hall 1. Introduction Expanding the use of renewable energy sources, primarily wind and solar, has become a US national priority.

More information

Programming Exercise 3: Multi-class Classification and Neural Networks

Programming Exercise 3: Multi-class Classification and Neural Networks Programming Exercise 3: Multi-class Classification and Neural Networks Machine Learning November 4, 2011 Introduction In this exercise, you will implement one-vs-all logistic regression and neural networks

More information

Simulation of global solar radiation based on cloud observations

Simulation of global solar radiation based on cloud observations Solar Energy 78 (25) 157 162 www.elsevier.com/locate/solener Simulation of global solar radiation based on cloud observations Jimmy S.G. Ehnberg *, Math H.J. Bollen Department of Electric Power Engineering,

More information

How To Calculate Global Radiation At Jos

How To Calculate Global Radiation At Jos IOSR Journal of Applied Physics (IOSR-JAP) e-issn: 2278-4861.Volume 7, Issue 4 Ver. I (Jul. - Aug. 2015), PP 01-06 www.iosrjournals.org Evaluation of Empirical Formulae for Estimating Global Radiation

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

REDUCING UNCERTAINTY IN SOLAR ENERGY ESTIMATES

REDUCING UNCERTAINTY IN SOLAR ENERGY ESTIMATES REDUCING UNCERTAINTY IN SOLAR ENERGY ESTIMATES Mitigating Energy Risk through On-Site Monitoring Marie Schnitzer, Vice President of Consulting Services Christopher Thuman, Senior Meteorologist Peter Johnson,

More information

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,

More information

High-Performance Signature Recognition Method using SVM

High-Performance Signature Recognition Method using SVM High-Performance Signature Recognition Method using SVM Saeid Fazli Research Institute of Modern Biological Techniques University of Zanjan Shima Pouyan Electrical Engineering Department University of

More information

Forecaster comments to the ORTECH Report

Forecaster comments to the ORTECH Report Forecaster comments to the ORTECH Report The Alberta Forecasting Pilot Project was truly a pioneering and landmark effort in the assessment of wind power production forecast performance in North America.

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Artificial Neural Network-based Electricity Price Forecasting for Smart Grid Deployment

Artificial Neural Network-based Electricity Price Forecasting for Smart Grid Deployment Artificial Neural Network-based Electricity Price Forecasting for Smart Grid Deployment Bijay Neupane, Kasun S. Perera, Zeyar Aung, and Wei Lee Woon Masdar Institute of Science and Technology Abu Dhabi,

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

GPR Regression for Tourism Demand Forecasting in Hong Kong

GPR Regression for Tourism Demand Forecasting in Hong Kong Expert Systems with Applications 39 (2012) 4769 4774 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa A sparse Gaussian

More information

How To Predict Web Site Visits

How To Predict Web Site Visits Web Site Visit Forecasting Using Data Mining Techniques Chandana Napagoda Abstract: Data mining is a technique which is used for identifying relationships between various large amounts of data in many

More information

Renewable Energy. Solar Power. Courseware Sample 86352-F0

Renewable Energy. Solar Power. Courseware Sample 86352-F0 Renewable Energy Solar Power Courseware Sample 86352-F0 A RENEWABLE ENERGY SOLAR POWER Courseware Sample by the staff of Lab-Volt Ltd. Copyright 2009 Lab-Volt Ltd. All rights reserved. No part of this

More information

Neural Networks for Sentiment Detection in Financial Text

Neural Networks for Sentiment Detection in Financial Text Neural Networks for Sentiment Detection in Financial Text Caslav Bozic* and Detlef Seese* With a rise of algorithmic trading volume in recent years, the need for automatic analysis of financial news emerged.

More information

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

More information

A Vector Autoregression Weather Model for Electricity Supply and Demand Modeling

A Vector Autoregression Weather Model for Electricity Supply and Demand Modeling A Vector Autoregression Weather Model for Electricity Supply and Demand Modeling Yixian Liu a, Matthew C. Roberts b,c, Ramteen Sioshansi a,c, a Integrated Systems Engineering Department, The Ohio State

More information

Distributed Renewable Energy Sources Integration and Smart Grid Control

Distributed Renewable Energy Sources Integration and Smart Grid Control Distributed Renewable Energy Sources Integration and Smart Grid Control Jianwu Zeng Power and Energy Systems Laboratory Department of Electrical & Computer Engineering University of Nebraska-Lincoln URL:

More information

Solarstromprognosen für Übertragungsnetzbetreiber

Solarstromprognosen für Übertragungsnetzbetreiber Solarstromprognosen für Übertragungsnetzbetreiber Elke Lorenz, Jan Kühnert, Annette Hammer, Detlev Heienmann Universität Oldenburg 1 Outline grid integration of photovoltaic power (PV) in Germany overview

More information

Visualization of Breast Cancer Data by SOM Component Planes

Visualization of Breast Cancer Data by SOM Component Planes International Journal of Science and Technology Volume 3 No. 2, February, 2014 Visualization of Breast Cancer Data by SOM Component Planes P.Venkatesan. 1, M.Mullai 2 1 Department of Statistics,NIRT(Indian

More information

Studying Achievement

Studying Achievement Journal of Business and Economics, ISSN 2155-7950, USA November 2014, Volume 5, No. 11, pp. 2052-2056 DOI: 10.15341/jbe(2155-7950)/11.05.2014/009 Academic Star Publishing Company, 2014 http://www.academicstar.us

More information

Improvement in the Assessment of SIRS Broadband Longwave Radiation Data Quality

Improvement in the Assessment of SIRS Broadband Longwave Radiation Data Quality Improvement in the Assessment of SIRS Broadband Longwave Radiation Data Quality M. E. Splitt University of Utah Salt Lake City, Utah C. P. Bahrmann Cooperative Institute for Meteorological Satellite Studies

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Proactive Drive Failure Prediction for Large Scale Storage Systems

Proactive Drive Failure Prediction for Large Scale Storage Systems Proactive Drive Failure Prediction for Large Scale Storage Systems Bingpeng Zhu, Gang Wang, Xiaoguang Liu 2, Dianming Hu 3, Sheng Lin, Jingwei Ma Nankai-Baidu Joint Lab, College of Information Technical

More information

1. Classification problems

1. Classification problems Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification

More information

Time Series Laboratory

Time Series Laboratory Time Series Laboratory Computing in Weber Classrooms 205-206: To log in, make sure that the DOMAIN NAME is set to MATHSTAT. Use the workshop username: primesw The password will be distributed during the

More information

Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

More information

Chapter 2 Modeling for Energy Demand Forecasting

Chapter 2 Modeling for Energy Demand Forecasting Chapter 2 Modeling for Energy Demand Forecasting As mentioned in Chap. 1, the electric load forecasting methods can be classified in three categories [1 12]: Traditional approaches, including Box Jenkins

More information

Trading Strategies and the Cat Tournament Protocol

Trading Strategies and the Cat Tournament Protocol M A C H I N E L E A R N I N G P R O J E C T F I N A L R E P O R T F A L L 2 7 C S 6 8 9 CLASSIFICATION OF TRADING STRATEGIES IN ADAPTIVE MARKETS MARK GRUMAN MANJUNATH NARAYANA Abstract In the CAT Tournament,

More information