Online Forecasting of Stock Market Movement Direction Using the Improved Incremental Algorithm


 Barrie Burns
 3 years ago
 Views:
Transcription
1 Online Forecasting of Stock Market Movement Direction Using the Improved Incremental Algorithm Dalton Lunga and Tshilidzi Marwala University of the Witwatersrand School of Electrical and Information Engineering Private Bag 3 Wits 2050, Johannesburg, South Africa {d.lunga, marwala Abstract. In this paper we present a particular implementation of the Learn++ algorithm: we investigate the predictability of financial movement direction with Learn++ by forecasting the daily movement direction of the Dow Jones. The Learn++ algorithm is derived from the Adaboost algorithm, which is denominated by subsampling. The goal of concept learning, according to the probably approximately correct weak model, is to generate a description of another function, called the hypothesis, which is close to the concept, by using a set of examples. The hypothesis which is derived from weak learning is boosted to provide a better composite hypothesis in generalizing the establishment of the final classification boundary. The framework is implemented using multilayer Perceptron (MLP) as a weak Learner. First, a weak learning algorithm, which tries to learn a class concept with a single input Perceptron, is established. The Learn++ algorithm is then applied to improve the weak MLP learning capacity and introduces the concept of online incremental learning. The proposed framework is able to adapt as new data are introduced and is able to classify. 1 Introduction The financial market is a complex, evolutionary, and nonlinear dynamical system. The field of financial forecasting is characterized by data intensity, noise, nonstationary, unstructured nature, high degree of uncertainty, and hidden relationships [1]. Many factors interact in finance including political events, general economic conditions, and traders expectations. Therefore, predicting market price movements is quite difficult. Increasingly, according to academic investigations, movements in market prices are not random. Rather, they behave in a highly nonlinear and dynamical manner. The standard random walk assumption of future prices may merely be a veil of randomness that shrouds a noisy nonlinear process [2]. Incremental learning is the solution to such scenarios, which can be defined as the process of extracting new information without losing prior I. King et al. (Eds.): ICONIP 2006, Part III, LNCS 4234, pp , c SpringerVerlag Berlin Heidelberg 2006
2 Online Forecasting of Stock Market Movement Direction 441 knowledge from an additional dataset that later becomes available. Various definitions and interpretations of incremental learning can be found in literature, including online learning [3], relearning of previously misclassified instances, and growing and pruning of classifier architectures [4]. An algorithm possesses incremental learning capabilities, if it meets the following criteria: Ability to acquire additional knowledge when new stock data are introduced Ability to retain previously learned information about the stock closing prices. Ability to learn new classes of stock data if introduced by new data. Some applications of online classification problems have been reported recently [5]. In most cases, the degree of accuracy and the acceptability of certain classifications are measured by the error of misclassified instances. Although Learn++ has mostly been applied to classification problems, we show in this paper that the choice of Learn++ algorithm can boost a weak learning model to classify stock closing values with minimum error and reduced training time. For the practitioners in financial market, forecasting methods based on minimizing forecast error may not be adequate to meet their objectives. In other words, trading driven by a certain forecast with a small forecast error may not be as profitable as trading guided by an accurate prediction of the direction of movement. The main goal of this study is to explore the predictability of financial market movement direction using an ensemble of classifiers implemented using the Learn++ algorithm. This paper discusses the ensemble systems, introduces the basic theory on incremental learning and the Learn++ algorithm, and gives the experimental scheme as well as results obtained. 2 Ensemble of Classifiers Ensemble systems have attracted a great deal of attention over the last decade due to their empirical success over single classifier systems on a variety of applications. Such systems combine an ensemble of generally weak classifiers to take advantage of the socalled instability of the weak classifier. This causes the classifiers to construct sufficiently different decision boundaries for minor modifications in their training parameters and as a result each classifier makes different errors on any given instance. A strategic combination of these classifiers, such as weighted majority voting [6], then eliminates the individual errors, generating a strong classifier. A rich collection of algorithms has been developed using multiple classifiers, such as AdaBoost [7], with the general goal of improving the generalization performance of the classification system. Using multiple classifiers for incremental learning, however, has been largely unexplored. Learn++, in part inspired by AdaBoost, was developed in response to recognizing the potential feasibility of ensemble of classifiers in solving the incremental learning problem. Learn++ was initially introduced in [8] as an incremental learning algorithm for the MLP type networks. A more versatile form of the algorithm was presented in [9] for all supervised classifiers. We have recently recognized that the
3 442 D. Lunga and T. Marwala inherent voting mechanism of the algorithm can also be used in effectively determining the confidence of the classification system in its own decision making. In this work, we describe the algorithm Learn++, along with representative results on incremental learning and confidence estimation obtained on the application of the algorithm to predict the direction of the movement for the Dow Jones Average Indicators. 3 Incremental Learning An incremental learning algorithm is defined as an algorithm that learns new information from unseen data, without necessitating access to previously used data [10]. The algorithm must also be able to learn new information from new data and still retains knowledge from the original data. Lastly, the algorithm must be able to learn new classes that may be introduced by new data. This type of learning algorithm is sometimes referred to as a memoryless online learning algorithm. Learning new information without requiring access to previously used data, however, raises stabilityplasticity dilemma [11]. This dilemma indicates that a completely stable classifier maintains the knowledge from previously seen data, but fails to adjust in order to learn new information, while a completely plastic classifier is capable of learning new data but lose prior knowledge. The problem with the MLP is that it is a stable classifier and is not able to learn new information after it has been trained. Different procedures have been implemented for incremental learning. One procedure of learning new information from additional data involves discarding the existing classifier and training a new classifier using accumulated data. Other methods such as pruning of networks or controlled modification of classifier weight or growing of classifier architectures are referred to as incremental learning algorithm. This involves modifying the weights of the classifier using the misclassified instances only. The above algorithms are capable of learning new information; however, they suffer from catastrophic forgetting and require access to old data. One approach evaluates the current performance of the classifier architecture. If the present architecture does not sufficiently represent the decision boundaries being learned, new decision clusters are generated in response to new pattern. Furthermore, this approach does not require access to old data and can accommodate new classes. However, the main shortcomings of this approach are: cluster proliferation and extreme sensitivity to selection of algorithm parameters. In this paper, Learn++ is implemented for online prediction of stock movement direction using the Dow Jones average indicators. The Learn++ algorithm is summarized in the next section. 4 Learn++ Learn++ is an incremental learning algorithm that uses an ensemble of classifiers that are combined using weighted majority voting. Learn++ was developed from an inspiration by a boosting algorithm called adaptive boosting (AdaBoost).
4 Online Forecasting of Stock Market Movement Direction 443 Each classifier is trained using a training subset that is drawn according to a distribution D. The classifiers are trained using a weaklearn algorithm. The requirement for the weaklearn algorithm is that it must be able to give a classification rate of atleast 50% initially. For each database D k that contains learning examples and their corresponding classes, Learn++ starts by initializing the weights, w, according to the distribution D T,whereT is the number of hypothesis. Initially the weights are initialized to be uniform, which gives equal probability for all instances to be selected to the first training subset and the distribution is given by D = 1 (1) m Where m represents the number of training examples in database S k. The training data are then divided into training subset T R and testing subset T E to ensure weaklearn capability. The distribution is then used to select the training subset T R and testing subset T E from S k. After the training and testing subset have been selected, the weaklearn algorithm is implemented. The weaklearner is trained using subset, T R.Ahypothesis,h t obtained from weaklearner is tested using both the training and testing subsets to obtain an error,ɛ t : ɛ t = t:h t(x i) y i D t (i) (2) The error is required to be less than 1 2 ; a normalized error β t is computed using: β t = ɛ t (3) 1 ɛ t If the error is greater than 1 2, the hypothesis is discarded and new training and testing subsets are selected according to D T and another hypothesis is computed. All classifiers generated so far, are combined using weighted majority voting to obtain composite hypothesis, H t H t =argmax y Y t:h t(x)=y log 1 β t (4) Weighted majority voting gives higher voting weights to a hypothesis that performs well on its training and testing subsets. The error of the composite hypothesis is computed as in Eq. 5 and is given by E t = t:h t(x i) y i D t (i) (5) If the error is greater than 1 2, the current composite hypothesis is discarded and the new training and testing data are selected according to the distribution D T. Otherwise, if the error is less than 1 2, the normalized error of the composite hypothesis is computed as: B t = E t 1 E t (6)
5 444 D. Lunga and T. Marwala The error is used in the distribution update rule, where the weights of the correctly classified instances are reduced, consequently increasing the weights of the misclassified instances. This ensures that instances that were misclassified by the current hypothesis have a higher probability of being selected for the subsequent training set. The distribution update rule is given by w t+1 = w t (i) B [ Ht(xi) yi ] t (7) Once the T hypotheses are created for each database, the final hypothesis is computed by combining the composite hypothesis using weighted majority voting given by H t =argmax y Y 5 Confidence Measurement K k=1 t:h t(x)=y log 1 β t (8) An intimately relevant issue is the confidence of the classifier in its decision, with particular interest on whether the confidence of the algorithm improves as new data become available. The voting mechanism inherent in Learn++ hints to a practical approach for estimating confidence: decisions made with a vast majority of votes have better confidence than those made by a slight majority [12]. We have implemented McIver and Friedl s weighted exponential voting based confidence metric [13] with Learn++ as C i (x) =P (y = i x) = exp Fi(x) N k=1 expf k(x), 0 C i(x) 1 (9) Where C i (x) is the confidence assigned to instance x when classified as class i, F i (x) is the total vote associated with the i t h class for the instance x and N is the number of classes. The total vote F i (x) class received for any given instances is computed as F i (x) = N ( log 1 t=1 β t, if h t (x) =i 0, otherwise ) (10) The confidence of winning class is then considered as the confidence of the algorithm in making the decision with respect to the winning class. Since C i (x) is between 0 and 1, the confidences can be translated into linguistic indicators as shown in Table 1. These indicators are adopted and used in interpreting our experimental results. Equations (9) and (10) allow Learn++ to determine its own confidence in any classification it makes. The desired outcome of the confidence analysis is to observe a high confidence on correctly classified instances, and a low confidence on misclassified instances, so that the low confidence can be used to flag those instances that are being misclassified by the algorithm. A second desired outcome is to observe improved confidences on correctly classified instances and reduced confidence on misclassified instances, as new data become available, so that the incremental learning ability of the algorithm can be further confirmed.
6 Online Forecasting of Stock Market Movement Direction 445 Table 1. Confidence estimation representation Confidence range (%) Confidence level 90 C 100 Very High (VH) 80 C<90 High (H) 70 C<80 Medium (M) 60 C<70 Low (L) C<60 Very Low (VL) 6 Forecasting Framework 6.1 Experimental Design In our empirical analysis, we set out to examine the daily changes of the Dow Jones Index. The Dow Jones averages are unique in that they are price weighted rather than market capitalization weighted. Their component weightings are therefore affected only by changes in the stock prices, in contrast with other indexes weightings that are affected by both price changes and changes in the number of shares outstanding [14]. When the averages were initially created, their values were calculated by simply adding up the component stock prices and dividing by the number of components. Later, the practice of adjusting the divisor was initiated to smooth out the effects of stock splits and other corporate actions. The Dow Jones Industrial Average measures the composite price performance of over 30 highly capitalized stocks trading on the New York Stock Exchange (NYSE), representing a broad crosssection of US industries. Trading in the index has gained unprecedented popularity in major financial markets around the world. The increasing diversity of financial instruments related to the Dow Jones Index has broadened the dimension of global investment opportunity for both individual and institutional investors. There are two basic reasons for the success of these index trading vehicles. First, they provide an effective means for investors to hedge against potential market risks. Second, they create new profit making opportunities for market speculators and arbitrageurs. Therefore, it has profound implications and significance for researchers and practitioners alike to accurately forecast the movement direction of stock prices. 6.2 Model Input Selection Most of the previous researchers have employed multivariate input. Several studies have examined the crosssectional relationship between stock index and macroeconomic variables. The potential macroeconomic input variables which are used by the forecasting models include term structure of interest rates (TS), shortterm interest rate (ST), longterm interest rate (LT), consumer price index (CPI), industrial production (IP), government consumption (GC), private consumption (PC), gross national product (GNP) and gross domestic product (GDP). Other macroeconomic variables data are not available for our study. Thus for our study only the closing values of the Index were selected as inputs.
7 446 D. Lunga and T. Marwala A one step forward prediction of the Index was performed on a daily basis. The output of this prediction model was used as inputs to the learn++ algorithm for classification into the correct category that would give an indication of whether the predicted index value is 1 (indicating a positive increase in next day s predicted closing value compared to the previous day s closing value) or a predicted closing value of 1, indicating a decrease in next day s predicted closing value compared to the previous day s closing value. Figure 1 depicts the conceptual model of all processes required for this study. The first prediction model can be written as depicted by Eq. 11 below: CV t = F (cv t 1,cv t 2,cv t 3,cv t 4 ) (11) Where CV t is the predicted close value at time t, cv t 1 indicates the close value at day i, wherei =1, 2, 3,,t 1.The second model takes the output of the first model as its input in predicting the direction of movement for the index. The classification prediction stage can be represented by Eq. 12: Direction t = F (CV t ) (12) Where CV t is the first model prediction of the fifth day stock closing value when given the raw data at time t 1tot 4 respectively. Direction t is a categorical variable to indicate the movement direction of Dow Jones Index at time t. If Dow Jones Index at time t is larger than that at time t 1, Direction t is 1. Otherwise, Direction t is 1. Fig. 1. Proposed model for online stock forecasting 6.3 Experimental Results The forecasting model described in the sections above is estimated and validated by insample data. The model estimation selection process is then followed by an empirical evaluation which is based on the outofsample data. At this stage, the relative performance of the model is measured by the classification accuracy of the final hypothesis chosen for all given databases. The confidence of the algorithm on its own decision is used in establishing the accuracy of predicted closing value category. The first experiment implements a one step forward prediction of the next day s stock closing value. After predicting the
8 Online Forecasting of Stock Market Movement Direction 447 next day s closing value this value is fed into a classification model to indicate the direction of movement for the stock prices. As discussed above the database consisted of 1476 instances of the Dow Jones average closing value during the period from January 2000 to November 2005; 1000 instances is used for training and all the remaining instances is used for validation. The two binary classes are 1, indicating an upward direction of returns in Dow Jones stock, and 1 to indicate a predicted fall/downward direction of movement for the Dow Jones stock. Four datasets S 1, S 2, S 3, S 4, where each dataset included exactly one quarter of the entire training data, were provided to Learn++ in four training sessions for incremental learning. For each training session k,(k =1, 2, 3, 4) three weak hypothesis were generated by Learn ++. Each hypothesis h 1, h 2 and h 3 of the k t h training session was generated using a training subset TR t and a testing subset TE t. The WeakLearner was a single hidden layer MLP with 15 hidden layer nodes and 1 output node with an MSE goal of 0.1. The test set of data, Validate consisted of 476 instances that were used for validation purposes. On average, the MLP hypothesis, weaklearner, performed little over 50%, which improved to over 80% when the hypothesis were combined by making use of weighted majority voting. This improvement demonstrates the performance improvement property of Learn++, as inherited from AdaBoost, on a given database. The data distribution and the percentage classification performance are given in Table 2. The performances listed are on the validation data, Validate following each training session. Table 3 provides an actual breakdown of correctly classified and misclassified instances falling into each confidence range after each training session. The trends of the confidence estimates after subsequent training sessions are given in Table 3. The desired outcome on the actual confidences is high to very high confidences on correctly classified instances, and low to very low confidences on misclassified instances. The desired outcome on confidence trends is increasing or steady confidences on correctly classified instances, and decreasing confidences on misclassified instances, as new data is introduced. Table 2. Training and generalisation performance of Learn++ Database Class(1) Class(1) Test Performance (%) S S S S Validate The performance shown in Table 2 indicates that the algorithm is improving its generalization capacity as new data become available. The improvement is modest, however, as majority of the new information is already learned in the first training session. Table 4 indicates that the vast majority of correctly classified instances tend to have very high confidences, with continually improved confidences at consecutive training sessions. While a considerable portion of
9 448 D. Lunga and T. Marwala misclassified instances also had high confidence for this database, the general desired trends of increased confidence on correctly classified instances and decreasing confidence on misclassified ones were notable and dominant, as shown in Table 3. Table 3. Confidence results VH H M VL L Correctly classified S S S S Incorrectly classified S S S S Table 4. Confidence trends for Dow Jones Increasing Steady Decreasing Correctly classified Misclassified Conclusion In this paper, we study the use of an incremental algorithm to predict financial markets movement direction. As demonstrated in our empirical analysis, Learn++ is observed to give good results on converting the weaklearner (MLP) into a strong learning algorithm that has confidence in all its decisions. The Learn++ algorithm is observed to assess the confidence of its own decisions. In general, majority of correctly classified instances had very high confidence estimates while lower confidence values were associated with misclassified instances. Therefore, classifications with low confidences can be used as a flag to further evaluate those instances. Furthermore, the algorithm also showed increasing confidences in correctly classified instances and decreasing confidences in misclassified instances after subsequent training sessions. This is a very comforting outcome, which further indicates that algorithm can incrementally acquire new and novel information from additional data. Acknowledgement This research was fully funded by the National Research Foundation of the Republic of South Africa.
10 Online Forecasting of Stock Market Movement Direction 449 References 1. Carpenter, G., Grossberg, S., Marhuzon, N., Reynolds, J., Rosen, D.: Artmap: A neural network architecture for incremental learning supervised learning of analog multidimensional maps. In: Transactions in Neural Networks. Volume 3., IEEE (1992) McNelis, P.D., ed.: Neural Networks in Finance: Gaining the predictive edge in the market. Elsevier Academic Press, OxfordUK (2005) 3. Freund, Y., Schapire, R.: A decisiontheoretic generalization of online learning and an application to boosting. Journal of Computer and System Science (1997) 4. Bishop, C., ed.: Neural Networks for Pattern Recognition. Oxford University Press, OxfordLondon (1995) 5. Vilakazi, B., Marwala, T., Mautla, R., Moloto, E.: Online bushing condition monitoring using computational intelligence. WSEAS Transactions on Power Systems 1 (2006) Littlestone, N., Warmuth, M.: Weighted majority voting algorithm. information and computer science 108 (1994) Polikar, R., Byorick, J., Krause, S., Marino, A., Moreton, M.: Learn++: A classifier independent incremental learning algorithm. Proceedings of International Joint Conference on Neural Networks (2002) 8. Polikar, R.: Algorithms for enhancing pattern separability, feature selection and incremental learning with applications to gas sensing electronic noise systems. PhD thesis, Iowa State University, Ames (2000) 9. Freund, Y., Schapire, R.: A short introduction to boosting. Japanese Society for Artificial Intelligence 14 (1999) Polikar, R., Udpa, L., Udpa, S., Honavar, V.: An incremental learning algorithm with confidence estimation for automated identification of nde signals. Transactions on Ultrasonic Ferroelectrics, and Frequency control 51 (2004) Grossberg, S.: Nonlinear neural networks: principles, mechanisms and architectures. Neural Networks 1 (1988) Byorick, J., Polikar, R.: Confidence estimation using the incremental learning algorithm. Lecture notes in computer science 2714 (2003) McIver, D., Friedl, M.: Estimating pixelscale land cover classification confidence using nonparametric machine learning methods. Transactions on Geoscience and Remote Sensing 39 (2001) 14. Leung, M., Daouk, H., Chen, A.: Forecasting stock indices: a comparison of classification and level estimation models. (International Journal of Forecasting)
Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training
More informationEnsemble Data Mining Methods
Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods
More informationIncremental SampleBoost for Efficient Learning from MultiClass Data Sets
Incremental SampleBoost for Efficient Learning from MultiClass Data Sets Mohamed Abouelenien Xiaohui Yuan Abstract Ensemble methods have been used for incremental learning. Yet, there are several issues
More informationBOOSTING  A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL
The Fifth International Conference on elearning (elearning2014), 2223 September 2014, Belgrade, Serbia BOOSTING  A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University
More informationEnsemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 20150305
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 20150305 Roman Kern (KTI, TU Graz) Ensemble Methods 20150305 1 / 38 Outline 1 Introduction 2 Classification
More informationData Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
More informationModel Combination. 24 Novembre 2009
Model Combination 24 Novembre 2009 Datamining 1 20092010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy
More informationNeural Networks for Sentiment Detection in Financial Text
Neural Networks for Sentiment Detection in Financial Text Caslav Bozic* and Detlef Seese* With a rise of algorithmic trading volume in recent years, the need for automatic analysis of financial news emerged.
More informationNew Ensemble Combination Scheme
New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,
More informationActive Learning with Boosting for Spam Detection
Active Learning with Boosting for Spam Detection Nikhila Arkalgud Last update: March 22, 2008 Active Learning with Boosting for Spam Detection Last update: March 22, 2008 1 / 38 Outline 1 Spam Filters
More informationData Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
More informationMANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL
MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL G. Maria Priscilla 1 and C. P. Sumathi 2 1 S.N.R. Sons College (Autonomous), Coimbatore, India 2 SDNB Vaishnav College
More informationAdvanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
More informationCI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.
CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes
More informationChapter 11 Boosting. Xiaogang Su Department of Statistics University of Central Florida  1 
Chapter 11 Boosting Xiaogang Su Department of Statistics University of Central Florida  1  Perturb and Combine (P&C) Methods have been devised to take advantage of the instability of trees to create
More informationNeural Networks and Support Vector Machines
INF5390  Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF539013 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines
More informationKnowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19  Bagging. Tom Kelsey. Notes
Knowledge Discovery and Data Mining Lecture 19  Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.standrews.ac.uk twk@standrews.ac.uk Tom Kelsey ID505919B &
More informationUsing artificial intelligence for data reduction in mechanical engineering
Using artificial intelligence for data reduction in mechanical engineering L. Mdlazi 1, C.J. Stander 1, P.S. Heyns 1, T. Marwala 2 1 Dynamic Systems Group Department of Mechanical and Aeronautical Engineering,
More informationOperations Research and Knowledge Modeling in Data Mining
Operations Research and Knowledge Modeling in Data Mining Masato KODA Graduate School of Systems and Information Engineering University of Tsukuba, Tsukuba Science City, Japan 3058573 koda@sk.tsukuba.ac.jp
More informationA Study Of Bagging And Boosting Approaches To Develop MetaClassifier
A Study Of Bagging And Boosting Approaches To Develop MetaClassifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet524121,
More informationL25: Ensemble learning
L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo GutierrezOsuna
More informationSoftware Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model
Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model Iman Attarzadeh and Siew Hock Ow Department of Software Engineering Faculty of Computer Science &
More informationData Mining  Evaluation of Classifiers
Data Mining  Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationA Multilevel Artificial Neural Network for Residential and Commercial Energy Demand Forecast: Iran Case Study
211 3rd International Conference on Information and Financial Engineering IPEDR vol.12 (211) (211) IACSIT Press, Singapore A Multilevel Artificial Neural Network for Residential and Commercial Energy
More informationComparison of Data Mining Techniques used for Financial Data Analysis
Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract
More informationStabilization by Conceptual Duplication in Adaptive Resonance Theory
Stabilization by Conceptual Duplication in Adaptive Resonance Theory Louis Massey Royal Military College of Canada Department of Mathematics and Computer Science PO Box 17000 Station Forces Kingston, Ontario,
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationChapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
More informationNTC Project: S01PH10 (formerly I01P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling
1 Forecasting Women s Apparel Sales Using Mathematical Modeling Celia Frank* 1, Balaji Vemulapalli 1, Les M. Sztandera 2, Amar Raheja 3 1 School of Textiles and Materials Technology 2 Computer Information
More informationPrediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
More informationClass #6: Nonlinear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Nonlinear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Nonlinear classification Linear Support Vector Machines
More informationSUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK
SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK N M Allinson and D Merritt 1 Introduction This contribution has two main sections. The first discusses some aspects of multilayer perceptrons,
More informationInternational Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013
A ShortTerm Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi.S 1 Research Scholar, MS University,Thirunelvelli Dr.M.Punithavalli Director, SREC,Coimbatore. Abstract:
More informationEvaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring
714 Evaluation of Feature election Methods for Predictive Modeling Using Neural Networks in Credits coring Raghavendra B. K. Dr. M.G.R. Educational and Research Institute, Chennai95 Email: raghavendra_bk@rediffmail.com
More informationSolving Regression Problems Using Competitive Ensemble Models
Solving Regression Problems Using Competitive Ensemble Models Yakov Frayman, Bernard F. Rolfe, and Geoffrey I. Webb School of Information Technology Deakin University Geelong, VIC, Australia {yfraym,brolfe,webb}@deakin.edu.au
More informationKnowledge Discovery in Stock Market Data
Knowledge Discovery in Stock Market Data Alfred Ultsch and Hermann LocarekJunge Abstract This work presents the results of a Data Mining and Knowledge Discovery approach on data from the stock markets
More informationEFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE S. Anupama Kumar 1 and Dr. Vijayalakshmi M.N 2 1 Research Scholar, PRIST University, 1 Assistant Professor, Dept of M.C.A. 2 Associate
More informationKnowledge Based Descriptive Neural Networks
Knowledge Based Descriptive Neural Networks J. T. Yao Department of Computer Science, University or Regina Regina, Saskachewan, CANADA S4S 0A2 Email: jtyao@cs.uregina.ca Abstract This paper presents a
More informationIncremental Learning
Incremental Learning Abdelhamid Bouchachia Department of Informatics University of Klagenfurt Universitaetsstr. 6567 Klagenfurt, 9020 Austria voice: +43 463 2700 3525 fax: +43 463 2700 3599 email: hamid@isys.uniklu.ac.at
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right
More informationA New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication
2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management
More informationEnsembles and PMML in KNIME
Ensembles and PMML in KNIME Alexander Fillbrunn 1, Iris Adä 1, Thomas R. Gabriel 2 and Michael R. Berthold 1,2 1 Department of Computer and Information Science Universität Konstanz Konstanz, Germany First.Last@UniKonstanz.De
More informationBoosting. riedmiller@informatik.unifreiburg.de
. Machine Learning Boosting Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät AlbertLudwigsUniversität Freiburg riedmiller@informatik.unifreiburg.de
More informationA Learning Algorithm For Neural Network Ensembles
A Learning Algorithm For Neural Network Ensembles H. D. Navone, P. M. Granitto, P. F. Verdes and H. A. Ceccatto Instituto de Física Rosario (CONICETUNR) Blvd. 27 de Febrero 210 Bis, 2000 Rosario. República
More informationA Data Mining Study of Weld Quality Models Constructed with MLP Neural Networks from Stratified Sampled Data
A Data Mining Study of Weld Quality Models Constructed with MLP Neural Networks from Stratified Sampled Data T. W. Liao, G. Wang, and E. Triantaphyllou Department of Industrial and Manufacturing Systems
More informationHow Boosting the Margin Can Also Boost Classifier Complexity
Lev Reyzin lev.reyzin@yale.edu Yale University, Department of Computer Science, 51 Prospect Street, New Haven, CT 652, USA Robert E. Schapire schapire@cs.princeton.edu Princeton University, Department
More information1311. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 1311 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationAnalecta Vol. 8, No. 2 ISSN 20647964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
More informationREVIEW OF ENSEMBLE CLASSIFICATION
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.
More informationBootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
More informationIncreasing Classification Accuracy. Data Mining: Bagging and Boosting. Bagging 1. Bagging 2. Bagging. Boosting Metalearning (stacking)
Data Mining: Bagging and Boosting Increasing Classification Accuracy Andrew Kusiak 2139 Seamans Center Iowa City, Iowa 522421527 andrewkusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel: 319335
More informationNeural Network based Vehicle Classification for Intelligent Traffic Control
Neural Network based Vehicle Classification for Intelligent Traffic Control Saeid Fazli 1, Shahram Mohammadi 2, Morteza Rahmani 3 1,2,3 Electrical Engineering Department, Zanjan University, Zanjan, IRAN
More informationLecture 6. Artificial Neural Networks
Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm
More informationIntroduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems
More informationHong Kong Stock Index Forecasting
Hong Kong Stock Index Forecasting Tong Fu Shuo Chen Chuanqi Wei tfu1@stanford.edu cslcb@stanford.edu chuanqi@stanford.edu Abstract Prediction of the movement of stock market is a longtime attractive topic
More informationIntroduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011
Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning
More informationEffect of Using Neural Networks in GABased School Timetabling
Effect of Using Neural Networks in GABased School Timetabling JANIS ZUTERS Department of Computer Science University of Latvia Raina bulv. 19, Riga, LV1050 LATVIA janis.zuters@lu.lv Abstract:  The school
More informationEnsemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008
Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles
More informationENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS
ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS Michael Affenzeller (a), Stephan M. Winkler (b), Stefan Forstenlechner (c), Gabriel Kronberger (d), Michael Kommenda (e), Stefan
More informationHYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION
HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan
More informationUsing News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 gyozo@cs.ucsd.edu 21, June 15,
More informationCreating Shortterm Stockmarket Trading Strategies using Artificial Neural Networks: A Case Study
Creating Shortterm Stockmarket Trading Strategies using Artificial Neural Networks: A Case Study Bruce Vanstone, Tobias Hahn Abstract Developing shortterm stockmarket trading systems is a complex process,
More informationReal Stock Trading Using Soft Computing Models
Real Stock Trading Using Soft Computing Models Brent Doeksen 1, Ajith Abraham 2, Johnson Thomas 1 and Marcin Paprzycki 1 1 Computer Science Department, Oklahoma State University, OK 74106, USA, 2 School
More informationTraining Methods for Adaptive Boosting of Neural Networks for Character Recognition
Submission to NIPS*97, Category: Algorithms & Architectures, Preferred: Oral Training Methods for Adaptive Boosting of Neural Networks for Character Recognition Holger Schwenk Dept. IRO Université de Montréal
More informationFlexible Neural Trees Ensemble for Stock Index Modeling
Flexible Neural Trees Ensemble for Stock Index Modeling Yuehui Chen 1, Ju Yang 1, Bo Yang 1 and Ajith Abraham 2 1 School of Information Science and Engineering Jinan University, Jinan 250022, P.R.China
More informationInternational Journal of Electronics and Computer Science Engineering 1449
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN 22771956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
More informationTRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP
TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions
More information6.2.8 Neural networks for data mining
6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural
More informationTowards better accuracy for Spam predictions
Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial
More informationArtificial Neural Network and NonLinear Regression: A Comparative Study
International Journal of Scientific and Research Publications, Volume 2, Issue 12, December 2012 1 Artificial Neural Network and NonLinear Regression: A Comparative Study Shraddha Srivastava 1, *, K.C.
More informationOn the effect of data set size on bias and variance in classification learning
On the effect of data set size on bias and variance in classification learning Abstract Damien Brain Geoffrey I Webb School of Computing and Mathematics Deakin University Geelong Vic 3217 With the advent
More informationAUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
More informationDECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com
More informationER Volatility Forecasting using GARCH models in R
Exchange Rate Volatility Forecasting Using GARCH models in R Roger Roth Martin Kammlander Markus Mayer June 9, 2009 Agenda Preliminaries 1 Preliminaries Importance of ER Forecasting Predicability of ERs
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationSupply Chain Forecasting Model Using Computational Intelligence Techniques
CMU.J.Nat.Sci Special Issue on Manufacturing Technology (2011) Vol.10(1) 19 Supply Chain Forecasting Model Using Computational Intelligence Techniques Wimalin S. Laosiritaworn Department of Industrial
More informationRandom forest algorithm in big data environment
Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest
More informationData quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
More informationImproved Neural Network Performance Using Principal Component Analysis on Matlab
Improved Neural Network Performance Using Principal Component Analysis on Matlab Improved Neural Network Performance Using Principal Component Analysis on Matlab Junita MohamadSaleh Senior Lecturer School
More informationData Mining Methods: Applications for Institutional Research
Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014
More informationII. RELATED WORK. Sentiment Mining
Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network?  Perceptron learners  Multilayer networks What is a Support
More informationOptimization of technical trading strategies and the profitability in security markets
Economics Letters 59 (1998) 249 254 Optimization of technical trading strategies and the profitability in security markets Ramazan Gençay 1, * University of Windsor, Department of Economics, 401 Sunset,
More informationNeural Networks and Back Propagation Algorithm
Neural Networks and Back Propagation Algorithm Mirza Cilimkovic Institute of Technology Blanchardstown Blanchardstown Road North Dublin 15 Ireland mirzac@gmail.com Abstract Neural Networks (NN) are important
More informationTowards applying Data Mining Techniques for Talent Mangement
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Towards applying Data Mining Techniques for Talent Mangement Hamidah Jantan 1,
More informationFace Recognition For Remote Database Backup System
Face Recognition For Remote Database Backup System Aniza Mohamed Din, Faudziah Ahmad, Mohamad Farhan Mohamad Mohsin, Ku Ruhana KuMahamud, Mustafa Mufawak Theab 2 Graduate Department of Computer Science,UUM
More informationA New Quantitative Behavioral Model for Financial Prediction
2011 3rd International Conference on Information and Financial Engineering IPEDR vol.12 (2011) (2011) IACSIT Press, Singapore A New Quantitative Behavioral Model for Financial Prediction Thimmaraya Ramesh
More informationSEARCH AND CLASSIFICATION OF "INTERESTING" BUSINESS APPLICATIONS IN THE WORLD WIDE WEB USING A NEURAL NETWORK APPROACH
SEARCH AND CLASSIFICATION OF "INTERESTING" BUSINESS APPLICATIONS IN THE WORLD WIDE WEB USING A NEURAL NETWORK APPROACH Abstract Karl Kurbel, Kirti Singh, Frank Teuteberg Europe University Viadrina Frankfurt
More informationRecurrent Neural Networks
Recurrent Neural Networks Neural Computation : Lecture 12 John A. Bullinaria, 2015 1. Recurrent Neural Network Architectures 2. State Space Models and Dynamical Systems 3. Backpropagation Through Time
More informationThe Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
More informationA neural network model to forecast Japanese demand for travel to Hong Kong
Tourism Management 20 (1999) 89 97 A neural network model to forecast Japanese demand for travel to Hong Kong Rob Law*, Norman Au Department of Hotel and Tourism Management, The Hong Kong Polytechnic University,
More informationRule based Classification of BSE Stock Data with Data Mining
International Journal of Information Sciences and Application. ISSN 09742255 Volume 4, Number 1 (2012), pp. 19 International Research Publication House http://www.irphouse.com Rule based Classification
More informationDesigning a Decision Support System Model for Stock Investment Strategy
Designing a Decision Support System Model for Stock Investment Strategy Chai Chee Yong and Shakirah Mohd Taib Abstract Investors face the highest risks compared to other form of financial investments when
More informationPredicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationData Mining for Knowledge Management in Technology Enhanced Learning
Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 2729, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning
More informationCategorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors ChiaHui Chang and ZhiKai Ding Department of Computer Science and Information Engineering, National Central University, ChungLi,
More informationManjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India
Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone
More informationNeural Network and Genetic Algorithm Based Trading Systems. Donn S. Fishbein, MD, PhD Neuroquant.com
Neural Network and Genetic Algorithm Based Trading Systems Donn S. Fishbein, MD, PhD Neuroquant.com Consider the challenge of constructing a financial market trading system using commonly available technical
More information' ( ) * +, ..  '/0112! " " " #$#%#%#&
' ( ) * +, ..  '/0112! " " " #$#%#%#& !"!#$%&'&% &() * () *%+, %.!, + %*!", + & /001 (&2 " #!. &3+ 4 *+ 3 (* 5 & 2 %617. 8 911:0;/05 &! 2 (?@911:0;/0
More information