Learning Trading Negotiations Using Manually and Automatically Labelled Data

Size: px
Start display at page:

Download "Learning Trading Negotiations Using Manually and Automatically Labelled Data"

Transcription

1 Learning Trading Negotiations Using Manually and Automatically Labelled Data Heriberto Cuayáhuitl, Simon Keizer, Oliver Lemon School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom {h.cuayahuitl s.keizer Abstract Strategic conversational agents often need to trade resources with their opponent conversants and trading strategically can lead to better results. While rule-based or supervised agents can be used for such a purpose, here we explore a learning approach based on automatically labelled examples from human players for automatic trading in the game of Settlers of Catan. Our experiments are based on data collected from human players trading in text-based natural language. We compare the performance of Bayes Nets, Conditional Random Fields, and Random Forests on the task of ranking trading offers, trained from both manually labelled and automatically labelled data. Our experimental results show that our best agent trained on automatic labels outperformed its counterpart trained on manual labels (with moderate annotator agreement) in terms of (a) predicting human trading negotiations better, and (b) winning more games. Keywords-strategic interaction; supervised learning; semisupervised learning; automatic labelling; board games; I. INTRODUCTION Strategic conversation does not assume full cooperation during the interaction between agents [1]. In this paper, we will use a strategic card-trading board game to illustrate our approach. Board games with trading aspects aim not only at entertaining people, but also at training them with trading skills. Popular board games of this kind include Last Will, Settlers of Catan, and Power Grid, among others [2]. While these games can be played between humans, they can also be played between computers and humans. The trading behaviours of computer games are usually based on heuristics or optimisation methods. The former include carefully tuned rules, and the latter include methods such as Monte-Carlo tree search [3] and reinforcement learning [4], [5], [6], or a combination of them [7]. However, their application is not trivial due to the complexity of the problem, e.g. large state-action spaces. On the one hand, unique situations in the game can be described by a number of variables (e.g. resources available) so that enumerating them would result in very large state spaces. On the other, the action space can also be large due to the wide range of unique negotiations (e.g. givable and receivable resources). While one can aim for optimising the whole game via compression of the search space, one can also aim for a specialised solution. The latter is the focus of this paper by focusing on learning to trade only, rather than learning to play the whole game. In addition, while previous work has focused on optimising negotiation strategies [4], [3], our proposed approach focuses on learning human-like trading from human examples despite the fact that in reality the best choice may not be the most human-like one, especially with non-expert player data. Our scenario for strategic interaction is the game of Settlers of Catan, where players take the role of settlers on the fictitious island of Catan see Figure 1. The board game consists of 19 hexes randomly connected: 3 hills, 3 mountains, 4 forests, 4 pastures, 4 fields and 1 desert. On this island, hills produce clay, mountains produce ore, pastures produce sheep, fields produce wheat, forests produce wood, and the desert produces nothing. In our setting, four players attempt to settle on the island by building settlements and cities connected by roads. To build, players need specific resource cards, for example: a road requires clay and wood; a settlement requires clay, sheep, wheat and wood; a city requires three clay cards and two wheat cards; and a development card requires clay, sheep and wheat. Each player gets points for example by building a settlement (1 point) or a city (2 points), or by obtaining victory point cards (1 point each). A game consists of a sequence of turns, and each game turn starts with the roll of a die that can make the players obtain or lose resources (depending on the number rolled and resources on the board). The player in turn can trade resources with the bank or other players, and can make use of available resources to build roads, settlements or cities. This game is highly strategic because players often face decisions about what resources to request and what resources to give away, which are influenced by what they need to build. A player can extend build-ups on locations connected to existing pieces, i.e. road, settlement or city, and all settlements and cities must be separated by at least 2 roads. The first player to win 10 victory points wins and all others lose. 1 This paper extends our previous approach based on statistical inference for ranking trading negotiations [9], i.e. the exchange of resources for some others, from training on labelled data to training on automatically labelled data. We compare three statistical agents Bayes Nets, Conditional Random Fields, and Random Forests against rule-based and random agents as baselines, and show that our best 1

2 Figure 1. Example board of the game Settlers of Catan [8]. The topmiddle dialogue box is a chat interface that displays the game history including trading offers and responses from all players agent, trained on automatically labelled data, performed better than its counterpart trained on manually labelled data. II. RELATED WORK Machine learning techniques for strategic trading games have received little attention to date. Notable exceptions have applied reinforcement learning to board games. [10] proposes reinforcement learning with multilayer neural networks for training an agent to play the game of Backgammon. He finds that agents trained with such an approach are able to match and even beat human performance. [4] proposes hierarchical reinforcement learning for automatic decision making on object-placing and trading actions in the game of Settlers of Catan. He incorporates built-in knowledge for learning the behaviours of the game quicker, and finds that the combination of learned and built-in knowledge is able to beat human players. [6] used reinforcement learning in non-cooperative dialogue, and focuses on a small 2-player trading problem with 3 resource types, but without using any real human dialogue data. This work showed that explicit manipulation moves (e.g. I really need sheep ) can be used to win when playing against adversaries who are gullible (i.e. they believe such statements) but also against adversaries who can detect manipulation and can punish the player for being manipulative [11]. More recently, [12] compare training policies against hand-crafted traders and supervised traders created from human players. They found that rather than training trading policies on hand-crafted rule-based heuristics, a more successful approach is to train trading policies from a supervised classifier trained from human examples. Related work on supervised learning using manually and automatically labelled data have reported divergent strategies. One strategy has been to train classifiers for natural language processing (NLP) tasks using automatically extracted examples. For example, [13] train a classifier for discourse relations and report a classification accuracy of up to 93%. On the other hand, [14] compare classifiers for discourse relations trained from automatically extracted examples against such trained from manually labelled examples. The authors focus on a dataset with only moderate inter-annotator agreement (κ=0.592), and observe that classification accuracy drops substantially in the presence of ambiguous labels. The success of automatic labelling therefore seems to vary with the nature of the target dataset. In this paper, we will present further evidence that automatic labelling can lead to good results. Some other supervised learning techniques have been applied to train automated agents that know how to play board games such as decision trees [15], preference learning [16], and deep neural networks [17]. Since statistical inference has received little attention in previous work, with some exceptions [17], [9], we argue that it can play an important role in training strategic agents with human-like behaviour. In addition, statistical traders have not been trained from automatically labelled data before, and our results report that this approach represents a state-of-the-art method for learning trading negotiations. Other related work has been carried out in the context of automated non-cooperative dialogue systems, where an agent may act to satisfy its own goals rather than those of other participants [5]. The game-theoretic underpinnings of non-cooperative behaviour have also been investigated [18]. Such automated agents are of interest when trying to persuade, argue, or debate, or in the area of believable characters in video games and educational simulations [5], [19]. Another arena in which non-cooperative dialogue behaviour has been investigated is in negotiation [20], where hiding information (and even outright lying) can be advantageous. Given the machine learning efforts applied to strategic interactive games, other forms of learning remain to be explored. They include not only direct but also inverse reinforcement learning to learn from trial and error, semisupervised learning to learn from labelled and unlabelled data, unsupervised learning to learn from unlabelled data, multi-agent systems to learn behaviours considering the strategies of opponents, transfer learning so that agents do not have to be trained from scratch, and active learning to learn to ask what to do in uncertain situations while playing the game, among others see [15], [21], [22] for an overview. Another direction to explore in strategic games includes a combination of planning and learning, which has shown more promising results than either in isolation

3 [17], [7]. A further direction to explore includes end-toend statistical training of language understanding [23], [24], game behaviour, and language generation [25], [26], [27] using a unified learning framework. III. THE DATA AND TASK We used a set of 32 logged games from 56 different players as described in [28]. Although they were carefully labelled by multiple annotators, they were difficult to annotate as is indicated by their moderate annotator agreement score of 0.62 according to the well-known kappa score [29]. The data correspond to 2512 trading negotiation events (also referred to as training instances ) denoted as D m = {(x 1, y 1 ),.., (x N, y N )}, where x i are vectors of features and y i are class labels (i.e. givable resources). Our data set reports an average of 44.8 turns per player. An example trading negotiation in the game of Settlers of Catan in Natural Language is I ll give anyone sheep for clay, which can be represented as follows, including the agent s available resources: Givable(Sheep, all) Receivable(Clay, all) Resources(clay = 0, ore = 0, sheep = 4, wheat = 1, wood = 0) Buildups(roads = 2, settlements = 0, cities = 0). From this illustrative example, y i =sheep and x i = {0, 0, 4, 1, 0, 2, 0, 0, 1, 0, 0, 0, 0} based on features 1-14 in Table I. Although this representation may look simple at first sight, it has support for = 2.6 billion possible (and unique) negotiation events. Even though the class label is only the givable, we use receivables as features in ranking all the possible offers so all offers including one givable and multiple receivables are in fact ranked 2. Notice that not all of them are valid or legal at every point in time in the game. Choosing the most human-like (in our case) trading negotiation can be seen as a ranking task, where we focus on computing a score representing the importance of each trading negotiation (similar to the one above) available for making the best choice, i.e. the most human-like. In this way, the quality of our learning agents will depend on the quality of the examples provided. To rank such trading negotiation alternatives, we train a set of statistical classifiers based on the feature set described in Table I. Our set of features includes the resources available (features f 1 -f 5 ), the build-ups (features f 6 -f 8 ) with a default minimum of 0 and maximum value of 7, the receivable resources in binary form to reduce data sparsity (features f 9 -f 13 ), and the giveable resource considered as the class prediction (feature f 14 ). An example subdialogue between players is shown in Table II. The first column shows the player IDs, where the 2 The feature set listed in Table I was chosen because it yielded the best performance in previous experiments from a pool of feature sets from both manual feature selection and automatic feature selection. Other feature sets that we explored include smaller domains (only binary features), larger domains (non-binary features), smaller and larger sets of features, and multiple givables rather than a single one, among others. ID Domain Feature Description f 1 hasclay {0...7} Num. clay units available f 2 hasore {0...7} Num. ore units available f 3 hassheep {0...7} Num. sheep units available f 4 haswheat {0...7} Num. wheat units available f 5 haswood {0...7} Num. wood units available f 6 hasroads {0...7} Num. roads built so far f 7 hassettlements {0...7} Num. settlements built so far f 8 hascities {0...7} Num. cities built so far f 9 recclay Binary Clay offered by opponent? f 10 recore Binary Ore offered by opponent? f 11 recsheep Binary Sheep offered by opponent? f 12 recwheat Binary Wheat offered by opponent? f 13 recwood Binary Wood offered by opponent? f 14 givable Resource Clay/Ore/Sheep/Wheat/Wood Table I FEATURE SET FOR LEARNING TRADING NEGOTIATIONS FROM EXAMPLES. fourth player was silent. Each game had four players in total. The second column shows the messages typed and shown in the top-middle dialogue box of Figure 1. The third column shows the semantics of textual messages. The last column shows the context of the trading negotiations, represented by features f 1 -f 8 described in Table I. These sort of subdialogues occur in the game, which result in players accepting or rejecting trading offers from other players in turn. IV. TRAINING APPROACHES In this paper, we treat trading in strategic conversation as a classification task, where we train statistical classifiers either with manually labelled data (typical approach) or with automatically labelled data (our proposed approach). A. Training with Manually Labelled Data To train statistical agents in a supervised manner, we first use only one data set of manually labelled trading examples D m = {(x 1, y 1 ),.., (x N, y N )}, where x i are vectors of features and y i are class labels. Each pair or tuple represents an instance used for training or testing by the learning methods described in Section V. See Figure 2(a) for an illustration. B. Training with Automatically Labelled Data We extend the previous approach by automatically relabelling data set D m into D a = {(x 1, y p 1 ),.., (x N, y p N )}, where the y p j represent our predicted labels using the automatic labeller described below. This approach is motivated by the fact that it can generate potentially more useful data than its original source. We then use D a to train the statistical classifiers described in Section V. See Figure 2(b) for an illustration. Our classifier for automatic labelling used as features the most common words in text-based trading messages

4 Player Message Semantics Context (f 1...f 8 ) A Anyone wants to trade wood for clay Givable(wood) Receivable(clay) 0,0,0,2,3,4,2,0 A ,0,0,2,3,4,2,0 B No-one wants wheat for clay? Givable(wheat) Receivable(clay) 0,0,0,1,1,4,2,0 A Wheat for clay? Givable(wheat) Receivable(clay) 0,0,1,2,2,4,2,0 C Sheep for clay? Givable(sheep) Receivable(clay) 0,0,4,5,1,2,2,0 A I got 1 sheep Give(sheep) 0,0,1,2,2,4,2,0 Table II EXAMPLE TRADING NEGOTIATIONS FROM HUMAN PLAYERS IN THE GAME OF SETTLERS OF CATAN. Figure 2. Illustration of training approaches using manually labelled data D m and automatically labelled data D a. The latter is created from an automatic labeller trained from the original source D m that re-labels the data see Section IV-B for further details from human players 3, and the class labels were Givable and Receivable. This binary classifier used a Random Forest with 100 decision trees, see Section V-C for more details. Specifically, the word-level features included the most common words at the left of a resource in focus, and the most common words in the right-hand context of the same resource in focus. In this way, the sentence I give you sheep for clay would be labelled as Givable(sheep) and Receivable(clay). From this illustrative example, the words give and you at the left of the resource sheep would be potentially relevant features for the class label Givable. Similarly, the word for at the left of the resource clay would be a potentially relevant feature for the label Receivable. In other words, our automatic labeller generated the semantics from raw text as illustrated in columns 2 and 3 in Table II. We note that while manual labels referred to context beyond the sentence in a turn (e.g. one or more tradings before the one in focus), our automatic labeller only referred to the local context of the sentence in focus. We also note that our automatic labels were agnostic about the 3 The common words in text-based trading messages are defined as those that appear more than the average number of words and symbols (e.g. dots, question mark) in the training data Figure 3. High-level proportion of dialogue act types in manual and automatic labels players in focus, i.e. our automatic labeller did not take into account the sender and recipient players. Furthermore, we note that while the manual labels used 7 dialogue act types, our automatic labels focused on 3 dialogue act types see Figures 3, 4, and 5. The smaller set of dialogue ac types was used to reduce the complexity in the annotations.

5 is selected according to y = arg max y Y P (y e(y)), where the contextual information of givable y is defined by e(t) = {f 1 = val 1,..., f n = val n } with features f i. Figure 4. Figure 5. Detailed proportion of dialogue act types in manual labels Detailed proportion of dialogue act types in automatic labels V. STATISTICAL TRADING AGENTS We compare the performance of the following statistical classifiers with the aim of finding the best predictor of human-like trading negotiations: (i) Bayesian Networks, (ii) Conditional Random Fields, and (iii) Random Forests. A. Learning to Trade with Bayesian Nets Our Bayesian agent is defined by P (x) = n i=1 P (x i pa(x i )), where x= {x 1,..., x n } is a set of random variables describing the context of the game, pa(.) denotes the set of parent random variables, and every variable is associated with a conditional probability distribution P (x i pa(x i )). Two main tasks are involved in the creation of our Bayes net. First, parameter learning involves the estimation of conditional probability distributions (discrete in our case) from D based on maximum likelihood estimation with smoothing. Second, structure learning involves inducing the dependencies of random variables based on the K2 algorithm, see [30] for details. Once the Bayes net has been trained, we use the junction tree algorithm [31] for probabilistic inference of trades. The most probable human-like trade B. Learning to Trade with CRFs This agent treats trading as a sequence labelling task, in which a sequence of game environment inputs is labelled with appropriate givable resources to support trades. The task is therefore to find a mapping between (observed) features including available resources, build-ups, and receivables and a (hidden) sequence of givables. We use the linear-chain Conditional Random Field (CRF) model for predicting human-like trades in the game of Settlers of Catan. This model defines the posterior probability distribution of labels (givables in our case) y={y 1,..., y y } given features x={x 1,..., x x }, as P (y x) = 1 { T Z(x) t=1 exp K } k=1 θ kφ k (y t, y t 1, x t ), where Z(x) is a normalisation factor over all available vectors of contextual information x such that the sum of all labellings is one. The parameters θ k are weights associated with feature functions Φ k (.), which are real values describing the label state y at time t based on the previous label state y t 1 and features x t. The parameters θ k are set to maximise the conditional likelihood of sequences of givables in the training data set. They are estimated using the gradient descent algorithm. After training, labels can be predicted for new sequences of observations. The most likely trading offer is expressed as y = arg max y P r(y x), which is computed using the Viterbi and A search algorithms see [32] for further details. C. Learning to Trade with Random Forests This agent is trained using an ensemble of trees, which are used to vote for the class prediction at test time [33], [34]. A random forest is an ensemble learning method that constructs a set of random decision trees at training time, and uses them to generate the most popular class. We compute the probability distribution of a human-like trade b B P b(givable evidence), as P (givable evidence) = 1 Z where givable refers to the class prediction, evidence refers to observed features 1-13, P b (..) is the posterior distribution of the bth tree, and Z is a normalisation constant see [35] for further details. In our experiments below, we fixed the amount of decision trees to 100. Assuming that Y is a set of givables at a particular point in time in the game, extracting the most human-like trading offer (givable y ) and collected evidence (context of the game), is defined as y = arg max y Y P r(y evidence). VI. EXPERIMENTS AND RESULTS Our evaluation metrics for assessing the predictive power of human-like trading include classification accuracy and precision-recall. These metrics are part of our offline evaluation, which reports performance on held-out data.

6 In addition, to assess the performance of our statistical classifiers while playing the game we consider the following game-related metrics (in terms of averages): winning rate, victory points, offers made, successful offers, and pieces built. These metrics are part of our online evaluation, and are used to assess performance while playing the game of Settlers of Catan using a benchmark framework. Each of the classifiers (Bayes Net, Conditional Random Field, Random Forest) below was trained and evaluated equally. The only difference between models was the data source, i.e. manual labels or automatic labels. A. Offline Evaluation Table III shows the classification results of our statistical classifiers using the features listed in Table I trained as described in Section III, IV and V. Our evaluation used 10-fold cross-validation, i.e. average results over 10 rounds of 9 folds for training and 1 fold for validation. These folds mean that while our automatic labeller was trained on 90% of manually labelled data, the remaining 10% was used for validation. The classification accuracy of the automatic labeller was 80.23% according to the cross evaluation. For the evaluation in the next section, we choose the automatic labeller with the highest classification accuracy. Our observations from Table III can be described as follows: Firstly, it can be noted is that all our statistical classifiers substantially outperform a majority baseline. A second point to notice is that predicting human trading negotiations is a difficult task because our best classifier, the Random Forest, achieves a classification accuracy of 65.7% when training on manual labels, and 84.8% when training on automatic labels. A further point to observe is that all classifiers trained on automatic labels perform better than their counterparts trained on manual labels. In other words, automatic labels help to predict human trading behaviour better than manual labels. This result suggests that automatic labels are useful for data sets difficult to annotate like ours which reported a moderate annotator agreement in the manually labelled data [28]. Although this conclusion requires confirmation in other data sets, the next section reports an additional evaluation to confirm the good performance of the trained classifiers. B. Online Evaluation We also evaluated the statistical classifiers described in Sections III, IV and V by integrating them into the JSettlers benchmark framework [8] illustrated in Figure 1, where we use random and rule-based baseline negotiators 4 as the opponents. It has to be noted that our evaluations here played 4 The baseline trading agent referred to as rule-based included the following parameters in all agents, see [36] for further details: TRY N BEST BUILD PLANS:0, FAVOUR DEV CARDS:-5. strategic games at the semantic level, i.e. using dialogue acts as those shown in column 3 of Table II. In addition, our trained agents were active only during the ranking of trading offers, the functionality of the rest of the game was based on the JSettlers framework [8]. We refer to this evaluation as online because the agents were used in the actual game to rank realistic trading negotiations. This means that all games were run using four automated agents: one statistical vs. three rule-based. We evaluate each classifier with 10,000 games in order to obtain significant comparisons due to the randomness exhibited in the game. Such a number of games has shown to produce meaningful comparisons [36]. Table IV shows results of our online evaluation, which we describe as follows. First, note that random behaviour is substantially worse than rule-based, and that more (successful) offers do not contribute to more winning. Second, it can be noted that the rule-based agents obtain a winning rate of 25% because four players of the same kind play against each other. Third, it can also be noted that only some of our agents using the trained classifiers outperform the rule-based agents resulting in more winning, more victory points, and more pieces built but not necessarily more offers. Fourth, taking into account the classification results in the previous section, it can be inferred that higher classification accuracy from average human players does not imply better winning rates in the case of Conditional Random Fields and Bayes nets only in the Random Forest case. Similar effects but from expert human traders remain to be investigated. Fifth, we can observe that the best results are obtained by the Random Forest trained on automatic labels D a. It won 1% more games than the Random Forest trained on manual labels D m. This difference was significant at p < 0.05 according to a two-tailed Wilcoxon- Signed Rank Test. This result suggests that the use of automatic labels is useful for training better negotiation tradings than manual labels at least in the case of manual labels with a moderate annotator agreement. Manual labels with higher and even lower annotation agreements remain to be investigated. VII. CONCLUSIONS AND FUTURE DIRECTIONS The contribution of this paper is a learning approach for trading in strategic conversation including an evaluation of statistical trading agents trained for manually and automatically labelled data. We have trained three statistical agents from manually and automatically labelled data, and then applied statistical inference for computing probabilistic scores for each trading negotiation. The obtained scores were used to rank the available trading negotiations, where the top choice (i.e. the most human-like) was used in the game. In an offline evaluation, the statistical agents showed that the

7 Classifier Accuracy Precision Recall F-Measure Majority Baseline Conditional Random Field man Bayesian Network man Random Forest man Conditional Random Field auto Bayesian Network auto Random Forest auto Table III OFFLINE EVALUATION: CLASSIFICATION ACCURACY AND PRECISION-RECALL RESULTS OF HUMAN TRADING NEGOTIATIONS IN SETTLERS OF CATAN. NOTATION: man=training ON MANUALLY LABELLED DATA D m, AND auto=training ON AUTOMATICALLY LABELLED DATA D a Comparison Between Trained Winning Victory Offers Successful Pieces Statistical Trader vs Opponent Rate (%) Points Made Offers Built Random (from legal offers) vs Rule-based Rule-based vs Rule-based Conditional Random Field man vs Rule-based Bayesian Network man vs Rule-based Random Forest vs man Rule-based Conditional Random Field auto vs Rule-based Bayesian Network auto vs Rule-based Random Forest auto vs Rule-based Table IV ONLINE EVALUATION: GAME RESULTS COMPARING A STATISTICAL CLASSIFIER VS. THREE RULE-BASED TRADERS, I.E. FOUR PLAYERS IN TOTAL IN EACH GAME EACH LINE SHOWS AVERAGE RESULTS OVER 10,000 TEST GAMES. NOTATION: man=training ON MANUALLY LABELLED DATA D m, AND auto=training ON AUTOMATICALLY LABELLED DATA D a best classification result was obtained by a random forest classifier using automatic labels. In an online evaluation, the best agent (random forest) using automatic labels achieved a winning rate that was 1% better than its counterpart using manual labels with moderate annotator agreement. This result suggests that statistical classifiers should consider training from automatically labelled data especially if initially labelled data does not report high inter-annotator agreement. This result is encouraging for training statistical agents from human examples difficult to annotate in order to incorporate trainable behaviour in strategic conversational agents. Future research avenues include: training trading agents that take into account richer contextual information such as features from other players, and training them to play multiple games; training with other forms of machine learning, as commented in Section II; training agents not just from average players but from expert human traders in multiple domains; and evaluating trained agents against human players. ACKNOWLEDGMENTS Funding from the European Research Council (ERC) project STAC: Strategic Conversation no is gratefully acknowledged (see We would also like to thank the following members of the STAC project for helpful discussions: Markus Guhe, Eric Kow, Mihai Dobre, Ioannis Efstathiou, Wenshuo Tang, Verena Rieser, Alex Lascarides, and Nicholas Asher. REFERENCES [1] N. Asher and A. Lascarides, Strategic conversation, Semantics and Pragmatics, vol. 6, no. 2, pp. 1 62, August [2] M. McFarlin, 10 great board games for traders, Futures Magazine, Oct. 2013, great-board-games-for-traders. [Online]. Available: 10-great-board-games-for-traders [3] I. Szita, G. Chaslot, and P. Spronck, Monte-Carlo Tree Search in Settlers of Catan, in Proceedings of the 12th International Conference on Advances in Computer Games, ser. ACG 09. Berlin, Heidelberg: Springer-Verlag, 2010, pp [4] M. Pfeiffer, Reinforcement learning of strategies for Settlers of Catan, in International Conference on on Computer Games: Artificial Intelligence, Design and Education, [5] K. Georgila and D. Traum, Reinforcement learning of argumentation dialogue policies in negotiation, in Proc. of INTERSPEECH, [6] I. Efstathiou and O. Lemon, Learning non-cooperative dialogue behaviours, in SIGDIAL, [7] M. S. Dobre and A. Lascarides, Online learning and mining human play in complex games, in IEEE Conference on Computational Intelligence and Games, CIG, [8] R. Thomas and K. J. Hammond, Java settlers: a research environment for studying multi-agent negotiation, in Intelligent User Interfaces (IUI), 2002, pp

8 [9] H. Cuayáhuitl, S. Keizer, and O. Lemon, Learning to trade in strategic board games, in IJCAI Workshop on Computer Games (IJCAI-CGW), [10] G. Tesauro, Temporal difference learning and TD-gammon, Commun. ACM, vol. 38, no. 3, pp , [11] I. Efstathiou and O. Lemon, Learning to manage risk in noncooperative dialogues, in Proc. SEMDIAL, [12] S. Keizer, H. Cuayáhuitl, and O. Lemon, Learning Trade Negotiation Policies in Strategic Conversation, in Workshop on the Semantics and Pragmatics of Dialogue (godial), [13] D. Marcu and A. Echihabi, An unsupervised approach to recognizing discourse relations, in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), 2002, pp [14] C. Sporleder and A. Lascarides, Using automatically labelled examples to classify rhetorical relations: an assessment, Natural Language Engineering, vol. 14, no. 3, pp , [15] J. Fürnkranz, Machine learning in games: A survey, in Machines that Learn to Play Games, Chapter 2. Nova Science Publishers, 2000, pp [16] T. P. Runarsson and S. M. Lucas, Preference learning for move prediction and evaluation function approximation in othello, IEEE Trans. Comput. Intellig. and AI in Games, vol. 6, no. 3, pp , [17] C. J. Maddison, A. Huang, I. Sutskever, and D. Silver, Move Evaluation in Go Using Deep Convolutional Neural Networks, CoRR, vol. abs/ , [18] N. Asher and A. Lascarides, Commitments, beliefs and intentions in dialogue, in Proc. of SemDial, 2008, pp [19] J. Shim and R. Arkin, A Taxonomy of Robot Deception and its Benefits in HRI, in Proc. IEEE Systems, Man, and Cybernetics Conference, [20] D. Traum, Extended abstract: Computational models of non-cooperative dialogue, in Proc. of SIGdial Workshop on Discourse and Dialogue, [21] H. Cuayáhuitl, M. van Otterlo, N. Dethlefs, and L. Frommberger, Machine learning for interactive systems and robots: A brief introduction, in Proceedings of the 2 nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication, ser. MLIS 13. New York, NY, USA: ACM, 2013, pp [22] O. Pietquin and M. Lopez, Machine learning for interactive systems: Challenges and future trends, in Proceedings of the Workshop Affect, Compagnon Artificiel (WACAI), [23] A. Cadilhac, N. Asher, F. Benamara, and A. Lascarides, Grounding strategic conversation: Using negotiation dialogues to predict trades in a win-lose game, in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing EMNLP, 2013, pp [24] H. Cuayáhuitl, N. Dethlefs, H. W. Hastie, and O. Lemon, Barge-in effects in bayesian dialogue act recognition and simulation, in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, December 8-12, 2013, 2013, pp [25] O. Lemon, Adaptive natural language generation in dialogue using Reinforcement Learning, in Proc. of the 12th SEMdial Workshop on on the Semantics and Pragmatics of Dialogues, London, UK, June [26] N. Dethlefs and H. Cuayáhuitl, Hierarchical reinforcement learning for situated natural language generation, Natural Language Engineering, vol. 21, [27] N. Dethlefs, H. W. Hastie, H. Cuayáhuitl, and O. Lemon, Conditional random fields for responsive surface realisation using global features, in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, 4-9 August 2013, Sofia, Bulgaria, Volume 1: Long Papers, 2013, pp [28] S. Afantenos, N. Asher, F. Benamara, A. Cadilhac, C. Dégremont, P. Denis, M. Guhe, S. Keizer, A. Lascarides, O. Lemon, P. Muller, S. Paul, V. Rieser, and L. Vieu, Developing a corpus of strategic conversation in The Settlers of Catan, in Workshop on the Semantics and Pragmatics of Dialogue (SeineDial), Paris, France, 2012, hal [29] J. Carletta, Assessing Agreement on Classification Tasks: The Kappa Statistic, Computational Linguistics, vol. 22, no. 2, pp , [30] G. Cooper and E. Herskovits, A Bayesian method for the induction of probabilistic networks from data, Machine Learning, vol. 9, no. 4, pp , [31] F. G. Cozman, Generalizing variable elimination in bayesian networks, in In Workshop on Probabilistic Reasoning in Artificial Intelligence, 2000, pp [32] T. Kudo, CRF++: Yet another crf toolkit, Software available at crfpp.sourceforge.net, [33] L. Breiman, Random forests, Machine Learning, vol. 45, no. 1, pp. 5 32, [34] T. Hastie, R. Tibshirani, and J. Friedman, The elements of statistical learning: data mining, inference and prediction, 2nd ed. Springer, [35] A. Criminisi, J. Shotton, and E. Konukoglu, Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning, Foundations and Trends in Computer Graphics and Vision, vol. 7, no. 2-3, pp , [36] M. Guhe and A. Lascarides, Game strategies for The Settlers of Catan, in 2014 IEEE Conference on Computational Intelligence and Games, CIG 2014, Dortmund, Germany, August 26-29, 2014, 2014, pp. 1 8.

Learning is a very general term denoting the way in which agents:

Learning is a very general term denoting the way in which agents: What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,

More information

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

More information

Machine Learning. 01 - Introduction

Machine Learning. 01 - Introduction Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

More information

Benchmarking of different classes of models used for credit scoring

Benchmarking of different classes of models used for credit scoring Benchmarking of different classes of models used for credit scoring We use this competition as an opportunity to compare the performance of different classes of predictive models. In particular we want

More information

Machine Learning: Overview

Machine Learning: Overview Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave

More information

Learning to Process Natural Language in Big Data Environment

Learning to Process Natural Language in Big Data Environment CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu

Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill

More information

Three types of messages: A, B, C. Assume A is the oldest type, and C is the most recent type.

Three types of messages: A, B, C. Assume A is the oldest type, and C is the most recent type. Chronological Sampling for Email Filtering Ching-Lung Fu 2, Daniel Silver 1, and James Blustein 2 1 Acadia University, Wolfville, Nova Scotia, Canada 2 Dalhousie University, Halifax, Nova Scotia, Canada

More information

CSC384 Intro to Artificial Intelligence

CSC384 Intro to Artificial Intelligence CSC384 Intro to Artificial Intelligence What is Artificial Intelligence? What is Intelligence? Are these Intelligent? CSC384, University of Toronto 3 What is Intelligence? Webster says: The capacity to

More information

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three

More information

Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information

Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi, and Gianluca Bontempi 15/07/2015 IEEE IJCNN

More information

Prediction of Stock Performance Using Analytical Techniques

Prediction of Stock Performance Using Analytical Techniques 136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

More information

Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

More information

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde

More information

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016 Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00

More information

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain

More information

Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

More information

Impact of Boolean factorization as preprocessing methods for classification of Boolean data

Impact of Boolean factorization as preprocessing methods for classification of Boolean data Impact of Boolean factorization as preprocessing methods for classification of Boolean data Radim Belohlavek, Jan Outrata, Martin Trnecka Data Analysis and Modeling Lab (DAMOL) Dept. Computer Science,

More information

An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048

An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048 An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048 Hong Gui, Tinghan Wei, Ching-Bo Huang, I-Chen Wu 1 1 Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan

More information

TD-Gammon, A Self-Teaching Backgammon Program, Achieves Master-Level Play

TD-Gammon, A Self-Teaching Backgammon Program, Achieves Master-Level Play TD-Gammon, A Self-Teaching Backgammon Program, Achieves Master-Level Play Gerald Tesauro IBM Thomas J. Watson Research Center P. O. Box 704 Yorktown Heights, NY 10598 (tesauro@watson.ibm.com) Abstract.

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

How To Bet On An Nfl Football Game With A Machine Learning Program

How To Bet On An Nfl Football Game With A Machine Learning Program Beating the NFL Football Point Spread Kevin Gimpel kgimpel@cs.cmu.edu 1 Introduction Sports betting features a unique market structure that, while rather different from financial markets, still boasts

More information

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel

More information

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA Professor Yang Xiang Network Security and Computing Laboratory (NSCLab) School of Information Technology Deakin University, Melbourne, Australia http://anss.org.au/nsclab

More information

REVIEW OF ENSEMBLE CLASSIFICATION

REVIEW OF ENSEMBLE CLASSIFICATION Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.

More information

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next

More information

Learning Agents: Introduction

Learning Agents: Introduction Learning Agents: Introduction S Luz luzs@cs.tcd.ie October 22, 2013 Learning in agent architectures Performance standard representation Critic Agent perception rewards/ instruction Perception Learner Goals

More information

Appendices master s degree programme Artificial Intelligence 2014-2015

Appendices master s degree programme Artificial Intelligence 2014-2015 Appendices master s degree programme Artificial Intelligence 2014-2015 Appendix I Teaching outcomes of the degree programme (art. 1.3) 1. The master demonstrates knowledge, understanding and the ability

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Machine Learning. Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos)

Machine Learning. Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos) Machine Learning Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos) What Is Machine Learning? A computer program is said to learn from experience E with respect to some class of

More information

Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

More information

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining. Fundamentals, robotics, recognition Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

More information

Online learning and mining human play in complex games

Online learning and mining human play in complex games Online learning and mining human play in complex games Mihai Sorin Dobre, Alex Lascarides School of Informatics University of Edinburgh Edinburgh, EH8 9AB Scotland Email: msdobre@smsedacuk, alex@infedacuk

More information

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

More information

Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

More information

203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Rafael Witten Yuze Huang Haithem Turki. Playing Strong Poker. 1. Why Poker?

Rafael Witten Yuze Huang Haithem Turki. Playing Strong Poker. 1. Why Poker? Rafael Witten Yuze Huang Haithem Turki Playing Strong Poker 1. Why Poker? Chess, checkers and Othello have been conquered by machine learning - chess computers are vastly superior to humans and checkers

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Invited Applications Paper

Invited Applications Paper Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK THOREG@MICROSOFT.COM JOAQUINC@MICROSOFT.COM

More information

Bagged Ensemble Classifiers for Sentiment Classification of Movie Reviews

Bagged Ensemble Classifiers for Sentiment Classification of Movie Reviews www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 2 February, 2014 Page No. 3951-3961 Bagged Ensemble Classifiers for Sentiment Classification of Movie

More information

Predicting the Stock Market with News Articles

Predicting the Stock Market with News Articles Predicting the Stock Market with News Articles Kari Lee and Ryan Timmons CS224N Final Project Introduction Stock market prediction is an area of extreme importance to an entire industry. Stock price is

More information

Dan French Founder & CEO, Consider Solutions

Dan French Founder & CEO, Consider Solutions Dan French Founder & CEO, Consider Solutions CONSIDER SOLUTIONS Mission Solutions for World Class Finance Footprint Financial Control & Compliance Risk Assurance Process Optimization CLIENTS CONTEXT The

More information

II. RELATED WORK. Sentiment Mining

II. RELATED WORK. Sentiment Mining Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract

More information

Course 395: Machine Learning

Course 395: Machine Learning Course 395: Machine Learning Lecturers: Maja Pantic (maja@doc.ic.ac.uk) Stavros Petridis (sp104@doc.ic.ac.uk) Goal (Lectures): To present basic theoretical concepts and key algorithms that form the core

More information

The Artificial Prediction Market

The Artificial Prediction Market The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

More information

Applying Machine Learning to Stock Market Trading Bryce Taylor

Applying Machine Learning to Stock Market Trading Bryce Taylor Applying Machine Learning to Stock Market Trading Bryce Taylor Abstract: In an effort to emulate human investors who read publicly available materials in order to make decisions about their investments,

More information

Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

Document Image Retrieval using Signatures as Queries

Document Image Retrieval using Signatures as Queries Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and

More information

Beating the MLB Moneyline

Beating the MLB Moneyline Beating the MLB Moneyline Leland Chen llxchen@stanford.edu Andrew He andu@stanford.edu 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Predicting borrowers chance of defaulting on credit loans

Predicting borrowers chance of defaulting on credit loans Predicting borrowers chance of defaulting on credit loans Junjie Liang (junjie87@stanford.edu) Abstract Credit score prediction is of great interests to banks as the outcome of the prediction algorithm

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

MA2823: Foundations of Machine Learning

MA2823: Foundations of Machine Learning MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu jiaqian.yu@centralesupelec.fr

More information

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore. CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

More information

Automated Content Analysis of Discussion Transcripts

Automated Content Analysis of Discussion Transcripts Automated Content Analysis of Discussion Transcripts Vitomir Kovanović v.kovanovic@ed.ac.uk Dragan Gašević dgasevic@acm.org School of Informatics, University of Edinburgh Edinburgh, United Kingdom v.kovanovic@ed.ac.uk

More information

Equity forecast: Predicting long term stock price movement using machine learning

Equity forecast: Predicting long term stock price movement using machine learning Equity forecast: Predicting long term stock price movement using machine learning Nikola Milosevic School of Computer Science, University of Manchester, UK Nikola.milosevic@manchester.ac.uk Abstract Long

More information

The Predictive Data Mining Revolution in Scorecards:

The Predictive Data Mining Revolution in Scorecards: January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms

More information

Using Artificial Intelligence to Manage Big Data for Litigation

Using Artificial Intelligence to Manage Big Data for Litigation FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS

A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS Charanma.P 1, P. Ganesh Kumar 2, 1 PG Scholar, 2 Assistant Professor,Department of Information Technology, Anna University

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

Parsing Software Requirements with an Ontology-based Semantic Role Labeler

Parsing Software Requirements with an Ontology-based Semantic Role Labeler Parsing Software Requirements with an Ontology-based Semantic Role Labeler Michael Roth University of Edinburgh mroth@inf.ed.ac.uk Ewan Klein University of Edinburgh ewan@inf.ed.ac.uk Abstract Software

More information

Bayesian Networks and Classifiers in Project Management

Bayesian Networks and Classifiers in Project Management Bayesian Networks and Classifiers in Project Management Daniel Rodríguez 1, Javier Dolado 2 and Manoranjan Satpathy 1 1 Dept. of Computer Science The University of Reading Reading, RG6 6AY, UK drg@ieee.org,

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br

More information

Towards better accuracy for Spam predictions

Towards better accuracy for Spam predictions Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial

More information

Football Match Winner Prediction

Football Match Winner Prediction Football Match Winner Prediction Kushal Gevaria 1, Harshal Sanghavi 2, Saurabh Vaidya 3, Prof. Khushali Deulkar 4 Department of Computer Engineering, Dwarkadas J. Sanghvi College of Engineering, Mumbai,

More information

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

More information

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-19-B &

More information

ADVANCED MACHINE LEARNING. Introduction

ADVANCED MACHINE LEARNING. Introduction 1 1 Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures

More information

Sentiment analysis for news articles

Sentiment analysis for news articles Prashant Raina Sentiment analysis for news articles Wide range of applications in business and public policy Especially relevant given the popularity of online media Previous work Machine learning based

More information

Less naive Bayes spam detection

Less naive Bayes spam detection Less naive Bayes spam detection Hongming Yang Eindhoven University of Technology Dept. EE, Rm PT 3.27, P.O.Box 53, 5600MB Eindhoven The Netherlands. E-mail:h.m.yang@tue.nl also CoSiNe Connectivity Systems

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

More information

CENG 734 Advanced Topics in Bioinformatics

CENG 734 Advanced Topics in Bioinformatics CENG 734 Advanced Topics in Bioinformatics Week 9 Text Mining for Bioinformatics: BioCreative II.5 Fall 2010-2011 Quiz #7 1. Draw the decompressed graph for the following graph summary 2. Describe the

More information

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them Vangelis Karkaletsis and Constantine D. Spyropoulos NCSR Demokritos, Institute of Informatics & Telecommunications,

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee

UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee 1. Introduction There are two main approaches for companies to promote their products / services: through mass

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

More information

Tensor Factorization for Multi-Relational Learning

Tensor Factorization for Multi-Relational Learning Tensor Factorization for Multi-Relational Learning Maximilian Nickel 1 and Volker Tresp 2 1 Ludwig Maximilian University, Oettingenstr. 67, Munich, Germany nickel@dbs.ifi.lmu.de 2 Siemens AG, Corporate

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

More information

Implementation of hybrid software architecture for Artificial Intelligence System

Implementation of hybrid software architecture for Artificial Intelligence System IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.1, January 2007 35 Implementation of hybrid software architecture for Artificial Intelligence System B.Vinayagasundaram and

More information

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual

More information