Trading Strategies and the Cat Tournament Protocol

Transcription

1 M A C H I N E L E A R N I N G P R O J E C T F I N A L R E P O R T F A L L 2 7 C S CLASSIFICATION OF TRADING STRATEGIES IN ADAPTIVE MARKETS MARK GRUMAN MANJUNATH NARAYANA

2 Abstract In the CAT Tournament, agents facilitate transactions between buyers and sellers with the intention of maximizing profit from commission and other fees. The agent must find a well-balanced strategy that allows it to sign buyers and sellers to trade in its market while also maintaining the buyers and sellers that are currently subscribed to it. One approach is to classify the traders that interact with the agent with respect to the trading strategies they utilize. Although the set of trading strategies is very small (traders are assigned one of four previously-defined strategies) the data available to the agent does not provide an explicit correlation of strategies to traders during the competition. As a result, agents must gather data in which the relationship between agents and strategies is known and then construct a probabilistic model based on the acquired data. Different strategies yielded varying frequency and quantity of data, making its raw form unusable for the k-means clustering technique. Only after discarding most of the data (thus deserting its integrity) was it possible to pass it to the k-means algorithm, which performed very poorly. A support vector machine (SVM) was also utilized and yielded varying results. While the linear and sigmoid kernels performed nominally better than classification based on complete randomness (28.2% and 32.8% respectively), the SVM employing a radial kernel was able to predict strategies with nearly 6% accuracy. Classification using a Hidden Markov Model was even more successful, predicting strategies correctly 62.28% of the time. It was observed that a moderate number of states and mixtures yielded the best results when classifying using the HMM. Additionally, two of the four strategies were observed to be more easily predictable using both the SVM and HMM classification techniques, suggesting that better classification could be achieved if the raw data from the two strategies could be separated into two disjoint sets before performing classification. 1. Introduction The field of Catallactics, or the study of exchanges, has received significant attention in the Artificial Intelligence community over the past few years, in large part, due to the progress that has been made in Machine Learning. In particular, significant attention has been given to designing efficient markets in which agents of numerous roles and preferences interact and exchange goods while utilizing various strategies and trading tactics. The CAT Tournament [1], an offshoot of the original Trading Agents Competition, is such a contest in which agent brokers (hereon referred to as specialists ) attempt to lure buyers and sellers (hereon referred to as traders ) to their respective markets in hopes of maximizing profit. A variety of trading techniques employed by the traders, who also wish to maximize individual profit, serve as the main motivation for developing adaptive markets that actively respond to the ever-changing preferences of the buyers and sellers. In addition to enticing traders to utilize the specialist s unique market, the specialist must also make necessary adjustments to its strategy to retain the traders that have decided to collaborate with it. Inevitably, trade-offs must be made regarding whether to maintain a similar strategy while possibly limiting the amount of profit that could be made and altering strategy to maximize profit at the risk of losing existing traders. Therefore, in order to make optimal decisions, the specialist must make inferences about the expected behavior of traders given any action it may take. To do this, the specialist must analyze data it has collected from previous trading sessions to model the strategies employed by the traders. To simplify this task, the CAT protocol specifies that traders are most likely (and possibly required) to use one of four previously-defined trading strategies, turning the learning problem (at least in part) into a clustering problem. Naturally, the relevance of said clustering also comes into question. In other words, one must decide whether an ability to classify all traders based on the previously-defined strategies is useful in maximizing the overall profit of the specialist agent. Intuitively, successful classification should allow the specialist to make better inferences regarding the overall effect of potential change in trading strategy, although the degree to which the inferences are improved can only be determined through research. 2

3 2. Strategies The traders in the CAT tournament are designed to choose from one of four strategies, which are explained in detail in [2]-[5]. An agent using the Double Auction strategy [2] (henceforth called ) keeps track of the number of bids accepted and rejected by the market at a particular price. Subsequent values for the bids are chosen depending on the probability of acceptance of a bid, given the past history. Utilizing the Extensive Form Game strategy [3] (hereon referred to as ), a trader alters its future bid values based on the profits that were observed for the previous bids. The Zero Information-Constrained (ZI-C) strategy [4] involves generating random bids constrained between a maximum and minimum value. A buyer using the ZI- C strategy will never bid more than what it believes a good is worth. Likewise, a seller will never sell a good for less than the amount it cost the seller to obtain the good. Finally, if a trader uses the Zero Information, Plus () strategy [5], it utilizes the same trading techniques as an agent that employs ZI-C, but it also updates the constraints based on feedback from the market. Thus, each process (except when using the ZI-C strategy) receives feedback from the market in various forms and updates itself to generate new bids. 3. Data Each trader makes a bid to the market and continues to update it until another trader in the same market accepts the bid price and a transaction takes place. We call each string of updated bids from the same trader a bid sequence. Given the competitive nature of the market with several traders attempting to make the transaction, a buyer s bid could be accepted by a seller at any point during the bid process. This means that the number of bids a trader had to make before successfully concluding a transaction is not constant. As a result, the number of bids in each bid sequence can vary significantly. There is no upper bound on the length of the sequence. Figure 1, illustrating bid sequences for a selling trader using the strategy, shows that the number of sequences in the bid could range from 1 to any large number depending on the state of the market and the strategies of other traders in that market. The problem is further complicated when we consider multiple traders using different strategies. Figure 2 is an illustration of how the bid sequences might look in the actual competition. We can see that the data clearly poses many challenges, the most significant being the unpredictable lengths of bid sequences. Seller 1-84., 9.4, 12.9 Seller Seller Seller 1-59., 85.2, 82.7, 81.3, 73.4, Figure 1. Illustration of bid sequences from one seller using the strategy. Seller 1-84., 9.4, 12.9 Seller Seller 3-3.9, 14.5, 75.6, 9.3 Seller 1-1.2, 98.7, 89.6, 77.6, 19.4 Seller 4-59., 85.2, 82.7, 81.3, 73.4 Seller , 56.7, 78.9 Seller 3-2.8, 132,6 Figure 2. Illustration of bid sequences from a sample market with multiple traders employing different bidding strategies. The observed data is further distorted by the processes of generating new bids. As explained in section 2, the underlying parameters of each bidding strategy are updated using market feedback. Each trader gets different feedback from the market depending on the values of its previous bid and underlying parameters. Thus, the parameters of traders using even the same strategy are updated differently. Consequently, we see differing bid prices from traders using the same strategy. 4. Data Collection The traders bidding data was collected through numerous simulations of a typical CAT competition. Although there were no problems collecting a sufficient amount of data (one could always run more simulations), the raw data collected could not immediately be used for classification purposes for a number of reasons. The most significant obstacle of data classification was dealing with anonymized data. By anonymized we mean that the true origin of each bid was masked. This occurred per the 2

4 specification of the CAT Competition Protocol, in which bids shouted by traders first reached the server, which replaced the identity of the bid source with a unique bid identifier used only for that particular sequence of bids. Once a transaction completed, the bid identifier was discarded and a new bid identifier was assigned to the next sequence of bids from the same trader. As a result, all bids were masked by the time they reached the specialist. Thus, the specialist could determine neither the true identity of each bid, nor which bid sequences originated from the same trader. The random order in which the bids arrived also failed to assist the specialist in determining the true origin of each bid. Fortunately, the CAT source code (freely available to all CAT Competition participants) gives users access to all functional modules that make up the competition. With these additional resources, and a number of code modifications, the authors were able to obtain the required data in its unmasked form, allowing them to identify the true source of each bid that the specialist received. This data was collected under the assumption that it would be used for training only, since the identity of each bid would not be available to the specialist during an actual competition. Bid sequences were collected for 4 traders (1 traders of each strategy), with their identities unmasked. The authors decided that focusing on the selling trader sequences alone would suffice to evaluate the efficacy of the classification strategy. In all, 276 bid sequences were generated by the system. Two-thirds (1384) of these samples were randomly chosen to be the training set and one-third of the data (692) was used for testing. 5. Classification A number of classification techniques were employed to determine which technique yielded the best classification results. We discuss in detail the techniques that showed the most promising results and touch upon other techniques that were evaluated in limited capacity. I. K-means classification: The first classification method utilized the k-means classifier in MATLAB. In order to use this classification technique, the collected data needed to be adjusted to ensure that each data entry contained the same number of data points. As a result, a decision was made to include only the first bid of each recorded bid sequence. This decision was made on the observation that some (in fact a large majority of) bid sequences were made up of only a single bid, so only the first one could be used in order to guarantee that 1) each bid sequence was represented in the data, and 2) all four strategies were equally represented. Figure 3 illustrates the initial plot of the original raw data. The k-means clustering algorithm created 4 clusters that completely disagreed with the original raw data plot, as seen in Figure 4. This result was very intuitive, given the seemingly random plot of the original raw data which lacked any identifiable clusters, and suggested that k-means clustering should not be used to classify testing data. Figure 3. Each bid sequence is represented by the first bid. In all, 135 bid sequences of each strategy are represented. Although discarding all bids except the first from each bid sequence guaranteed equal representation of all four trading strategies, it compromised the integrity of the data by removing a significant number of data points pertaining to certain bid sequences. The bid sequences that were especially long (some 3

5 made up of more than 1 bids) were affected most significantly by this approach. One alternative that was not tested would have involved treating each bid as its own sequence. Although this approach would avoid discarding a large number of data points, it is not intuitive whether the integrity of the data would be maintained, since the relationship of bids and the bid sequence they belonged to would be lost in the transformation. type of kernel that was used for training. Training on the sigmoid kernel (with default gamma and coefficient values) yielded the worst results, predicting only 28.2% of the testing set correctly (only 3% better than completely random prediction). Training under the linear kernel also yielded rather poor results, predicting only 32.8% of the training data correctly. The polynomial kernel also yielded rather poor results, predicting 38.4% of the testing data correctly using the default polynomial kernel. The sigmoid kernel, however, produced much better results. It was able to predict 53.8% of the testing data correctly under default parameters and 59.7% of the data correctly when gamma was set to.8. An observation was also made that the and ZI-C strategies were predicted with a high degree of accuracy, while data from the and strategies was more difficult to classify. Figure 4. Erroneous K-means clustering of the observed data. II. SVM classification: Another attempt to classify the observed data was made using a Support Vector Machine (SVM). In order to perform classification using the SVM, the observed raw data was converted to the appropriate data format. Each sequence of bids was represented by a unique vector, and each bid in the sequence became a feature in the corresponding vector. Feature numbers were assigned incrementally, so a bid sequence made up of n bids was represented by a vector of features 1 through n. Each vector was assigned a classification as follows: a bid sequence coming from a trader using the strategy was classified as class 1, as class 2, as class 3, and as class 4. An example of the data can be seen in Figure 5. SVM training and classification was performed using LibSVM v2.85. Prediction results using SVM classification varied greatly, depending largely on the Figure 5. Example of observed data in SVMfriendly format. The confusion matrix represents the quality of classification by the system. Each row represents the true class of a data instance and each column represents the predicted class of a data instance. Thus, the confusion matrix indicates the errors within each class (how many instances of each class were successfully classified and how many were misclassified). High diagonal values in the matrix indicate a successful classification system. Figure 6 presents the confusion matrices for various runs of the SVM classification with different kernels. 4

6 SVM Linear kernel Accuracy = 32.8% SVM Radial kernel Default gamma= Accuracy = 53.8% Figure 6. Confusion matrices for different SVM kernels III. HMM classification: From the onset, the presence of hidden parameters in the data (bid identities, strategies, and the internal parameters within each strategy), and time-dependent nature of bid sequences suggested that a Hidden Markov Model should be utilized in classifying the acquired dataset. Q SVM Polynomial kernel Default gamma=1, coeff=, degree= Accuracy = 38.4% SVM Sigmoid kernel Accuracy = 28.2% Hidden State Default gamma=1, coeff= the system. The hidden variables are the state (Q) and the mixture parameters (M). Since there are 4 strategies, 4 separate HMMs are trained (one for each class of data). Each HMM thus represents one of 4 strategies. For the testing set, the likelihood of each data sample for each HMM is calculated. The data sample is assigned to the strategy that is represented by the HMM with the maximum likelihood value. The HMM-based classification showed a slight improvement over the SVM-based method that was used earlier, supporting the authors initial observation of the possible effectiveness of the Hidden Markov Model in representing the variables involved in the bidding process. Several HMM runs were executed with different values for the parameters (number of hidden states and mixture components). The HMM models took between 1 and 68 minutes to train depending on the number of states and mixture components. In contrast, SVM training ranged between only a few seconds and 2-3 minutes. Nevertheless, the accuracy for the HMM method was always greater than 52%, and the authors were able to achieve about 62% accuracy by tuning the parameters (number of hidden states = 1, number of mixture components= 1). Figure 8 is a plot of the accuracies obtained in different HMM run, and Figure 9 shows the confusion matrices for a few runs of the HMM method. M 64. Classification accuracy for different HMM runs Mixture parameters X Accuracy Bid Figure 7. Mixture-observation Hidden Markov Model used for the CAT dataset. The Mixture-observation HMM [6] that was used for this dataset is shown in Figure 7. A Gaussian mixture model is assumed for each observation. X represents the current bid value and is the observed variable in States, 1 Mixture 4 States, 4 States, 4 States, 2 Mixtures 1 Mixtures 1 Mixtures HMM parameters 1 States, 1 Mixtures Figure 8. Classification accuracy for different HMM parameters. 5

7 From the HMM confusion matrices presented in this report, it is clear that the and ZI-C strategies are the easiest to classify. This fact can also be observed from figure 3, where the red dots (representing the strategy) clearly follow a recognizable pattern, and the cyan dots (representing the ZI-C strategy) all fall within a reasonably small range of values. The strategy was moderately difficult to classify, while the strategy proved extremely difficult (its performance was not much better than purely random performance). Figure 9. Confusion matrices for different HMM parameters 6. Feature reduction A small experiment was also carried out to explore the efficacy of feature reduction of the observed dataset in the HMM classification framework. We could not use standard reduction methods like PCA because the dataset included instances of varying feature lengths (treating each bid as a single feature). Most samples had only 1 feature, while some had as many as 2. Experiments showed that when only 2 features were used, the HMM accuracy fell to 49%, while using only the first 1 features improved upon earlier best results slightly. Thus, the authors concluded that a moderate reduction in the number of features could result in improved performance while also maintaining the integrity of the observed data. Further experiments would be required to arrive at the optimal number of features that should be used. The confusion matrices and accuracies for 2- feature-only and 1-features-only HMM runs are presented in Figure 1. HMM First 2 features only Accuracy = 49.71% HMM First 1 features only Figure 1. Results of training an HMM sing only first 2 and first 1 bids from each bid sequence, respectively. 7. Other methods A number of other interesting methods were explored for improving the results of the HMM. We tried to fit a Conditional Random Fields (CRF) model to the data, using Kevin Murphy s Matlab code [7]. The model failed to converge in many cases, however, and resulted in accuracy close to random performance for our dataset. We concluded that a deeper understanding of the CRF process was required before we could effectively utilize the advantages that could be offered by the discriminative model of CRF. Since SVM is the most popular and successful classification method in most applications, we decided to also try the time-dependent Fourier kernel [8]. We thought that the power of SVM when combined with some time information provided by the Fourier kernel might show significant improvement in the classification accuracies. A Fourier kernel was calculated for all the instances in the training and testing data set, as required by the libsvm framework for implementing user-defined kernels in libsvm. Accuracy of between 28% and 42% was observed for different cases of the Fourier kernel, suggesting that it was not an improvement over the other kernel functions of the SVM framework. We briefly contemplated the use of the pyramid kernel [9] because it is designed to work with datasets that have varying number of features. However, due to lack of time we decided to pursue this as future work. 8. Conclusions and Future work We found SVM classification to be reasonably accurate, with the best case accuracy of 59%. However, SVM by nature does not handle time-series Accuracy = 62.57%

8 data well and is also adversely affected by data containing varying feature lengths. The HMM proved to be a better solution for this dataset (yielding a prediction accuracy of almost 63% in the best case), because it is designed for time-series data and not affected greatly by the varying lengths of features. We used a simple HMM model for this project. It might be useful to experiment with more complicated models that represent the data generation process more accurately. Given that the task was classification of the data into multiple classes, we were surprised by the prediction performance obtained using the SVM and HMM techniques. Random guesses would yield about 25% accuracy). The achieved results were better than expected, suggesting that the classification of trading strategies as a pre-step to managing the transactions between traders in the CAT virtual market is a reasonable approach for any CAT participant. The data generated for this project was from one trading session alone. Further rigorous testing for robustness across trading sessions should be carried out before applying the classification scheme in the CAT Tournament. [6] Jeff A. Bilmes. What HMMs can do. IEICE Transactions on Information and Systems, E89-D(3): , 26. [7] crf.html [8] SVM kernels for time series analysis, Stephen Ruping, 21. [9] K. Grauman and T. Darrell. The Pyramid Match kernel: discriminative classification with sets of image features. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Beijing, China, October References [1] Cliff et al. CAT Document 1. TAC Market Design: Communication Protocol Specification. July 27. [2] Dave Cliff. Minimal-intelligence agents for bargaining behaviours in market-based environments. Technical Report HP-97-91, Hewlett-Packard Research Laboratories, Bristol, England, [3] S. Gjerstad and J. Dickhaut. Price formation in double auctions. Games and Economic Behaviour, 22:1-29, [4] D. K. Gode and S. Sunder. Allocative efficiency of markets with zero-intelligence traders: Markets as a partial substitute for individual rationality. The Journal of Political Economy, 11(1): , February [5] A. E. Roth and I. Erev. Learning in extensiveform games: Experimental data and simple dynamic models in the intermediate term. Games and Economic Behavior, 8: ,