Analysis of Financial Data Using Non-Negative Matrix Factorization
|
|
|
- Lindsey Dorsey
- 10 years ago
- Views:
Transcription
1 International Mathematical Forum,, 008, no. 8, Analysis of Financial Data Using Non-Negative Matrix Factorization Konstantinos Drakakis UCD CASL, University College Dublin Belfield, Dublin, Ireland Scott Rickard UCD CASL, University College Dublin Belfield, Dublin, Ireland Ruairí de Fréin UCD CASL, University College Dublin Belfield, Dublin, Ireland Andrzej Cichocki Laboratory for Advanced Brain Signal Processing Brain Science Institute, RIKEN, - Hirosawa Wako-shi, Saitama, -098, Japan [email protected] Abstract We apply Non-negative Matrix Factorization (NMF) to the problem of identifying underlying trends in stock market data. NMF is a recent and very successful tool for data analysis including image and audio processing; we use it here to decompose a mixture a data, the daily closing prices of the 0 stocks which make up the Dow Jones Industrial Average, into its constitute parts, the underlying trends which The author is also affiliated with the School of Mathematics, University College Dublin. The author is also affiliated with Electronic & Electrical Engineering, University College Dublin.
2 8 Konstantinos Drakakis et al govern the financial marketplace. We demonstrate how to impose appropriate sparsity and smoothness constraints on the components of the decomposition. Also, we describe how the method clusters stocks together in performance-based groupings which can be used for portfolio diversification. Introduction The explosion in popularity of Non-negative Matrix Factorization (NMF) in recent times has led to it being applied to diverse fields such as PET [], EEG analysis [], pattern recognition, feature extraction, denoising, dimensionality reduction and blind source separation []. NMF considers the following problem [8]: Problem.. Given a matrix Y = [y(), y(),..., y(t )] R m T, NMF decomposes Y into the product of two matrices, A R m r and X = [x(), x(),..., x(t )] R r T, where all matrices have exclusively non-negative elements. We can view Y as a mixture of the rows of X weighted by the coefficients in the mixing matrix A. Thus, the rows of X can be thought of as underlying components from which the mixtures are created. We shall consider here decompositions which are approximative in nature, i.e., Y = AX + V, () where A 0, X 0 component-wise and V R m T represents a noise or error matrix. One arena in which an understanding of the hidden components which drive the system is crucial (and potentially lucrative) is financial markets. It is widely believed that the stock market and indeed individual stock prices are determined by fluctuations in underlying but unknown factors (signals). If one could determine and predict these components, one would have the opportunity to leverage this knowledge for financial gain. In this paper, we use NMF to learn the components which drive the stock market; specifically, we use for Y the closing prices for the past 0 years of the 0 stocks which make up the Dow Jones Industrial Average. To the best of our knowledge, this is the first time NMF has been applied in a financial setting. We begin with an overview of the NMF algorithm. Subsequently, we use its standard varieties (Frobenius and Kullback-Leibler) to represent each stock as a weighted sum of underlying components, and propose a novel method for those components to be smooth. In fact, in some instances we are interested
3 Analysis of financial data 8 in decompositions with sparse A and smooth X, and we modify the classical NMF techniques accordingly. Finally, based on the identified mixing matrix A, we propose a clustering that can be of potential use to investors as it produces a performance-based grouping of the stocks instead of the tradition sector-based grouping. Non-Negative Matrix Factorization We start by introducing two standard NMF techniques proposed by Lee and Seung [8]. In their seminal work on NMF, [9] considered the squared Frobenius norm and the Kullback-Leibler (KL) objective functions. Frobenius: D F (Y AX) = y ik [AX] ik () Kullback-Leibler: D KL (Y AX) = ik ik ( ) y ik y ik ln + [AX] ik y ik [AX] ik A suitable step-size parameter was proposed in both cases resulting in two alternating, multiplicative, gradient descent updating algorithms: Frobenius: Kullback-Leibler: () A A YX T AXX T, () X X A T Y A T AX () A A (Y AX) X T [X T,..., X T ] T, () X X A T (Y AX) [A T m,..., A T m ] (7) where represents element-wise multiplication, represents element-wise division, and n is a column vector with n ones. Lee and Seung argue that the advantage of having multiplicative updates is that it provides a simple update rule under which A and X never become negative, and so projection into the positive orthant of the space is not necessary. Having alternating updates implies that the optimization is no longer convex (NMF is convex if either A or X are updated, while the other is known) and therefore the solution it leads to may be neither unique nor optimal (exact). Alternatively, NMF factorization can be achieved in many other ways: for example, multiplicative or additive gradient descent [9], projected gradient [0], exponentiated gradient [], nd order Newton []; and with different costs such as Kullback-Leibler, Frobenius and Amari-alpha divergence []; and with additive or projected sparsity [, 7] and/or orthogonality constraints [].
4 8 Konstantinos Drakakis et al. Sparsity and smoothness When it is known that the components of A and X have certain properties (e.g., sparsity), it is advantageous to manipulate the algorithm to produce components with the desired properties. For example, [7] argues that explicitly incorporating a sparseness constraint improves the found decompositions for face image databases. Alternatively, [] apply a temporal smoothness constraint which they argue produces meaningful physical and physiological interpretations of clinical EEG recordings. We now give a brief overview of how sparsity and smoothness are implemented in practice. At the end of this section, we propose a novel method of producing a decomposition with a sparse mixing matrix and smooth component matrix. Consider the following constrained optimization problems with the Frobenius and KL cost []: D α A,α X F (Y AX) = D F (Y AX) + α A J A (A) + α X J X (X), (8) D α A,α X KL (Y AX) = D KL (Y AX) + α A J A (A) + α X J X (X), s. t. i, j, k : x jk 0, a ij 0, where α A 0 and α X 0 are regularization parameters, and functions J X (X) and J A (A) are used to enforce a certain application-dependent characteristics of the solution, such as sparsity and/or smoothness... Regularization functions: The regularization, terms J A (A) and J X (X) can be defined in many ways. Let us assume the following definition for L p -norm of a given matrix C = [c mn ] ( M ) R M N : L p (C) N n= c mn p p, thus: m= L norm (p = ) for sparse solution: J A (A) = m r i= j= a ij, J X (X) = r T j= k= x jk, A J A (A) =, X J X (X) =, Squared L norm (p = ) for smooth solution: J A (A) = m i= r j= a ij, J X (X) = r j= T k= x jk, A J A (A) = a ij, X J X (X) = x jk, There exists many other measures of sparsity and smoothness (e.g., see in particular [7] which proposes a sparse measure with intuitively desirable properties), each of which has associated modified algorithms.
5 Analysis of financial data 87.. Modified Frobenius and KL algorithms: The Lee-Seung algorithm with Frobenius cost can be regularized using additive penalty terms []. Using gradient descent the following generalized updates can be derived: a ij a ij [ [Y X T ] ij α A [ A J A (A)] ij ]ε [A X X T ] ij, (9) x jk x jk [ [A T Y] jk α X [ X J X (X)] jk ]ε [A T A X] jk, (0) a ij a ij m i= a, () ij where the nonlinear operator [x] ε = max{ε, x} has been included in the numerator to avoid the possibility of a negative numerator. Note that, in general, the signature matrix A is normalized at the end of each pair (A and X) of updates. Regularization of the Lee and Seung algorithm with Kullback-Leibler cost function was considered in []. The modified learning rules were found to perform poorly. In order to enforce sparsity while using the Kullback-Leibler cost function, a different approach can be used []: a ij ( a ij T k= x jk (y ik /[A X] ik ) T p= x jp ) +αsa, () x jk ( x jk m i= a ij (y ik /[A X] ik ) m q= a qj ) +αsx, () a ij a ij m i= a, () ij where α Sa and α Sx enforce sparse solutions. Typically, α Sa, α Sx [0.00, 0.00]. This is a rather intuitive or heuristic approach to sparsification. Raising the Lee-Seung learning rules to the power of ( + α Sa ) or ( + α Sx ), and having α Sa, α Sx > 0 resulting in an exponents that are greater than one, implies that the small values in the non-negative matrix tend to zero as the number of iterations increase. The components of the non-negative matrix that are greater
6 88 Konstantinos Drakakis et al than one are forced to be larger. In practice these learning rules are stable and the algorithm works well. Of course, this power raising technique can also be used with the Frobenius norm, if we wish: in both cases then, this reduces to an intermediate step of the algorithm performing the updates: x ij x +α Sx ij, a ij a +α Sa ij. () This is actually the smoothness and sparsity technique we will be using to derive our results... Sparse A and smooth X In many practical situation, we may desire a decomposition which has sparse A and smooth X (or, alternatively sparse X and smooth A). We now propose a simple approach which produces such solutions. In the Frobenius case, we required a projection to correct the modified algorithm s solutions that occasionally lie outside the allowed non-negative space. The modified Kullback- Leibler method does not produce satisfactory results which led to the heuristic and intuitive approach for sparsification of the solution []. If we, however, desire smooth solutions, we can modify () or () simply by setting α Sa < 0 or α Sx < 0. By setting, for example, α Sa > 0 and α Sx < 0, the solutions produced have sparse A and smooth X. This is the approach we use in this paper. Stock market trends We will be applying NMF on the closing prices for the past 0 years of the 0 stocks that make up the Dow Jones Industrial Index. These stocks are traditionally grouped into sectors according to the nature of the activities or the object of trade of the company they correspond to (such as Technology, Basic Materials, Financial etc.); full information about these stocks can be found in Table. The closing prices of stocks are definitely nonnegative signals, so it makes sense to apply NMF on them; we actually analyze their natural logarithms, and, to obtain better results, we ground them, namely we shift the time series of each stock vertically so that the minimum value becomes 0. It is reasonable to assume that stocks that are part of the same market do not behave independently of each other, but are rather driven by some underlying forces, normally significantly fewer than the number of stocks themselves. In this way, groups of stocks exhibit a correlated behavior, and these groups may or may not coincide with the sectors mentioned above. Obviously,
7 Analysis of financial data 89 the most interesting scenario would be a high performance group almost orthogonal to the sectors (that is, containing one stock from most sectors), as investment in such a group would falsely appear to be diversified and subject investors to unsuspected risk. NMF is definitely an exemplary candidate method to determine these underlying driving components; after all, this is exactly what it has been successfully applied to in audio and image processing, where it has proved extremely efficient in demixing speech and separating overlapping images. In what follows, we will test some of the variants of NMF we have been describing above on our financial data. One last point to be mentioned concerns the clustering algorithm that determines the groups of the stocks; this is, however, an extensively studied problem in statistics and most mathematical software packages include efficient algorithms that solve it automatically. In our case, we use the kmeans routine of MATLAB on the matrix A we determine at the end of each run. Results We divide this section into various parts, according to the features incorporated in the version of the NMF algorithm we use. Note that X is always normalized so that each row s maximum is equal to. We run the algorithms through a recursion of 000 steps, or until the cost stops decreasing, whichever comes first. In Figure we show things that should be kept in mind throughout the experiments: the original stock prices, so that we can compare the reconstructions we obtain through NMF, and the action of exponentiation, whereby exponents larger than increase the range of values (making large values larger and small values smaller, relatively speaking), while exponents less than suppress it.. Frobenius runs In the following runs we use the Frobenius norm:.. No smoothness, no sparsity In this run we use vanilla NMF, making no effort to induce either extra sparsity or extra smoothness; in other words, α Sx = α Sa = 0 in (). The recursion goes through all 000 steps, and the results are shown in Figure. We see that the components (rows of X) are already sufficiently smooth, and that the reconstruction of the data from the determined components is also very good. The cost is computed using ().
8 80 Konstantinos Drakakis et al # Symbol Company Sector AA Alcoa Basic Materials AIG American International Group Financial AXP American Express Financial BA Boeing Industrial Goods C Citigroup Financial CAT Caterpillar Industrial Goods 7 DD DuPont Basic Materials 8 DIS Disney Services 9 GE General Electric Conglomerates 0 GM General Motors Consumer Goods HD Home Depot Services HON Honeywell Industrial Goods HPQ Hewlett-Packard Technology IBM International Business Machines Technology INTC Intel Technology JNJ Johnson and Johnson Healthcare 7 JPM JP Morgan Chase Financial 8 KO Coca-Cola Consumer Goods 9 MCD McDonald s Services 0 MMM Minnesota Mining and Manufacturing Conglomerates MO Altria Consumer Goods MRK Merck Healthcare MSFT Microsoft Technology PFE Pfizer Healthcare PG Procter and Gamble Consumer Goods T AT&T Technology 7 UTX United Technologies Conglomerates 8 VZ Verizon Technology 9 WMT Wal-Mart Services 0 XOM ExxonMobil Basic Materials Table : The 0 stocks of the Dow Jones Industrial Average (for the past 0 years) and their sectors.
9 Analysis of financial data 8 0 Stock prices (a) (b) Figure : (a) The original stock prices data; (b) graph of the functions f(x) = x and f(x) = x, x > 0... No smoothness, sparsity In this run we use α Sx = 0, α Sa = 0.0 in () in order to obtain a sparse A; this turns out to produce components which oscillate too much (see Figure ). Hence, extra sparsity for A should be used in parallel with extra smoothness for X. Note also that the recursion ended prematurely here, indicating that the cost reached a minimum and was about to increase again... Smoothness, no sparsity In this run we use α Sx = 0.0, α Sa = 0 in () in order to obtain a smooth X; this time the recursion goes on till the end, the components are smooth, but they are also very similar to each other, with a clear upward trend (see Figure ). If we remove this common upward trend, the components will look similar to those in Figure... Other attempts to induce sparsity and smoothness We saw earlier that our attempt to sparsify A by exponentiation led to oscillatory components in X. We can remove this oscillation either by using α S x < 0, or by brute force, namely by convolving the rows of X with a smoothing window at the end of each step of the algorithm. The latter method is especially appealing, as we can smoothen the components to an arbitrary degree, by choosing the width of our smoothening window (we used a Hamming window of width 0). Here, the components look very much like the ones in Figure.
10 8 Konstantinos Drakakis et al Reconstruction (rows of AX) 0.9 Components (rows of X) (a) (b) Cost Step (c) Figure : Results of the Frobenius run with no extra smoothness or sparsity. Kullback-Leibler runs In the following runs we use the Kullback-Leibler norm:.. No smoothness, no sparsity In this run we use again vanilla NMF, making no effort to induce either extra sparsity or extra smoothness: we set α Sx = α Sa = 0 in (). The recursion goes through all 000 steps, and the results are shown in Figure. We see that the components (rows of X) are already sufficiently smooth, and that the reconstruction of the data from the determined components is also very good. The cost is computed using ()... No smoothness, sparsity In this run we use α Sx = 0, α Sa = 0.00 in () in order to obtain a sparse A; given the high oscillation we observed in the corresponding Frobenius run, we decreased the exponent used, and this time the result is satisfactory. Although the components are more oscillatory than before, they do not oscillate wildly
11 Analysis of financial data 8 Reconstruction (rows of AX) 0.9 Components (rows of X) (a) (b) Cost Step (c) Figure : Results of the Frobenius run with no extra smoothness but extra sparsity for A (see Figure ). Finally, note that the recursion ended prematurely here... Smoothness and sparsity In this run we use α Sx = 0.00, α Sa = 0.00 in () in order to obtain a sparse A; the results are very nice, as Figure 7 shows... Smoothness by convolution and sparsity In this run we use α Sx = 0, α Sa = 0.00 in (), but attempt to smoothen the components by filtering. As the convolution with a large Hamming window did not produce interesting results in the corresponding Frobenius runs, we now use a very small Hamming window instead of width (after all, we are only interested in fine scale smoothing).
12 8 Konstantinos Drakakis et al Reconstruction (rows of AX) Components (rows of X) (a) (b) 0.0 Cost Step (c) Figure : Results of the Frobenius run with no extra sparsity but extra smoothness for X. Discussion of the various runs In the experiments above we tried to sample a huge variety of parameters. Our results suggest the following: There is practically no difference between using the Frobenius cost and using the Kullback-Leibler cost; Attempts to sparsify A lead to oscillations in X that need to be smoothened out; Smoothening through exponentiation tends to perform both coarse and fine scale smoothening and reduces the overall oscillation of the components, yielding very similar components which need to have their trends removed; Smoothening X through filtering (with a Hamming window, in our example) gives us more control which scales we want to smoothen over, and is thus a method more suitable for our purposes.
13 Analysis of financial data 8 Reconstruction (rows of AX) Components (rows of X) (a) (b) 7 x 0 Cost Step (c) Figure : Results of the Kullback-Leibler run with no extra smoothness or sparsity. Stock clustering The determination of the underlying driving components of the market in the various runs above leads us to investigate which stocks behave similarly, in the sense that they rely mainly on (are linear combinations of, that is) the same components. This can be easily done by examining the rows of A determined at the end of each run and deciding which rows have the dominant coefficients at the same positions (columns). This is a well studied problem in statistics and MATLAB has a routine (kmeans) that does the grouping automatically upon feeding it with A and the number of clusters we seek; we set 7 clusters in all cases. The results of the clustering can be seen in Table., which demonstrates clearly the success of NMF in the analysis of financial data. Indeed, we see that clusters (or at least sub-clusters) of stocks persist across the various runs, that is they are not dependent on the parameters we run NMF with (cost, sparsity, smoothness etc.) This implies that NMF is robust when financial data is concerned, hence very suitable for analysis. In addition, those persistent clusters, as we are about to see, are frequently orthogonal to the traditionally
14 8 Konstantinos Drakakis et al Reconstruction (rows of AX) 0.9 Components (rows of X) (a) (b) 0.0 Cost Step (c) Figure : Results of the Kullback-Leibler run with no extra smoothness but extra sparsity for A defined sectors, falsely suggesting good portfolio diversification options. More precisely, to give some concrete examples, and are always grouped together, with exception only (run ), and correspond to American Express and Citigroup (finance). and are always together, and correspond to Microsoft (technology) and Home Depot (services)., 9, and are always grouped together, often along with 9, and correspond to AIG (financial), General Electric (conglomerates), Pfizer (healthcare), and Wal-Mart (services), so the group they form cuts across sectors. and are always together with exception (run ), and correspond to Altria and Procter & Gamble, both in consumer goods. and 7 are always together except in run, and correspond to Hewlett- Packard (technology) and JP Morgan Chase (finance).
15 Analysis of financial data 87 Reconstruction (rows of AX) Components (rows of X) (a) (b) 0.0 Cost Step (c) Figure 7: Results of the Kullback-Leibler run with extra smoothness for X and extra sparsity for A 7, 0, 9,, 8 are grouped together in all runs, often along with and : this is a large group comprising DuPont (basic materials), General Motors (consumer goods), MacDonald s (services), AT&T (technology), Verizon (technology), Honeywell (industrial goods), and Merck (healthcare), and truly remarkable as it intersects almost all sectors. Conclusion and summary We tested the main variations of the NMF algorithm on the analysis of financial data, namely the closing prices of the 0 stocks forming the Dow Jones Industrial Index for the past 0 years, in order to discover the underlying forces driving the financial market. We found that there is a tradeoff between the smoothness of those underlying components and the sparsity of the representation of the stock prices as linear combinations of these components, and proposed a novel method for sparsity and smoothness through exponentiation. But, more importantly, we also found that the clustering of the stocks
16 88 Konstantinos Drakakis et al Reconstruction (rows of AX) Components (rows of X) (a) (b) 9 x 0 Cost Step (c) Figure 8: Results of the Kullback-Leibler run with extra smoothness induced by filtering for X and extra sparsity for A according to the components they depend on is robust in the parameters we use, in the sense that certain groups of stocks persist in all (or at least in most) clusterings we attempted, and that those groups often span many sectors, thus potentially giving investors a false sense of security that they have diversified their portfolio, when in fact they have not. Conversely, investing in stocks belonging to different persistent groups, but not necessarily in different sectors, offers genuine portfolio diversification options. Acknowledgements This material is based upon works supported by the Science Foundation Ireland under Grant No. 0/YI/I77.
17 Analysis of financial data 89,,,, 7, 9,,,,, 9,, 0, 7, 0, 7, 8, 0,, 8, 9,,, 8,,, 0,, 0 8, 8,,, 9,,, 9,,, 7, 7, 0,,, 7, 9,, 8, 7, 8, 8,,, 7,,,, 0,,, 0,, 9,,, 9 7, 0,, 9,, 8, 7,,,, 7,,, 9,, 8,,,, 9,, 7, 8, 0, 9, 0,, 8, 0, 7,, 7,,,, 9,, 9,,, 0,,, 0 7, 8, 0,, 8, 9,,, 8, 7,,,, 7, 9,,, 7, 8, 0, 9,, 8,,, 8, 0,,,, 9, 0,,,,,, 9,, 9,,,, 7,, 0, 7, 0 7, 8, 0, 8, 9,,, 8 Table : Clustering of the stocks according to the 7 runs performed, in the same order from top to bottom; the correspondence between numbers and stocks can be found in Table. References [] B. Bodvarsson and M. Mørkebjerg. Analysis of dynamic PET data. Master s thesis, Informatics and Mathematical Modelling, Technical University of Denmark, DTU, Richard Petersens Plads, Building, DK-800 Kgs. Lyngby, 00. Supervised by Lars Kai Hansen, IMM. [] Z. Chen, A. Cichocki, and T. Rutkowski. Constrained non-negative matrix factorization method for EEG analysis in early detection of alzheimer disease. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), volume, pages 89 89, May 00. [] A. Cichocki, S. Amari, R. Zdunek, R. Kompass, G. Hori, and Z. He. Extended SMART algorithms for non-negative matrix factorization. In ICAISC, pages 8, 00. [] A. Cichocki and R. Zdunek. NMFLAB User s Guide MATLAB Toolbox for Non-Negative Matrix Factorization. Laboratory for Advanced Brain Signal Processing, Riken, 00. [] A. Cichocki, R. Zdunek, and S. Amari. New algorithms for non-negative matrix factorization in applications to blind source separation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), volume, pages, May 00. [] T. Feng, S. Li, H. Shum, and H. Zhang. Local non-negative matrix factorization as a visual representation. In ICDL 0: Proceedings of the nd International Conference on Development and Learning, page 78, Washington, DC, USA, 00. IEEE Computer Society. [7] P. Hoyer. Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res., :7 9, 00.
18 870 Konstantinos Drakakis et al [8] D. Lee and H. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 0:788 79, 999. [9] D. Lee and H. Seung. Algorithms for non-negative matrix factorization. In NIPS, pages, 000. [0] C. Lin. Projected gradient methods for non-negative matrix factorization. Technical report, Department of Computer Science, National Taiwan University, 00. [] W. Liu, N. Zheng, and X. Li. Nonnegative Matrix Factorization for EEG Signal Classification, volume 7 of Lecture Notes in Computer Science. Springer Berlin Heidelberg, 00. [] R. Zdunek and A. Cichocki. Non-negative matrix factorization with quasinewton optimization. In ICAISC, pages , 00. Received: April, 008
Smith Barney Portfolio Manager Institute Conference
Smith Barney Portfolio Manager Institute Conference Richard E. Cripps, CFA Portfolio Strategy Group March 2006 The EquityCompass is an investment process focused on selecting stocks and managing portfolios
Lesson 5 Save and Invest: Stocks Owning Part of a Company
Lesson 5 Save and Invest: Stocks Owning Part of a Company Lesson Description This lesson introduces students to information and basic concepts about the stock market. In a bingo game, students become aware
Factor Analysis. Factor Analysis
Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we
Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning
Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step
Read chapter 7 and review lectures 8 and 9 from Econ 104 if you don t remember this stuff.
Here is your teacher waiting for Steve Wynn to come on down so I could explain index options to him. He never showed so I guess that t he will have to download this lecture and figure it out like everyone
Stock Market Trading
Project Number: DZT0517 Stock Market Trading An Interactive Qualifying Project Report Submitted to the faculty of the Worcester Polytechnic Institute in partial fulfillment of the requirements for the
We shall turn our attention to solving linear systems of equations. Ax = b
59 Linear Algebra We shall turn our attention to solving linear systems of equations Ax = b where A R m n, x R n, and b R m. We already saw examples of methods that required the solution of a linear system
INVESTMENT CHOICES. Dr. Suzanne B. Badenhop Professor/Extension Specialist
INVESTMENT CHOICES Dr. Suzanne B. Badenhop Professor/Extension Specialist INVESTING 101 What is your investment objective? What is your age? What is your time horizon What is your risk comfort level? What
Analysing equity portfolios in R
Analysing equity portfolios in R Using the portfolio package by David Kane and Jeff Enos Introduction 1 R is used by major financial institutions around the world to manage billions of dollars in equity
Limit and Stop levels*** SIC Commission. > 500K$ < 500K$ $ per lot. Minimum Price Fluctuation. Currency Pair
Spread and IB commission for MT4 platform, and the TWS platform spread is same as Exchanges spread + commission start from 6$ - 25$ per lot + IB markup commission Currency Pair Main FX USD 0.01 2 3 Depend
Class-specific Sparse Coding for Learning of Object Representations
Class-specific Sparse Coding for Learning of Object Representations Stephan Hasler, Heiko Wersing, and Edgar Körner Honda Research Institute Europe GmbH Carl-Legien-Str. 30, 63073 Offenbach am Main, Germany
Stock Index Futures Spread Trading
S&P 500 vs. DJIA Stock Index Futures Spread Trading S&P MidCap 400 vs. S&P SmallCap 600 Second Quarter 2008 2 Contents Introduction S&P 500 vs. DJIA Introduction Index Methodology, Calculations and Weightings
Applications to Data Smoothing and Image Processing I
Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is
Integer Factorization using the Quadratic Sieve
Integer Factorization using the Quadratic Sieve Chad Seibert* Division of Science and Mathematics University of Minnesota, Morris Morris, MN 56567 [email protected] March 16, 2011 Abstract We give
Solution of Linear Systems
Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start
Learning, Sparsity and Big Data
Learning, Sparsity and Big Data M. Magdon-Ismail (Joint Work) January 22, 2014. Out-of-Sample is What Counts NO YES A pattern exists We don t know it We have data to learn it Tested on new cases? Teaching
System Identification for Acoustic Comms.:
System Identification for Acoustic Comms.: New Insights and Approaches for Tracking Sparse and Rapidly Fluctuating Channels Weichang Li and James Preisig Woods Hole Oceanographic Institution The demodulation
By choosing to view this document, you agree to all provisions of the copyright laws protecting it.
This material is posted here with permission of the IEEE Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University of Technology's products or services Internal
Forecasting the U.S. Stock Market via Levenberg-Marquardt and Haken Artificial Neural Networks Using ICA&PCA Pre-Processing Techniques
Forecasting the U.S. Stock Market via Levenberg-Marquardt and Haken Artificial Neural Networks Using ICA&PCA Pre-Processing Techniques Golovachev Sergey National Research University, Higher School of Economics,
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 2013. ACCEPTED FOR PUBLICATION 1
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 2013. ACCEPTED FOR PUBLICATION 1 Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio Tuomas Virtanen, Member,
Analecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
Buying and Selling Stocks
SECTION 3 Buying and Selling Stocks OBJECTIVES KEY TERMS TAKING NOTES In Section 3, you will discuss why people buy stocks describe how stocks are traded explain how the performance of stocks is measured
Efficient online learning of a non-negative sparse autoencoder
and Machine Learning. Bruges (Belgium), 28-30 April 2010, d-side publi., ISBN 2-93030-10-2. Efficient online learning of a non-negative sparse autoencoder Andre Lemme, R. Felix Reinhart and Jochen J. Steil
Readiness Activity. (An activity to be done before viewing the program)
Knowledge Unlimited NEWS Matters The Stock Market: What Goes Up...? Vol. 3 No. 2 About NewsMatters The Stock Market: What Goes Up...? is one in a series of NewsMatters programs. Each 15-20 minute program
UOB Structured Deposit Ace 688 Deposit (USD)
Year 4 Performance (Matured on 10 May 2010) 1st Observation Date (23 October 2009) for Year 4 Reference 5-Day (starting from 1st Observation Date) Return = 5-Day 5-Day (2 May 06) Day 1 Day 2 Day 3 Day
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
CS3220 Lecture Notes: QR factorization and orthogonal transformations
CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss
Options on. Dow Jones Industrial Average SM. the. DJX and DIA. Act on the Market You Know Best.
Options on the Dow Jones Industrial Average SM DJX and DIA Act on the Market You Know Best. A glossary of options definitions appears on page 21. The Chicago Board Options Exchange (CBOE) was founded in
Introduction to time series analysis
Introduction to time series analysis Margherita Gerolimetto November 3, 2010 1 What is a time series? A time series is a collection of observations ordered following a parameter that for us is time. Examples
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
NOTES ON LINEAR TRANSFORMATIONS
NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all
4F7 Adaptive Filters (and Spectrum Estimation) Least Mean Square (LMS) Algorithm Sumeetpal Singh Engineering Department Email : [email protected].
4F7 Adaptive Filters (and Spectrum Estimation) Least Mean Square (LMS) Algorithm Sumeetpal Singh Engineering Department Email : [email protected] 1 1 Outline The LMS algorithm Overview of LMS issues
Principles of Dat Da a t Mining Pham Tho Hoan [email protected] [email protected]. n
Principles of Data Mining Pham Tho Hoan [email protected] References [1] David Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, MIT press, 2002 [2] Jiawei Han and Micheline Kamber,
Linear Programming I
Linear Programming I November 30, 2003 1 Introduction In the VCR/guns/nuclear bombs/napkins/star wars/professors/butter/mice problem, the benevolent dictator, Bigus Piguinus, of south Antarctica penguins
Transportation Polytopes: a Twenty year Update
Transportation Polytopes: a Twenty year Update Jesús Antonio De Loera University of California, Davis Based on various papers joint with R. Hemmecke, E.Kim, F. Liu, U. Rothblum, F. Santos, S. Onn, R. Yoshida,
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
1 Determinants and the Solvability of Linear Systems
1 Determinants and the Solvability of Linear Systems In the last section we learned how to use Gaussian elimination to solve linear systems of n equations in n unknowns The section completely side-stepped
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
1 Solving LPs: The Simplex Algorithm of George Dantzig
Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.
a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.
Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given
Manifold Learning Examples PCA, LLE and ISOMAP
Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition
CBOT Dow Complex Reference Guide
CBOT Dow Complex Reference Guide DJIA SM Futures DJIA SM Options mini-sized Dow SM Futures mini-sized Dow SM Options The information in this publication is taken from sources believed to be reliable, but
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Medical Information Management & Mining. You Chen Jan,15, 2013 [email protected]
Medical Information Management & Mining You Chen Jan,15, 2013 [email protected] 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
Bag of Pursuits and Neural Gas for Improved Sparse Coding
Bag of Pursuits and Neural Gas for Improved Sparse Coding Kai Labusch, Erhardt Barth, and Thomas Martinetz University of Lübec Institute for Neuro- and Bioinformatics Ratzeburger Allee 6 23562 Lübec, Germany
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
18.06 Problem Set 4 Solution Due Wednesday, 11 March 2009 at 4 pm in 2-106. Total: 175 points.
806 Problem Set 4 Solution Due Wednesday, March 2009 at 4 pm in 2-06 Total: 75 points Problem : A is an m n matrix of rank r Suppose there are right-hand-sides b for which A x = b has no solution (a) What
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
3. INNER PRODUCT SPACES
. INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.
Operation Count; Numerical Linear Algebra
10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point
6. Cholesky factorization
6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix
Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. [email protected]
Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang [email protected] http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel
A Markov-Switching Multi-Fractal Inter-Trade Duration Model, with Application to U.S. Equities
A Markov-Switching Multi-Fractal Inter-Trade Duration Model, with Application to U.S. Equities Fei Chen (HUST) Francis X. Diebold (UPenn) Frank Schorfheide (UPenn) April 28, 2014 1 / 39 Big Data Are big
Lecture 1: Systems of Linear Equations
MTH Elementary Matrix Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Lecture 1 Systems of Linear Equations ² Systems of two linear equations with two variables
Metrics on SO(3) and Inverse Kinematics
Mathematical Foundations of Computer Graphics and Vision Metrics on SO(3) and Inverse Kinematics Luca Ballan Institute of Visual Computing Optimization on Manifolds Descent approach d is a ascent direction
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections
Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections Maximilian Hung, Bohyun B. Kim, Xiling Zhang August 17, 2013 Abstract While current systems already provide
Computational Foundations of Cognitive Science
Computational Foundations of Cognitive Science Lecture 15: Convolutions and Kernels Frank Keller School of Informatics University of Edinburgh [email protected] February 23, 2010 Frank Keller Computational
Computational Optical Imaging - Optique Numerique. -- Deconvolution --
Computational Optical Imaging - Optique Numerique -- Deconvolution -- Winter 2014 Ivo Ihrke Deconvolution Ivo Ihrke Outline Deconvolution Theory example 1D deconvolution Fourier method Algebraic method
Statistical machine learning, high dimension and big data
Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,
Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method
Lecture 3 3B1B Optimization Michaelmas 2015 A. Zisserman Linear Programming Extreme solutions Simplex method Interior point method Integer programming and relaxation The Optimization Tree Linear Programming
(Quasi-)Newton methods
(Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting
CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d- dimensional subspace Axes of this subspace
Component Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
Linear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
PRESS RELEASE. S&P 500 Stock Buybacks Jump 59%; $30 billion First Quarter Increase Pushes Cash Holdings Down
S&P 500 Stock Buybacks Jump 59%; $30 billion First Quarter Increase Pushes Cash Holdings Down New York, June 18, 2014 announced today that preliminary results show that S&P 500 stock buybacks, or share
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: [email protected]
Compact Representations and Approximations for Compuation in Games
Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions
1 Example of Time Series Analysis by SSA 1
1 Example of Time Series Analysis by SSA 1 Let us illustrate the 'Caterpillar'-SSA technique [1] by the example of time series analysis. Consider the time series FORT (monthly volumes of fortied wine sales
APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES
NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES Silvija Vlah Kristina Soric Visnja Vojvodic Rosenzweig Department of Mathematics
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
The Image Deblurring Problem
page 1 Chapter 1 The Image Deblurring Problem You cannot depend on your eyes when your imagination is out of focus. Mark Twain When we use a camera, we want the recorded image to be a faithful representation
BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, [email protected]) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
x1 x 2 x 3 y 1 y 2 y 3 x 1 y 2 x 2 y 1 0.
Cross product 1 Chapter 7 Cross product We are getting ready to study integration in several variables. Until now we have been doing only differential calculus. One outcome of this study will be our ability
Annotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
K-Means Clustering Tutorial
K-Means Clustering Tutorial By Kardi Teknomo,PhD Preferable reference for this tutorial is Teknomo, Kardi. K-Means Clustering Tutorials. http:\\people.revoledu.com\kardi\ tutorial\kmean\ Last Update: July
How To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical
Orthogonal Projections
Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors
Interpreting Kullback-Leibler Divergence with the Neyman-Pearson Lemma
Interpreting Kullback-Leibler Divergence with the Neyman-Pearson Lemma Shinto Eguchi a, and John Copas b a Institute of Statistical Mathematics and Graduate University of Advanced Studies, Minami-azabu
