PREDICTING MARKET VOLATILITY FEDERAL RESERVE BOARD MEETING MINUTES FROM

Similar documents
Predicting Market-Volatility from Federal Reserve Board Meeting Minutes NLP for Finance

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

Analysis of Tweets for Prediction of Indian Stock Markets

Why is Internal Audit so Hard?

Interday news-based prediction of stock prices and trading volume CHRISTIAN SÖYLAND. Master s thesis in Engineering Mathematics

Predicting the Stock Market with News Articles

Bagged Ensemble Classifiers for Sentiment Classification of Movie Reviews

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Sentiment Analysis for Movie Reviews

Neural Networks for Sentiment Detection in Financial Text

Forecasting stock markets with Twitter

W. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015

ARTIFICIAL INTELLIGENCE METHODS IN STOCK INDEX PREDICTION WITH THE USE OF NEWSPAPER ARTICLES

Equity forecast: Predicting long term stock price movement using machine learning

Applying Machine Learning to Stock Market Trading Bryce Taylor

Clustering Technique in Data Mining for Text Documents

Author Gender Identification of English Novels

Data Mining Yelp Data - Predicting rating stars from review text

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Predicting outcome of soccer matches using machine learning

Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

Building a Question Classifier for a TREC-Style Question Answering System

Predicting Movie Revenue from IMDb Data

How can we discover stocks that will

Using News Articles to Predict Stock Price Movements

Experiments in Web Page Classification for Semantic Web

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Data Mining: STATISTICA

Predicting the NFL Using Twitter. Shiladitya Sinha, Chris Dyer, Kevin Gimpel, Noah Smith

Machine Learning and Algorithmic Trading

Sentiment analysis on tweets in a financial domain

Big Data Text Mining and Visualization. Anton Heijs

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r

5Strategic. decisions for a sound investment policy

An Introduction to. Metrics. used during. Software Development

Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction

Data Integration: Financial Domain-Driven Approach

ecommerce Web-Site Trust Assessment Framework Based on Web Mining Approach

Stock price reaction to analysts EPS forecast error

Introduction to Big Data Science

Patent Big Data Analysis by R Data Language for Technology Management

CSCI 5417 Information Retrieval Systems Jim Martin!

Machine Learning in Statistical Arbitrage

FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Automated Content Analysis of Discussion Transcripts

QUANTIFIED THE IMPACT OF AGILE. Double your productivity. Improve Quality by 250% Balance your team performance. Cut Time to Market in half

A Sentiment Detection Engine for Internet Stock Message Boards

Textual Analysis of Stock Market Prediction Using Financial News Articles

A Knowledge-Poor Approach to BioCreative V DNER and CID Tasks

CS224N Final Project: Sentiment analysis of news articles for financial signal prediction

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Digital vs. Analog Volume Controls

Statistical Feature Selection Techniques for Arabic Text Categorization

dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING

II. RELATED WORK. Sentiment Mining

Big Data Analytics CSCI 4030

Effectiveness of term weighting approaches for sparse social media text sentiment analysis

A Proposed Prediction Model for Forecasting the Financial Market Value According to Diversity in Factor

Content-Based Recommendation

best technical analysis trading software

A Decision Support Approach based on Sentiment Analysis Combined with Data Mining for Customer Satisfaction Research

Technical Report. The KNIME Text Processing Feature:

New Developments in the Automatic Classification of Records. Inge Alberts, André Vellino, Craig Eby, Yves Marleau

ACQUA : Application-level Quality of Experience at Internet Access

Sentiment Analysis on Twitter with Stock Price and Significant Keyword Correlation. Abstract

Segmentation and Classification of Online Chats

TURKISH ORACLE USER GROUP

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.

Semantic Sentiment Analysis of Twitter

A Web Content Mining Approach for Tag Cloud Generation

SOPS: Stock Prediction using Web Sentiment

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek

Keep Decypha-ing! What s in it for You?

Movie Classification Using k-means and Hierarchical Clustering

Data Mining - Evaluation of Classifiers

DSIP List (Diversified Stock Income Plan)

Data Mining. Nonlinear Classification

Probabilistic topic models for sentiment analysis on the Web

Maschinelles Lernen mit MATLAB

FABRICATION DRAWINGS A Paradigm Shift

Transcription:

PREDICTING MARKET VOLATILITY FROM FEDERAL RESERVE BOARD MEETING MINUTES Reza Bosagh Zadeh and Andreas Zollmann Lab Advisers: Noah Smith and Bryan Routledge

GOALS Make Money! Not really. Find interesting patterns in meeting minutes Meetings happen roughly 10 times a year Interest rate changes are decided, along with other qualitative assessments of US Economy Minutes freely available on the web for meetings from 1967 to 2008 Idea : Use established tools from NLP and ML

Efficient market hypothesis: excess returns per unit risk cannot be consistently generated using public information Stock prices react on news in split-seconds Automated analysis can outperform humans because of processing speed Strongest form of EMH: all market correction due to insiders Gidofalvi and Elkan (2003): News-based prediction model has highest predictive accuracy over the 20 minutes trading window before the publication time of the respective article

PREVIOUS ATTEMPTS Source: Text Mining Systems for Market Response to News: A Survey, Marc-André Mittermayer, Gerhard F. Knolmayer 2006

PAST WORK: FEATURES USED IN TEXT-BASED PREDICTION BOW, BObigrams, NPs, NNPs, NEs (frequency, TF-IDF score, or information gain) Lerman et al. (COLING 08): News-focus features: change in occurrence frequency of a word in the current day's news coverage compared to the average news coverage of the past N days Dependency features

CLASSIFICATION VS. REGRESSION Most past work: predicts increase or decrease in prices/volatility Kogan et al. (NAACL 09): predict indicator (stock volatility) directly using Support Vector Regression

TAPPING THE FED Our aim: use FOMC meeting minutes to predict financial indicators No previous attempts to our knowledge Boukus & Rosenberg: market participants do extract complex signals from these minutes found correlations of e.g. Treasury yields with specific themes of the meeting minutes using Latent Semantic Analysis

OUR ATTEMPT Predicting Prices is too hard. Focus on Volatility: Predict volatility of S&P 500 13-week Treasury Bills 10-year Treasury Notes

MACHINE LEARNING SETTING Take meeting minutes from minutes held on day t, and predict volatility n (look-ahead) days ahead. Not I.I.D. training data at all. But let s hide that under the carpet. Features: Bag of Words: unigrams, bigrams Dependency fragments Volatility from n days ago

FEATURES: BAG OF WORDS TF-IDF: IDF dampens effect of common words Log1P: Don t really need IDF since we already removed stopwords

DATA MINING Obtained meeting text from PDF and HTML files available at http://www.federalreserve.gov/monetarypolicy/fomc.htm Our corpus available at http://rezab.ca/useful/fomc_minutes.html Stemmed using Porter 2 stemmer. Removed stop-words using available online list.

PREDICT WHAT? Regression: Actual value of the volatility Classification: Two classes, volatility goes UP or DOWN First set of experiments: Classification for different indices.

S&P 500 CLASSIFICATION SHORT PERIODS Higher is better.

S&P 500 CLASSIFICATION LONG PERIODS

10 YEAR TREASURY NOTE CLASSIFICATION SHORT PERIODS

13 WEEK TREASURY BILLS CLASSIFICATION SHORT PERIODS

13 WEEK TREASURY BILLS CLASSIFICATION LONG PERIODS

EXAMPLE DECISION TREE This had 64% Accuracy

SVM PROMINENT TERMS

CONCLUSIONS FROM CLASSIFICATION Shorter Periods are easier to predict than longer periods 13 Week Treasury bills are easier to predict than S&P 500 and 10 Year Treasury notes Bigrams don t help in our case Dependency fragments don t help either Now onto Regression

S&P 500 REGRESSION SHORT PERIODS Now lower is better.

S&P 500 REGRESSION LONG PERIODS

CONCLUSIONS FROM REGRESSION Regression for S&P 500 is hard can t beat simple straw man baseline using only words Oddly enough, training on the previous volatility does worse than just predicting the previous volatility. Over-fitting happening with just two dimensions very surprising, a testament to the difficulty of the problem.