Why Big Data is not Big Hype in Economics and Finance?

Size: px
Start display at page:

Download "Why Big Data is not Big Hype in Economics and Finance?"

Transcription

1 Why Big Data is not Big Hype in Economics and Finance? Ariel M. Viale Marshall E. Rinker School of Business Palm Beach Atlantic University West Palm Beach, April 2015

2 1 The Big Data Hype 2 Big Data as a resourceful Toolbox 3 Big Data or Big Mistake?

3 What is Big Data? A vague term often thrown around by people with something to sell (Harford, 2014). After the success of Google s Flue Trends it has been taken for granted as a quick, accurate, cheap, and theory-free method to understand the world through data. More generally what is referred as big data is what we know as found data i.e., the digital exhaust of web searches, credit card payments, mobiles, etc. Any data set that is relatively cheap to collect given its size, is less structured, has high dimensionality, and can be updated in real time.

4 What is Big Data? A vague term often thrown around by people with something to sell (Harford, 2014). After the success of Google s Flue Trends it has been taken for granted as a quick, accurate, cheap, and theory-free method to understand the world through data. More generally what is referred as big data is what we know as found data i.e., the digital exhaust of web searches, credit card payments, mobiles, etc. Any data set that is relatively cheap to collect given its size, is less structured, has high dimensionality, and can be updated in real time.

5 What is Big Data? A vague term often thrown around by people with something to sell (Harford, 2014). After the success of Google s Flue Trends it has been taken for granted as a quick, accurate, cheap, and theory-free method to understand the world through data. More generally what is referred as big data is what we know as found data i.e., the digital exhaust of web searches, credit card payments, mobiles, etc. Any data set that is relatively cheap to collect given its size, is less structured, has high dimensionality, and can be updated in real time.

6 What is Big Data? A vague term often thrown around by people with something to sell (Harford, 2014). After the success of Google s Flue Trends it has been taken for granted as a quick, accurate, cheap, and theory-free method to understand the world through data. More generally what is referred as big data is what we know as found data i.e., the digital exhaust of web searches, credit card payments, mobiles, etc. Any data set that is relatively cheap to collect given its size, is less structured, has high dimensionality, and can be updated in real time.

7 What is Big Data? A vague term often thrown around by people with something to sell (Harford, 2014). After the success of Google s Flue Trends it has been taken for granted as a quick, accurate, cheap, and theory-free method to understand the world through data. More generally what is referred as big data is what we know as found data i.e., the digital exhaust of web searches, credit card payments, mobiles, etc. Any data set that is relatively cheap to collect given its size, is less structured, has high dimensionality, and can be updated in real time.

8 The Four Pillars of the Faith It gets uncanny accurate results. Causation has been knocked off its pedestal. N = All, consequently sampling does not matter. The numbers speak for themselves (Wired). So it is Theory-free!

9 The Four Pillars of the Faith It gets uncanny accurate results. Causation has been knocked off its pedestal. N = All, consequently sampling does not matter. The numbers speak for themselves (Wired). So it is Theory-free!

10 The Four Pillars of the Faith It gets uncanny accurate results. Causation has been knocked off its pedestal. N = All, consequently sampling does not matter. The numbers speak for themselves (Wired). So it is Theory-free!

11 The Four Pillars of the Faith It gets uncanny accurate results. Causation has been knocked off its pedestal. N = All, consequently sampling does not matter. The numbers speak for themselves (Wired). So it is Theory-free!

12 The Four Pillars of the Faith It gets uncanny accurate results. Causation has been knocked off its pedestal. N = All, consequently sampling does not matter. The numbers speak for themselves (Wired). So it is Theory-free!

13 The Four Pillars of the Faith It gets uncanny accurate results. Causation has been knocked off its pedestal. N = All, consequently sampling does not matter. The numbers speak for themselves (Wired). So it is Theory-free!

14 The Four Pillars of the Faith It gets uncanny accurate results. Causation has been knocked off its pedestal. N = All, consequently sampling does not matter. The numbers speak for themselves (Wired). So it is Theory-free!

15 The Four Pillars of the Faith Accuracy Four years after the Google Flu Trends publication in Nature, the flu outbreak claimed an unexpected victim: Google Flu Trends. When the slow-and-steady data from the CDC arrived, they showed that Google s estimates were overstated by almost a factor of two.

16 The Four Pillars of the Faith Theory-free Theory-free analysis of correlations is inevitably fragile. If you have no idea what is behind a correlation, you have no idea what might cause the correlation to break down. Statisticians have spent the past 200 years figuring out what traps lie in wait when we try to understand the world using data.

17 The Four Pillars of the Faith N = All. When it comes to data, size isn t everything N = All is not a good description of found data sets. Take Twitter as example, Twitter users are not representative of the population as a whole. As any well-trained economist knows, a randomly chosen sample might not reflect the underlying population (sampling error), and the sample might not have been randomly chosen at all (sample selection bias). N = All is just an assumption rather than a fact about the data.

18 The Four Pillars of the Faith Statistical Sorcery? Without careful analysis, the ratio of genuine patterns/correlations to spurious patterns/ correlations - signal to noise ratio - quickly tends to zero.

19 Helpful Data Driven Predictive Tools Machine Learning and Pattern Recognition Analysis. Clustering Analysis and Classification Algorithms. Neural Networks. Directed Acyclic Graphs. Bayesian Networks, etc.

20 Helpful Data Driven Predictive Tools Machine Learning and Pattern Recognition Analysis. Clustering Analysis and Classification Algorithms. Neural Networks. Directed Acyclic Graphs. Bayesian Networks, etc.

21 Helpful Data Driven Predictive Tools Machine Learning and Pattern Recognition Analysis. Clustering Analysis and Classification Algorithms. Neural Networks. Directed Acyclic Graphs. Bayesian Networks, etc.

22 Helpful Data Driven Predictive Tools Machine Learning and Pattern Recognition Analysis. Clustering Analysis and Classification Algorithms. Neural Networks. Directed Acyclic Graphs. Bayesian Networks, etc.

23 Helpful Data Driven Predictive Tools Machine Learning and Pattern Recognition Analysis. Clustering Analysis and Classification Algorithms. Neural Networks. Directed Acyclic Graphs. Bayesian Networks, etc.

24 Helpful Data Driven Predictive Tools Machine Learning and Pattern Recognition Analysis. Clustering Analysis and Classification Algorithms. Neural Networks. Directed Acyclic Graphs. Bayesian Networks, etc.

25 Helpful Data Driven Predictive Tools Machine Learning and Pattern Recognition Analysis. Clustering Analysis and Classification Algorithms. Neural Networks. Directed Acyclic Graphs. Bayesian Networks, etc.

26 Helpful Data Driven Predictive Tools Machine Learning and Pattern Recognition Analysis. Clustering Analysis and Classification Algorithms. Neural Networks. Directed Acyclic Graphs. Bayesian Networks, etc.

27 Helpful Data Driven Predictive Tools Machine Learning and Pattern Recognition Analysis. Clustering Analysis and Classification Algorithms. Neural Networks. Directed Acyclic Graphs. Bayesian Networks, etc.

28 With Same Old Problems Overfitting. Stationarity. Lucas Critique. If the predicitve model is used to decide on a policy intervention, the final result may no be what the model predicts because the policy change is anticipated and behavior changes. Its kind of funny if one thinks that some of these techniques were used in Computer Science to get insight into the problem of causality. Judea Pearl s book Causality in Computer Science is a foundational reading in Artifical Intelligence and Machine Learning.

29 With Same Old Problems Overfitting. Stationarity. Lucas Critique. If the predicitve model is used to decide on a policy intervention, the final result may no be what the model predicts because the policy change is anticipated and behavior changes. Its kind of funny if one thinks that some of these techniques were used in Computer Science to get insight into the problem of causality. Judea Pearl s book Causality in Computer Science is a foundational reading in Artifical Intelligence and Machine Learning.

30 With Same Old Problems Overfitting. Stationarity. Lucas Critique. If the predicitve model is used to decide on a policy intervention, the final result may no be what the model predicts because the policy change is anticipated and behavior changes. Its kind of funny if one thinks that some of these techniques were used in Computer Science to get insight into the problem of causality. Judea Pearl s book Causality in Computer Science is a foundational reading in Artifical Intelligence and Machine Learning.

31 With Same Old Problems Overfitting. Stationarity. Lucas Critique. If the predicitve model is used to decide on a policy intervention, the final result may no be what the model predicts because the policy change is anticipated and behavior changes. Its kind of funny if one thinks that some of these techniques were used in Computer Science to get insight into the problem of causality. Judea Pearl s book Causality in Computer Science is a foundational reading in Artifical Intelligence and Machine Learning.

32 With Same Old Problems Overfitting. Stationarity. Lucas Critique. If the predicitve model is used to decide on a policy intervention, the final result may no be what the model predicts because the policy change is anticipated and behavior changes. Its kind of funny if one thinks that some of these techniques were used in Computer Science to get insight into the problem of causality. Judea Pearl s book Causality in Computer Science is a foundational reading in Artifical Intelligence and Machine Learning.

33 With Same Old Problems Overfitting. Stationarity. Lucas Critique. If the predicitve model is used to decide on a policy intervention, the final result may no be what the model predicts because the policy change is anticipated and behavior changes. Its kind of funny if one thinks that some of these techniques were used in Computer Science to get insight into the problem of causality. Judea Pearl s book Causality in Computer Science is a foundational reading in Artifical Intelligence and Machine Learning.

34 With Same Old Problems Overfitting. Stationarity. Lucas Critique. If the predicitve model is used to decide on a policy intervention, the final result may no be what the model predicts because the policy change is anticipated and behavior changes. Its kind of funny if one thinks that some of these techniques were used in Computer Science to get insight into the problem of causality. Judea Pearl s book Causality in Computer Science is a foundational reading in Artifical Intelligence and Machine Learning.

35 Some Recent Applications - Use of Government Administrative (Big) Data. For example, Piketty and Saez (2003) used IRS data to derive an historical series of income shares for top percentile earners among US households and get some insight into income inequality. - To obtain new measures of private economic activity. For example, the Billion Prices Project (BPP) developed by Alberto Cavallo and Roberto Rigobon at the MIT publishes an alternative measure of retail price inflation obtained from online retail websites in more than fifty countries. - Improving Government policymaking. For example the Federal Reserve made FRED service publicly available and integrated it into popular softwares like Office, E-views, and QuandI. - Use of highly granular data to reveal the role of specific institutional details and variations at the micro-level that will be otherwise difficult to isolate. For example in understanding the markets, the new hype in Finance relies on High Frequency Trading (HFT) data and Market Microstructure models to get a better understanding of the price discovery process.

36 Some Recent Applications - Use of Government Administrative (Big) Data. For example, Piketty and Saez (2003) used IRS data to derive an historical series of income shares for top percentile earners among US households and get some insight into income inequality. - To obtain new measures of private economic activity. For example, the Billion Prices Project (BPP) developed by Alberto Cavallo and Roberto Rigobon at the MIT publishes an alternative measure of retail price inflation obtained from online retail websites in more than fifty countries. - Improving Government policymaking. For example the Federal Reserve made FRED service publicly available and integrated it into popular softwares like Office, E-views, and QuandI. - Use of highly granular data to reveal the role of specific institutional details and variations at the micro-level that will be otherwise difficult to isolate. For example in understanding the markets, the new hype in Finance relies on High Frequency Trading (HFT) data and Market Microstructure models to get a better understanding of the price discovery process.

37 Some Recent Applications - Use of Government Administrative (Big) Data. For example, Piketty and Saez (2003) used IRS data to derive an historical series of income shares for top percentile earners among US households and get some insight into income inequality. - To obtain new measures of private economic activity. For example, the Billion Prices Project (BPP) developed by Alberto Cavallo and Roberto Rigobon at the MIT publishes an alternative measure of retail price inflation obtained from online retail websites in more than fifty countries. - Improving Government policymaking. For example the Federal Reserve made FRED service publicly available and integrated it into popular softwares like Office, E-views, and QuandI. - Use of highly granular data to reveal the role of specific institutional details and variations at the micro-level that will be otherwise difficult to isolate. For example in understanding the markets, the new hype in Finance relies on High Frequency Trading (HFT) data and Market Microstructure models to get a better understanding of the price discovery process.

38 Some Recent Applications - Use of Government Administrative (Big) Data. For example, Piketty and Saez (2003) used IRS data to derive an historical series of income shares for top percentile earners among US households and get some insight into income inequality. - To obtain new measures of private economic activity. For example, the Billion Prices Project (BPP) developed by Alberto Cavallo and Roberto Rigobon at the MIT publishes an alternative measure of retail price inflation obtained from online retail websites in more than fifty countries. - Improving Government policymaking. For example the Federal Reserve made FRED service publicly available and integrated it into popular softwares like Office, E-views, and QuandI. - Use of highly granular data to reveal the role of specific institutional details and variations at the micro-level that will be otherwise difficult to isolate. For example in understanding the markets, the new hype in Finance relies on High Frequency Trading (HFT) data and Market Microstructure models to get a better understanding of the price discovery process.

39 Some Recent Applications - Use of Government Administrative (Big) Data. For example, Piketty and Saez (2003) used IRS data to derive an historical series of income shares for top percentile earners among US households and get some insight into income inequality. - To obtain new measures of private economic activity. For example, the Billion Prices Project (BPP) developed by Alberto Cavallo and Roberto Rigobon at the MIT publishes an alternative measure of retail price inflation obtained from online retail websites in more than fifty countries. - Improving Government policymaking. For example the Federal Reserve made FRED service publicly available and integrated it into popular softwares like Office, E-views, and QuandI. - Use of highly granular data to reveal the role of specific institutional details and variations at the micro-level that will be otherwise difficult to isolate. For example in understanding the markets, the new hype in Finance relies on High Frequency Trading (HFT) data and Market Microstructure models to get a better understanding of the price discovery process.

40 Some Recent Applications - Use of Government Administrative (Big) Data. For example, Piketty and Saez (2003) used IRS data to derive an historical series of income shares for top percentile earners among US households and get some insight into income inequality. - To obtain new measures of private economic activity. For example, the Billion Prices Project (BPP) developed by Alberto Cavallo and Roberto Rigobon at the MIT publishes an alternative measure of retail price inflation obtained from online retail websites in more than fifty countries. - Improving Government policymaking. For example the Federal Reserve made FRED service publicly available and integrated it into popular softwares like Office, E-views, and QuandI. - Use of highly granular data to reveal the role of specific institutional details and variations at the micro-level that will be otherwise difficult to isolate. For example in understanding the markets, the new hype in Finance relies on High Frequency Trading (HFT) data and Market Microstructure models to get a better understanding of the price discovery process.

41 Some Recent Applications - Use of Government Administrative (Big) Data. For example, Piketty and Saez (2003) used IRS data to derive an historical series of income shares for top percentile earners among US households and get some insight into income inequality. - To obtain new measures of private economic activity. For example, the Billion Prices Project (BPP) developed by Alberto Cavallo and Roberto Rigobon at the MIT publishes an alternative measure of retail price inflation obtained from online retail websites in more than fifty countries. - Improving Government policymaking. For example the Federal Reserve made FRED service publicly available and integrated it into popular softwares like Office, E-views, and QuandI. - Use of highly granular data to reveal the role of specific institutional details and variations at the micro-level that will be otherwise difficult to isolate. For example in understanding the markets, the new hype in Finance relies on High Frequency Trading (HFT) data and Market Microstructure models to get a better understanding of the price discovery process.

42 How can we benefit using Big Data without making a big mistake? As another important resource for anyone analyzing data, not a silver bullet. Have in mind that some of the conceptual approaches, statistical methods, and challenges used by Big Data are familiar old ones to economists.

43 Challenges: Data access:taq and TORQ HFT databases from NYSE and NASDAQ are only accessible through WRDS. Other data sets are proprietary e.g., FOREX signed order flow from dealers. Data processing: Handling and cleaning messy data requires specific algorithms and deep knowledge about the data. For example the Lee & Ready method used in time-stamped HFT data. Most of the techniques require programming skills with specific software capable of managing large datasets: SQL, R, SAS, Matlab, Python, etc. Asking the right questions. Only way not to get into the trap opf spurious inference is to get some formal training in the conceptual framework that seek to explain the relations driving the data. Theory does matter! Formal statistical robustness checks and methods ara a must! As an example. In Finance when it comes to HFT data we rely in formal sometimes heavy-weighted econometrics and two Market Microstructure canonical models: 1) The Glosten-Milgrom Dealer model; and 2) Kyle s model of the informed trader. Both rooted in Microeconomics.

44 Livan Einav and Jonathan Levin (2013). The data revolution and economic analysis. NBER WP No , NBER and Stanford University. Tim Harford (2014). Big data: Are we making a big mistake? Financial Times article, March 28, 2014.

How To Use Big Data In Economics

How To Use Big Data In Economics 1 The Data Revolution and Economic Analysis Liran Einav, Stanford University and NBER Jonathan Levin, Stanford University and NBER Executive Summary Many believe that big data will transform business,

More information

The Billion Prices Project Research and Inflation Measurement Applications

The Billion Prices Project Research and Inflation Measurement Applications The Billion Prices Project Research and Inflation Measurement Applications Alberto Cavallo MIT & NBER IMF Statistical Forum November 2015 Micro-Price Data in Macroeconomics Data Sources Statistical Offices

More information

The Billion Prices Project Using Online Prices for Inflation and Research

The Billion Prices Project Using Online Prices for Inflation and Research The Billion Prices Project Using Online Prices for Inflation and Research Alberto Cavallo MIT & NBER MFM Conference - NYU January 2016 Big Data in Macro and International Quote from Griliches (AER 1985)

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Opportunities and Limitations of Big Data

Opportunities and Limitations of Big Data Opportunities and Limitations of Big Data Karl Schmedders University of Zurich and Swiss Finance Institute «Big Data: Little Ethics?» HWZ-Darden-Conference June 4, 2015 On fortune.com this morning: Apple's

More information

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed

More information

The Data Revolution and Economic Analysis *

The Data Revolution and Economic Analysis * This Draft: May 1, 2013 The Data Revolution and Economic Analysis * Liran Einav and Jonathan Levin Stanford University and NBER Abstract. Many believe that big data will transform business, government

More information

Healthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw

Healthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Healthcare data analytics Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Outline Data Science Enabling technologies Grand goals Issues Google flu trend Privacy Conclusion Analytics

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

15.034 Metrics for Managers: Big Data and Better Answers. Fall 2014. Course Syllabus DRAFT. Faculty: Professor Joseph Doyle E62-516 jjdoyle@mit.

15.034 Metrics for Managers: Big Data and Better Answers. Fall 2014. Course Syllabus DRAFT. Faculty: Professor Joseph Doyle E62-516 jjdoyle@mit. 15.034 Metrics for Managers: Big Data and Better Answers Fall 2014 Course Syllabus DRAFT Faculty: Professor Joseph Doyle E62-516 jjdoyle@mit.edu Professor Roberto Rigobon E62-515 rigobon@mit.edu Professor

More information

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining. Fundamentals, robotics, recognition Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

More information

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Anthony Lai (aslai), MK Li (lilemon), Foon Wang Pong (ppong) Abstract Algorithmic trading, high frequency trading (HFT)

More information

Up/Down Analysis of Stock Index by Using Bayesian Network

Up/Down Analysis of Stock Index by Using Bayesian Network Engineering Management Research; Vol. 1, No. 2; 2012 ISSN 1927-7318 E-ISSN 1927-7326 Published by Canadian Center of Science and Education Up/Down Analysis of Stock Index by Using Bayesian Network Yi Zuo

More information

Statistics, Big Data and Data Science!?

Statistics, Big Data and Data Science!? Statistics, Big Data and Data Science!? Prof. Dr. Göran Kauermann Ludwig-Maximilians-Universität Munich, Germany Statistics, Big Data and Data Science Statistics Founded around 1900 with the seminal work

More information

A Pharmacometrician s Perspective for Utilization of Big Data

A Pharmacometrician s Perspective for Utilization of Big Data Is There a Role of Big Data in Drug Development Decisions? ACoP6 Oct. 5, 2015 Crystal City, VA A Pharmacometrician s Perspective for Utilization of Big Data Marc R. Gastonguay, Ph.D. President & CEO Metrum

More information

Big data: are we making a big mistake?

Big data: are we making a big mistake? data science Big data: are we making a big mistake? Economist, journalist and broadcaster Tim Harford delivered the 2014 Significance lecture at the Royal Statistical Society International Conference.

More information

An Introduction to Advanced Analytics and Data Mining

An Introduction to Advanced Analytics and Data Mining An Introduction to Advanced Analytics and Data Mining Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda What are Advanced Analytics and Data Mining? The toolkit

More information

Why do statisticians "hate" us?

Why do statisticians hate us? Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data

More information

Assessing Data Mining: The State of the Practice

Assessing Data Mining: The State of the Practice Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality

More information

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIRPO

More information

Statistical Challenges with Big Data in Management Science

Statistical Challenges with Big Data in Management Science Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH 205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Defending Networks with Incomplete Information: A Machine Learning Approach. Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject

Defending Networks with Incomplete Information: A Machine Learning Approach. Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject Defending Networks with Incomplete Information: A Machine Learning Approach Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject Agenda Security Monitoring: We are doing it wrong Machine Learning

More information

Data Mining: An Introduction

Data Mining: An Introduction Data Mining: An Introduction Michael J. A. Berry and Gordon A. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support, 2nd Edition, 2004 Data mining What promotions should be targeted

More information

Multichannel Attribution

Multichannel Attribution Accenture Interactive Point of View Series Multichannel Attribution Measuring Marketing ROI in the Digital Era Multichannel Attribution Measuring Marketing ROI in the Digital Era Digital technologies have

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

More information

INTRODUCTION TO DATA SCIENCE USING R

INTRODUCTION TO DATA SCIENCE USING R 3 day course to cover fundamentals and practices you need to know about data science and using R. #1 JOIN THE DATA REVOLUTION! Every object on earth is generating data, including our homes, our cars and

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Characterizing Task Usage Shapes in Google s Compute Clusters

Characterizing Task Usage Shapes in Google s Compute Clusters Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang 1, Joseph L. Hellerstein 2, Raouf Boutaba 1 1 University of Waterloo, 2 Google Inc. Introduction Cloud computing is becoming a key

More information

Introduction to Time Series Analysis and Forecasting. 2nd Edition. Wiley Series in Probability and Statistics

Introduction to Time Series Analysis and Forecasting. 2nd Edition. Wiley Series in Probability and Statistics Brochure More information from http://www.researchandmarkets.com/reports/3024948/ Introduction to Time Series Analysis and Forecasting. 2nd Edition. Wiley Series in Probability and Statistics Description:

More information

Algorithmic Trading Session 1 Introduction. Oliver Steinki, CFA, FRM

Algorithmic Trading Session 1 Introduction. Oliver Steinki, CFA, FRM Algorithmic Trading Session 1 Introduction Oliver Steinki, CFA, FRM Outline An Introduction to Algorithmic Trading Definition, Research Areas, Relevance and Applications General Trading Overview Goals

More information

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open

More information

MHI3000 Big Data Analytics for Health Care Final Project Report

MHI3000 Big Data Analytics for Health Care Final Project Report MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given

More information

Data Science and Prediction*

Data Science and Prediction* Data Science and Prediction* Vasant Dhar Professor Editor-in-Chief, Big Data Co-Director, Center for Business Analytics, NYU Stern Faculty, Center for Data Science, NYU *Article in Communications of the

More information

Machine learning for algo trading

Machine learning for algo trading Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with

More information

Algorithmic Presentation to European Central Bank. Jean-Marc Orlando, EFX Global Head BNP PARIBAS

Algorithmic Presentation to European Central Bank. Jean-Marc Orlando, EFX Global Head BNP PARIBAS Algorithmic Presentation to European Central Bank Jean-Marc Orlando, EFX Global Head BNP PARIBAS 1 What s all the BUZZ about Algorithmic Trading /efx? 2 Why is Algorithmic Trading Exploding in the industry?

More information

Nancy Cartwright, Hunting Causes and Using Them: Approaches in Philosophy and Economics

Nancy Cartwright, Hunting Causes and Using Them: Approaches in Philosophy and Economics Review of Nancy Cartwright, Hunting Causes and Using Them: Approaches in Philosophy and Economics Kevin D. Hoover Departments of Economics and Philosophy Duke University Box 90097 Durham, North Carolina

More information

Predictive Modeling and Big Data

Predictive Modeling and Big Data Predictive Modeling and Presented by Eileen Burns, FSA, MAAA Milliman Agenda Current uses of predictive modeling in the life insurance industry Potential applications of 2 1 June 16, 2014 [Enter presentation

More information

Threat Intelligence: The More You Know the Less Damage They Can Do. Charles Kolodgy Research VP, Security Products

Threat Intelligence: The More You Know the Less Damage They Can Do. Charles Kolodgy Research VP, Security Products Threat Intelligence: The More You Know the Less Damage They Can Do Charles Kolodgy Research VP, Security Products IDC Visit us at IDC.com and follow us on Twitter: @IDC 2 Agenda Evolving Threat Environment

More information

PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA

PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA IMS Symposium at ISPOR at Montreal June 2 nd, 2014 Agenda Topic Presenter Time Introduction:

More information

Customer Relationship Management using Adaptive Resonance Theory

Customer Relationship Management using Adaptive Resonance Theory Customer Relationship Management using Adaptive Resonance Theory Manjari Anand M.Tech.Scholar Zubair Khan Associate Professor Ravi S. Shukla Associate Professor ABSTRACT CRM is a kind of implemented model

More information

B2B opportunity predictiona Big Data and Advanced. Analytics Approach. Insert

B2B opportunity predictiona Big Data and Advanced. Analytics Approach. Insert B2B opportunity predictiona Big Data and Advanced Analytics Approach Vodafone Global Enterprise Manu Kumar, Head of Targeting, Optimization & Data Science Insert Agenda Why B2B opportunities are hard to

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Delivering new insights and value to consumer products companies through big data

Delivering new insights and value to consumer products companies through big data IBM Software White Paper Consumer Products Delivering new insights and value to consumer products companies through big data 2 Delivering new insights and value to consumer products companies through big

More information

Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

More information

No BI without Machine Learning

No BI without Machine Learning No BI without Machine Learning Francis Pieraut francis@qmining.com http://fraka6.blogspot.com/ 10 March 2011 MTI-820 ETS Too Much Data Supervised Learning (classification) Unsupervised Learning (clustering)

More information

Event driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016

Event driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016 Event driven trading new studies on innovative way of trading in Forex market Michał Osmoła INIME live 23 February 2016 Forex market From Wikipedia: The foreign exchange market (Forex, FX, or currency

More information

Financial Markets. Itay Goldstein. Wharton School, University of Pennsylvania

Financial Markets. Itay Goldstein. Wharton School, University of Pennsylvania Financial Markets Itay Goldstein Wharton School, University of Pennsylvania 1 Trading and Price Formation This line of the literature analyzes the formation of prices in financial markets in a setting

More information

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

The impact of social media is pervasive. It has

The impact of social media is pervasive. It has Infosys Labs Briefings VOL 12 NO 1 2014 Social Enablement of Online Trading Platforms By Sivaram V. Thangam, Swaminathan Natarajan and Venugopal Subbarao Socially connected retail stock traders make better

More information

Supply Chain Best Practice: Demand Planning Using Point-of-Sale Data. An Oracle White Paper Updated October 2006

Supply Chain Best Practice: Demand Planning Using Point-of-Sale Data. An Oracle White Paper Updated October 2006 Supply Chain Best Practice: Demand Planning Using Point-of-Sale Data An Oracle White Paper Updated October 2006 Supply Chain Best Practice: Demand Planning Using Point-of-Sale Data Multiple forecasts based

More information

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

News Trading and Speed

News Trading and Speed News Trading and Speed Thierry Foucault, Johan Hombert, Ioanid Roşu (HEC Paris) 6th Financial Risks International Forum March 25-26, 2013, Paris Johan Hombert (HEC Paris) News Trading and Speed 6th Financial

More information

Machine Learning. 01 - Introduction

Machine Learning. 01 - Introduction Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

More information

Machine Learning and Algorithmic Trading

Machine Learning and Algorithmic Trading Machine Learning and Algorithmic Trading In Fixed Income Markets Algorithmic Trading, computerized trading controlled by algorithms, is natural evolution of security markets. This area has evolved both

More information

Marketers: the future is ready for you now

Marketers: the future is ready for you now Content themes: Marketers: the future is ready for you now Brain game Finding faster growth Connected world Research excellence Extensive research from TNS proves that social media and search data can

More information

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots?

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots? Class 1 Data Mining Data Mining and Artificial Intelligence We are in the 21 st century So where are the robots? Data mining is the one really successful application of artificial intelligence technology.

More information

Data Science Will computer science and informatics eat our lunch?

Data Science Will computer science and informatics eat our lunch? Data Science Will computer science and informatics eat our lunch? Thomas Lumley University of Auckland (g)tslumley statschat.org.nz notstat schat.tumblr.com In the 1920s, the computing labs helped establish

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data 100 001 010 111 From Raw Data to 10011100 Actionable Insights with 00100111 MATLAB Analytics 01011100 11100001 1 Access and Explore Data For scientists the problem is not a lack of available but a deluge.

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Social Media Implementations

Social Media Implementations SEM Experience Analytics Social Media Implementations SEM Experience Analytics delivers real sentiment, meaning and trends within social media for many of the world s leading consumer brand companies.

More information

A Proposal for the use of Artificial Intelligence in Spend-Analytics

A Proposal for the use of Artificial Intelligence in Spend-Analytics A Proposal for the use of Artificial Intelligence in Spend-Analytics Mark Bishop, Sebastian Danicic, John Howroyd and Andrew Martin Our core team Mark Bishop PhD studied Cybernetics and Computer Science

More information

Getting the Most from Demographics: Things to Consider for Powerful Market Analysis

Getting the Most from Demographics: Things to Consider for Powerful Market Analysis Getting the Most from Demographics: Things to Consider for Powerful Market Analysis Charles J. Schwartz Principal, Intelligent Analytical Services Demographic analysis has become a fact of life in market

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index

Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index Rickard Nyman *, Paul Ormerod Centre for the Study of Decision Making Under Uncertainty,

More information

A Trading Strategy Based on the Lead-Lag Relationship of Spot and Futures Prices of the S&P 500

A Trading Strategy Based on the Lead-Lag Relationship of Spot and Futures Prices of the S&P 500 A Trading Strategy Based on the Lead-Lag Relationship of Spot and Futures Prices of the S&P 500 FE8827 Quantitative Trading Strategies 2010/11 Mini-Term 5 Nanyang Technological University Submitted By:

More information

Financial Econometrics and Volatility Models Introduction to High Frequency Data

Financial Econometrics and Volatility Models Introduction to High Frequency Data Financial Econometrics and Volatility Models Introduction to High Frequency Data Eric Zivot May 17, 2010 Lecture Outline Introduction and Motivation High Frequency Data Sources Challenges to Statistical

More information

Spam Filtering using Naïve Bayesian Classification

Spam Filtering using Naïve Bayesian Classification Spam Filtering using Naïve Bayesian Classification Presented by: Samer Younes Outline What is spam anyway? Some statistics Why is Spam a Problem Major Techniques for Classifying Spam Transport Level Filtering

More information

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577 T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or

More information

The Intersection of Big Data, Data Science, and The Internet of Things

The Intersection of Big Data, Data Science, and The Internet of Things The Intersection of Big Data, Data Science, and The Internet of Things Bebo White SLAC National Accelerator Laboratory/ Stanford University bebo@slac.stanford.edu SLAC is a US national laboratory operated

More information

Role Description. Position of a Data Scientist Machine Learning at Fractal Analytics

Role Description. Position of a Data Scientist Machine Learning at Fractal Analytics Opportunity to work with leading analytics firm that creates Insights, Impact and Innovation. Role Description Position of a Data Scientist Machine Learning at Fractal Analytics March 2014 About the Company

More information

Performance optimization in retail business using real-time predictive analytics

Performance optimization in retail business using real-time predictive analytics Lecture Notes in Management Science (2015) Vol. 7, 45 49 ISSN 2008-0050 (Print), ISSN 1927-0097 (Online) Performance optimization in retail business using real-time predictive analytics Nizar Zaarour 1

More information

Exploring Big Data in Social Networks

Exploring Big Data in Social Networks Exploring Big Data in Social Networks virgilio@dcc.ufmg.br (meira@dcc.ufmg.br) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about

More information

Certificate Program in Applied Big Data Analytics in Dubai. A Collaborative Program offered by INSOFE and Synergy-BI

Certificate Program in Applied Big Data Analytics in Dubai. A Collaborative Program offered by INSOFE and Synergy-BI Certificate Program in Applied Big Data Analytics in Dubai A Collaborative Program offered by INSOFE and Synergy-BI Program Overview Today s manager needs to be extremely data savvy. They need to work

More information

10/24/2015. Review the extant Marketing Literature to provide initial answers to the MSI research priorities. Review Big Marketing Data Analytics

10/24/2015. Review the extant Marketing Literature to provide initial answers to the MSI research priorities. Review Big Marketing Data Analytics Review the extant Marketing Literature to provide initial answers to the MSI research priorities Review Big Marketing Data Analytics Identify open issues and an outlook for the future Our Framework Types

More information

Assessing the Proposed 2014 Statistics Curriculum 9/22/2013 V0A. www.statlit.org/pdf/2014-schield-dsi2-slides.pdf 1

Assessing the Proposed 2014 Statistics Curriculum 9/22/2013 V0A. www.statlit.org/pdf/2014-schield-dsi2-slides.pdf 1 Assessing the Proposed 2014 Statistics Curriculum 9/22/2013 V0A 1 Business Analytics vs. Data Science by Milo Schield Member: International Statistical Institute US Rep: International Statistical Literacy

More information

Big Data and Economics, Big Data and Economies. Susan Athey, Stanford University Disclosure: The author consults for Microsoft.

Big Data and Economics, Big Data and Economies. Susan Athey, Stanford University Disclosure: The author consults for Microsoft. Big Data and Economics, Big Data and Economies Susan Athey, Stanford University Disclosure: The author consults for Microsoft. Lenses on big data 1. The science and practice of using big data 2. Management

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Machine Learning and Econometrics. Hal Varian Jan 2014

Machine Learning and Econometrics. Hal Varian Jan 2014 Machine Learning and Econometrics Hal Varian Jan 2014 Definitions Machine learning, data mining, predictive analytics, etc. all use data to predict some variable as a function of other variables. May or

More information

Advanced analytics at your hands

Advanced analytics at your hands 2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

More information

Characterizing Task Usage Shapes in Google s Compute Clusters

Characterizing Task Usage Shapes in Google s Compute Clusters Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang University of Waterloo qzhang@uwaterloo.ca Joseph L. Hellerstein Google Inc. jlh@google.com Raouf Boutaba University of Waterloo rboutaba@uwaterloo.ca

More information

Students will become familiar with the Brandeis Datastream installation as the primary source of pricing, financial and economic data.

Students will become familiar with the Brandeis Datastream installation as the primary source of pricing, financial and economic data. BUS 211f (1) Information Management: Financial Data in a Quantitative Investment Framework Spring 2004 Fridays 9:10am noon Lemberg Academic Center, Room 54 Prof. Hugh Lagan Crowther C (781) 640-3354 hugh@crowther-investment.com

More information

Cross Validation. Dr. Thomas Jensen Expedia.com

Cross Validation. Dr. Thomas Jensen Expedia.com Cross Validation Dr. Thomas Jensen Expedia.com About Me PhD from ETH Used to be a statistician at Link, now Senior Business Analyst at Expedia Manage a database with 720,000 Hotels that are not on contract

More information

AN INTRODUCTION TO BACKTESTING WITH PYTHON AND PANDAS

AN INTRODUCTION TO BACKTESTING WITH PYTHON AND PANDAS AN INTRODUCTION TO BACKTESTING WITH PYTHON AND PANDAS Michael Halls-Moore - QuantStart.com WHAT S THIS TALK ABOUT? A talk of two halves! In the first half we talk about quantitative trading and backtesting

More information

Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information