How To Improve Data Quality

Size: px
Start display at page:

Download "How To Improve Data Quality"

Transcription

1 Big Data Promises and Pitfalls David J. Hand Imperial College, London and Winton Capital Management July 2015 Policy making in the Big Data Era 1

2 The world of data is changing Not something which happens and we can then settle into the new world Rather change is the only constant tomorrow s IT will be different from today s (think Facebook, Twitter, YouTube, Instagram, etc) The future you have tomorrow won't be the same future you had yesterday. Chuck Palahniuk In times of change, learners inherit the Earth, while the learned find themselves beautifully equipped to deal with a world that no longer exists. Eric Hoffer Policy making in the Big Data Era 2

3 What s new in today s world of data? source of data: automatic acquisition of data speed of acquisition of data: streaming data size of data sets: big data diversity of data complexity of data Policy making in the Big Data Era 3

4 Source: Modern data capture technologies Automatic data collection: electronic measurements: point of sale credit card terminals, petrol pumps, contactless travel cards, phone records, s, GPS, CCTV cameras,... Properties of automatic data collection: immediate complete??? untouched by human hands??? Policy making in the Big Data Era 4

5 Internet of things Social media data data directly from the web Administrative data Data not primarily collected for research purposes e.g. supermarket purchases, credit card transactions, tax records, education records, health records, transport movements,... Administrative data research is secondary analysis, so may not be ideal for the research purpose issues of consent may arise changes to the collection procedures may change nature of data quality issues different from those of surveys selection distortion who s in the database? Policy making in the Big Data Era 5

6 The Administrative Data Research Network Aim: to facilitate access to and linkage of de identified administrative data routinely collected by government departments and other public sector organisations Four centres: England, NI, Scotland, Wales + ADS Partnerships with Nat Stats Institutes UK wide governance Safe and secure data access Accredited researchers Public engagement Policy making in the Big Data Era 6

7 Speed: realtime data collection and analysis This has several major implications, threatening the position of NSIs? e.g. timeliness e.g. analytic tools and methods Policy making in the Big Data Era 7

8 Timeliness The balance of timeliness against accuracy Example: UK GDP 1 st estimate: 44% of the data available by 25 days, 2 nd estimate: 88% by 55 days, 3 rd estimate: 85 days Policy making in the Big Data Era 8

9 What about inflation rate?: Elaborate procedure to collect sample data Contrast with direct recording from transactions: Google has created a new inflation measure the Google Price Index based on the cost of goods sold online which could prove more accurate and up to date than official statistics. Google's mountain of web shopping data could also be used by the online group for economic forecasting ahead of the publication of official statistics, the Financial Times reported. Google's chief economist Hal Varian said that he is working on predicting the present by using real time search data to forecast official figures which often are published at least a month after the period they cover. Policy making in the Big Data Era 9

10 Analytic tools and methods Streaming data : the data keep on coming, like water from a hose Permanently executing analytic tools, processing the data as it arrives anomalies changes summaries (trends, averages, variability, maxima,...) Realtime automatic analysis Contrast the more familiar: a fixed database Policy making in the Big Data Era 10

11 Note: perhaps we cannot store the data once it has been processed In that case we need to know what questions we will ask We cannot later ask arbitrary questions, but only those that can be answered from our summary statistics Summarising a stream Subsetting a stream: sampling, but requires different approaches from classical survey sampling Filtering a stream: accept only those cases which meet some criterion Policy making in the Big Data Era 11

12 Size: Large data sets Census: the largest set of data collected by NSIs until relatively recently? But now administrative data, register based data some countries (e.g. Scandinavian) ahead of the field transaction data social media, Google searches, twitter messages, transaction logs, phone logs, transport logs,... Policy making in the Big Data Era 12

13 Diversity of data Survey, census, administrative, transaction, experimental,... Numerical tables, image, text, signal, networks,... Different kinds of data have different properties e.g. survey data: answers to the questions you choose but slow and expensive to collect, response bias? e.g. transaction data: fine granularity, both spatial and temporal, immediate, but may not address the question you want an opportunity: Perhaps data of different kinds can be combined synergistically, to overcome the problems of each individual kind Policy making in the Big Data Era 13

14 Stitching different kinds of data together Linking Matching Merging Technical challenges have begun to be addressed in different fields e.g. medical combining information from scans with traditional numeric and text files Great potential benefit from cross disciplinary collaboration and awareness Policy making in the Big Data Era 14

15 survey and census data is what people say: administrative and transaction data is what people do Policy making in the Big Data Era 15

16 survey and census data is what people say: administrative and transaction data is what people do New forms of data are closer to social reality? Policy making in the Big Data Era 16

17 Complexity of data Especially, networks and linked data Social networks Cybersecurity Fraud detection Policy making in the Big Data Era 17

18 Ping Pong Fraud Ring Model (HBOS Data Mining Team, 2007) Within the Mortgage Industry, in addition to individual fraudulent applications, fraud can also occur in groups often referred to as Fraud Rings. These groups can consist of applicants, brokers and/or other professionals that get together to cheat the system. The purpose of the Ping Pong Fraud Ring Model is to assist the Mortgage Fraud team in identifying these Fraud Rings: 1. Ping Pong clustering: aggregating applications that have common names, addresses, telephone numbers, employers, etc 2. Ranking: inconsistency rules are used to rank the clusters in order of severity. 3. Linkage analysis: identifying links between customers, brokers, employers, and/or solicitors. Policy making in the Big Data Era 18

19 OTHER CRITICAL ISSUES Data quality most big data that have received popular attention are not the output of instruments designed to produce valid and reliable data amenable for scientific analysis Lazer et al 2014 Caution: No data set is without potential quality issues Role and reputation of NSIs, as standard bearers of quality Policy making in the Big Data Era 19

20 Example: Mistaken idea that admin data/transaction data has no quality issues Quality Assurance of Administrative Data Administrative Data Quality Assurance Toolkit UK Statistics Authority, January 2015 Policy making in the Big Data Era 20

21 Top Tips UKSA Administrative Data Quality Assurance Toolkit, 2015 These five tips summarise the main pointers for statistical producers to develop a good understanding of the quality issues of administrative data: Don t trust the safeguards: Check if safeguards are functioning effectively Get involved: work and share with suppliers, such as through secondments and webinars, to develop a common understanding Raise a red flag: identify potential data quality concerns using input and output quality indicators and investigate anomalies See the big picture: identify what investigations and audits have been conducted and what they found Corroborate the evidence: confirm the levels and the trends shown by the stats derived from the admin data Policy making in the Big Data Era 21

22 Example: transaction implies a complex sociological selection process e.g. reject inference in credit scoring Policy making in the Big Data Era 22

23 Example: Automatic data capture can go wrong Wind speed in metres per second, against time Policy making in the Big Data Era 23

24 Example: Proportion of homeowner missing values vs time Policy making in the Big Data Era 24

25 The rate of change of technology Implications: for series based on a particular technology, what happens when it changes, or disappears altogether (Windows XP) surveys have been around forever, but Google Trends? Policy making in the Big Data Era 25

26 Statistical complications manipulation vs inference two types of statistical model regression to the mean selection bias individual behaviour vs population behaviour impact of stated targets, feedback, gaming, Policy making in the Big Data Era 26

27 Data manipulation: Sorting, adding, selecting, matching, concatenating, aggregating, etc mathematical/computing challenges Examples Google maps to find directions apps for train and bus status find the shop with the item you seek locate that out of print book identify the name of that piece of music Policy making in the Big Data Era 27

28 Inference A statement about the entire population, some other population, the future, etc. statistical challenges Examples which teaching strategies are most effective? social influences on sport participation and consequent impact on population health and on NHS costs Policy making in the Big Data Era 28

29 Two types of statistical model Theory driven Based on underlying theories of social behaviour, policy impact, economics, etc e.g. microeconomic theories of how agents interact Data driven Based purely on empirical relationships found in the data e.g. empirical logistic regression trees in credit scoring Policy making in the Big Data Era 29

30 Regression to the mean Policy making in the Big Data Era 30

31 Google flu trends In February 2013,... Nature reported that [Google Flu Trends] was predicting more than double the proportion of doctor visits for influenza like illness than the Centers for Disease Control and Prevention, which bases its estimates on surveillance reports from laboratories across the United States... despite the fact that GFT was built to predict CDC reports. Lazer et al 2014 Policy making in the Big Data Era 31

32 Initial version: find best matches from 50 million search terms for 1152 data points Overfitting! part flu detector, part winter detector Updated, 2009: still consistently overestimated flu prevalence (100 out of 108 consecutive weeks); autocorrelated errors, etc Some understanding of classical time series modelling could help Policy making in the Big Data Era 32

33 Correlation = Policy making in the Big Data Era 33

34 Predicting the financial markets Preis T, Moat H.S., and Stanley H.E. (2013) Quantifying trading behavior in financial markets using Google trends. Scientific Reports, 3, By analyzing changes in Google query volumes for search terms related to finance, we find patterns that may be interpreted as early warning signs of stock market moves. Policy making in the Big Data Era 34

35 Chose 98 search terms related to stock markets Implemented a strategy which sold DJIA following an increase in searches and bought following a decrease in searches Concluded: "We find that returns from the Google Trends strategies are significantly higher overall than returns from the random strategies (<R>=0.60; t=8.65, df=97, p<0.001, one sample t test)." But this is an inappropriate test Telling us nothing about the effectiveness of the strategy Policy making in the Big Data Era 35

36 1) the 98 terms were purposively sampled, not a random sample from a population of possible terms: the authors confused fixed and random effects; 2) the standard deviation used in the denominator of their t test is irrelevant to the variability of the mean return: it s the wrong statistic; 3) they ignored the correlation between the random aspect of the different words returns: anyone working in the financial sector will be aware of the critical importance of correlation! Policy making in the Big Data Era 36

37 Google Price Index My earlier quote about the Google Price index was from 2010 Where is the GPI now? [Hal Varian, Google s Chief Economist] responded that the GPI was never intended to be a public project, data source, etc. It was simply a project internal to Google that got hyped up by the press happened to the google price index Policy making in the Big Data Era 37

38 Crime maps For many years, all forces have mapped crimes and incidents to help them focus investigations, analyse hot spots and tackle crime vigorously. The information now on the forces websites has a different more communityfocused perspective and means the public can now look at crime levels in their community simply by putting their postcode into their local police force s website. Policy making in the Big Data Era 38

39 More than 5.2 million people have not reported crimes for fear of deterring home buyers or renters since the online crime map was launched in February 2011 A quarter (24 per cent) of people would not report a crime for fear it would harm their chances of selling or renting their property /news Policy making in the Big Data Era 39

40 Trust, privacy,... people readily give data to supermarkets, travel companies, phone companies, credit card companies,.. government misuse, targeting subgroups. Formal separation of statistical offices from government? Anonymisation De identification Trusted Third Party linkage methods need for sta s cians to learn new skills broader data scientists Policy making in the Big Data Era 40

41 Interaction with the commercial world The credit bureau model: everyone benefits NSI takes fine granularity commercial (e.g. transaction) data, combines it with public data, analyses it, and feeds back the results to the commercial body for a price NSI retains control, protects public Policy making in the Big Data Era 41

42 The skills gap Big data and open data are only useful if you have the skills to extract information New skills are needed What are they? How many NSI statisticians are at ease with real time transaction data? The major sources of inaccuracy in survey samples are (?) selection bias and sampling variability What are the major sources of error in administrative data? Policy making in the Big Data Era 42

43 An ethical framework Context: purpose of collecting data, using data Consent and choice Reasonable breadth and depth of data Substantiated data Ownership of results Fairness to all parties, compensation Considered consequences Access to data by subjects Accountability Policy making in the Big Data Era 43

44 Conclusions huge opportunities technical and social challenges but we need the capability to grasp the opportunities we need people with the skills Policy making in the Big Data Era 44

45 Conclusions huge opportunities technical and social challenges but we need the capability to grasp the opportunities we need people with the skills And above all: Policy making in the Big Data Era 45

46 Data are evidence Policy making in the Big Data Era 46

David J. Hand Imperial College, London

David J. Hand Imperial College, London David J. Hand Imperial College, London Discovery vs distortion the importance of quality in learning from data David J. Hand Imperial College, London and Winton Capital Management 14 July 2015 Learning

More information

Finding Patterns the Challenge of Big Data 1

Finding Patterns the Challenge of Big Data 1 Finding Patterns The Challenge of Big Data David J. Hand Imperial College, London and Winton Capital Management November 2015 Finding Patterns the Challenge of Big Data 1 we are on the cusp of a tremendous

More information

Big Data Hope or Hype?

Big Data Hope or Hype? Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big

More information

Big Data, Official Statistics and Social Science Research: Emerging Data Challenges

Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Professor Paul Cheung Director, United Nations Statistics Division Building the Global Information System Elements of

More information

Statistical Challenges with Big Data in Management Science

Statistical Challenges with Big Data in Management Science Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision

More information

The Impact of Big Data on Social Research David Rhind Sharon Witherspoon

The Impact of Big Data on Social Research David Rhind Sharon Witherspoon The Impact of Big Data on Social Research David Rhind Sharon Witherspoon 1 www.nuffieldfoundation.org The landscape to be covered What is Big Data? Just consultants hype? Key questions for SRA Technology

More information

Healthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw

Healthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Healthcare data analytics Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Outline Data Science Enabling technologies Grand goals Issues Google flu trend Privacy Conclusion Analytics

More information

Opportunities and Limitations of Big Data

Opportunities and Limitations of Big Data Opportunities and Limitations of Big Data Karl Schmedders University of Zurich and Swiss Finance Institute «Big Data: Little Ethics?» HWZ-Darden-Conference June 4, 2015 On fortune.com this morning: Apple's

More information

What is Prospect Analytics?

What is Prospect Analytics? What is Prospect Analytics? Everything you need to know about this new sphere of sales and marketing technology and how it can improve your business Table of Contents Executive Summary... 2 The Power of

More information

On promises and pitfalls of big data: A case study of Google Flu Trends

On promises and pitfalls of big data: A case study of Google Flu Trends On promises and pitfalls of big data: A case study of Google Flu Trends Paul Ormerod*, Rickard Nyman* & Alexander Bentley** *Centre for the Study of Decision-Making Uncertainty University College London

More information

Demystifying Big Data. James Rawlins Senior Consultant

Demystifying Big Data. James Rawlins Senior Consultant Demystifying Big Data James Rawlins Senior Consultant What s the Big Idea? Objectives for the session Understand what is meant by Big Data Provide some context around what is Big Are we big? Understand

More information

Administrative Data Quality Assurance Toolkit

Administrative Data Quality Assurance Toolkit Administrative Data Quality Assurance Toolkit Version 1 January 2015 1 Administrative Data Quality Assurance Toolkit This toolkit is intended to help statistical assessors review the areas of practice

More information

Big Data for Social Good. Nuria Oliver, PhD Scientific Director User, Data and Media Intelligence Telefonica Research

Big Data for Social Good. Nuria Oliver, PhD Scientific Director User, Data and Media Intelligence Telefonica Research Big Data for Social Good Nuria Oliver, PhD Scientific Director User, Data and Media Intelligence Telefonica Research 6.8 billion subscriptions 96% of world s population (ITU) Mobile penetration of 120%

More information

Data Protection Act. Conducting privacy impact assessments code of practice

Data Protection Act. Conducting privacy impact assessments code of practice Data Protection Act Conducting privacy impact assessments code of practice 1 Conducting privacy impact assessments code of practice Data Protection Act Contents Information Commissioner s foreword... 3

More information

WSIC Integrated Care Record FAQs

WSIC Integrated Care Record FAQs WSIC Integrated Care Record FAQs How your information is shared now Today, all the places where you receive care keep records about you. They can usually only share information from your records by letter,

More information

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen 2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems

More information

Big Data Big Security Problems? Ivan Damgård, Aarhus University

Big Data Big Security Problems? Ivan Damgård, Aarhus University Big Data Big Security Problems? Ivan Damgård, Aarhus University Content A survey of some security and privacy issues related to big data. Will organize according to who is collecting/storing data! Intelligence

More information

Is Big Data Good for our Health? You Bet. Here s Why. By Cameron Warren and Merav Yuravlivker

Is Big Data Good for our Health? You Bet. Here s Why. By Cameron Warren and Merav Yuravlivker Is Big Data Good for our Health? You Bet. Here s Why. By Cameron Warren and Merav Yuravlivker The term Big Data is increasingly used in our everyday lives. But each mention of it means something different,

More information

Help to Buy (Equity Loan scheme) and Help to Buy: NewBuy statistics: Data to 30 September 2015,

Help to Buy (Equity Loan scheme) and Help to Buy: NewBuy statistics: Data to 30 September 2015, Help to Buy (Equity Loan scheme) and Help to Buy: NewBuy statistics: Data to 30 September 2015, England In the first 30 months of the Help to Buy: Equity Loan scheme (to 30 September 2015), 62,569 properties

More information

The Evolution of NHS Scotland into a World-Class Healthcare Provider

The Evolution of NHS Scotland into a World-Class Healthcare Provider View the Replay on YouTube The Evolution of NHS Scotland into a World-Class Healthcare Provider FairWarning Executive Webinar Series 15 May 2013 Today s Panel Dr. Daniel Beaumont Information Assurance

More information

Predictive Analytics: Extracts from Red Olive foundational course

Predictive Analytics: Extracts from Red Olive foundational course Predictive Analytics: Extracts from Red Olive foundational course For more details or to speak about a tailored course for your organisation please contact: Jefferson Lynch: jefferson.lynch@red-olive.co.uk

More information

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Software Engineering for Big Data CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Big Data Big data technologies describe a new generation of technologies that aim

More information

Changing the shape of British retirement. www.homewise.co.uk

Changing the shape of British retirement. www.homewise.co.uk Changing the shape of British retirement Free Phone 0800 043 33 66 About us Welcome to Homewise Like many people, you may have spent years looking forward to a dream retirement lifestyle, perhaps wanting

More information

Consumer needs not being met by UK grocery market A British Brands Group research publication

Consumer needs not being met by UK grocery market A British Brands Group research publication Consumer needs not being met by UK grocery market A British Brands Group research publication INTRODUCTION The British Brands Group provides the voice for brand manufacturers in the UK. It is a membership

More information

Harness the power of data to drive marketing ROI

Harness the power of data to drive marketing ROI Harness the power of data to drive marketing ROI I need to get better results from my marketing......and improve my return on investment. Are you directing spend where it ll have the greatest effect? MAKING

More information

Mario Guarracino. Data warehousing

Mario Guarracino. Data warehousing Data warehousing Introduction Since the mid-nineties, it became clear that the databases for analysis and business intelligence need to be separate from operational. In this lecture we will review the

More information

What is the Ideal Labour Code: A Measurement Perspective from the United States Mike Horrigan US Bureau of Labor Statistics

What is the Ideal Labour Code: A Measurement Perspective from the United States Mike Horrigan US Bureau of Labor Statistics What is the Ideal Labour Code: A Measurement Perspective from the United States Mike Horrigan US Bureau of Labor Statistics As with all public policy issues and debates, the quality of data that serves

More information

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»

More information

Beyond 2011: Population Coverage Survey Field Test 2013 December 2013

Beyond 2011: Population Coverage Survey Field Test 2013 December 2013 Beyond 2011 Beyond 2011: Population Coverage Survey Field Test 2013 December 2013 Background The Beyond 2011 Programme in the Office for National Statistics (ONS) is currently reviewing options for taking

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Policy on Public and School Bus Closed Circuit Television Systems (CCTV)

Policy on Public and School Bus Closed Circuit Television Systems (CCTV) DEPARTMENT OF TRANSPORT Policy on Public and School Bus Closed Circuit Television Systems (CCTV) Responsibility of: Public Transport Division TRIM File: DDPI2010/3680 Effective Date: July 2010 Version

More information

KAVE ecosystem unlocks the potential of Big Data Building blocks for scalable, manageable and cost-efficient data analysis

KAVE ecosystem unlocks the potential of Big Data Building blocks for scalable, manageable and cost-efficient data analysis 1 KAVE ecosystem unlocks the potential of Big Data KAVE ecosystem unlocks the potential of Big Data Building blocks for scalable, manageable and cost-efficient data analysis Advisory www.kpmg.com/nl 2

More information

Asset management firms and the risk of market abuse

Asset management firms and the risk of market abuse Financial Conduct Authority Asset management firms and the risk of market abuse February 2015 Thematic Review TR15/1 TR15/1 Asset management firms and the risk of market abuse Contents 1. Executive summary

More information

The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia

The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia The Impact of Big Data on Classic Machine Learning Algorithms Thomas Jensen, Senior Business Analyst @ Expedia Who am I? Senior Business Analyst @ Expedia Working within the competitive intelligence unit

More information

Ten Mistakes to Avoid

Ten Mistakes to Avoid EXCLUSIVELY FOR TDWI PREMIUM MEMBERS TDWI RESEARCH SECOND QUARTER 2014 Ten Mistakes to Avoid In Big Data Analytics Projects By Fern Halper tdwi.org Ten Mistakes to Avoid In Big Data Analytics Projects

More information

BIG DATA FUNDAMENTALS

BIG DATA FUNDAMENTALS BIG DATA FUNDAMENTALS Timeframe Minimum of 30 hours Use the concepts of volume, velocity, variety, veracity and value to define big data Learning outcomes Critically evaluate the need for big data management

More information

Exploiting the Single Customer View to maximise the value of customer relationships

Exploiting the Single Customer View to maximise the value of customer relationships Exploiting the Single Customer View to maximise the value of customer relationships October 2011 Contents 1. Executive summary 2. Introduction 3. What is a single customer view? 4. Obstacles to achieving

More information

Direct Marketing of Insurance. Integration of Marketing, Pricing and Underwriting

Direct Marketing of Insurance. Integration of Marketing, Pricing and Underwriting Direct Marketing of Insurance Integration of Marketing, Pricing and Underwriting As insurers move to direct distribution and database marketing, new approaches to the business, integrating the marketing,

More information

Delivery Plan for the Long Term Strategy for Population Surveys in Scotland 2009 2019

Delivery Plan for the Long Term Strategy for Population Surveys in Scotland 2009 2019 Delivery Plan for the Long Term Strategy for Population Surveys in Scotland 2009 2019 1. The Scottish Government s Long Term Strategy for Population Surveys in Scotland 2009 2019 was publish in June 2009.

More information

Managing your credit cards, scores and reports The do s and don ts

Managing your credit cards, scores and reports The do s and don ts Managing your credit cards, scores and reports The do s and don ts Copyright 2012 Vicki Wusche Helping you to Make More Money From Property www.mmmfp.com Credit cards the do s and don t Introduction I

More information

Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index

Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index Rickard Nyman *, Paul Ormerod Centre for the Study of Decision Making Under Uncertainty,

More information

Chair: Stephen Darvill (Logica) Raporteur: Edward Phelps (EURIM) SUMMARY OF ROUND TABLE STATEMENTS AND DISCUSSION

Chair: Stephen Darvill (Logica) Raporteur: Edward Phelps (EURIM) SUMMARY OF ROUND TABLE STATEMENTS AND DISCUSSION 1 Summary Report of the Directors Round Table on Information Governance, 1600-1800, 24 th November 2008, The Boothroyd Room, Portcullis House, Westminster Chair: Stephen Darvill (Logica) Raporteur: Edward

More information

SMALL BUSINESS REPUTATION & THE CYBER RISK

SMALL BUSINESS REPUTATION & THE CYBER RISK SMALL BUSINESS REPUTATION & THE CYBER RISK Executive summary In the past few years there has been a rapid expansion in the development and adoption of new communications technologies which continue to

More information

Strategic plan. Outline

Strategic plan. Outline Strategic plan Outline 1 Introduction Our vision Our role Our mandate 2 About us Our governance Our structure 3 Context Our development Camden 4 Resources Funding Partners 5 Operating model How we will

More information

Statistics, Big Data and Data Science!?

Statistics, Big Data and Data Science!? Statistics, Big Data and Data Science!? Prof. Dr. Göran Kauermann Ludwig-Maximilians-Universität Munich, Germany Statistics, Big Data and Data Science Statistics Founded around 1900 with the seminal work

More information

1. Understanding Big Data

1. Understanding Big Data Big Data and its Real Impact on Your Security & Privacy Framework: A Pragmatic Overview Erik Luysterborg Partner, Deloitte EMEA Data Protection & Privacy leader Prague, SCCE, March 22 nd 2016 1. 2016 Deloitte

More information

Economic Commentaries

Economic Commentaries n Economic Commentaries Data and statistics are a cornerstone of the Riksbank s work. In recent years, the supply of data has increased dramatically and this trend is set to continue as an ever-greater

More information

5 Point Social Media Action Plan.

5 Point Social Media Action Plan. 5 Point Social Media Action Plan. Workshop delivered by Ian Gibbins, IG Media Marketing Ltd (ian@igmediamarketing.com, tel: 01733 241537) On behalf of the Chambers Communications Sector Introduction: There

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What

More information

Medical Big Data Interpretation

Medical Big Data Interpretation Medical Big Data Interpretation Vice president of the Xiangya Hospital, Central South University The director of the ministry of mobile medical education key laboratory Professor Jianzhong Hu BIG DATA

More information

A RE YOU SUFFERING FROM A DATA PROBLEM?

A RE YOU SUFFERING FROM A DATA PROBLEM? June 2012 A RE YOU SUFFERING FROM A DATA PROBLEM? DO YOU NEED A DATA MANAGEMENT STRATEGY? Most businesses today suffer from a data problem. Yet many don t even know it. How do you know if you have a data

More information

The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics

The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics The Data Engineer Mike Tamir Chief Science Officer Galvanize Steven Miller Global Leader Academic Programs IBM Analytics Alessandro Gagliardi Lead Faculty Galvanize Businesses are quickly realizing that

More information

New Solutions New Opportunities

New Solutions New Opportunities New Solutions New Opportunities What is Penton Marketing Services Penton Marketing Services offers a full range of content solutions, digital services and lead nurturing and qualifying services that leverage

More information

The big data dilemma an inquiry by the House of Commons Select Committee on Science and Technology

The big data dilemma an inquiry by the House of Commons Select Committee on Science and Technology The big data dilemma an inquiry by the House of Commons Select Committee on Science and Technology Evidence from the UK Computing Research Committee Definitive. 1 September 2015 The UK Computing Research

More information

Big Data and Analytics

Big Data and Analytics INSIDE TRACK Analyst commentary with a real-world edge Big Data and Analytics Dazzling new solutions or irritating new hype? By Tony Lock, November 2012 Originally published on http://www.theregister.co.uk/

More information

Understand how PPC can help you achieve your marketing objectives at every stage of the sales funnel.

Understand how PPC can help you achieve your marketing objectives at every stage of the sales funnel. 1 Understand how PPC can help you achieve your marketing objectives at every stage of the sales funnel. 2 WHO IS THIS GUIDE FOR? This guide is written for marketers who want to get more from their digital

More information

incommmsec Whitepaper: Increasing role of employee monitoring within public and private sector organisations.

incommmsec Whitepaper: Increasing role of employee monitoring within public and private sector organisations. Whitepaper: Increasing role of employee monitoring within public and private sector organisations. GOOD GOVERNANCE OR BIG BROTHER STATE? So what do we mean by Network or Employee monitoring? For the purpose

More information

How To Buy Insurance Online From An Insurer

How To Buy Insurance Online From An Insurer Casualty Actuarial Society Annual Meeting Brace Yourselves For Direct Sales To Small-Business Insurance Consumers! Sam Friedman, Insurance Research Leader, Deloitte Center for Financial Services Donna

More information

Lead Scoring. Five steps to getting started. wowanalytics. 730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc.

Lead Scoring. Five steps to getting started. wowanalytics. 730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc. Lead Scoring Five steps to getting started supported and sponsored by wowanalytics 730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc.com LEAD SCORING 5 steps to getting

More information

INSURANCE N BRIEF. i n s u r a n c e b r o k e r s Issue No.6 Autumn 2013

INSURANCE N BRIEF. i n s u r a n c e b r o k e r s Issue No.6 Autumn 2013 INSURANCE N BRIEF i n s u r a n c e b r o k e r s Issue No.6 Autumn 2013 Contents Are your fleet insurers fighting your corner when it comes to motor claims? Insurance Fraud: Don t Be a Victim Professional

More information

Identification of problem solving approaches and appropriate tools (not only R even though this is important)

Identification of problem solving approaches and appropriate tools (not only R even though this is important) Lecture Outline Thus, the lecture will contain: 1. Introduction 2. How to frame the business problem 3. How to transfer it to a problem which can be solved with analytics methods 4. Data identification

More information

Social Media & Its Importance For Brands. September 2015

Social Media & Its Importance For Brands. September 2015 Social Media & Its Importance For Brands September 2015 Contents 3. Introduction 4. Key Points 5. Conclusions 7. Frequency Of Using Social Media 8. Interacting With Consumers 10. Types Of Company Ever

More information

Information for registrants. What happens if a concern is raised about me?

Information for registrants. What happens if a concern is raised about me? Information for registrants What happens if a concern is raised about me? Contents About this brochure 1 What is fitness to practise? 1 What can I expect from you? 3 How are fitness to practise concerns

More information

730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc.com

730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc.com Lead Scoring: Five Steps to Getting Started 730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc.com Introduction Lead scoring applies mathematical formulas to rank potential

More information

Business Plan 2012/13

Business Plan 2012/13 Business Plan 2012/13 Contents Introduction 3 About the NFA..4 Priorities for 2012/13 4 Resources.6 Reporting Arrangements.6 Objective 1 7 To raise the profile and awareness of fraud among individuals,

More information

Whether the government is correct in describing the UK as the whiplash capital of the world

Whether the government is correct in describing the UK as the whiplash capital of the world Whiplash and the cost of motor insurance: what s behind the insurance industry claims Submission to the Transport Committee by Thompsons Solicitors April 2013 About Thompsons Thompsons is the UK s most

More information

Quick Guide to NHS Continuing Healthcare Funding

Quick Guide to NHS Continuing Healthcare Funding Quick Guide to NHS Continuing Healthcare Funding Contents 1 How can Farley Dwek help families with their eligibility for NHS Continuing Healthcare Funding? 3 2 3 4 5 6 7 8 Why is it important for families

More information

(Big) Data Analytics: From Word Counts to Population Opinions

(Big) Data Analytics: From Word Counts to Population Opinions (Big) Data Analytics: From Word Counts to Population Opinions Mark Keane Insight@University College Dublin October 2014 ~ RSS ~ Edinburgh September 2014/EPIC 2 September 2014/EPIC 3 September 2014/EPIC

More information

Your guide to using new media

Your guide to using new media Your guide to using new media A comprehensive guide for the charity and voluntary sector with tips on how to make the most of new, low cost communication tools such as social media and email marketing.

More information

Flood Incident Management - the next ten years

Flood Incident Management - the next ten years Flood Incident Management - the next ten years Consultation on the future of FlM in England and Wales, Aug Oct 2012 Table of Contents 1. About this consultation 3 1.1 Introduction 1.2 Why are we consulting?

More information

Google: Trust, Choice, and Privacy

Google: Trust, Choice, and Privacy Google: Trust, Choice, and Privacy Gus Meuli, Caitlin Finn Trust is hard to earn, easy to loose, and nearly impossible to win back. 1 This statement seems to ring true in the constantly changing world

More information

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH 205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology

More information

CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE

CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE Michael Diederich, Microsoft CMG Research & Insights Introduction The rise of social media platforms like Facebook and Twitter has created new

More information

Basic research methods. Basic research methods. Question: BRM.2. Question: BRM.1

Basic research methods. Basic research methods. Question: BRM.2. Question: BRM.1 BRM.1 The proportion of individuals with a particular disease who die from that condition is called... BRM.2 This study design examines factors that may contribute to a condition by comparing subjects

More information

AGA Kansas City Chapter Data Analytics & Continuous Monitoring

AGA Kansas City Chapter Data Analytics & Continuous Monitoring AGA Kansas City Chapter Data Analytics & Continuous Monitoring Agenda Market Overview & Drivers for Change Key challenges that organizations face Data Analytics What is data analytics and how can it help

More information

Web analytics: Data Collected via the Internet

Web analytics: Data Collected via the Internet Database Marketing Fall 2016 Web analytics (incl real-time data) Collaborative filtering Facebook advertising Mobile marketing Slide set 8 1 Web analytics: Data Collected via the Internet Customers can

More information

AQA Level 3 Technical Level Business

AQA Level 3 Technical Level Business AQA Level 3 Technical Level Business Marketing principles Unit Number: Y/506/6086 Specimen Question Paper Time allowed: 2 hours Instructions Use black ink or black ball-point pen Answer all questions You

More information

Procedure for Managing a Privacy Breach

Procedure for Managing a Privacy Breach Procedure for Managing a Privacy Breach (From the Privacy Policy and Procedures available at: http://www.mun.ca/policy/site/view/index.php?privacy ) A privacy breach occurs when there is unauthorized access

More information

BIG DATA LITTLE DATA AND CRM

BIG DATA LITTLE DATA AND CRM BIG DATA LITTLE DATA AND CRM 11 Tips to Enable Data Using Microsoft Dynamics CRM in Todays Big Data Age A Caltech CRM Publication george@caltech.co.uk 1 IS THIS BOOK RIGHT FOR ME? Is this e-book right

More information

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics contents A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics Abstract... 2 Need of Social Content Analytics... 3 Social Media Content Analytics... 4 Inferences

More information

Nagarjuna College Of

Nagarjuna College Of Nagarjuna College Of Information Technology (Bachelor in Information Management) TRIBHUVAN UNIVERSITY Project Report on World s successful data mining and data warehousing projects Submitted By: Submitted

More information

How To Predict Stock Price With Mood Based Models

How To Predict Stock Price With Mood Based Models Twitter Mood Predicts the Stock Market Xiao-Jun Zeng School of Computer Science University of Manchester x.zeng@manchester.ac.uk Outline Introduction and Motivation Approach Framework Twitter mood model

More information

5 WAYS TO DOUBLE YOUR WEB SITE S SALES IN THE NEXT 12 MONTHS

5 WAYS TO DOUBLE YOUR WEB SITE S SALES IN THE NEXT 12 MONTHS HIGH CONVERSION LAB 5 WAYS TO DOUBLE YOUR WEB SITE S SALES IN THE NEXT 12 MONTHS By Ryan Berg Head Profit Scientist at HighConversionLab.com The Struggle to Build a Thriving Online Business as the Internet

More information

Halton Borough Council. Privacy Notice

Halton Borough Council. Privacy Notice Halton Borough Council Privacy Notice Halton Borough Council is registered as a data controller under the Data Protection Act as we collect and process personal information about you. The information we

More information

Impact of Genetic Testing on Life Insurance

Impact of Genetic Testing on Life Insurance Agenda, Volume 10, Number 1, 2003, pages 61-72 Impact of Genetic Testing on Life Insurance he Human Genome project generates immense interest in the scientific community though there are also important

More information

Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement

Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement Ray Chen, Marius Lazer Abstract In this paper, we investigate the relationship between Twitter feed content and stock market

More information

Paid Search Services

Paid Search Services Paid Search Services Results-driven, mathematical PPC Pay Per Click advertising is widely acknowledged as the most powerful form of direct marketing. Not only are you gaining access to people that want

More information

The Best Reasons to Purchase a House Now!

The Best Reasons to Purchase a House Now! Interactive Marketing and Site Tools Study In the introduction to their report Competitive Strategy In The Age Of The Customer, Forrester stated In this age of the customer, the only sustainable competitive

More information

How To Share Your Health Records With The National Health Service

How To Share Your Health Records With The National Health Service HOW WE USE YOUR PERSONAL INFORMATION Information Leaflet Your Health. Our Priority. Page 2 of 9 Introduction This Leaflet explains why the NHS collects information about you and how it is used, your right

More information

The Importance of Sharing Medical Data

The Importance of Sharing Medical Data Diving into the Data Pool Exploring public views about the way medical data is shared Report from public event on 31 October 2013 Should it be easier for medical data to be shared to help research? What

More information

CYBER STREETWISE. Open for Business

CYBER STREETWISE. Open for Business CYBER STREETWISE Open for Business As digital technologies transform the way we live and work, they also change the way that business is being done. There are massive opportunities for businesses that

More information

Statistical & Technical Team

Statistical & Technical Team Statistical & Technical Team A Practical Guide to Sampling This guide is brought to you by the Statistical and Technical Team, who form part of the VFM Development Team. They are responsible for advice

More information

Safer food supply chains why assessments are great news for your business

Safer food supply chains why assessments are great news for your business Safer food supply chains why assessments are great news for your business Article By Vel Pillay, a food safety expert for LRQA America; and Cor Groenveld, Global Food Product Manager of LRQA and chairman

More information

MICRO-COMPUTER BASED REAL ESTATE DECISION MAKING AND INFORMATION MANAGEMENT - AN INTEGRATED APPROACH

MICRO-COMPUTER BASED REAL ESTATE DECISION MAKING AND INFORMATION MANAGEMENT - AN INTEGRATED APPROACH MICRO-COMPUTER BASED REAL ESTATE DECISION MAKING AND INFORMATION MANAGEMENT - AN INTEGRATED APPROACH Introduction Paul Kershaw, Rob Kooymans and Peter Rossini Department of Property Resource Management

More information

Uninterrupted Operational Intelligence and Business Analytics

Uninterrupted Operational Intelligence and Business Analytics Atre Group, Inc. Uninterrupted Operational Intelligence and Business Analytics To provide value from all data to all users from all channels by sensing and responding ASAP March 19, 2014 San Mateo, California,

More information

How to look better online

How to look better online Online Reputation Management How to look better online Online Reputation Management for CEOs, Rising Stars, VIPs and Their Organizations Shannon M. Wilkinson Table of Contents Why This Guide?... 1 Introduction...

More information

PUBLIC HEALTH MEETS SOCIAL MEDIA: MINING HEALTH INFO FROM TWITTER

PUBLIC HEALTH MEETS SOCIAL MEDIA: MINING HEALTH INFO FROM TWITTER PUBLIC HEALTH MEETS SOCIAL MEDIA: MINING HEALTH INFO FROM TWITTER Michael Paul (@mjp39) Johns Hopkins University Crowdsourcing and Human Computation Lecture 18 Learning about the real world through Twitter

More information

Setting the Standard for Safe City Projects in the United States

Setting the Standard for Safe City Projects in the United States Leading Safe Cities Setting the Standard for Safe City Projects in the United States Edge360 is a provider of Safe City solutions to State & Local governments, helping our clients ensure they have a secure,

More information